from:"Rod Taylor"

Re: [HACKERS] Row Level Security Documentation

2017-08-03 Thread Rod Taylor

On Thu, Jul 13, 2017 at 5:49 AM, Fabien COELHO <coe...@cri.ensmp.fr> wrote:

>
> Hello Rod,
>
> This version of the table attempts to stipulate which section of the
>> process the rule applies to.
>>
>
> The table should be referenced from the description, something like "Table
> xxx summarizes the ..."
>

Added the below which seemed consistent with other "see something else"
messages.

A summary of the application of policies to a command is found in .



> ISTM that it would be clearer to split the Policy column into "FOR xxx
> ..." and "USING" or "WITH CHECK", and to merge the rows which have the same
> "FOR xxx ..." contents, something like:
>
>POLICY |
>   ---++-
>  | USING  | ...
>   FOR ALL ...++-
>  | WITH CHECK | ...
>   ---++-
>   FOR SELECT ... | USING  | ...
>
> So that it is clear that only ALL & UPDATE can get both USING & WITH
> CHECK. This can be done with "morerows=1" on an entry so that it spans more
> rows.
>

Done. I couldn't figure out a morecols=1 equivalent to keep everything
under the Policy heading without a full colspec.



> For empty cells, maybe a dash would be clearer. Not sure.


Looked cluttered to me. Tried N/A first which was even worse.

-- 
Rod Taylor
diff --git a/doc/src/sgml/ref/create_policy.sgml b/doc/src/sgml/ref/create_policy.sgml
index c0dfe1ea4b..52a868e65d 100644
--- a/doc/src/sgml/ref/create_policy.sgml
+++ b/doc/src/sgml/ref/create_policy.sgml
@@ -94,6 +94,11 @@ CREATE POLICY name ON default deny policy is assumed, so that no rows will
be visible or updatable.
   
+
+  
+   A summary of the application of policies to a command is found
+   in .
+  
  
 
  
@@ -389,6 +394,80 @@ CREATE POLICY name ON 
 

+
+   Policies Applied During Statement
+
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+  
+   Policy
+   SELECT
+   INSERT
+   UPDATE
+   DELETE
+  
+ 
+ 
+  
+   FOR ALL ...
+   USING
+   WHERE clause
+   RETURNING clause
+   WHERE and RETURNING clause
+   WHERE and RETURNING clause
+  
+  
+   WITH CHECK
+   
+   new tuple
+   new tuple
+   new tuple
+  
+  
+   FOR SELECT ... USING
+   WHERE clause
+   RETURNING clause
+   WHERE and RETURNING clause
+   WHERE and RETURNING clause
+  
+  
+   FOR INSERT ... WITH CHECK
+   
+   new tuple
+   
+   
+  
+  
+   FOR UPDATE ...
+   USING
+   
+   
+   WHERE clause
+   
+  
+  
+   WITH CHECK
+   
+   
+   new tuple
+   
+  
+  
+   FOR DELETE ... USING
+   
+   
+   
+   WHERE clause
+  
+ 
+
+   
+
   
  
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RLS in CTE incorrect permission failure

2017-06-21 Thread Rod Taylor

On Wed, Jun 21, 2017 at 7:46 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:

> Rod Taylor <rod.tay...@gmail.com> writes:
> > In the attached script, the second insert into t2 (as part of the CTE)
> > should succeed.
>
> No, I don't think so.  You declared the check function as STABLE which
> means it is confined to seeing the same snapshot as the surrounding query.
> So it can't see anything inserted by that query.
>
> Possibly it'd work as you wish with a VOLATILE function.
>

Indeed, that works as expected.

Sorry for the noise.


-- 
Rod Taylor

[HACKERS] RLS in CTE incorrect permission failure

2017-06-21 Thread Rod Taylor

In the attached script, the second insert into t2 (as part of the CTE)
should succeed. My actual use case isn't much more complex; the function is
used primarily to allow peaking at columns that the function definer has
access to but a typical user does not. Function also makes it easy to copy
this policy to a number of structures.

The function within the policy doesn't seem to be able to see records
inserted by earlier statements in the CTE. Perhaps this is as simple as
adding a command counter increment in the right place?

Fails in 9.5.7 and HEAD.

-- 
Rod Taylor


cte_rls_fail.sql
Description: application/sql

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Row Level Security Documentation

2017-05-11 Thread Rod Taylor

Of course, better thoughts appear immediately after hitting the send button.

This version of the table attempts to stipulate which section of the
process the rule applies to.

On Thu, May 11, 2017 at 8:40 PM, Rod Taylor <rod.tay...@gmail.com> wrote:

> I think the biggest piece missing is something to summarize the giant
> blocks of text.
>
> Attached is a table that has commands and policy types, and a "yes" if it
> applies.
>
> --
> Rod Taylor
>



-- 
Rod Taylor
diff --git a/doc/src/sgml/ref/create_policy.sgml b/doc/src/sgml/ref/create_policy.sgml
index 3b24e5e95e..4c997a956d 100644
--- a/doc/src/sgml/ref/create_policy.sgml
+++ b/doc/src/sgml/ref/create_policy.sgml
@@ -389,6 +389,72 @@ CREATE POLICY name ON 
 

+
+   Policies Applied During Statement
+
+ 
+  
+   Policy
+   SELECT
+   INSERT
+   UPDATE
+   DELETE
+  
+ 
+ 
+  
+   FOR ALL ... USING
+   WHERE clause
+   RETURNING clause
+   WHERE and RETURNING clause
+   WHERE and RETURNING clause
+  
+  
+   FOR ALL ... WITH CHECK
+   
+   new tuple
+   new tuple
+   new tuple
+  
+  
+   FOR SELECT ... USING
+   WHERE clause
+   RETURNING clause
+   WHERE and RETURNING clause
+   WHERE and RETURNING clause
+  
+  
+   FOR INSERT ... WITH CHECK
+   
+   new tuple
+   
+   
+  
+  
+   FOR UPDATE ... USING
+   
+   
+   WHERE clause
+   
+  
+  
+   FOR UPDATE ... WITH CHECK
+   
+   
+   new tuple
+   
+  
+  
+   FOR DELETE ... USING
+   
+   
+   
+   WHERE clause
+  
+ 
+
+   
+
   
  
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Row Level Security Documentation

2017-05-11 Thread Rod Taylor

I think the biggest piece missing is something to summarize the giant
blocks of text.

Attached is a table that has commands and policy types, and a "yes" if it
applies.

-- 
Rod Taylor
diff --git a/doc/src/sgml/ref/create_policy.sgml b/doc/src/sgml/ref/create_policy.sgml
index 3b24e5e95e..c737f9e884 100644
--- a/doc/src/sgml/ref/create_policy.sgml
+++ b/doc/src/sgml/ref/create_policy.sgml
@@ -389,6 +389,96 @@ CREATE POLICY name ON 
 

+
+   Policies Applied During Statement
+
+ 
+  
+   Policy
+   SELECT
+   INSERT
+   INSERT RETURNING
+   UPDATE WHERE
+   UPDATE RETURNING
+   DELETE WHERE
+   DELETE RETURNING
+  
+ 
+ 
+  
+   FOR ALL ... USING
+   yes
+   
+   
+   yes
+   yes
+   yes
+   yes
+  
+  
+   FOR ALL ... WITH CHECK
+   
+   yes
+   yes
+   yes
+   yes
+   
+   
+  
+  
+   FOR SELECT ... USING
+   yes
+   
+   yes
+   yes
+   yes
+   yes
+   yes
+  
+  
+   FOR INSERT ... WITH CHECK
+   
+   yes
+   yes
+   
+   
+   
+   
+  
+  
+   FOR UPDATE ... USING
+   
+   
+   
+   yes
+   yes
+   
+   
+  
+  
+   FOR UPDATE ... WITH CHECK
+   
+   
+   
+   yes
+   yes
+   
+   
+  
+  
+   FOR DELETE ... USING
+   
+   
+   
+   
+   
+   yes
+   yes
+  
+ 
+
+   
+
   
  
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Row Level Security UPDATE Confusion

2017-05-05 Thread Rod Taylor

Hmm.

UPDATE seems to work as described (unable to create records you cannot
select); both the single rule version and multi-rule version seem to work
the same.

This combination works too though it seems funny that UPDATE can create an
entry it cannot reverse. I can set the value to 100 (going to -1 fails) but
the UPDATE cannot see the record to set it back. I can see use cases for
it, for example you might be able to promote someone to manager but not
demote a manager to front-desk. We also allow INSERT on tables you cannot
delete from, so it's not inconsistent.

CREATE POLICY split_select ON t FOR SELECT TO split USING (value > 0);
CREATE POLICY split_update ON t FOR UPDATE TO split USING (value < 10) WITH
CHECK (value > 2);
SET session authorization split;
update t set value = 100 where value = 4; -- 1 record changed
update t set value = 5 where value = 100; -- 0 records changed


However, despite INSERT also functioning the same for both styles of
commands it's definitely not obeying the `cannot give away records` rule:

CREATE USER simple;
CREATE USER split;
CREATE TABLE t(value int);
grant select, update, insert, delete on table t to simple, split;

INSERT INTO t values (1), (2);

ALTER TABLE t ENABLE ROW LEVEL SECURITY;
CREATE POLICY simple_all ON t TO simple USING (value > 0) WITH CHECK (true);


CREATE POLICY split_select ON t FOR SELECT TO split USING (value > 0);
CREATE POLICY split_insert ON t FOR INSERT TO split WITH CHECK (true);


SET session authorization simple;
INSERT INTO t VALUES (3), (-3);
SELECT * FROM t;
 value
---
 1
 2
 3
(3 rows)


SET session authorization split;
INSERT INTO t VALUES (4), (-4);
SELECT * FROM t;
 value
---
 1
 2
 3
 4
(4 rows)


SET session authorization default;
SELECT * FROM t;
 value
---
 1
 2
 3
-3
 4
-4
(6 rows)


regards,

Rod



On Fri, May 5, 2017 at 12:10 PM, Stephen Frost <sfr...@snowman.net> wrote:

> Rod, Robert,
>
> * Robert Haas (robertmh...@gmail.com) wrote:
> > On Fri, Apr 14, 2017 at 9:16 AM, Stephen Frost <sfr...@snowman.net>
> wrote:
> > > I agreed already up-thread that there's an issue there and will be
> > > looking to fix it.  That comment was simply replying to Rod's point
> that
> > > the documentation could also be improved.
> >
> > OK, thanks.  The wrap for the next set of minor releases is, according
> > to my understanding, scheduled for Monday, so you'd better jump on
> > this soon if you're hoping to get a fix out this time around.
>
> The attached patch against master fixes this issue.  Rod, if you get a
> chance, would be great for you to check that you no longer see a
> difference between the single ALL policy and the split SELECT/UPDATE
> policies.
>
> Thanks!
>
> Stephen
>



-- 
Rod Taylor

Re: [HACKERS] Row Level Security UPDATE Confusion

2017-04-14 Thread Rod Taylor

On Fri, Apr 14, 2017 at 9:09 AM, Stephen Frost <sfr...@snowman.net> wrote:

> Rod,
>
> * Rod Taylor (rod.tay...@gmail.com) wrote:
> > My actual use-case involves a range. Most users can see and manipulate
> the
> > record when CURRENT_TIMESTAMP is within active_period. Some users
> > (staff/account admins) can see recently dead records too. And a 3rd group
> > (senior staff) have no time restriction, though there are a few customers
> > they cannot see due to their information being a touch more sensitive.
> > I've simplified the below rules to just deal with active_period and the
> > majority of user view (@> CURRENT_TIMESTAMP).
>
> Interesting.
>
> > NOTE: the active_period range is '[)' by default, so records with
> upper() =
> > CURRENT_TIMESTAMP are not visible with @> CURRENT_TIMESTAMP restriction.
>
> Is that really what you intend/want though?  For records with
> upper() = CURRENT_TIMESTAMP to not be visible?  You are able to change
> the range returned from tstzrange by specifying what you want, eg:
>

Yeah, think of it like a delete. Once a record is deleted you want it to
disappear. From the users point of view, who doesn't have time-travel
privileges, an UPDATE to upper() = CURRENT_TIMESTAMP should disappear from
any actions that take place later in the transaction.

A more common way of implementing this is an archive table. Have a DELETE
trigger and shuffle the record to another storage area but with many
dependent tuples via foreign key this can be very time consuming. Flipping
a time period is consistently fast with the caveat that SELECT pays a price.

If you decide Pg shouldn't allow a user to make a tuple disappear, I would
probably make a DO INSTEAD SECURITY DEFINER function that triggers on
DELETE for that role only and changes the time range. Reality is after
about 1 week for customers to contact their account administrator and say
"I accidentally deleted X" it would likely be moved to an archive structure.

select tstzrange(current_timestamp, current_timestamp, '[]');
>
> > CREATE A TABLE t (id integer, active_period tstzrange NOT NULL DEFAULT
> > tstzrange(current_timestamp, NULL));
>
> Why NULL instead of 'infinity'...?
>

Diskspace. NULL works (almost) the same as infinity but the storage is
quite a bit smaller.

>
> > -- Disallowed due to hide_old_select policy.
> > UPDATE t SET active_period = tstzrange(lower(active_period),
> > CURRENT_TIMESTAMP);
>
> Guess I'm still trying to figure out if you really intend for this to
> make the records invisible to the 'most users' case.
>

Yep. It's equivalent to a DELETE or DEACTIVATE. RLS may not be the right
facility but it was very close to working exactly the way I wanted in FOR
ALL mode.

-- 
Rod Taylor

Re: [HACKERS] Row Level Security UPDATE Confusion

2017-04-14 Thread Rod Taylor

On Fri, Apr 14, 2017 at 7:41 AM, Rod Taylor <rod.tay...@gmail.com> wrote:

>
>
>
> On Thu, Apr 13, 2017 at 5:31 PM, Stephen Frost <sfr...@snowman.net> wrote:
>
>> Rod, all,
>>
>> * Joe Conway (m...@joeconway.com) wrote:
>> > On 04/13/2017 01:31 PM, Stephen Frost wrote:
>> > > * Robert Haas (robertmh...@gmail.com) wrote:
>> > >> On Thu, Apr 6, 2017 at 4:05 PM, Rod Taylor <rod.tay...@gmail.com>
>> wrote:
>> > >> > I'm a little confused on why a SELECT policy fires against the NEW
>> record
>> > >> > for an UPDATE when using multiple FOR policies. The ALL policy
>> doesn't seem
>> > >> > to have that restriction.
>> > >>
>> > >> My guess is that you have found a bug.
>> > >
>> > > Indeed.  Joe's been looking into it and I'm hoping to find some time
>> to
>> > > dig into it shortly.
>> >
>> > >> CREATE POLICY split_select ON t FOR SELECT TO split
>> > >> USING (value > 0);
>> > >> CREATE POLICY split_update ON t FOR UPDATE TO split
>> > >> USING (true) WITH CHECK (true);
>> >
>> > Yes -- from what I can see in gdb:
>>
>> Actually, looking at this again, the complaint appears to be that you
>> can't "give away" records.  That was a topic of much discussion and I'm
>> reasonably sure that was what we ended up deciding made the most sense.
>> You have to be able to see records to be able to update them (you can't
>> update records you can't see), and you have to be able to see the result
>> of your update.  I don't doubt that we could improve the documentation
>> around this (and apparently the code comments, according to Joe..).
>>
>>
> Then there is a bug in the simpler statement which happily lets you give
> away records.
>
> CREATE POLICY simple_all ON t TO simple USING (value > 0) WITH CHECK
> (true);
>
> SET session authorization simple;
> SELECT * FROM t;
> UPDATE t SET value = value * -1 WHERE value = 1;
> -- No error and value is -1 at the end.
>

My actual use-case involves a range. Most users can see and manipulate the
record when CURRENT_TIMESTAMP is within active_period. Some users
(staff/account admins) can see recently dead records too. And a 3rd group
(senior staff) have no time restriction, though there are a few customers
they cannot see due to their information being a touch more sensitive.
I've simplified the below rules to just deal with active_period and the
majority of user view (@> CURRENT_TIMESTAMP).

NOTE: the active_period range is '[)' by default, so records with upper() =
CURRENT_TIMESTAMP are not visible with @> CURRENT_TIMESTAMP restriction.

CREATE A TABLE t (id integer, active_period tstzrange NOT NULL DEFAULT
tstzrange(current_timestamp, NULL));

The below policy is allowed but requires that 1ms slop to accommodate the wi

Updated record invisible to USING but requires a trigger to enforce
specific upper
and starting values. I have a trigger enforcing specific upper/lower values
for the range
for specific ROLEs. So I had the thought that I might move ROLE specific
trigger logic into
the RLS mechanism.

CREATE POLICY hide_old ON t TO s;
  USING ( active_period @> CURRENT_TIMESTAMP)
 WITH CHECK ( active_period && tstzrange(current_timestamp - interval
'0.001 seconds', current_timestamp, '[]'));

-- This is effectively a delete for the above policy. It becomes
immediately invisible due to USING restriction.
UPDATE t SET active_period = tstzrange(lower(active_period),
CURRENT_TIMESTAMP);
SELECT count(*) FROM t; -- 0 records

I tried to tighten the above rules, so INSERT must have upper of NULL and
UPDATE must
set upper to exactly CURRENT_TIMESTAMP. Clearly I can achieve this using
triggers for
enforcement but I tried to abuse RLS instead because it is a role specific
restriction.

I was surprised when hide_old_select->USING was preventing the UPDATE when
the simple single policy version let it through.

CREATE POLICY hide_old_select ON t FOR SELECT TO s
  USING ( active_period @> CURRENT_TIMESTAMP);
CREATE POLICY hide_old_insert ON t FOR INSERT to s
 WITH CHECK ( lower(active_period) = CURRENT_TIMESTAMP AND
upper(active_period) IS NULL);

CREATE POLICY hide_old_update ON t FOR UPDATE TO s
  USING ( active_period @> CURRENT_TIMESTAMP)
 WITH CHECK ( upper(active_period) = CURRENT_TIMESTAMP);

-- Disallowed due to hide_old_select policy.
UPDATE t SET active_period = tstzrange(lower(active_period),
CURRENT_TIMESTAMP);

I'm happy to help with testing and documentation but first I need to
understand what the intended functionality was. Right now it seems
inconsistent between the simple single policy version and the multi policy
version; the docs imply the single policy version is correct (it only seems
to mention SELECT checks on RETURNING clauses).

-- 
Rod Taylor

Re: [HACKERS] Row Level Security UPDATE Confusion

2017-04-14 Thread Rod Taylor

On Thu, Apr 13, 2017 at 5:31 PM, Stephen Frost <sfr...@snowman.net> wrote:

> Rod, all,
>
> * Joe Conway (m...@joeconway.com) wrote:
> > On 04/13/2017 01:31 PM, Stephen Frost wrote:
> > > * Robert Haas (robertmh...@gmail.com) wrote:
> > >> On Thu, Apr 6, 2017 at 4:05 PM, Rod Taylor <rod.tay...@gmail.com>
> wrote:
> > >> > I'm a little confused on why a SELECT policy fires against the NEW
> record
> > >> > for an UPDATE when using multiple FOR policies. The ALL policy
> doesn't seem
> > >> > to have that restriction.
> > >>
> > >> My guess is that you have found a bug.
> > >
> > > Indeed.  Joe's been looking into it and I'm hoping to find some time to
> > > dig into it shortly.
> >
> > >> CREATE POLICY split_select ON t FOR SELECT TO split
> > >> USING (value > 0);
> > >> CREATE POLICY split_update ON t FOR UPDATE TO split
> > >> USING (true) WITH CHECK (true);
> >
> > Yes -- from what I can see in gdb:
>
> Actually, looking at this again, the complaint appears to be that you
> can't "give away" records.  That was a topic of much discussion and I'm
> reasonably sure that was what we ended up deciding made the most sense.
> You have to be able to see records to be able to update them (you can't
> update records you can't see), and you have to be able to see the result
> of your update.  I don't doubt that we could improve the documentation
> around this (and apparently the code comments, according to Joe..).
>
>
Then there is a bug in the simpler statement which happily lets you give
away records.

CREATE POLICY simple_all ON t TO simple USING (value > 0) WITH CHECK (true);

SET session authorization simple;
SELECT * FROM t;
UPDATE t SET value = value * -1 WHERE value = 1;
-- No error and value is -1 at the end.



-- 
Rod Taylor

[HACKERS] Row Level Security UPDATE Confusion

2017-04-06 Thread Rod Taylor

I'm a little confused on why a SELECT policy fires against the NEW record
for an UPDATE when using multiple FOR policies. The ALL policy doesn't seem
to have that restriction.


DROP TABLE IF EXISTS t;

CREATE USER simple;
CREATE USER split;
CREATE TABLE t(value int);
grant select, update on table t to simple, split;

INSERT INTO t values (1), (2);


ALTER TABLE t ENABLE ROW LEVEL SECURITY;
CREATE POLICY simple_all ON t TO simple USING (value > 0) WITH CHECK (true);

CREATE POLICY split_select ON t FOR SELECT TO split USING (value > 0);
CREATE POLICY split_update ON t FOR UPDATE TO split USING (true) WITH CHECK
(true);


SET session authorization simple;
SELECT * FROM t;
UPDATE t SET value = value * -1 WHERE value = 1;

SET session authorization split;
SELECT * FROM t;
UPDATE t SET value = value * -1 WHERE value = 2;
/* UPDATE fails with below error:
psql:/tmp/t.sql:24: ERROR:  42501: new row violates row-level security
policy for table "t"
LOCATION:  ExecWithCheckOptions, execMain.c:2045
*/

SET session authorization default;
SELECT * FROM t;

This seems consistent in both Pg 9.5 and upcoming Pg 10.


-- 
Rod Taylor

Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE

2017-02-04 Thread Rod Taylor

On Thu, Feb 2, 2017 at 11:40 AM, Alvaro Herrera 
wrote:

> Pavel Stehule wrote:
>
> > Identification of unjoined tables should be very useful - but it is far
> to
> > original proposal - so it can be solved separately.
> >
> > This patch is simple - and usually we prefer more simple patches than one
> > bigger.
> >
> > Better to enhance this feature step by step.
>
> Agreed -- IMO this is a reasonable first step, except that I would
> rename the proposed extension so that it doesn't focus solely on the
> first step.  I'd pick a name that suggests that various kinds of checks
> are applied to queries, so "require_where" would be only one of various
> options that can be enabled.
>

A general SQL-Critic would be a very welcome extension.

Re: [HACKERS] Design for In-Core Logical Replication

2016-07-20 Thread Rod Taylor

On Wed, Jul 20, 2016 at 4:08 AM, Simon Riggs <si...@2ndquadrant.com> wrote:

>
>   
> And on Subscriber database:
> 
> CREATE SUBSCRIPTION mysub WITH CONNECTION dbname=foo host=bar
> user=repuser PUBLICATION mypub;
> 
>   
>   
> The above will start the replication process which synchronizes the
> initial table contents of users and
> departments tables and then starts replicating
> incremental changes to those tables.
>   
> 
> 
>

I think it's important for communication channels to be defined separately
from the subscriptions.

If I have nodes 1/2 + 3/4 which operate in pairs, I don't really want to
have to have a script reconfigure replication on 3/4 every-time we do
maintenance on 1 or 2.

3/4 need to know they subscribe to mypub and that they have connections to
machine 1 and machine 2. The replication system should be able to figure
out which (of 1/2) has the most recently available data.

So, I'd rather have:

CREATE CONNECTION machine1;
CREATE CONNECTION machine2;
CREATE SUBSCRIPTION TO PUBLICATION mypub;

Notice I explicitly did not tell it how to get the publication but if we
did have a preference the DNS weighting model might be appropriate.

I'm not certain the subscription needs to be named. IMO, a publication
should have the same properties on all nodes (so any node may become the
primary source). If a subscriber needs different behaviour for a
publication, it should be created as a different publication.

Documenting that ThisPub is different from ThatPub is easier than
documenting that ThisPub on node 1/2/4 is different from ThisPub on node
7/8, except Node 7 is temporarily on Node 4 too (database X instead of
database Y) due to that power problem.

Clearly this is advanced. An initial implementation may only allow mypub
from a single connection.

I also suspect multiple publications will be normal even if only 2 nodes.
Old slow moving data almost always got different treatment than fast-moving
data; even if only defining which set needs to hit the other node first and
which set can trickle through later.

regards,

Rod Taylor

Re: [HACKERS] array of domain types

2016-06-02 Thread Rod Taylor

On Thu, Jun 2, 2016 at 10:42 AM, Konstantin Knizhnik <
k.knizh...@postgrespro.ru> wrote:

> On 02.06.2016 17:22, Tom Lane wrote:
>
>> konstantin knizhnik  writes:
>>
>>> Attached please find patch for DefineDomain function.
>>>
>> You didn't attach the patch,
>>
>
> Sorry, but I did attached the patch - I see the attachment in my mail
> received from the group.
> Multidimensional arrays work fine:
>
> knizhnik=# SELECT '{{14},{20}}'::teenager[][];
> ERROR:  value for domain teenager violates check constraint
> "teenager_check"
> LINE 1: SELECT '{{14},{20}}'::teenager[][];
>^
> knizhnik=# SELECT '{{14},{19}}'::teenager[][];
>   teenager
> -
>  {{14},{19}}
> (1 row)
>
> knizhnik=# SELECT ('{{14},{19}}'::teenager[][])[1][1];
>  teenager
> --
>14
> (1 row)
>
>
> Domain of array of domain also works:
>
>
I applied the domain.patch from above on HEAD, and all I get is cache
lookup failures. The type_sanity regression test fails too.

postgres=# CREATE DOMAIN teenager AS int CHECK (VALUE BETWEEN 13 AND 20);
CREATE DOMAIN
postgres=# CREATE DOMAIN teenager_groups AS teenager[];
CREATE DOMAIN
postgres=# CREATE TABLE x (col teenager_groups);
ERROR:  cache lookup failed for type 0


Anyway, if that worked for me I would have done this which I expect will
succeed when it shouldn't.

INSERT INTO x VALUES (ARRAY[13,14,20]);
ALTER DOMAIN teenager DROP CONSTRAINT teenager_check;
ALTER DOMAIN teenager ADD CHECK (VALUE BETWEEN 13 AND 19);

Re: [HACKERS] LOCK TABLE .. DEFERRABLE

2016-04-05 Thread Rod Taylor

On Tue, Apr 5, 2016 at 1:10 PM, Simon Riggs  wrote:

> If a lock is successfully obtained on one table, but not on all tables, it
>> releases that lock and will retry to get them as a group in the future.
>> Since inheritance acts as a group of tables (top + recursive cascade to
>> children), this implementation is necessary even if only a single table is
>> specified in the command.
>>
>
> I'd prefer to see this as a lock wait mode where it sits in the normal
> lock queue BUT other lock requestors are allowed to queue jump past it.
> That should be just a few lines changed in the lock conflict checker and
> some sleight of hand in the lock queue code.
>
> That way we avoid the busy-wait loop and multiple DEFERRABLE lock waiters
> queue up normally.
>

Yeah, that would be better. I can see how to handle a single structure in
that way but I'm not at all certain how to handle multiple tables and
inheritance is multiple tables even with a single command.

X1 inherits from X

There is a long-running task on X1.

Someone requests LOCK TABLE X IN ACCESS EXCLUSIVE MODE WAIT PATIENTLY.
Internally this also grabs X1.

The lock on X might be granted immediately and now blocks all other access
to that table.

There would need be a Locking Group kind of thing so various LockTags are
treated as a single entity to grant them simultaneously. That seems pretty
invasive; at least I don't see anything like that today.

[HACKERS] LOCK TABLE .. DEFERRABLE

2016-04-05 Thread Rod Taylor

The intention of this feature is to give the ability to slip into a normal
workload for non-urgent maintenance work. In essence, instead of lock
waiters being in a Queue, DEFERRABLE causes the current lock statement to
always be last. It was discussed at last years pgCon as useful for
replication tools adding/removing triggers. I've also seen more than one
plpgsql loop using subtransactions and LOCK TABLE .. NOWAIT to achieve a
similar effect. IMO, it's much cleaner built in.


If a lock is successfully obtained on one table, but not on all tables, it
releases that lock and will retry to get them as a group in the future.
Since inheritance acts as a group of tables (top + recursive cascade to
children), this implementation is necessary even if only a single table is
specified in the command.


Like various CONCURRENT commands, it waits on a set of transactions which
were found to be blocking it. This puts it into the "waiting" state and
allows isolation testing to work as expected. I started with a simple loop
with a timer (and a GUC) but it didn't feel right without pg_stat_activity
showing the waiting state. statement_timeout is suggested for a time
restriction.


Possibly Ugly stuff:

SetLocktagRelationOid() no longer static inline. Better option? My C foo
isn't all that it should be. Lock Table allows locking shared tables so I
can't just assume MyDatabaseId is sufficient for the lock tag.

Return value InvalidOid in RangeVarGetRelidExtended() can now appear in 2
different situations; relation missing if missing_ok enabled and relation
unlockable if LockWaitPolicy LockWaitNonBlock. No callers currently use
both of these options at this time.

LockTableRecurse() returns the OID of the relation it could not lock in
order to wait on the processes holding those locks. It also keeps a list of
everything it did lock so they can be unlocked if necessary.


I'll add it to the open November commitfest.

regards,

Rod Taylor
diff --git a/doc/src/sgml/ref/lock.sgml b/doc/src/sgml/ref/lock.sgml
index b946eab..e852f1d 100644
--- a/doc/src/sgml/ref/lock.sgml
+++ b/doc/src/sgml/ref/lock.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  
 
-LOCK [ TABLE ] [ ONLY ] name [ * ] [, ...] [ IN lockmode MODE ] [ NOWAIT ]
+LOCK [ TABLE ] [ ONLY ] name [ * ] [, ...] [ IN lockmode MODE ] [ NOWAIT | DEFERRABLE ]
 
 where lockmode is one of:
 
@@ -39,7 +39,23 @@ LOCK [ TABLE ] [ ONLY ] name [ * ]
NOWAIT is specified, LOCK
TABLE does not wait to acquire the desired lock: if it
cannot be acquired immediately, the command is aborted and an
-   error is emitted.  Once obtained, the lock is held for the
+   error is emitted.  
+  
+
+  
+   If DEFERRABLE is specified,
+   LOCK TABLE will wait without blocking for the
+   duration of 
+   for all locks to become available. If all locks cannot be obtained
+   simultaneously before the timeout then none of the structures 
+   will be locked and an error is emitted. Since it is non-blocking,
+   other transactions may obtain locks freely and may cause the
+   required wait time to be infinite. Use statement_timeout
+   for to restrict the wait time.
+  
+
+  
+   Once obtained, the lock is held for the
remainder of the current transaction.  (There is no UNLOCK
TABLE command; locks are always released at transaction
end.)
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 34ba385..4259072 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -4865,6 +4865,9 @@ l3:
 		RelationGetRelationName(relation;
 
 		break;
+	case LockWaitNonBlock:
+		elog(ERROR, "unsupported lock wait_policy LockWaitNonBlock");
+		break;
 }
 
 /*
@@ -4902,6 +4905,9 @@ l3:
 	 errmsg("could not obtain lock on row in relation \"%s\"",
 		RelationGetRelationName(relation;
 		break;
+	case LockWaitNonBlock:
+		elog(ERROR, "unsupported lock wait_policy LockWaitNonBlock");
+		break;
 }
 			}
 
@@ -5125,6 +5131,9 @@ heap_acquire_tuplock(Relation relation, ItemPointer tid, LockTupleMode mode,
 	errmsg("could not obtain lock on row in relation \"%s\"",
 		   RelationGetRelationName(relation;
 			break;
+		case LockWaitNonBlock:
+			elog(ERROR, "unsupported lock wait_policy LockWaitNonBlock");
+			break;
 	}
 	*have_tuple_lock = true;
 
diff --git a/src/backend/catalog/namespace.c b/src/backend/catalog/namespace.c
index 446b2ac..a0c4e56 100644
--- a/src/backend/catalog/namespace.c
+++ b/src/backend/catalog/namespace.c
@@ -46,6 +46,7 @@
 #include "nodes/makefuncs.h"
 #include "parser/parse_func.h"
 #include "storage/ipc.h"
+#include "storage/lock.h"
 #include "storage/lmgr.h"
 #include "storage/sinval.h"
 #include "utils/acl.h"
@@ -223,14 +224,19 @@ Datum		pg_is_other_temp_schema(PG_FUNCTION_AR

Re: [HACKERS][PROPOSAL] Covering + unique indexes.

2015-09-15 Thread Rod Taylor

On Tue, Sep 15, 2015 at 12:57 PM, Anastasia Lubennikova <
a.lubennik...@postgrespro.ru> wrote:

>
> Proposal Clarification.
> I see that discussion become too complicated. So, I'd like to clarify
> what we are talking about.
>
> We are discussing 2 different improvements of index.
> The one  is "partially unique index" and the other  "index with included
> columns".
> Let's look at example.
>
> - We have a table tbl(f1, f2, f3, f4).
> - We want to have an unique index on (f1,f2).
> - We want to have an index on (f1, f2, f3) which allow us to use index for
> complex "where" clauses.
>


Can someone write a query where F3 being ordered is a contribution?

If F1 and F2 are unique, adding F3 to a where or order by clause doesn't
seem to contribute anything.

-- Already fully ordered by F1,F2
SELECT ... ORDER BY F1, F2, F3;


-- F3 isn't in a known order without specifying F2
SELECT ... WHERE F1 = ? ORDER BY F1, F3;


-- Index resolves to a single record; nothing to order
SELECT ... WHERE F1 = ? AND F2 = ? ORDER BY F3;


-- Without a where clause, the index isn't helpful unless F3 is the first
column
SELECT ... ORDER BY F3;


What is it that I'm missing?

Re: [HACKERS] Allow snapshot too old error, to prevent bloat

2015-02-19 Thread Rod Taylor

On Wed, Feb 18, 2015 at 4:57 PM, Kevin Grittner kgri...@ymail.com wrote:

  But max_standby_streaming_delay, max_standby_archive_delay and
  hot_standby_feedback are among the most frequent triggers for
  questions and complaints that I/we see.
 
  Agreed.
  And a really bad one used to be vacuum_defer_cleanup_age, because
  of confusing units amongst other things. Which in terms seems
  fairly close to Kevins suggestions, unfortunately.

 Particularly my initial suggestion, which was to base snapshot
 age it on the number of transaction IDs assigned.  Does this look
 any better to you if it is something that can be set to '20min' or
 '1h'?  Just to restate, that would not automatically cancel the
 snapshots past that age; it would allow vacuum of any tuples which
 became dead that long ago, and would cause a snapshot too old
 message for any read of a page modified more than that long ago
 using a snapshot which was older than that.


I like this thought. One of the first things I do in a new Pg environment
is setup a cronjob that watches pg_stat_activity and terminates most
backends over N minutes in age (about 5x the length of normal work) with an
exception for a handful of accounts doing backups and other maintenance
operations.  This prevents a stuck client from jamming up the database.

Would pg_dump be able to opt-out of such a restriction?

regards,

Rod

Re: [HACKERS] Column Redaction

2014-10-10 Thread Rod Taylor

On Fri, Oct 10, 2014 at 10:56 AM, Stephen Frost sfr...@snowman.net wrote:

 * Thom Brown (t...@linux.com) wrote:
  On 10 October 2014 12:45, Stephen Frost sfr...@snowman.net wrote:
   There's a difference between intending that there shouldn't be a way
   past security and just making access a matter of walking a longer
   route.
  
   Throwing random 16-digit numbers and associated information at a credit
   card processor could be viewed as walking a longer route too.  The
   same goes for random key searches or password guesses.
 
  But those would need to be exhaustive, and in nearly all cases,
  impractical.

 That would be exactly the idea with this- we make it impractical to get
 at the unredacted information.


For fun I gave the search a try.


create table cards (id serial, cc bigint);
insert into cards (cc)
  SELECT CAST(random() *  AS bigint) FROM
generate_series(1,1);

\timing on
WITH RECURSIVE t(id, range_min, range_max) AS (
  SELECT id, 1::bigint,  FROM cards
  UNION ALL
SELECT id
 , CASE WHEN cc = range_avg THEN range_avg ELSE range_min END
 , CASE WHEN cc = range_avg THEN range_avg ELSE range_max END
  FROM (SELECT id, (range_min + range_max) / 2 AS range_avg, range_min,
range_max
  FROM t
   ) AS t_avg
  JOIN cards USING (id)
 WHERE range_min != range_max
)
SELECT id, range_min AS cc FROM t WHERE range_min = range_max;


On my laptop I can pull all 10,000 card numbers in less than 1 second. For
a text based item I don't imagine it would be much different. Numbers are
pretty easy to work with though.

Re: [HACKERS] A worst case for qsort

2014-08-07 Thread Rod Taylor

On Thu, Aug 7, 2014 at 3:06 PM, Peter Geoghegan p...@heroku.com wrote:

 I think that pre-sorted, all-unique text datums, that have all
 differences beyond the first 8 bytes, that the user happens to
 actually want to sort are fairly rare.


While I'm sure it's not common, I've seen a couple of ten-million tuple
tables having a URL column as primary key where 98% of the entries begin
with 'http://www.'

So, that exact scenario is out there.

Re: [HACKERS] A worst case for qsort

2014-08-07 Thread Rod Taylor

Sigh. Found another example.

A table with 15 million entries and a unique key on filesystem location for
things users created via a web interface.

Entries all begin with /usr/home/ ...

This one is frequently sorted as batch operations against the files are
performed in alphabetical order to reduce conflict issues that a random
ordering may cause between jobs.

regards,

Rod




On Thu, Aug 7, 2014 at 5:23 PM, Rod Taylor rod.tay...@gmail.com wrote:


 On Thu, Aug 7, 2014 at 3:06 PM, Peter Geoghegan p...@heroku.com wrote:

 I think that pre-sorted, all-unique text datums, that have all
 differences beyond the first 8 bytes, that the user happens to
 actually want to sort are fairly rare.


 While I'm sure it's not common, I've seen a couple of ten-million tuple
 tables having a URL column as primary key where 98% of the entries begin
 with 'http://www.'

 So, that exact scenario is out there.

Re: [HACKERS] Suppressing unused subquery output columns

2014-06-05 Thread Rod Taylor

On Thu, Jun 5, 2014 at 10:27 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 I'm not entirely convinced that it's worth the extra planning cycles,
 though.  Given the small number of complaints to date, it might not
 be worth doing this.  Thoughts?



Would this avoid execution of expensive functions in views when their
output is discarded?

-- On 9.3
CREATE TABLE data (col1 serial primary key);
INSERT INTO data DEFAULT VALUES;
INSERT INTO data DEFAULT VALUES;

CREATE OR REPLACE VIEW v AS select *, (pg_sleep(1))::text FROM data;

t=# explain analyze select col1 from v;
  QUERY
PLAN
--
 Subquery Scan on v  (cost=0.00..76.00 rows=2400 width=4) (actual
time=1001.086..2002.217 rows=2 loops=1)
   -  Seq Scan on data  (cost=0.00..52.00 rows=2400 width=4) (actual
time=1001.083..2002.210 rows=2 loops=1)
 Total runtime: 2002.268 ms
(3 rows)


regards,

Rod

Re: [HACKERS] How can we make beta testing better?

2014-04-15 Thread Rod Taylor

On Tue, Apr 15, 2014 at 5:47 PM, Josh Berkus j...@agliodbs.com wrote:

 Hackers,

 I think 9.3 has given us evidence that our users aren't giving new
 versions of PostgreSQL substantial beta testing, or if they are, they
 aren't sharing the results with us.

 How can we make beta testing better and more effective?  How can we get
 more users to actually throw serious workloads at new versions and share
 the results?

 I've tried a couple of things over the last two years and they haven't
 worked all that well.  Since we're about to go into another beta testing
 period, we need something new.  Ideas?


I think it boils down to making it really easy to create a workload
generator. Most companies have simple single-threaded regression tests for
functionality but very few companies have good parallel workload generators
which reflect activities in their production environment.

A documented beta test process/toolset which does the following would help:
1) Enables full query logging
2) Creates a replica of a production DB, record $TIME when it stops.
3) Allow user to make changes (upgrade to 9.4, change hardware, change
kernel settings, ...)
4) Plays queries from the CSV logs starting from $TIME mimicking actual
timing and transaction boundaries

If Pg can make it easy to duplicate activities currently going on in
production inside another environment, I would be pleased to fire a couple
billion queries through it over the next few weeks.

#4 should include reporting useful to the project, such as a sampling of
queries which performed significantly worse and a few relative performance
stats for overall execution time.

Re: [HACKERS] Changeset Extraction v7.3

2014-01-28 Thread Rod Taylor

On Tue, Jan 28, 2014 at 4:56 PM, Andres Freund and...@2ndquadrant.comwrote:

 On 2014-01-28 21:48:09 +, Thom Brown wrote:
  On 28 January 2014 21:37, Robert Haas robertmh...@gmail.com wrote:
   On Tue, Jan 28, 2014 at 11:53 AM, Robert Haas robertmh...@gmail.com
 wrote:
   I've rebased it here and am hacking on it still.
  
   Andres and I are going back and forth between our respective git repos
   hacking on this, and I think we're getting there, but I have a
   terminological question which I'd like to submit to a wider audience:
  
   The point of Andres's patch set is to introduce a new technology
   called logical decoding; that is, the ability to get a replication
   stream that is based on changes to tuples rather than changes to
   blocks.  It could also be called logical replication.  In these
   patches, our existing replication is referred to as physical
   replication, which sounds kind of funny to me.  Anyone have another
   suggestion?
 
  Logical and Binary replication?

 Unfortunately changeset extraction output's can be binary data...


Perhaps Logical and Block?

The existing replication mechanism is similar to block-based disk backups.
It's the whole thing (not parts) and doesn't have any concept of
database/directory.

Re: [HACKERS] GIN improvements part2: fast scan

2013-11-18 Thread Rod Taylor

On Fri, Nov 15, 2013 at 2:42 PM, Alexander Korotkov aekorot...@gmail.comwrote:

 On Fri, Nov 15, 2013 at 11:39 PM, Rod Taylor rod.tay...@gmail.com wrote:


 The patched index is 58% of the 9.4 master size. 212 MB instead of 365 MB.


 Good. That's meet my expectations :)
 You mention that both master and patched versions was compiled with debug.
 Was cassert enabled?


Just debug. I try not to do performance tests with assertions on.

Patch 7 gives the results I was looking for on this small sampling of data.


gin-fast-scan.6.patch/9.4 master performance
=
the:  1147.413 ms 1159.360 ms  1122.549 ms
and:  1035.540 ms  999.514 ms  1003.042 ms
hotel:  57.670 ms   61.152 ms58.862 ms
and  the  hotel: 266.121 ms  256.711 ms   267.011 ms
hotel  and  the: 260.213 ms  254.055 ms   255.611 ms


gin-fast-scan.7.patch
=
the:  1091.735 ms 1068.909 ms  1076.474 ms
and:   985.690 ms  972.833 ms   948.286 ms
hotel:  60.756 ms   59.028 ms57.836 ms
and  the  hotel:  50.391 ms   38.715 ms46.168 ms
hotel  and  the:  45.395 ms   40.880 ms43.978 ms

Thanks,

Rod

Re: [HACKERS] GIN improvements part2: fast scan

2013-11-18 Thread Rod Taylor

I checked out master and put together a test case using a small percentage
of production data for a known problem we have with Pg 9.2 and text search
scans.

A small percentage in this case means 10 million records randomly selected;
has a few billion records.


Tests ran for master successfully and I recorded timings.



Applied the patch included here to master along with
gin-packed-postinglists-14.patch.
Run make clean; ./configure; make; make install.
make check (All 141 tests passed.)

initdb, import dump


The GIN index fails to build with a segfault.

DETAIL:  Failed process was running: CREATE INDEX textsearch_gin_idx ON kp
USING gin (to_tsvector('simple'::regconfig, string)) WHERE (score1 IS NOT
NULL);


#0  XLogCheckBuffer (holdsExclusiveLock=1 '\001', lsn=lsn@entry=0x7fffcf341920,
bkpb=bkpb@entry=0x7fffcf341960, rdata=0x468f11 ginFindLeafPage+529,
rdata=0x468f11 ginFindLeafPage+529) at xlog.c:2339
#1  0x004b9ddd in XLogInsert (rmid=rmid@entry=13 '\r',
info=info@entry=16 '\020', rdata=rdata@entry=0x7fffcf341bf0) at xlog.c:936
#2  0x00468a9e in createPostingTree (index=0x7fa4e8d31030,
items=items@entry=0xfb55680, nitems=nitems@entry=762,
buildStats=buildStats@entry=0x7fffcf343dd0) at gindatapage.c:1324
#3  0x004630c0 in buildFreshLeafTuple (buildStats=0x7fffcf343dd0,
nitem=762, items=0xfb55680, category=optimized out, key=34078256,
attnum=optimized out, ginstate=0x7fffcf341df0) at gininsert.c:281
#4  ginEntryInsert (ginstate=ginstate@entry=0x7fffcf341df0,
attnum=optimized out, key=34078256, category=optimized out,
items=0xfb55680, nitem=762,
buildStats=buildStats@entry=0x7fffcf343dd0) at gininsert.c:351
#5  0x004635b0 in ginbuild (fcinfo=optimized out) at
gininsert.c:531
#6  0x00718637 in OidFunctionCall3Coll
(functionId=functionId@entry=2738,
collation=collation@entry=0, arg1=arg1@entry=140346257507968,
arg2=arg2@entry=140346257510448, arg3=arg3@entry=32826432) at
fmgr.c:1649
#7  0x004ce1da in index_build
(heapRelation=heapRelation@entry=0x7fa4e8d30680,
indexRelation=indexRelation@entry=0x7fa4e8d31030,
indexInfo=indexInfo@entry=0x1f4e440, isprimary=isprimary@entry=0
'\000', isreindex=isreindex@entry=0 '\000') at index.c:1963
#8  0x004ceeaa in index_create
(heapRelation=heapRelation@entry=0x7fa4e8d30680,

indexRelationName=indexRelationName@entry=0x1f4e660
textsearch_gin_knn_idx, indexRelationId=16395, indexRelationId@entry=0,
relFileNode=optimized out, indexInfo=indexInfo@entry=0x1f4e440,
indexColNames=indexColNames@entry=0x1f4f728,
accessMethodObjectId=accessMethodObjectId@entry=2742,
tableSpaceId=tableSpaceId@entry=0,
collationObjectId=collationObjectId@entry=0x1f4fcc8,

classObjectId=classObjectId@entry=0x1f4fce0,
coloptions=coloptions@entry=0x1f4fcf8,
reloptions=reloptions@entry=0, isprimary=0 '\000',
isconstraint=0 '\000', deferrable=0 '\000', initdeferred=0 '\000',
allow_system_table_mods=0 '\000', skip_build=0 '\000', concurrent=0 '\000',
is_internal=0 '\000') at index.c:1082
#9  0x00546a78 in DefineIndex (stmt=optimized out,
indexRelationId=indexRelationId@entry=0, is_alter_table=is_alter_table@entry=0
'\000',
check_rights=check_rights@entry=1 '\001', skip_build=skip_build@entry=0
'\000', quiet=quiet@entry=0 '\000') at indexcmds.c:594
#10 0x0065147e in ProcessUtilitySlow
(parsetree=parsetree@entry=0x1f7fb68,

queryString=0x1f7eb10 CREATE INDEX textsearch_gin_idx ON kp USING gin
(to_tsvector('simple'::regconfig, string)) WHERE (score1 IS NOT NULL);,
context=optimized out, params=params@entry=0x0,
completionTag=completionTag@entry=0x7fffcf344c10 , dest=optimized out)
at utility.c:1163
#11 0x0065079e in standard_ProcessUtility (parsetree=0x1f7fb68,
queryString=optimized out, context=optimized out, params=0x0,
dest=optimized out, completionTag=0x7fffcf344c10 ) at utility.c:873
#12 0x0064de61 in PortalRunUtility (portal=portal@entry=0x1f4c350,
utilityStmt=utilityStmt@entry=0x1f7fb68, isTopLevel=isTopLevel@entry=1
'\001',
dest=dest@entry=0x1f7ff08, completionTag=completionTag@entry=0x7fffcf344c10
) at pquery.c:1187
#13 0x0064e9e5 in PortalRunMulti (portal=portal@entry=0x1f4c350,
isTopLevel=isTopLevel@entry=1 '\001', dest=dest@entry=0x1f7ff08,
altdest=altdest@entry=0x1f7ff08,
completionTag=completionTag@entry=0x7fffcf344c10
) at pquery.c:1318
#14 0x0064f459 in PortalRun (portal=portal@entry=0x1f4c350,
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=1
'\001',
dest=dest@entry=0x1f7ff08, altdest=altdest@entry=0x1f7ff08,
completionTag=completionTag@entry=0x7fffcf344c10 ) at pquery.c:816
#15 0x0064d2d5 in exec_simple_query (
query_string=0x1f7eb10 CREATE INDEX textsearch_gin_idx ON kp USING gin
(to_tsvector('simple'::regconfig, string)) WHERE (score1 IS NOT NULL);) at
postgres.c:1048
#16 PostgresMain (argc=optimized out, argv=argv@entry=0x1f2ad40,
dbname=0x1f2abf8 rbt, username=optimized out) at

Re: [HACKERS] GIN improvements part2: fast scan

2013-11-18 Thread Rod Taylor

I tried again this morning using gin-packed-postinglists-16.patch and
gin-fast-scan.6.patch. No crashes.

It is about a 0.1% random sample of production data (10,000,000 records)
with the below structure. Pg was compiled with debug enabled in both cases.

  Table public.kp
 Column |  Type   | Modifiers
+-+---
 id | bigint  | not null
 string | text| not null
 score1 | integer |
 score2 | integer |
 score3 | integer |
 score4 | integer |
Indexes:
kp_pkey PRIMARY KEY, btree (id)
kp_string_key UNIQUE CONSTRAINT, btree (string)
textsearch_gin_idx gin (to_tsvector('simple'::regconfig, string))
WHERE score1 IS NOT NULL



This is a query tested. All data is in Pg buffer cache for these timings.
Words like the and and are very common (~9% of entries, each) and a
word like hotel is much less common (~0.2% of entries).

  SELECT id,string
FROM kp
   WHERE score1 IS NOT NULL
 AND to_tsvector('simple', string) @@ to_tsquery('simple', ?)
 -- ? is substituted with the query strings
ORDER BY score1 DESC, score2 ASC
LIMIT 1000;

 Limit  (cost=56.04..56.04 rows=1 width=37) (actual time=250.010..250.032
rows=142 loops=1)
   -  Sort  (cost=56.04..56.04 rows=1 width=37) (actual
time=250.008..250.017 rows=142 loops=1)
 Sort Key: score1, score2
 Sort Method: quicksort  Memory: 36kB
 -  Bitmap Heap Scan on kp  (cost=52.01..56.03 rows=1 width=37)
(actual time=249.711..249.945 rows=142 loops=1)
   Recheck Cond: ((to_tsvector('simple'::regconfig, string) @@
'''hotel''  ''and''  ''the'''::tsquery) AND (score1 IS NOT NULL))
   -  Bitmap Index Scan on textsearch_gin_idx
(cost=0.00..52.01 rows=1 width=0) (actual time=249.681..249.681 rows=142
loops=1)
 Index Cond: (to_tsvector('simple'::regconfig, string)
@@ '''hotel''  ''and''  ''the'''::tsquery)
 Total runtime: 250.096 ms



Times are from \timing on.

MASTER
===
the:   888.436 ms   926.609 ms   885.502 ms
and:   944.052 ms   937.732 ms   920.050 ms
hotel:  53.992 ms57.039 ms65.581 ms
and  the  hotel: 260.308 ms   248.275 ms   248.098 ms

These numbers roughly match what we get with Pg 9.2. The time savings
between 'the' and 'and  the  hotel' is mostly heap lookups for the score
and the final sort.



The size of the index on disk is about 2% smaller in the patched version.

PATCHED
===
the:  1055.169 ms 1081.976 ms  1083.021 ms
and:   912.173 ms  949.364 ms   965.261 ms
hotel:  62.591 ms   64.341 ms62.923 ms
and  the  hotel: 268.577 ms  259.293 ms   257.408 ms
hotel  and  the: 253.574 ms  258.071 ms  250.280 ms

I was hoping that the 'and  the  hotel' case would improve with this
patch to be closer to the 'hotel' search, as I thought that was the kind of
thing it targeted. Unfortunately, it did not. I actually applied the
patches, compiled, initdb/load data, and ran it again thinking I made a
mistake.

Reordering the terms 'hotel  and  the' doesn't change the result.





On Fri, Nov 15, 2013 at 1:51 AM, Alexander Korotkov aekorot...@gmail.comwrote:

 On Fri, Nov 15, 2013 at 3:25 AM, Rod Taylor r...@simple-knowledge.comwrote:

 I checked out master and put together a test case using a small
 percentage of production data for a known problem we have with Pg 9.2 and
 text search scans.

 A small percentage in this case means 10 million records randomly
 selected; has a few billion records.


 Tests ran for master successfully and I recorded timings.



 Applied the patch included here to master along with
 gin-packed-postinglists-14.patch.
 Run make clean; ./configure; make; make install.
 make check (All 141 tests passed.)

 initdb, import dump


 The GIN index fails to build with a segfault.


 Thanks for testing. See fixed version in thread about packed posting lists.

 --
 With best regards,
 Alexander Korotkov.

Re: [HACKERS] GIN improvements part2: fast scan

2013-11-15 Thread Rod Taylor

I tried again this morning using gin-packed-postinglists-16.patch and
gin-fast-scan.6.patch. No crashes during index building.

Pg was compiled with debug enabled in both cases. The data is a ~0.1%
random sample of production data (10,000,000 records for the test) with the
below structure.

  Table public.kp
 Column |  Type   | Modifiers
+-+---
 id | bigint  | not null
 string | text| not null
 score1 | integer |
 score2 | integer |
 score3 | integer |
 score4 | integer |
Indexes:
kp_pkey PRIMARY KEY, btree (id)
kp_string_key UNIQUE CONSTRAINT, btree (string)
textsearch_gin_idx gin (to_tsvector('simple'::regconfig, string))
WHERE score1 IS NOT NULL



All data is in Pg buffer cache for these timings. Words like the and
and are very common (~9% of entries, each) and a word like hotel is
much less common (~0.2% of entries). Below is the query tested:

  SELECT id,string
FROM kp
   WHERE score1 IS NOT NULL
 AND to_tsvector('simple', string) @@ to_tsquery('simple', ?)
 -- ? is substituted with the query strings
ORDER BY score1 DESC, score2 ASC
LIMIT 1000;

 Limit  (cost=56.04..56.04 rows=1 width=37) (actual time=250.010..250.032
rows=142 loops=1)
   -  Sort  (cost=56.04..56.04 rows=1 width=37) (actual
time=250.008..250.017 rows=142 loops=1)
 Sort Key: score1, score2
 Sort Method: quicksort  Memory: 36kB
 -  Bitmap Heap Scan on kp  (cost=52.01..56.03 rows=1 width=37)
(actual time=249.711..249.945 rows=142 loops=1)
   Recheck Cond: ((to_tsvector('simple'::regconfig, string) @@
'''hotel''  ''and''  ''the'''::tsquery) AND (score1 IS NOT NULL))
   -  Bitmap Index Scan on textsearch_gin_idx
(cost=0.00..52.01 rows=1 width=0) (actual time=249.681..249.681 rows=142
loops=1)
 Index Cond: (to_tsvector('simple'::regconfig, string)
@@ '''hotel''  ''and''  ''the'''::tsquery)
 Total runtime: 250.096 ms



Times are from \timing on.

MASTER
===
the:   888.436 ms   926.609 ms   885.502 ms
and:   944.052 ms   937.732 ms   920.050 ms
hotel:  53.992 ms57.039 ms65.581 ms
and  the  hotel: 260.308 ms   248.275 ms   248.098 ms

These numbers roughly match what we get with Pg 9.2. The time savings
between 'the' and 'and  the  hotel' is mostly heap lookups for the score
and the final sort.



The size of the index on disk is about 2% smaller in the patched version.

PATCHED
===
the:  1055.169 ms 1081.976 ms  1083.021 ms
and:   912.173 ms  949.364 ms   965.261 ms
hotel:  62.591 ms   64.341 ms62.923 ms
and  the  hotel: 268.577 ms  259.293 ms   257.408 ms
hotel  and  the: 253.574 ms  258.071 ms  250.280 ms

I was hoping that the 'and  the  hotel' case would improve with this
patch to be closer to the 'hotel' search, as I thought that was the kind of
thing it targeted. Unfortunately, it did not. I actually applied the
patches, compiled, initdb/load data, and ran it again thinking I made a
mistake.

Reordering the terms 'hotel  and  the' doesn't change the result.



On Fri, Nov 15, 2013 at 1:51 AM, Alexander Korotkov aekorot...@gmail.comwrote:

 On Fri, Nov 15, 2013 at 3:25 AM, Rod Taylor r...@simple-knowledge.comwrote:

 I checked out master and put together a test case using a small
 percentage of production data for a known problem we have with Pg 9.2 and
 text search scans.

 A small percentage in this case means 10 million records randomly
 selected; has a few billion records.


 Tests ran for master successfully and I recorded timings.



 Applied the patch included here to master along with
 gin-packed-postinglists-14.patch.
 Run make clean; ./configure; make; make install.
 make check (All 141 tests passed.)

 initdb, import dump


 The GIN index fails to build with a segfault.


 Thanks for testing. See fixed version in thread about packed posting lists.

 --
 With best regards,
 Alexander Korotkov.

Re: [HACKERS] GIN improvements part2: fast scan

2013-11-15 Thread Rod Taylor

2%.

It's essentially sentence fragments from 1 to 5 words in length. I wasn't
expecting it to be much smaller.

10 recent value selections:

 white vinegar reduce color running
 vinegar cure uti
 cane vinegar acidity depends parameter
 how remedy fir clogged shower
 use vinegar sensitive skin
 home remedies removing rust heating
 does non raw apple cider
 home remedies help maintain healthy
 can vinegar mess up your
 apple cide vineger ph balance

regards,

Rod



On Fri, Nov 15, 2013 at 12:51 PM, Alexander Korotkov
aekorot...@gmail.comwrote:

 On Fri, Nov 15, 2013 at 6:57 PM, Rod Taylor p...@rbt.ca wrote:

 I tried again this morning using gin-packed-postinglists-16.patch and
 gin-fast-scan.6.patch. No crashes.

 It is about a 0.1% random sample of production data (10,000,000 records)
 with the below structure. Pg was compiled with debug enabled in both cases.

   Table public.kp
  Column |  Type   | Modifiers
 +-+---
  id | bigint  | not null
  string | text| not null
  score1 | integer |
  score2 | integer |
  score3 | integer |
  score4 | integer |
 Indexes:
 kp_pkey PRIMARY KEY, btree (id)
 kp_string_key UNIQUE CONSTRAINT, btree (string)
 textsearch_gin_idx gin (to_tsvector('simple'::regconfig, string))
 WHERE score1 IS NOT NULL



 This is a query tested. All data is in Pg buffer cache for these timings.
 Words like the and and are very common (~9% of entries, each) and a
 word like hotel is much less common (~0.2% of entries).

   SELECT id,string
 FROM kp
WHERE score1 IS NOT NULL
  AND to_tsvector('simple', string) @@ to_tsquery('simple', ?)
  -- ? is substituted with the query strings
 ORDER BY score1 DESC, score2 ASC
 LIMIT 1000;

  Limit  (cost=56.04..56.04 rows=1 width=37) (actual time=250.010..250.032
 rows=142 loops=1)
-  Sort  (cost=56.04..56.04 rows=1 width=37) (actual
 time=250.008..250.017 rows=142 loops=1)
  Sort Key: score1, score2
  Sort Method: quicksort  Memory: 36kB
  -  Bitmap Heap Scan on kp  (cost=52.01..56.03 rows=1 width=37)
 (actual time=249.711..249.945 rows=142 loops=1)
Recheck Cond: ((to_tsvector('simple'::regconfig, string)
 @@ '''hotel''  ''and''  ''the'''::tsquery) AND (score1 IS NOT NULL))
-  Bitmap Index Scan on textsearch_gin_idx
 (cost=0.00..52.01 rows=1 width=0) (actual time=249.681..249.681 rows=142
 loops=1)
  Index Cond: (to_tsvector('simple'::regconfig,
 string) @@ '''hotel''  ''and''  ''the'''::tsquery)
  Total runtime: 250.096 ms



 Times are from \timing on.

 MASTER
 ===
 the:   888.436 ms   926.609 ms   885.502 ms
 and:   944.052 ms   937.732 ms   920.050 ms
 hotel:  53.992 ms57.039 ms65.581 ms
 and  the  hotel: 260.308 ms   248.275 ms   248.098 ms

 These numbers roughly match what we get with Pg 9.2. The time savings
 between 'the' and 'and  the  hotel' is mostly heap lookups for the score
 and the final sort.



 The size of the index on disk is about 2% smaller in the patched version.

 PATCHED
 ===
 the:  1055.169 ms 1081.976 ms  1083.021 ms
 and:   912.173 ms  949.364 ms   965.261 ms
 hotel:  62.591 ms   64.341 ms62.923 ms
 and  the  hotel: 268.577 ms  259.293 ms   257.408 ms
 hotel  and  the: 253.574 ms  258.071 ms  250.280 ms

 I was hoping that the 'and  the  hotel' case would improve with this
 patch to be closer to the 'hotel' search, as I thought that was the kind of
 thing it targeted. Unfortunately, it did not. I actually applied the
 patches, compiled, initdb/load data, and ran it again thinking I made a
 mistake.

 Reordering the terms 'hotel  and  the' doesn't change the result.


 Oh, in this path new consistent method isn't implemented for tsvector
 opclass, for array only. Will be fixed soon.
 BTW, was index 2% smaller or 2 times smaller? If it's 2% smaller than I
 need to know more about your dataset :)

 --
 With best regards,
 Alexander Korotkov.

Re: [HACKERS] GIN improvements part2: fast scan

2013-11-15 Thread Rod Taylor

On Fri, Nov 15, 2013 at 2:26 PM, Alexander Korotkov aekorot...@gmail.comwrote:

 On Fri, Nov 15, 2013 at 11:18 PM, Rod Taylor rod.tay...@gmail.com wrote:

 2%.

 It's essentially sentence fragments from 1 to 5 words in length. I wasn't
 expecting it to be much smaller.

 10 recent value selections:

  white vinegar reduce color running
  vinegar cure uti
  cane vinegar acidity depends parameter
  how remedy fir clogged shower
  use vinegar sensitive skin
  home remedies removing rust heating
  does non raw apple cider
  home remedies help maintain healthy
  can vinegar mess up your
  apple cide vineger ph balance


 I didn't get why it's not significantly smaller. Is it possible to share
 dump?


Sorry, I reported that incorrectly. It's not something I was actually
looking for and didn't pay much attention to at the time.

The patched index is 58% of the 9.4 master size. 212 MB instead of 365 MB.

Re: [HACKERS] pgbench progress report improvements

2013-09-21 Thread Rod Taylor

On Sat, Sep 21, 2013 at 4:55 AM, Fabien COELHO coe...@cri.ensmp.fr wrote:



  I suggest getting the term stddev in there somehow, maybe like this:

  progress: 37.0 s, 115.2 tps, latency avg=8.678 ms  stddev=1.792


 My issue is to try to keep the line width under control so as to avoid
 line breaks in the terminal. Under throttling, there is also the time lag
 appended to the line.

 Moreover, using 'xxx=figure breaks simple cut pipelining to extract the
 figures, so I would prefer to stick to spaces.

 Maybe:

   progress: 36.0 s, 115.2 tps, lat avg 9.678 ms stddev 1.792, lag 0.143 ms

 but I liked my +- approach:-)


100 +- 3 implies a range of 97 to 103 and no values are outside of that
range.

Re: [HACKERS] record identical operator

2013-09-17 Thread Rod Taylor

On Tue, Sep 17, 2013 at 8:23 AM, Kevin Grittner kgri...@ymail.com wrote:


 Of course, that begs the question of whether == is already taken.
 If not, we could knock one '=' off of everything above except for
 equals.  What existing uses are known for == ?


== is already taken as a common typo in plpgsql scripts. I strongly prefer
if this remained an error.

IF foo == bar THEN ...

[HACKERS] FKey not enforced resulting in broken Dump/Reload

2013-07-19 Thread Rod Taylor

A poorly coded trigger on the referencing table has the ability to break
foreign keys, and as a result create a database which cannot be dumped and
reloaded.

The BEFORE DELETE trigger accidentally does RETURN NEW, which suppresses
the DELETE action by the foreign key trigger. This allows the record from
the referenced table to be deleted and the record in the referencing table
to remain in place.

While I don't expect Pg to do what the coder meant, but it should throw an
error and not leave foreign key'd data in an invalid state.

This applies to both 9.1 and 9.2.


Please see attached bug.sql.


bug.sql
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] GIN over array of ENUMs

2013-01-11 Thread Rod Taylor

I wish to create this data structure but GIN does not currently support an
array of ENUM. Is intarray() a good place to look into adding ENUM support
or is there already an operator class for working supports enums that I
simply don't see at the moment.

This is being done as an alternative to a very large number of boolean
columns which are rarely true (under 1%).


CREATE TYPE feature AS ENUM ('item1', 'item2', 'item3');
CREATE TABLE test (id serial PRIMARY KEY, features feature[]);

CREATE INDEX test_features_idx ON test USING GIN (features, id);

ERROR:  data type feature[] has no default operator class for access method
gin
HINT:  You must specify an operator class for the index or define a default
operator class for the data type.


Thanks in advance,

Rod

[HACKERS] Time bug with small years

2011-11-24 Thread Rod Taylor

I have no idea what is going on with the minutes/seconds, particularly for
years under 1895 where it gets appended onto the timezone component?


sk_test=# select version();

version

 PostgreSQL 9.1.1 on x86_64-pc-linux-gnu, compiled by gcc-4.6.real
(Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1, 64-bit
(1 row)
-- uname -a output: Linux rbt-dell 3.0.0-13-generic #22-Ubuntu SMP Wed Nov
2 13:27:26 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux


sk_test=# select '1894-01-01'::timestamp with time zone;
 timestamptz
--
 1894-01-01 00:00:00-05:17:32
(1 row)

sk_test=# select '1895-01-01'::timestamp with time zone;
  timestamptz

 1895-01-01 00:17:32-05
(1 row)

sk_test=# select '1896-01-01'::timestamp with time zone;
  timestamptz

 1896-01-01 00:00:00-05
(1 row)

sk_test=# show timezone;
 TimeZone
---
 localtime
(1 row)

sk_test=# set timezone= 'est5edt';
SET
sk_test=# select '1895-01-01'::timestamp with time zone;
  timestamptz

 1895-01-01 00:00:00-05
(1 row)

sk_test=# select '1894-01-01'::timestamp with time zone;
  timestamptz

 1894-01-01 00:00:00-05
(1 row)



I can duplicate with the exact same version of Pg on Intel hardware with
kernel:

Linux infongd2888
2.6.28.8-20101130b-iscsi-ntacker-fasync-mremap-amd-sec6-grsec #1 SMP Tue
Nov 30 18:27:29 CET 2010 i686 GNU/Linux

Re: [HACKERS] foreign keys for array/period contains relationships

2011-03-20 Thread Rod Taylor

On Mon, Oct 25, 2010 at 15:11, Peter Eisentraut pete...@gmx.net wrote:

 Example #4: PK is period, FK is timestamp.  FK must be contained in some
 PK period.

 CREATE TABLE pk (a period PRIMARY KEY, ...);

 CREATE TABLE fk (x timestamp REFERENCES pk (a), ...);

 As above, we can probably arrange the operator knowledge to make these
 checks.  But I think additionally, you'd need an exclusion constraint on
 the PK side to ensure nonoverlapping arrays/periods so that on
 update/delete restrict as well as cascading deletes work.


Additional interesting examples involve IP network containment using
 inet/cidr or ip4/ip4r.  There, you'd probably need additional syntax to
 tell the system explicitly which operators to use.


There are a large number of use-cases for this type of foreign key with
geometry ( PostGIS ) types as well. Point references Area or Line, Area
references Area, etc.

Re: [HACKERS] WIP: cross column correlation ...

2011-02-28 Thread Rod Taylor

  But it's not the same as tracking *sections of a table*.

 I dunno.  I imagine if you have a section of a table in different
 storage than other sections, you created a tablespace and moved the
 partition holding that section there.  Otherwise, how do you prevent the
 tuples from moving to other sections?  (We don't really have a concept
 of sections of a table.)


Section could be as simple as being on the inner or outer part of a single
disk, or as complicated as being on the SSD cache of a spinning disk, or in
the multi-gigabyte cache on the raid card or SAN due to being consistently
accessed.

Section is the wrong word. If primary key values under 10 million are
consistently accessed, they will be cached even if they do get moved through
the structure. Values over 10M may be fast if on the same page as the other
value but probably aren't.

This is very evident when dealing with time based data in what can be a very
large structure. 1% may be very hot and in memory while 99% is not.

Partitioning only helps if you can predict what will be hot in the future.
Sometimes an outside source (world events) impacts what section of the
structure is hot.

regards,

Rod

Re: [HACKERS] WIP: cross column correlation ...

2011-02-26 Thread Rod Taylor

On Fri, Feb 25, 2011 at 14:26, Alvaro Herrera alvhe...@commandprompt.comwrote:

 Excerpts from Rod Taylor's message of vie feb 25 14:03:58 -0300 2011:

  How practical would it be for analyze to keep a record of response times
 for
  given sections of a table as it randomly accesses them and generate some
  kind of a map for expected response times for the pieces of data it is
  analysing?

 I think what you want is random_page_cost that can be tailored per
 tablespace.


Yes, that can certainly help but does nothing to help with finding typical
hot-spots or cached sections of the table and sending that information to
the planner.

Between Analyze random sampling and perhaps some metric during actual IO of
random of queries we should be able to determine and record which pieces of
data tend to be hot/in cache, or readily available and what data tends not
to be.


If the planner knew that the value 1 tends to have a much lower cost to
fetch than any other value in the table (it is cached or otherwise readily
available), it can choose a plan better suited toward that.

Re: [HACKERS] WIP: cross column correlation ...

2011-02-25 Thread Rod Taylor

 4. Even if we could accurately estimate the percentage of the table
 that is cached, what then?  For example, suppose that a user issues a
 query which retrieves 1% of a table, and we know that 1% of that table
 is cached.  How much of the data that the user asked for is cache?
 Hard to say, right?  It could be none of it or all of it.  The second
 scenario is easy to imagine - just suppose the query's been executed
 twice.  The first scenario isn't hard to imagine either.


I have a set of slow disks which can impact performance nearly as much as in
cached in memory versus the fast disks.

How practical would it be for analyze to keep a record of response times for
given sections of a table as it randomly accesses them and generate some
kind of a map for expected response times for the pieces of data it is
analysing?

It may well discover, on it's own, that recent data (1 month old or less)
has a random read response time of N, older data (1 year old) in a different
section of the relation tends to have a response time of 1000N, and really
old data (5 year old) tends to have a response time of 3000N.

[HACKERS] 8.3 to 8.4 Upgrade issues

2010-08-10 Thread Rod Taylor

We recently upgraded from 8.3 to 8.4 and have seen a performance
degredation which we are trying to explain and I have been asked to
get a second opinion on the cost of going from LATIN1 to UTF8
(Collation and CType) where the encoding remained SQL_ASCII..

Does anybody have experience on the cost, if any, of making this change?

Pg 8.3:
Encoding: SQL_ASCII
LC_COLLATE: en_US
LC_CTYPE: en_US

Pg 8.4:
Encoding: SQL_ASCII
Collation: en_US.UTF-8
Ctype: en_US.UTF-8

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 8.3 to 8.4 Upgrade issues

2010-08-10 Thread Rod Taylor

On Tue, Aug 10, 2010 at 13:49, Tom Lane t...@sss.pgh.pa.us wrote:
 Rod Taylor rod.tay...@gmail.com writes:
 Does anybody have experience on the cost, if any, of making this change?

 Pg 8.3:
 Encoding: SQL_ASCII
 LC_COLLATE: en_US
 LC_CTYPE: en_US

 Pg 8.4:
 Encoding: SQL_ASCII
 Collation: en_US.UTF-8
 Ctype: en_US.UTF-8

 Well, *both* of those settings collections are fundamentally
 wrong/bogus; any collation/ctype setting other than C is unsafe if
 you've got encoding set to SQL_ASCII.  But without knowing what your
 platform thinks en_US means, it's difficult to speculate about what
 the difference between them is.  I suppose that your libc's default
 assumption about encoding is not UTF-8, else these would be equivalent.
 If it had been assuming a single-byte encoding, then telling it UTF8
 instead could lead to a significant slowdown in strcoll() speed ...
 but I would think that would mainly be a problem if you had a lot of
 non-ASCII data, and if you did, you'd be having a lot of problems other
 than just performance.  Have you noticed any change in sorting behavior?

Agreed with it being an interesting choice of settings. Nearly all of
the data is 7-bit ASCII and what isn't seems to be a mix of UTF8,
LATIN1, and LATIN15.

I'm pretty sure it interpreted en_US to be LATIN1. There haven't been
any noticeable changes in sorting order that I know of.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] CIText and pattern_ops

2010-04-23 Thread Rod Taylor

Is there any particular reason why the citext module doesn't have
citext_pattern_ops operator family?

Specifically, I wish to index for this type of query:

... WHERE citext_column LIKE 'Foo%';

This, of course, is equivalent to ILIKE 'Foo%' which does not appear
to be indexable without using a functional index (
lower(citext_column) ).

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG 9.0 and standard_conforming_strings

2010-02-03 Thread Rod Taylor

On Wed, Feb 3, 2010 at 13:20, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Feb 3, 2010 at 12:34 PM, Greg Sabino Mullane g...@turnstep.com 
 wrote:
 Perl (DBD::Pg anyway) has been compatible since May 2008.

 I would interpret that to mean that there is a significant possibility
 that a too-old DBD::Pg could get used with a new PostgreSQL, and
 therefore we shouldn't change anything for 9.0.  May 2008 is not that
 long ago, especially for people running systems like RHEL with
 five-year major release cycles.

I fall into this camp with a few machines still running standard RHEL
4 which I believe has DBD::Pg 1.32 installed. We do keep up to date
with PostgreSQL but the machines connecting to it include everything
from brand new web servers through to ancient machines in accounting
running reports.

As much as I would like GUCs to disappear I think this one should
proceed cautiously and probably be a 9.1 or even 9.2 item.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] update_process_title=off and logger, wal, ... processes

2009-12-29 Thread Rod Taylor

With the update_process_title parameter set to off some PostgreSQL
processes still change their ps title to a different name than the
default. I appreciate this setting came about for performance reasons
which the logger, wal writer, autovacuum, and stats collector would
not have but I actually require the default and patched
init_ps_display() to skip changing the name.

It just surprised me that the titles still changed a single time.

I (stupidly?) installed PostgreSQL into a hostile environment which
didn't like this and decided to kill the processes as a result.
Unfortunately, I cannot change the environment.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] NOT IN Doesn't use Anti Joins?

2009-12-17 Thread Rod Taylor

I'm sure there is a good reason why NOT IN will not use an Anti-Join
plan equivalent to NOT EXISTS due to NULL handling, but in this
particular case the value being compared is in the PRIMARY KEY of both
structures being joined.

The NOT IN plan was killed after 10 minutes. The NOT EXISTS plan
returned data in roughly 10ms.

Is there a reason why the NOT IN plan could not use Anti-Joins when
the column being compared against is guaranteed to be NOT NULL? Too
much planner overhead to determine nullness of the column?


sk=# explain select * from source_reb_listing where listing_id not in
(select listing_id from source_reb_listing_specs) order by file_id
desc limit 5;

 QUERY
PLAN
-
 Limit  (cost=729015.39..3420463.83 rows=5 width=28)
   -  Index Scan Backward using source_reb_listing_fileid_idx on
source_reb_listing  (cost=729015.39..169537219655.96 rows=314954
width=28)
 Filter: (NOT (SubPlan 1))
 SubPlan 1
   -  Materialize  (cost=729015.39..1185280.74 rows=32810035 width=8)
 -  Seq Scan on source_reb_listing_specs
(cost=0.00..568040.35 rows=32810035 width=8)
(6 rows)

sk=# explain select * from source_reb_listing where not exists (select
* from source_reb_listing_specs as t where t.listing_id =
source_reb_listing.listing_id) order by file_id desc limit 5;
   QUERY
PLAN
-
 Limit  (cost=0.00..35.31 rows=5 width=28)
   -  Nested Loop Anti Join  (cost=0.00..3880495.87 rows=549496 width=28)
 -  Index Scan Backward using source_reb_listing_fileid_idx
on source_reb_listing  (cost=0.00..1107142.20 rows=629907 width=28)
 -  Index Scan using source_reb_listing_specs_pkey on
source_reb_listing_specs t  (cost=0.00..1592.74 rows=408 width=8)
   Index Cond: (t.listing_id = source_reb_listing.listing_id)
(5 rows)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Using views for row-level access control is leaky

2009-10-22 Thread Rod Taylor


 \c - secretary

 CREATE OR REPLACE FUNCTION expose_person (person text, phone text)
 RETURNS bool AS $$
 begin
  RAISE NOTICE 'person: % number: %', person, phone;
  RETURN true;
 END; $$ LANGUAGE plpgsql COST 0.01;

 postgres=  SELECT * FROM phone_number WHERE expose_person(person, phone);
 NOTICE:  person: public person number: 12345
 NOTICE:  person: secret person number: 67890
    person     | phone
 ---+---
  public person | 12345
 (1 row)


Given RAISE is easily replaced with INSERT into a logging table or
another recording mechanism, it needs to be something to push back
execution of user based parameters OR something to push forward
security clauses.


Is there any way of exposing the information using standard SQL or is
a procedure required?



If a procedure is required, then we simply need a way of ensuring the
SECURITY clauses or functions run before all of the things which an
expose information (procedures at the moment).



How about some kind of a marker on which allows security based
constraints to be pushed forward rather than the entire view?

CREATE VIEW phone_number AS
SELECT person, phone FROM phone_data WHERE SECURITY(phone NOT LIKE '6%');


This still allows complex views and queries to be mostly optimized
with a few filters that run very early and in the order they are
defined in.

Perhaps we go one step further and encourage security filters to be
applied to the table directly where possible:

CREATE VIEW phone_number AS
SELECT person, phone
FROM phone_data USING SECURITY FILTER(phone NOT LIKE '6%');


This still allow many optimizations to be applied in complex cases. The planner

CREATE VIEW phone_number AS
SELECT person, phone, company
FROM phone_data USING SECURITY FILTER(phone NOT LIKE '6%')
   JOIN person USING (person_id)
   JOIN company USING (company_id)
AND person.active AND company.active;

\c - secretary
SELECT * FROM  phone_number WHERE company = 'Frankies Co.';


This still allows a query against phone_number to use the company data
first, find the single person (Frankie) within that company, then get
his phone number out.

The scan against phone_data would be an index scan for person_id BUT
applies the SECURITY FILTER as the node immediately around the index
scan as a Recheck Condition, similar to how bitmap scans ensure they
got the correct and only the correct information.

person.active and company.active, and the joins can still be optimized
in standard ways.


More complex SECURITY FILTER clauses might be applied in the where clause. I.e.

CREATE VIEW phone_number AS
SELECT person, phone, company
FROM phone_data USING SECURITY CLAUSE (phone NOT LIKE '6%')
   JOIN person USING (person_id)
   JOIN company USING (company_id)
 WHERE SECURITY CLAUSE (person.status = company.status)
AND person.active AND company.active;


This would result in the security check (person.status =
company.status) occurring as a filter tied to the join node for person
and company which cannot be moved around.


Layering is tricky, using the above view:

\c - secretary
CREATE VIEW company_number AS
 SELECT * FROM phone_number SECURITY CLAUSE (expose_person(person, phone));

SELECT * FROM company_number;


The security clauses are bound to run in the order they are found in
the node closes to the data they use.

phone_data is immediately run through a Recheck Cond. person/company
join node is checked immediately after. Finally, the expose_person()
function is run against the now clean data.



Oh, This all has the nice side effect of knowing what to hide in
explain analyze as well since the specific clauses are marked up.  If
the user running the query is super user or owner of the view, they
see the security clause filters. If they are not, then they get a line
like this:

SELECT * FROM phone_number WHERE phone = '555-555-';

 Bitmap Heap Scan on phone_data  (cost=14.25..61.47 rows=258 width=185)
   Security Cond: ** Hidden due to permissions **
   -  Bitmap Index Scan on phone_data_index  (cost=0.00..14.19
rows=258 width=0)
 Index Cond: (phone = '555-555-')

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Could regexp_matches be immutable?

2009-10-21 Thread Rod Taylor

 So, having dismissed my original off-the-cuff answer to Rod, the next
 question is what's really going wrong for him.  I get this from
 a quick trial:

I wish I had kept specific notes on what I was actually trying to do.
I tried to_number first then the expression as seen below. I guess I
saw the error again and assumed it was the same as for to_number.

sk=# BEGIN;
BEGIN
sk=#
sk=# create table t1 (col1 text);
CREATE TABLE
sk=# INSERT INTO t1 values ('Z342432');
INSERT 0 1
sk=# INSERT INTO t1 values ('REW9432');
INSERT 0 1
sk=#
sk=# SELECT (regexp_matches(col1, '(\d+)$'))[1] from t1;
 regexp_matches

 342432
 9432
(2 rows)

sk=#
sk=# create index t1_idx ON t1 ((  (regexp_matches(col1, '(\d+)$'))[1]  ));
ERROR:  index expression cannot return a set
sk=#
sk=# ROLLBACK;
ROLLBACK



It is interesting that citext seems to be functional with exactly
the same statements.

sk=# BEGIN;
BEGIN
sk=#
sk=# create table t1 (col1 citext);
CREATE TABLE
sk=# INSERT INTO t1 values ('Z342432');
INSERT 0 1
sk=# INSERT INTO t1 values ('REW9432');
INSERT 0 1
sk=#
sk=# SELECT (regexp_matches(col1, '(\d+)$'))[1] from t1;
 regexp_matches

 342432
 9432
(2 rows)

sk=#
sk=# create index t1_idx ON t1 ((  (regexp_matches(col1, '(\d+)$'))[1]  ));
CREATE INDEX
sk=#
sk=# ROLLBACK;
ROLLBACK



The function regexp_replace(col1, '^[^0-9]+', '') does seem to do the
trick for text.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Could regexp_matches be immutable?

2009-10-14 Thread Rod Taylor

I tried making a functional index based on an expression containing
the 2 argument regexp_matches() function. Is there a reason why this
function is not marked immutable instead of normal?

regards,

Rod Taylor

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Rod Taylor

 Yeah.  I think it's going to be hard to make this work without having
 standalone transactions.  One idea would be to start a subtransaction,
 insert tuples until one fails, then rollback the subtransaction and
 start a new one, and continue on until the error limit is reached.


I've found performance is reasonable, for data with low numbers of errors
(say 1 per 100,000 records or less) doing the following:

SAVEPOINT bulk;
Insert 1000 records using COPY.

If there is an error, rollback to bulk, and step through each line
individually within its own individual subtransaction. All good lines are
kept and bad lines are logged; client side control makes logging trivial.

The next set of 1000 records is done in bulk again.

1000 records per savepoint seems to be a good point for my data without too
much time lost to overhead or too many records to retry due to a failing
record. Of course, it is controlled by the client side rather than server
side so reporting back broken records is trivial.


It may be possible to boost performance by:

1) Having copy remember which specific line caused the error. So it can
replace lines 1 through 487 in a subtransaction since it knows those are
successful. Run 488 in its on subtransaction. Run 489 through ... in a new
subtransaction.
2) Increasing the number of records per subtransaction if data is clean. It
wouldn't take long until you were inserting millions of records per
subtransaction for a large data set. This should make the subtransaction
overhead minimal. Small imports would still run slower but very large
imports of clean data should be essentially the same speed in the end.

Re: [HACKERS] remove flatfiles.c

2009-09-01 Thread Rod Taylor

On Tue, Sep 1, 2009 at 19:34, Greg Stark gsst...@mit.edu wrote:

 On Wed, Sep 2, 2009 at 12:01 AM, Alvaro
 Herreraalvhe...@commandprompt.com wrote:
  The use cases where VACUUM FULL wins currently are where storing two
  copies of the table and its indexes concurrently just isn't practical.
 
  Yeah, but then do you really need to use VACUUM FULL?  If that's really
  a problem then there ain't that many dead tuples around.

 That's what I want to believe. But picture if you have, say a
 1-terabyte table which is 50% dead tuples and you don't have a spare
 1-terabytes to rewrite the whole table.



It would be interesting if there was something between VACUUM FULL and
CLUSTER which could, say, work on a single 1GB segment at a time in a manner
similar to cluster.

You would still end up with index bloat like vacuum full, though perhaps not
as bad, but shuffling around the tuples should be faster.


The idea here is that the files can be truncated individually. Two 500MB
files is pretty much the same as a single 1GB file on disk.


Of course, I'm hand waving and don't have the technical expertise to figure
out if it can be done easily within PostgreSQL.

Re: [HACKERS] Out parameters handling

2009-03-07 Thread Rod Taylor

It wouldn't be so bad if you could assign internal and external column names.

Within the function you call the column v_foo but the caller of the
function receives column foo instead.

OUT v_foo varchar AS foo


Another alternative is requiring a prefix like plout for the
replacement to occur:

( OUT foo varchar )

BEGIN
  SELECT foo.somename INTO plout.foo FROM foo WHERE id = 10;

  RETURN NEXT;

  RETURN;
END;


On Sat, Mar 7, 2009 at 8:50 AM, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Mar 6, 2009 at 8:44 PM, Josh Berkus j...@agliodbs.com wrote:
 Robert,

 Thing is, anybody can institute their own naming convention.  I've long used
 v_ as a prefix.  Allowing : would save me some keystrokes, but that's about
 it.

 --Josh

 True... but there doesn't seem to be any shortage of people who are
 annoyed by the current behavior.  Maybe we should all just learn to
 live with it.

 ...Robert

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Out parameters handling

2009-03-07 Thread Rod Taylor

 actually - function name should be used as label now. This code is working:

Not helpful for me. The most typical conflict I have is actually the
OUT parameter and table name, not a column of the table.

Really don't want to prefix all tables with a hardcoded schema or do
variable substitution for loading the document.

Not fond of prefixing with function name either as a) many of my
functions have very long names and b) they change names occasionally,
particularly during development.

A short prefix like out would be useful. I would immediately start
prefixing all uses.

rbt=# begin;
BEGIN
rbt=# create table b (col integer);
CREATE TABLE
rbt=# insert into b values (2);
INSERT 0 1
rbt=# create or replace function fx2(a integer, out b integer) as $$
rbt$# begin
rbt$#   SELECT col
rbt$# INTO fx2.b
rbt$# FROM b;
rbt$#
rbt$#   return;
rbt$# end; $$ language plpgsql;
ERROR:  syntax error at or near $1
LINE 1: SELECT col FROM  $1
 ^
QUERY:  SELECT col FROM  $1
CONTEXT:  SQL statement in PL/PgSQL function fx2 near line 4
rbt=#

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Out parameters handling

2009-03-07 Thread Rod Taylor

On Sat, Mar 7, 2009 at 11:32 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Sat, Mar 7, 2009 at 9:08 AM, Rod Taylor rod.tay...@gmail.com wrote:
 It wouldn't be so bad if you could assign internal and external column 
 names.

 This is a good point.  Uglifying the parameter names is sort of OK for
 input parameters, but is much more annoying for output parameters.

 How much of this pain would go away if we changed over to the arguably
 correct (as in Or*cle does it that way) scoping for names, wherein the
 parser first tries to match a name against column names of tables of the
 current SQL statement, and only failing that looks to see if they are
 plpgsql variables?

This would solve all of my conflicts correctly. I nearly always use
RETURN QUERY with OUT parameters.

An alternative would be the requirement to prefix out parameters with
out, export, or something similar, so the plain non-prefixed name
is never replaced.

b in the below is the table.

I hit this quite a bit since my historical table name might be
foo_bar_baz which is the same as the most relevant name for the out
parameter.

I've debated renaming all of my tables t_* on more than one occasion
as a workaround in applications which exclusively use functions to
access/write data.


create or replace function read_some_data_from_data_region(a integer,
out b integer) as $$
begin
  SELECT col
INTO out.b
FROM b;

  return;
end; $$ language plpgsql;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: FWD: Re: [HACKERS] Updated backslash consistency patch

2009-01-15 Thread Rod Taylor

I would settle for just following the search path as set by the user.

If you explicitly include pg_catalog in the search path, then you should see
those settings.

If you do not explicitly include pg_catalog on the search_path, then it
should not find those items.


Right now pg_catalog sneaks its way onto the search_path for everybody. That
is fine for execution but information listing like this should probably
ignore those additions.



On Thu, Jan 15, 2009 at 11:50 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Bruce Momjian br...@momjian.us writes:
  Tom Lane wrote:
  I think this falls in the category of be careful what you wish for,
  you might get it.  It is now blindingly obvious that the folks asking
  for that had not actually lived with the behavior for any period of
  time.

  I got several emails thanking me for applying the patch, so there is
  clearly user-demand for 'S'.

 Were any of them from people who had actually *used* the patch for more
 than five minutes?  I think this is clearly a case of allowing abstract
 consistency considerations to override usability.

 The real problem here is that the 'S' suffix for \dt is a bad precedent
 for everything else.  If you want consistency then we need to change
 that end of things.  I think that the idea of a switch to omit system
 objects, rather than include them, might work.

regards, tom lane

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: default values for function parameters

2008-12-12 Thread Rod Taylor

How about IS or INTO?

param_name IS 3
param_name IS 'some string value'

3 INTO param_name
'some string value' INTO param_name




On Fri, Dec 12, 2008 at 8:47 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 2008/12/12 David E. Wheeler da...@kineticode.com:
 On Dec 12, 2008, at 2:39 PM, Tom Lane wrote:

 So I think that really this is never going to fly unless it uses a
 keyword-looking reserved word.  And we're not going to take some short
 word that's not reserved now and suddenly make it so.  So, despite
 Pavel's objection that the AS syntax proposal might be confused with
 other uses of AS, I seriously doubt that any proposal is going to get
 accepted that doesn't recycle AS or some other existing reserved word.

 when I should exactly identify param name, the we should to use any symbols.


 I'm okay with AS if that's the way it has to be, but what about a colon
 right at the end of the label? A cast is two colons, so no conflict there:

  SELECT foo(1, name: 'bar', quantity: 10);

 it's look well, but I still prefer some combination with =

 name: = ''
 name: = '''
 :name = ''
 $name = ..
 $name = ..

 Maybe I am too conservative
 Pavel


 No doubt I'm missing something…

 Best

 David

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New feature request: FlashBack Query

2007-02-20 Thread Rod Taylor



Wrong.  When Oracle says it's committed, it's committed.  No
difference between when, where, and how.  In Oracle, the committed
version is *always* the first presented to the user... it takes time
to go back and look at older versions; but why shouldn't that be a bit
slower, it isn't common practice anyway.  Same with rollbacks... why
should they optimize for them when 97% of transactions commit?


Do 97% of transactions commit because Oracle has slow rollbacks and  
developers are working around that performance issue, or because they  
really commit?


I have watched several developers that would prefer to issue numerous  
selects to verify things like foreign keys in the application in  
order to avoid a rollback.


Anyway, I don't have experience with big Oracle applications but I'm  
not so sure that 97% of transactions would commit if rollbacks were  
cheaper.




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [HACKERS] Autovacuum on by default?

2006-08-17 Thread Rod Taylor

On Thu, 2006-08-17 at 18:32 +0200, Peter Eisentraut wrote:
 Is it time to turn on autovacuum by default in 8.2?  I know we wanted to 
 be on the side of caution with 8.1, but perhaps we should evaluate the 
 experiences now.  Comments?

I would say yes.

I use it on 2 databases over the 200GB mark without any difficulties.
One is OLTP and the other acts more like a data warehouse.

The defaults could be a little more aggressive for both vacuum and
analyze scale_factor settings; 10% and 5% respectively.

-- 


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Constraint exclusion is not general enough

2006-08-07 Thread Rod Taylor

On Mon, 2006-08-07 at 16:54 +0100, Simon Riggs wrote:
 On Fri, 2006-08-04 at 14:40 -0400, Tom Lane wrote:
  I was just looking at Martin Lesser's gripe here:
  http://archives.postgresql.org/pgsql-performance/2006-08/msg00053.php
  about how the planner is not real bright about the filter conditions
  it generates for a simple partitioning layout.  In particular it's
  generating scans involving self-contradictory conditions:
  
   Result  (cost=0.00..33.20 rows=6 width=36)
 -  Append  (cost=0.00..33.20 rows=6 width=36)
   -  Seq Scan on t_parted  (cost=0.00..33.20 rows=6 width=36)
 Filter: ((id1 = 0) AND (id1  100) AND (id1 = 900) AND 
  (id1  1000))
  
  which it seems we ought to be bright enough to notice.  In particular
  I would argue that turning on constraint_exclusion ought to instruct
  the planner to catch this sort of thing, whereas when it's off we
  ought not expend the cycles.  I have a preliminary patch (below)
  that seems to fix it.
  
  The problem I'm having is that this isn't constraint exclusion anymore
  --- it will in fact make useful deductions without a table constraint
  anywhere in sight.  Should we rename the GUC variable, and if so to what?
  Or just live with the misnomer?  I guess plan C would be to invent a
  separate GUC variable for the other kind of test, but I can't see that
  it's worth having two.  Thoughts?
 
 In general, I'd prefer a control that allowed amount of planning to be
 specified, much in the same way we rate error messages. We really want
 just one simple knob that can be turned up or down, no matter how many
 new optimizations we add.
 
 planning_effort = LOW | MEDIUM | HIGH | VERYHIGH | EXHAUSTIVE

A simple way of doing this might be to use a minimum cost number?

# Minimum cost of query is over 100 before applying
mutual_exclusion = 100

Once applied if the filter accomplished something the query is replanned
or adjusted to take that change into account.

If there were a large number of constraints on t_parted it may well have
taken longer to plan than to execute on the 6 rows. If there were 1M
rows in the structure, the extra effort would have been well worth it.


Ideally we could set the planning time as a percentage of total
execution time and let PostgreSQL figure out what should be tried and
when, but that means giving a cost to planner functionality and having
PostgreSQL plan how to plan.

planning_effort = 5%

-- 


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Constraint exclusion is not general enough

2006-08-07 Thread Rod Taylor

On Mon, 2006-08-07 at 13:44 -0400, Tom Lane wrote:
 Rod Taylor [EMAIL PROTECTED] writes:
  A simple way of doing this might be to use a minimum cost number?
 
 But you don't have any cost numbers until after you've done the plan.

Isn't it possible to find the cost using the straight forward (fast)
method, find out what order of magnitude it is in, then do a second pass
with additional planner bells and whistles turned on if the first plan
had a high estimate?

-- 


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Constraint exclusion is not general enough

2006-08-07 Thread Rod Taylor

On Mon, 2006-08-07 at 22:01 -0400, Tom Lane wrote:
 Florian G. Pflug [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  But you don't have any cost numbers until after you've done the plan.
 
  Couldn't this work similar to geqo_effort? The planner could
  try planning the query using only cheap algorithmns, and if
  the cost exceeds a certain value, it'd restart, and use
  more sophisticated methods.
 
 AFAICS this would be a net loss on average.  Most of the time, the
 constraint exclusion code doesn't win, and so throwing away all your
 planning work to try it is going to be a loser most of the time.

If constraint exclusion does not make any changes, mark the plan as
invalid, then there is no need to replan.

 1. Generate plan cheaply
 2. If under $threshold, execute query. The cost of further planning
is significant compared to executing this potentially
non-optimal plan.
 3. Run constraint exclusion. If it changes the clauses due to
constraint exclusion, mark the plan as invalid. I assume
constraint exclusion is relatively self contained.
 4. Invalid plan is replanned. Still valid plan (no potential
improvements can be made) is executed.

-- 


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: GENERATED ... AS IDENTITY, Was: Re: [HACKERS] Feature Freeze

2006-08-01 Thread Rod Taylor

On Tue, 2006-08-01 at 18:10 +0200, Zoltan Boszormenyi wrote:
 Hi,
 
 I have progressed a bit with my pet project, a.k.a $SUBJECT.
 
 Now GENERATED ALWAYS AS IDENTITY and
 GENERATED ALWAYS AS ( expr ) work as
 intended. Documentation was also extended.

I'm only commenting because I debated trying to implement this feature a
couple of times. The ugliness required for pg_dump put me off of doing
it.

I did not see a test for enforcement during COPY. UPDATE restrictions
appear to have been missed as well:

4) If set clause SC specifies an object column that
references a column of which some underlying column is either a
generated column or an identity column whose descriptor
indicates that values are always generated, then the update
source specified in SC shall consist of a default
specification.

object column is the update target, or the left hand side of the
equation. In short, if a column marked GENERATED ALWAYS is updated then
it must be to DEFAULT or not provided as an update target.

CREATE TABLE tab (col integer GENERATED ALWAYS AS IDENTITY);
UPDATE tab SET col = DEFAULT; -- ACCEPTED
UPDATE tab SET col = 1; -- ERROR


For db restoration (pg_dump), how do you restore to the same values as
previously if it is always regenerated? By making ALWAYS a suggestion
for some users instead of always enforced and providing an override
mechanism for it. I assume it only works for relation owners but I've
not figured out how the spec does permissions.

override clause ::=
OVERRIDING USER VALUE
  | OVERRIDING SYSTEM VALUE

In short, pg_dump should append OVERRIDING SYSTEM VALUE to any insert or
copy for relations with an GENERATED ALWAYS identity column and the
backend will need to respect that.

ALWAYS is really only enforced for anyone who doesn't have permission to
specify otherwise.


Another one that got me is what do you do if you do this:

CREATE TABLE tab (col integer);
INSERT INTO tab VALUES (10);
ALTER TABLE tab ALTER col GENERATED ALWAYS AS IDENTITY;

What is the value for tab.col? It would seem that the table should
be rewritten with all values for col recalculated -- thus it would be
'1'. But wait! Can we add the override clause here too to keep the old
values and change the enforcement for new tuples only?


 Some test cases are also included, that shows
 that ALTER TABLE ALTER TYPE keeps both
 the sequence and the GENERATED ALWAYS
 property. Gzipped patch is attached.
 
 Next steps are:
 - pg_dump support
 - more ALTER TABLE support for adding and
   dropping IDENTITY and GENERATED ALWAYS
   features
 - more testing
 
 I still maintain that I don't see any standard
 requirement between the GENERATED AS IDENTITY
 and NEXT VALUE FOR but obviously both
 require SEQUENCE as supported feature
 in parallel. I can be proven wrong, though,
 but please, quote section# and text where
 it can be found in the standard.
 
 As for why GENERATED ALWAYS AS IDENTITY
 is useful? Consider someone who is coming from
 another DBMS (Informix, Access, etc.) where
 INSERT INTO table (id, ...) VALUES (0, ...);
 inserts the next value for the autoincrementer field
 instead of 0. Leaving out fields from INSERT is
 not allowed in the source because of documentation
 reasons and writing DEFAULT is not handy or not
 found in that legacy DBMS' features.
 Multiply it with N applications that was written
 that way over the years of the lifespan of a large
 project, count in the human resistance to learn
 something new (say 2.5x multiplier, but that may be
 under-estimated :-) ) and a feature that help porting
 easier will be a cheered feature. IIRC Bruce Momjian
 himself wrote in this list that ease-of-use features
 can boost PostgreSQL userbase pretty quickly.
 
 So, please, review my patch in it's current state
 and decide whether it's a 8.2-worthy feature.
 
 BTW, is there anyone working on COPY FROM ( select ) feature?
 
 Thanks in advance and best regards,
 Zoltán Böszörményi
 
 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
-- 


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: GENERATED ... AS IDENTITY, Was: Re: [HACKERS] Feature Freeze

2006-08-01 Thread Rod Taylor

  For db restoration (pg_dump), how do you restore to the same values as
  previously if it is always regenerated? By making ALWAYS a suggestion
  for some users instead of always enforced and providing an override
  mechanism for it. I assume it only works for relation owners but I've
  not figured out how the spec does permissions.
 
  override clause ::=
  OVERRIDING USER VALUE
| OVERRIDING SYSTEM VALUE
 
  In short, pg_dump should append OVERRIDING SYSTEM VALUE to any insert or
  copy for relations with an GENERATED ALWAYS identity column and the
  backend will need to respect that.

 
 Aren't INSERT and COPY distinguished in code paths?

Yes, they are separate but they also use the same permission set.

Any user can copy into a structure at any time and virtually every
restriction will be applied normally (CHECK, DEFAULT, etc.). Copy
bypasses Rules, significant parsing overhead since there is no need to
look for subselects, and possibly some triggers are bypassed. I'm fairly
sure that foreign key triggers fire.

In short, COPY doesn't bypass enforcement. GENERATED ALWAYS is part of
that enforcement and should be included in that.

If it is not included, we cannot recommend GENERATED ALWAYS for uses
like recording CURRENT_USER in an audit log since the data could be
fudged.

  ALWAYS is really only enforced for anyone who doesn't have permission to
  specify otherwise.
 
 
  Another one that got me is what do you do if you do this:
 
  CREATE TABLE tab (col integer);
  INSERT INTO tab VALUES (10);
  ALTER TABLE tab ALTER col GENERATED ALWAYS AS IDENTITY;
 
  What is the value for tab.col? It would seem that the table should
  be rewritten with all values for col recalculated -- thus it would be
  '1'. But wait! Can we add the override clause here too to keep the old
  values and change the enforcement for new tuples only?

 
 I don't think we should rewrite existing rows because
 when it was inserted, the stored value was valid
 according to the rules at that time. What if you
 have more than one rows in that table?

SERIAL has, until recently, been described as a macro. A tool for
setting things up quickly but many parts of which can be changed by hand
after-ward. It's not exactly a good source for information on how this
structure should work. For one, you can easily override the suggested
default a serial gives at any time as any user. The intention of ALWAYS
is to prevent exactly that behaviour.

I don't have an opinion on ALTER TABLE changes for this one way or the
other. It was my intention to advise that a group decision is required
and some research into what other databases do in this case. I believe
MSSQL and DB2 both implement this functionality.

Oh, and one more item. These expressions have the same abilities as a
CHECK constraint for referencing other columns.

This example comes from an IBM Guide:

CREATE TABLE T1(c1 INT, c2 DOUBLE, 
c3 DOUBLE GENERATED ALWAYS AS (c1 + c2),
c4 SMALLINT GENERATED ALWAYS AS 
  (CASE
 WHEN c1  c2 THEN 1 
 ELSE NULL
   END)
   ); 

Here is what IBM has to say about ALTER TABLE and GENERATED ALWAYS:

GENERATED 
Specifies that DB2 generates values for the column.

ALWAYS 
Specifies that DB2 will always generate
a value for the column when a row is
inserted into the table, or whenever the
result value of the
generation-expression might change. The
result of the expression is stored in
the table. GENERATED ALWAYS is the
recommended option unless data
propagation or unload and reload
operations are being performed.
GENERATED ALWAYS is the required option
for generated columns.

BY DEFAULT 
Specifies that DB2 will generate a value
for the column when a row is inserted
into the table, or updated, specifying
DEFAULT for the column, unless an
explicit value is specified. BY DEFAULT
is the recommended option when using
data propagation or performing unload
and reload operations.

identity-options 
This clause cannot be specified when adding a

[HACKERS] Connection limit and Superuser

2006-07-31 Thread Rod Taylor

It appears that the superuser does not have connection limit
enforcement. I think this should be changed.

Slony in particular does not need more than N connections but does
require being a super user.

-- 


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Connection limit and Superuser

2006-07-31 Thread Rod Taylor

On Mon, 2006-07-31 at 09:06 -0400, Tom Lane wrote:
 Rod Taylor [EMAIL PROTECTED] writes:
  It appears that the superuser does not have connection limit
  enforcement. I think this should be changed.
 
 If you're superuser, you are not subject to access restrictions,
 by definition.  I cannot imagine any scenario under which the
 above would be a good idea.  (Hint: it would be more likely to
 lock out manual admin connections than Slony.)

If you don't want an admin user to have a connection limit, give them
-1 or no connection limit.

Anyway, you're right that Slony should not require superuser status but
at the moment that is rather tricky to accomplish since it wants to muck
about in the system catalogues, use pg_cancel_backend, among other
things.
-- 


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Connection limit and Superuser

2006-07-31 Thread Rod Taylor

On Mon, 2006-07-31 at 15:07 +0200, Csaba Nagy wrote:
 On Mon, 2006-07-31 at 15:00, Martijn van Oosterhout wrote:
  On Mon, Jul 31, 2006 at 08:47:38AM -0400, Rod Taylor wrote:
   It appears that the superuser does not have connection limit
   enforcement. I think this should be changed.
  
  So if some admin process goes awry and uses up all the connection
  slots, how does the admin get in to see what's happening? If there's a
  limit you're not really superuser, are you?
 
 I thought there is a limit for super-users too... citation from:
 http://www.postgresql.org/docs/8.1/static/runtime-config-connection.html#RUNTIME-CONFIG-CONNECTION-SETTINGS

Sorry for not being more specific. I was speaking about ALTER ROLE WITH
CONNECTION LIMIT.

-- 


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Connection limit and Superuser

2006-07-31 Thread Rod Taylor

On Mon, 2006-07-31 at 15:00 +0200, Martijn van Oosterhout wrote:
 On Mon, Jul 31, 2006 at 08:47:38AM -0400, Rod Taylor wrote:
  It appears that the superuser does not have connection limit
  enforcement. I think this should be changed.
 
 So if some admin process goes awry and uses up all the connection
 slots, how does the admin get in to see what's happening? If there's a
 limit you're not really superuser, are you?

Work this one through.

If an admin process goes awry and uses up all the connection slots it
has reached max_connections AND used superuser_reserved_connections as
well.

This means an admin cannot get in to see what is happening.

That's what happens today.

I would much prefer that Superuser 'a' reaches WITH CONNECTION LIMIT for
user 'a' and superuser 'b' can get in to see what is happening.

  Slony in particular does not need more than N connections but does
  require being a super user.
 
 Maybe someone should look into enabling slony to not run as a
 superuser?
 
 Have a nice day,
-- 


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Connection limit and Superuser

2006-07-31 Thread Rod Taylor

On Mon, 2006-07-31 at 09:52 -0400, Tom Lane wrote:
 Andrew Dunstan [EMAIL PROTECTED] writes:
  Martijn van Oosterhout wrote:
  Maybe someone should look into enabling slony to not run as a
  superuser?
 
  That was my initial reaction to this suggestion. But then I realised 
  that it might well make sense to have a separate connection-limited 
  superuser for Slony purposes (or any other special purpose) alongside an 
  unlimited superuser.
 
 Actually, the real question in my mind is why Slony can't be trusted
 to use the right number of connections to start with.  If you don't
 trust it that far, what are you doing letting it into your database as
 superuser to start with?

I generally try to apply reasonable restrictions on all activities that
take place on my systems unless the machine was dedicated for that task
(in which case the limitations are those of the machine).

When things go wrong, and they almost always do eventually, these types
of restrictions ensure that only the one process grinds to a halt
instead of the entire environment.


Cron jobs are another area that are frequently implemented incorrectly.
Implementing checks to see if it is already running is overlooked enough
that I would like to restrict them as well.

This is less important since roles now allow multiple users to take
ownership of a relation; less jobs that need to run as a superuser.
-- 


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [PATCHES] extension for sql update

2006-07-31 Thread Rod Taylor

On Mon, 2006-07-31 at 17:26 +0200, Peter Eisentraut wrote:
 Am Mittwoch, 26. Juli 2006 22:58 schrieb Tom Lane:
  The reason people want this syntax is that they expect to be
  able to write, say,
 
  UPDATE mytab SET (foo, bar, baz) =
  (SELECT alpha, beta, gamma FROM othertab WHERE key = mytab.key);
 
 I don't find any derivation in the standard that would permit this.  The only 
 thing I could find are variations on
 
 SET (a) = x  -- no parentheses
 SET (a, b) = (x, y)
 SET (a, b) = ROW (x, y)
 
 where x and y are some sort of value expression.  I would have expected the 
 sort of thing that you describe, but if you know how to derive that, I'd like 
 to see it.

I believe contextually typed row value constructor element list can be
one or more value expressions which includes a row value expression.
row value expression gives us the row subquery option.

For that matter the below portion of contextually typed row value
constructor gives us:
  | left paren contextually typed row value constructor element
comma
contextually typed row value constructor element list right
paren

This breaks down into one or more comma separated row subquerys.

UPDATE tab SET (...) = ((SELECT foo, bar from a), (select bif,baz from
b));

-- 


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [PATCHES] extension for sql update

2006-07-30 Thread Rod Taylor

On Sun, 2006-07-30 at 20:20 -0400, Robert Treat wrote:
 On Thursday 27 July 2006 09:28, Bruce Momjian wrote:
  Tom Lane wrote:
   Bruce Momjian [EMAIL PROTECTED] writes:
Tom Lane wrote:
UPDATE mytab SET (foo, bar, baz) =
(SELECT alpha, beta, gamma FROM othertab WHERE key = mytab.key);
   
That UPDATE example is interesting because I remember when using
Informix that I had to do a separate SELECT statement for each UPDATE
column I wanted to update.  I didn't realize that you could group
columns and assign them from a single select --- clearly that is a
powerful syntax we should support some day.
  
   No question.  The decision at hand is whether we want to look like
   we support it, when we don't yet.  I'd vote not, because I think the
   main use-case for the row-on-the-left syntax is exactly this, and
   so I fear people will just get frustrated if they see it in the
   syntax synopsis and try to use it.
 
 
 I'm not a big fan of implementing partial solutions (remember left-joins are 
 not implemented messages :-) way back when) , however in my experience with 
 this form of the update command, the primary usage is not to use a subselect 
 to derive the values, but to make it easier to generate sql, using a single 

I disagree. UPDATE mytab SET (foo, bar, baz) =(SELECT ...) is the
specifications way of doing an update with a join. That is its primary
purpose.

UPDATE ... FROM  is a PostgreSQL alternative to the above.
-- 


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Better name/syntax for online index creation

2006-07-24 Thread Rod Taylor

   SHARED CREATE INDEX 

 Comments?

CREATE [UNIQUE] INDEX foo [WITH NOLOCK] ON ...


-- 


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Better name/syntax for online index creation

2006-07-24 Thread Rod Taylor

Sorry, hit send too quickly.

NOLOCK is kinda like NOWAIT, except implies that the command will not
take a strong lock instead of stating that it will not wait for one.

On Mon, 2006-07-24 at 11:20 -0400, Rod Taylor wrote:
  SHARED CREATE INDEX 
 
  Comments?
 
 CREATE [UNIQUE] INDEX foo [WITH NOLOCK] ON ...
 
 
-- 


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Three weeks left until feature freeze

2006-07-13 Thread Rod Taylor

On Thu, 2006-07-13 at 11:03 -0400, Jonah H. Harris wrote:
 On 7/13/06, Lukas Smith wrote:
  However I do think that PostgreSQL is missing out in
  getting new users aboard that are in the early stages
  of evalutation and simply only consider features that
  they get along with a default installation (mostly due
  to lack of better knowledge about places like pgfoundry).
 
 This is my point exactly.  As with many things, we keep skirting the
 real issue by going with an improve the smaller component approach
 such as promote pgfoundry more.  I have never seen this approach
 work, but maybe someone has an example of another OSS project that has
 successfully excluded major components like this?

Personally, I prefer the Gnome approach. Most components are developed
and bundled independently but once the code meets certain stability and
usability requirements the component is promoted to the standard website
with standard documentation, bug reporting, download locations, etc.

On PostgreSQL.org, aside from the Downloads tab it is very difficult
to find these items. PGFoundry does not attempt to differentiate between
the state of projects. Top downloads is the closest to this.


XML based docbook can easily suck in documentation from multiple remote
sources (when available -- substituted when not available) for a single
documentation build via XMLIncludes. This allows for PL/Java chapter in
the standard documentation set online.

PostgreSQL.org Support could pretty easily link to the various
locations for bug reports -- Bugzilla makes component selection a common
requirement.

A slight restructuring of the FTP tree should probably be made. It's
currently very easy to find the main pgsql, pgadmin and odbc components.
Finding pl/java (what the heck is that gborg or pgfoundry project?) is
pretty difficult.


The only real tricky part is defining what a plugin or addon application
such as pgadmin needs to be considered production ready. This will
relate to testing practices, documentation, code quality, ongoing
maintenance, and expected supported lifetime. For lifetime, if it was
released for Core 7.3 should still work with 7.3 today or old versions
should still receive security and bug fixes.


-- 


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] pg_terminate_backend idea

2006-07-10 Thread Rod Taylor

On Tue, 2005-06-21 at 23:34 -0400, Tom Lane wrote:
 Bruce Momjian pgman@candle.pha.pa.us writes:
  Tom Lane wrote:
  In any case the correct way to solve the problem is to find out what's
  being left corrupt by SIGTERM, rather than install more messiness in
  order to avoid facing the real issue ...
 
  I am confused.  Are you talking about the client SIGTERM or the server? 
 
 I am talking about Rod Taylor's reports that SIGTERM'ing individual
 backends tends to lead to lock table corrupted crashes awhile later.
 Now, I've been playing the part of Chicken Little on this for awhile,
 but seeing an actual report of problems from the field certainly
 strengthens my feelings about it.

Bringing this thread back to life.

I have not seen a lock table corruption issue with SIGTERM in 8.1 on
Solaris/Sun IV, Linux/AMD64, or Linux/Intel. I don't recall seeing one
on 8.0.3 either though I'm pretty sure there were several on 8.0.1.

There are times when locks for a process hang around for a few minutes
before getting cleared. I don't recall whether they were ungranted table
locks or entries waiting on a transaction ID lock, but the source was
Slony and a large pg_listener structure with more than 2 pages (yes,
pages not tuples).

I have also seen processes refusing to acknowledge the signal and exit
during btree index builds, but that's not a data corruption issue.
-- 


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Rod Taylor

  Here we have for example some tables which are frequently updated but
  contain 100 million rows. Vacuuming that takes hours. And the dead row
  candidates are the ones which are updated again and again and looked up
  frequently...
 
 This demonstrates that archival material and active data should be
 kept separately.
 
 They have different access patterns; kludging them into the same table
 turns out badly.

Rightfully it should be up to the database engine to ensure that both of
these patterns work against the same structure. Splitting up the data
for their access patterns is the job of partitions (hidden from the end
user preferably).

Same table good, same partition and possible same table space is bad.



-- 


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Rod Taylor

  You mean systems that are designed so exactly, that they can't take 10%
  performance change ?
 
 No, that's not really the point, performance degrades over time, in one
 minute it degraded 10%.
 
 The update to session ratio has a HUGE impact on PostgreSQL. If you have a
 thousand active sessions, it may take a minute to degrade 10% assuming
 some level of active vs operations per session per action.

So don't do an update. Multiple updates to the same row block anyway
which is generally not something you want anyway.

If you INSERT into multiple partitions (by time -- say one per minute)
and TRUNCATE periodically (30 minute old partitions for 30 minute
expiry) it works much better. Expiring the session is quite fast as well
since they'll go away with the truncate.

Index on sessionid and time and grab the row with the most recent time.
-- 


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] vacuum, performance, and MVCC

2006-06-22 Thread Rod Taylor

On Thu, 2006-06-22 at 13:42 -0400, Jonah H. Harris wrote:
 On 6/22/06, Rod Taylor [EMAIL PROTECTED] wrote:
  If you INSERT into multiple partitions (by time -- say one per minute)
  and TRUNCATE periodically (30 minute old partitions for 30 minute
  expiry) it works much better. Expiring the session is quite fast as well
  since they'll go away with the truncate.
 
 Forgive me if this sounds rude because I'm not meaning it as such, but
 this suggestion sounds like a MySQL-ish hack.  No doubt it would work,
 but should an application-writer have to incur the cost of writing
 something different because the database can't handle it?  I remember
 having to write subselect code in PHP because MySQL couldn't do it...
 not what I would call a *solution* by any mean.

Application writer doesn't need to do anything. That's what rules,
views, etc. are for.

The only snag is that PostgreSQL doesn't 'eliminate' a partition by
itself, so the truncate is a manual requirement.

-- 


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] postgresql and process titles

2006-06-16 Thread Rod Taylor

  I did have dbt2 pretty close to functional on FreeBSD a year ago but
  it's probably gone back into linuxisms since then.
 
 :(
 
 I won't have the chance to work on this further for another 2 months,
 but if you have patches I could see about picking up on them when I
 get back.

Everything has been applied to the dbt2 tree.

-- 


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] postgresql and process titles

2006-06-15 Thread Rod Taylor

On Tue, 2006-06-13 at 14:18 -0400, Kris Kennaway wrote:
 On Tue, Jun 13, 2006 at 12:29:14PM -0500, Jim C. Nasby wrote:
 
  Unless supersmack has improved substantially, you're unlikely to find
  much interest. Last I heard it was a pretty brain-dead benchmark. DBT2/3
  (http://sourceforge.net/projects/osdldbt) is much more realistic (based
  on TPC-C and TPC-H).
 
 Have you tried to compile this on FreeBSD?  It looks like it (dbt1 at
 least) will need a moderate amount of hacking - there are some Linux
 assumptions in the source and the configure script makes assumptions
 about where things are installed that cannot be overridden on the
 commandline.

I did have dbt2 pretty close to functional on FreeBSD a year ago but
it's probably gone back into linuxisms since then.
-- 


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] CSV mode option for pg_dump

2006-06-13 Thread Rod Taylor

On Mon, 2006-06-12 at 16:28 -0400, Bill Bartlett wrote:
 Can't -- the main production database is over at a CoLo site with access
 only available via SSH, and tightly-restricted SSH at that. Generally
 one of the developers will SSH over to the server, pull out whatever
 data is needed into a text file via psql or pg_dump, scp the file(s)
 back here and send them to the user.

I don't get it. If you can use psql then you already have csv support.

psql -c 'COPY pg_class TO STDOUT WITH CSV' postgres  pg_class.csv

  -Original Message-
  From: Joshua D. Drake [mailto:[EMAIL PROTECTED] 
  Sent: Monday, June 12, 2006 4:15 PM
  To: Bill Bartlett
  Cc: 'Andrew Dunstan'; 'Tom Lane'; 'PG Hackers'
  Subject: Re: [HACKERS] CSV mode option for pg_dump
  
  
  Bill Bartlett wrote:
   Here's me speaking up -- I'd definitely use it!   As a 
  quick way to pull
   data into Excel to do basic reports or analysis, a CSV 
  format would be 
   great.
  
  Why not just use ODBC?
  
  Joshua D. Drake
  -- 
  
   === The PostgreSQL Company: Command Prompt, Inc. ===
 Sales/Support: +1.503.667.4564 || 24x7/Emergency: 
  +1.800.492.2240
 Providing the most comprehensive  PostgreSQL solutions 
  since 1997
http://www.commandprompt.com/
  
  
 
 
 ---(end of broadcast)---
 TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly
 
-- 


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Extended SERIAL parsing

2006-06-11 Thread Rod Taylor

 The condition (column-is_serial  column-force_default)
 can help enforcing GENERATED ALWAYS at INSERT time
 and can also help fixing the two TODO entries about SERIAL.

You will need to include the insert components of the spec which allow
for overriding GENERATED ALWAYS during an INSERT and extend that to COPY
and teach pg_dump how to use them.
-- 


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] How to avoid transaction ID wrap

2006-06-06 Thread Rod Taylor

On Tue, 2006-06-06 at 13:53 -0400, Christopher Browne wrote:
 Clinging to sanity, [EMAIL PROTECTED] (Hannu Krosing) mumbled into her beard:
  Ühel kenal päeval, T, 2006-06-06 kell 08:42, kirjutas Mark Woodward:
  OK, here's my problem, I have a nature study where we have about 10 video
  cameras taking 15 frames per second.
  For each frame we make a few transactions on a PostgreSQL database.
  We want to keep about a years worth of data at any specific time.
 
  partition by month, then you have better chances of removing old data
  without causing overload/data loss;
 
 It's certainly worth something to be able to TRUNCATE an elderly
 partition; that cleans things out very nicely...

With one potential snafu -- it blocks new SELECTs against the parent
table while truncate runs on the child (happens with constraint
exclusion as well).

If your transactions are short then it won't be an issue. If you have
mixed length transactions (many short which the occasional long select)
then it becomes tricky since those short transactions will be blocked.

  We have triggers that fire is something interesting is found on insert.
  We want this thing to run for a log time.
  From the numbers, you can see the PostgreSQL database is VERY loaded.
  Running VACUUM may not always be possible without losing data.
 
  why ? just run it with very friendly delay settings.
 
 Friendly delay settings can have adverse effects; it is likely to
 make vacuum run on the order of 3x as long, which means that if you
 have a very large table that takes 12h to VACUUM, vacuum delay will
 increase that to 36h, which means you'll have a transaction open for
 36h.

Sounds like this is almost strictly inserts and selects though. If there
is limited garbage collection (updates, deletes, rollbacks of inserts)
required then it isn't all that bad.

-- 


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] More thoughts about planner's cost estimates

2006-06-02 Thread Rod Taylor

 One objection to this is that after moving off the gold standard of
 1.0 = one page fetch, there is no longer any clear meaning to the
 cost estimate units; you're faced with the fact that they're just an
 arbitrary scale.  I'm not sure that's such a bad thing, though.  For
 instance, some people might want to try to tune their settings so that
 the estimates are actually comparable to milliseconds of real time.

Any chance that the correspondence to time could be made a part of the
design on purpose and generally advise people to follow that rule? If we
could tell people to run *benchmark* and use those numbers directly as a
first approximation tuning, it could help quite a bit for people new to
PostgreSQL experiencing poor performance.

effective_cache_size then becomes essentially the last hand-set variable
for medium sized installations.
-- 


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] [OT] MySQL is bad, but THIS bad?

2006-05-19 Thread Rod Taylor

On Fri, 2006-05-19 at 09:11 -0700, Joshua D. Drake wrote:
 Martijn van Oosterhout wrote:
  On Thu, May 18, 2006 at 02:58:11PM -0400, Mark Woodward wrote:
  The reality is that MySQL is widely supported by some very, shall we say,
  interesting open source projects and using these products with
  PostgreSQL would be a plus.
  
  The biggest headache I find with using postgres is that various GPL
  licenced programs have trouble directly shipping postgresql support
  because of our use of OpenSSL. Each and every one of those program
  needs to add an exception to their licence for distributors to
  distribute postgresql support.
 
 Why would that be the case... OpenSSL and PostgreSQL both are BSD 
 licensed... Am I missing something?

OpenSSL is not the 3 clause BSD license, it also includes a number of
advertising clauses that the GPL has never liked -- GPL must not be
modified for derivatives but the advertising clauses are in addition to
the GPL, so it must be modified for the combination.

Exceptions exist in the GPL for libraries and tools included in the
operating system and this is enough in most cases. GPL applications on
Windows may have problems.


http://www.openssl.org/support/faq.html#LEGAL2
2. Can I use OpenSSL with GPL software?

On many systems including the major Linux and BSD distributions, yes
(the GPL does not place restrictions on using libraries that are part of
the normal operating system distribution). 

On other systems, the situation is less clear. Some GPL software
copyright holders claim that you infringe on their rights if you use
OpenSSL with their software on operating systems that don't normally
include OpenSSL.

If you develop open source software that uses OpenSSL, you may find it
useful to choose an other license than the GPL, or state explicitly that
This program is released under the GPL with the additional exemption
that compiling, linking, and/or using OpenSSL is allowed. If you are
using GPL software developed by others, you may want to ask the
copyright holder for permission to use their software with OpenSSL.




OpenSSL License
 * 1. Redistributions of source code must retain the above copyright
 *notice, this list of conditions and the following disclaimer. 
 *
 * 2. Redistributions in binary form must reproduce the above copyright
 *notice, this list of conditions and the following disclaimer in
 *the documentation and/or other materials provided with the
 *distribution.
 *
 * 3. All advertising materials mentioning features or use of this
 *software must display the following acknowledgment:
 *This product includes software developed by the OpenSSL Project
 *for use in the OpenSSL Toolkit. (http://www.openssl.org/)
 *
 * 4. The names OpenSSL Toolkit and OpenSSL Project must not be used
to
 *endorse or promote products derived from this software without
 *prior written permission. For written permission, please contact
 *[EMAIL PROTECTED]
 *
 * 5. Products derived from this software may not be called OpenSSL
 *nor may OpenSSL appear in their names without prior written
 *permission of the OpenSSL Project.
 *
 * 6. Redistributions of any form whatsoever must retain the following
 *acknowledgment:
 *This product includes software developed by the OpenSSL Project
 *for use in the OpenSSL Toolkit (http://www.openssl.org/)



  I'm thinking particularly of FreeRadius but there are others. More than
  once I thought while waiting for stuff to compile: if I'd chosen mysql
  I'd be done by now...
  
  Have a nice day,
-- 


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Compression and on-disk sorting

2006-05-17 Thread Rod Taylor

 Actually, I suspect in most cases it won't matter; I don't think people
 make a habit of trying to sort their entire database. :) But we'd want
 to protect for the oddball cases... yech.

I can make query result sets that are far larger than the database
itself.

create table fat_table_with_few_tuples(fat_status_id serial primary key,
fat_1 text, fat_2 text);

create table thin_table_with_many_tuples(fat_status_id integer
references fat_table_with_few_tuples, thin_1 integer, thin_2 integer);

SELECT * FROM thin_table_with_many_tuples NATURAL JOIN
fat_table_with_few_tuples order by fat_1, thin_1, thin_2, fat_2;


I would be asking the folks trying to use PostgreSQL for data
warehousing what their opinion is. A few fact tables in an audit query
could easily result in a very large amount of temporary diskspace being
required.

-- 


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Number of dimensions of an array parameter

2006-05-08 Thread Rod Taylor

 thhal=# CREATE DOMAIN twodims as int[][];
 CREATE DOMAIN

While still not perfect, you can use a CHECK constraint on the domain to
enforce dimension.

It's not perfect because domain constraints are not enforced in all
locations in versions earlier than 8.2. Adding extra explicit casts can
often work around that though.

ru=# create domain twodims as int[][] check(array_dims(value) =
'[1:2][1:2]');

ru=# select
array_dims('{{{1,2},{3,4}},{{5,3},{9,9}}}'::twodims);
ERROR:  value for domain twodims violates check constraint
twodims_check

ru=# select array_dims('{{1,2},{3,4}}'::twodims);
 array_dims

 [1:2][1:2]
(1 row)

If you want to be fancy, use something like this:

check(array_dims(value) ~ '^[1:\\d+][1:\\d+]$');


-- 


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[HACKERS] 8.1.3 and unused files

2006-05-05 Thread Rod Taylor

Am I correct in the thought that the various files listed below are not
used by the database and can be safely removed? There were no other
active db connections when I issued this command.

I think truncate (Slony) left them behind.

ssdb=# select file
from pg_ls_dir('base/'|| (select oid from pg_database where datname =
'ssdb')) as tab(file) 
where file !~ '\\..*$'
and file not in (select relfilenode from pg_class)
and file not in ('PG_VERSION', 'pgsql_tmp');

  file
-
 1434986
 1434984
 1434985
(3 rows)

[EMAIL PROTECTED] 16384]# ls -la 143498[456]*
-rw---   1 rbt  sysadmin 1073741824 May  1 20:56 1434984
-rw---   1 rbt  sysadmin 1073741824 May  1 21:11 1434984.1
-rw---   1 rbt  sysadmin 1073741824 May  1 23:21 1434984.10
-rw---   1 rbt  sysadmin 1073741824 May  1 23:36 1434984.11
-rw---   1 rbt  sysadmin 1073741824 May  1 23:50 1434984.12
-rw---   1 rbt  sysadmin 1073741824 May  2 00:06 1434984.13
-rw---   1 rbt  sysadmin 1073741824 May  2 00:23 1434984.14
-rw---   1 rbt  sysadmin 1073741824 May  2 00:39 1434984.15
-rw---   1 rbt  sysadmin 1073741824 May  2 00:57 1434984.16
-rw---   1 rbt  sysadmin 1073741824 May  2 01:14 1434984.17
-rw---   1 rbt  sysadmin 1073741824 May  2 01:31 1434984.18
-rw---   1 rbt  sysadmin 1073741824 May  2 01:50 1434984.19
-rw---   1 rbt  sysadmin 1073741824 May  1 21:25 1434984.2
-rw---   1 rbt  sysadmin 1073741824 May  2 02:07 1434984.20
-rw---   1 rbt  sysadmin 1073741824 May  2 02:23 1434984.21
-rw---   1 rbt  sysadmin 1073741824 May  2 02:41 1434984.22
-rw---   1 rbt  sysadmin 1073741824 May  2 02:55 1434984.23
-rw---   1 rbt  sysadmin 1073741824 May  2 03:09 1434984.24
-rw---   1 rbt  sysadmin 1073741824 May  2 03:24 1434984.25
-rw---   1 rbt  sysadmin 1073741824 May  2 03:37 1434984.26
-rw---   1 rbt  sysadmin 1073741824 May  2 03:53 1434984.27
-rw---   1 rbt  sysadmin 1073741824 May  2 04:09 1434984.28
-rw---   1 rbt  sysadmin 1073741824 May  2 04:24 1434984.29
-rw---   1 rbt  sysadmin 1073741824 May  1 21:40 1434984.3
-rw---   1 rbt  sysadmin 1073741824 May  2 04:40 1434984.30
-rw---   1 rbt  sysadmin 1073741824 May  2 04:56 1434984.31
-rw---   1 rbt  sysadmin 990912512 May  2 05:09 1434984.32
-rw---   1 rbt  sysadmin 1073741824 May  1 21:54 1434984.4
-rw---   1 rbt  sysadmin 1073741824 May  1 22:08 1434984.5
-rw---   1 rbt  sysadmin 1073741824 May  1 22:23 1434984.6
-rw---   1 rbt  sysadmin 1073741824 May  1 22:36 1434984.7
-rw---   1 rbt  sysadmin 1073741824 May  1 22:52 1434984.8
-rw---   1 rbt  sysadmin 1073741824 May  1 23:07 1434984.9
-rw---   1 rbt  sysadmin8192 May  1 20:40 1434985
-rw---   1 rbt  sysadmin 1073741824 May  2 11:27 1434986
-rw---   1 rbt  sysadmin 1073741824 May  2 11:39 1434986.1
-rw---   1 rbt  sysadmin 121733120 May  2 16:53 1434986.10
-rw---   1 rbt  sysadmin 1073741824 May  2 11:56 1434986.2
-rw---   1 rbt  sysadmin 1073741824 May  2 12:15 1434986.3
-rw---   1 rbt  sysadmin 1073741824 May  2 12:43 1434986.4
-rw---   1 rbt  sysadmin 1073741824 May  2 13:15 1434986.5
-rw---   1 rbt  sysadmin 1073741824 May  2 13:53 1434986.6
-rw---   1 rbt  sysadmin 1073741824 May  2 14:35 1434986.7
-rw---   1 rbt  sysadmin 1073741824 May  2 15:38 1434986.8
-rw---   1 rbt  sysadmin 1073741824 May  2 16:53 1434986.9
-- 


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] 8.1.3 and unused files

2006-05-05 Thread Rod Taylor

On Fri, 2006-05-05 at 14:09 -0400, Tom Lane wrote:
 Rod Taylor [EMAIL PROTECTED] writes:
  Am I correct in the thought that the various files listed below are not
  used by the database and can be safely removed? There were no other
  active db connections when I issued this command.
 
  I think truncate (Slony) left them behind.
 
 I don't particularly like the casual assumption that truncate is broken.

 If I were you I'd be looking harder for a plausible explanation about
 where these files came from, especially seeing how large they are.
 Have you tried dumping the file contents to see if the data looks
 recognizable at all?

Hardware is perfectly functional and has been for about 18 months in
production with 8.0.x.

It is a completely new 8.1 database and Slony is the only entity that
has been working in it. There are not very many possibilities.


I'm fairly confident I know exactly which table they are/were a part of.
1434984 is the table data, 1434986 is the primary key of the table (only
index), and 1434985 is probably the toast structure.

The structure have different relfilenode values and valid data at this
time.

At some point it must have failed in copying the data across, aborted,
and restarted.


So it would have been something like this:

BEGIN; 
TRUNCATE; 
decouple indexes -- ask Jan; 
COPY; 
recouple indexes; 
REINDEX crash, abort, something else to cause a Slony to restart;

reconnect
BEGIN; 
TRUNCATE; 
decouple indexes -- ask Jan; 
COPY; recouple indexes; 
REINDEX;
COMMIT;

-- 


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] 8.1.3 and unused files

2006-05-05 Thread Rod Taylor

On Fri, 2006-05-05 at 14:31 -0400, Tom Lane wrote:
 Rod Taylor [EMAIL PROTECTED] writes:
  At some point it must have failed in copying the data across, aborted,
  and restarted.
 
 Unless you had an actual backend crash, that's not an adequate
 explanation.  Transaction abort does clean up created files.

The only reason I noticed is because pg_database_size didn't match
sum(pg_total_relation_size()) and was investigating what I thought was a
bug in one of those functions.


I'm afraid we don't have all of the monitoring, logging, and change
control bits hooked up to the non-production DBs, so that is pretty much
all I have other than conjecture.

The only thing I can come up with is that perhaps someone forcefully
gave it a kick. SIGTERM is a necessary action once in a while to unwedge
a stuck db connection (killing the client script doesn't always get it
immediately).

Slony holds open a transaction on the master while reindexing the slave,
so perhaps someone thought the slave needed help. Making a copy of the
master takes several weeks. They may have killed slony, found the
statements still working away, SIGTERM'd them both, then restarted
slony. It wouldn't be an unusual pattern of events, particularly since
they've not been taught about pg_cancel_backend() yet (no 8.1 training).

How about this?

BEGIN;
TRUNCATE;
COPY;
REINDEX SIGTERM during REINDEX;


pg_class references old files. New files in their aborted state are left
behind?


-- 


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] 8.1.3 and unused files

2006-05-05 Thread Rod Taylor

On Fri, 2006-05-05 at 15:10 -0400, Tom Lane wrote:
 Rod Taylor [EMAIL PROTECTED] writes:
  On Fri, 2006-05-05 at 14:31 -0400, Tom Lane wrote:
  Unless you had an actual backend crash, that's not an adequate
  explanation.  Transaction abort does clean up created files.
 
  The only thing I can come up with is that perhaps someone forcefully
  gave it a kick. SIGTERM is a necessary action once in a while to unwedge
  a stuck db connection (killing the client script doesn't always get it
  immediately).
 
 SIGTERM wouldn't cause that either.  I hope your people are not in the
 habit of using kill -9?

Command line records, etc. are not available, but I did track down a a
snippet of logs from the backups (daily log rotation). Sorry, I didn't
realize there were backups initially -- it's unusual. Appears it did
crash or get killed in some way exited with exit code 1.

It's a temp DB to try a different database encoding (prep for 8.1
upgrade) with production data.

Is there something you would like me to try doing in an attempt to
reproduce? Preferably with a smaller structure.

The truncate would have happened as part of the prepareTableForCopy()
call.

slony%ssdb 10171 4621947 2006-05-02 05:09:40 EDTLOG:  0: duration: 
30526368.316 ms  statement: select _ssrep.prepareTableForCopy(1010); copy 
SNIP from stdin;
slony%ssdb 10171 4621947 2006-05-02 05:09:40 EDTLOCATION:  exec_simple_query, 
postgres.c:1103
slony%ssdb 10171 4621947 2006-05-02 05:09:40 EDTLOCATION:  exec_simple_query, 
postgres.c:1103
slony%ssdb 10181 0 2006-05-02 15:32:06 EDTLOG:  08P01: unexpected EOF on client 
connection
slony%ssdb 10181 0 2006-05-02 15:32:06 EDTLOCATION:  SocketBackend, 
postgres.c:295
slony%ssdb 10154 0 2006-05-02 15:32:06 EDTLOG:  08P01: unexpected EOF on client 
connection
slony%ssdb 10154 0 2006-05-02 15:32:06 EDTLOCATION:  SocketBackend, 
postgres.c:295
slony%ssdb 10173 0 2006-05-02 16:30:53 EDTLOG:  08P01: unexpected EOF on client 
connection
slony%ssdb 10173 0 2006-05-02 16:30:53 EDTLOCATION:  SocketBackend, 
postgres.c:295
slony%ssdb 10755 0 2006-05-02 16:30:53 EDTLOG:  08P01: unexpected EOF on client 
connection
slony%ssdb 10755 0 2006-05-02 16:30:53 EDTLOCATION:  SocketBackend, 
postgres.c:295
slony%ssdb 300 0 2006-05-02 16:55:18 EDTLOG:  08P01: unexpected EOF on client 
connection
slony%ssdb 300 0 2006-05-02 16:55:18 EDTLOCATION:  SocketBackend, postgres.c:295
slony%ssdb 301 0 2006-05-02 16:55:18 EDTLOG:  08P01: unexpected EOF on client 
connection
slony%ssdb 301 0 2006-05-02 16:55:18 EDTLOCATION:  SocketBackend, postgres.c:295
% 1960  2006-05-02 17:03:19 EDTLOG:  0: server process (PID 10171) exited 
with exit code 1
% 1960  2006-05-02 17:03:19 EDTLOCATION:  LogChildExit, postmaster.c:2416
% 1960  2006-05-02 17:03:19 EDTLOG:  0: terminating any other active server 
processes
% 1960  2006-05-02 17:03:19 EDTLOCATION:  HandleChildCrash, postmaster.c:2306
% 1960  2006-05-02 17:03:19 EDTLOCATION:  HandleChildCrash, postmaster.c:2306
% 1960  2006-05-02 17:03:19 EDTLOG:  0: all server processes terminated; 
reinitializing
% 1960  2006-05-02 17:03:19 EDTLOCATION:  reaper, postmaster.c:2206
 snip connection attempts 
% 5826  2006-05-02 17:03:22 EDTLOG:  0: database system was interrupted at 
2006-05-02 16:06:20 EDT
% 5826  2006-05-02 17:03:22 EDTLOCATION:  StartupXLOG, xlog.c:4374
% 5826  2006-05-02 17:03:22 EDTLOG:  0: checkpoint record is at 59/E0B56920
% 5826  2006-05-02 17:03:22 EDTLOCATION:  StartupXLOG, xlog.c:4442
% 5826  2006-05-02 17:03:22 EDTLOG:  0: redo record is at 59/E0B56920; undo 
record is at 0/0; shutdown FALSE
% 5826  2006-05-02 17:03:22 EDTLOCATION:  StartupXLOG, xlog.c:4469
% 5826  2006-05-02 17:03:22 EDTLOG:  0: next transaction ID: 4863932; next 
OID: 1441853
% 5826  2006-05-02 17:03:22 EDTLOCATION:  StartupXLOG, xlog.c:4472
% 5826  2006-05-02 17:03:22 EDTLOG:  0: next MultiXactId: 1; next 
MultiXactOffset: 0
% 5826  2006-05-02 17:03:22 EDTLOCATION:  StartupXLOG, xlog.c:4475
% 5826  2006-05-02 17:03:22 EDTLOG:  0: database system was not properly 
shut down; automatic recovery in progress
% 5826  2006-05-02 17:03:22 EDTLOCATION:  StartupXLOG, xlog.c:4532
% 5826  2006-05-02 17:03:22 EDTLOG:  0: redo starts at 59/E0B56970
% 5826  2006-05-02 17:03:22 EDTLOCATION:  StartupXLOG, xlog.c:4569
% 5826  2006-05-02 17:03:22 EDTLOG:  0: record with zero length at 
59/E0E429B8
% 5826  2006-05-02 17:03:22 EDTLOCATION:  ReadRecord, xlog.c:2764
% 5826  2006-05-02 17:03:22 EDTLOG:  0: redo done at 59/E0E42988
% 5826  2006-05-02 17:03:22 EDTLOCATION:  StartupXLOG, xlog.c:4627
% 5826  2006-05-02 17:03:23 EDTLOG:  0: database system is ready
% 5826  2006-05-02 17:03:23 EDTLOCATION:  StartupXLOG, xlog.c:4821
% 5826  2006-05-02 17:03:23 EDTLOG:  0: transaction ID wrap limit is 
1073749769, limited by database ssdb
% 5826  2006-05-02 17:03:23 EDTLOCATION:  SetTransactionIdLimit, varsup.c:234
-- 


---(end of broadcast

Re: [HACKERS] 8.1.3 and unused files

2006-05-05 Thread Rod Taylor

On Fri, 2006-05-05 at 16:11 -0400, Tom Lane wrote:
 Rod Taylor [EMAIL PROTECTED] writes:
  % 1960  2006-05-02 17:03:19 EDTLOG:  0: server process (PID 10171) 
  exited with exit code 1
 
 Hm.  I wonder if there are any uses of exit(1) in the Slony triggers.

It doesn't appear so. It does have this though:

Datum
_Slony_I_killBackend(PG_FUNCTION_ARGS)
{
int32   pid;
int32   signo;
text   *signame;

if (!superuser())
elog(ERROR, Slony-I: insufficient privilege for killBackend);

pid = PG_GETARG_INT32(0);
signame = PG_GETARG_TEXT_P(1);

if (VARSIZE(signame) == VARHDRSZ + 4 
memcmp(VARDATA(signame), NULL, 0) == 0)
{
signo = 0;
}
else if (VARSIZE(signame) == VARHDRSZ + 4 
memcmp(VARDATA(signame), TERM, 0) == 0)
{
signo = SIGTERM;
}
else
{
elog(ERROR, Slony-I: unsupported signal);
}

if (kill(pid, signo)  0)
PG_RETURN_INT32(-1);

PG_RETURN_INT32(0);
}

-- 


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] 8.1.3 and unused files

2006-05-05 Thread Rod Taylor

On Fri, 2006-05-05 at 18:53 -0400, Tom Lane wrote:
 Rod Taylor [EMAIL PROTECTED] writes:
  On Fri, 2006-05-05 at 16:11 -0400, Tom Lane wrote:
  Hm.  I wonder if there are any uses of exit(1) in the Slony triggers.
 
  It doesn't appear so. It does have this though:
 
 Well, a SIGTERM would have resulted in a bleat in the postmaster log.
 The striking thing about your log is that the backend went down without
 saying a word; which would be understandable if it had crashed (eg SEGV
 or kill -9) but then the postmaster would have seen some other exit
 status.  I'm fairly certain there are no paths in the standard backend
 code that will exit(1) without any attempt to print a message.  That's
 why I'm wondering about add-ons.

Add-ons are slim. Slony. We don't have any C based functions and only a
few plpgsql functions in that DB.

I did trim out a ton of autovacuum log entries (it likes to log once a
minute) but I don't see anything interesting in that area nor autovac
the pid that exited.

My knowledge of signal handling is pretty basic. Any chance that
multiple SIGTERMs could have caused it to avoid the logging?

-- 


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] autovacuum logging, part deux.

2006-05-04 Thread Rod Taylor

I don't know about anyone else, but the only time I look at that mess is
to find poor tuple/table or tuple/index ratios and other indications
that vacuum isn't working as well as it should be.

How about this instead:

Log when the actual autovacuum_vacuum_scale_factor (dead space cleaned
up) was more than 2 times the autovacuum_vacuum_scale_factor listed in
postgresql.conf. This means autovacuum isn't keeping up to what you want
it to.

Another interesting case would be a large amount of empty space in the
index or table (say 3x autovacuum_vacuum_scale_factor). This may
indicate unnecessary bloat and something to fix.

Aside from that, the raw numbers don't really interest me.

On Thu, 2006-05-04 at 14:46 +, Chris Browne wrote:
 [EMAIL PROTECTED] (Larry Rosenman) writes:
  Gentlepeople,
  Now that the patch is out for keeping the last
  autovacuum/vacuum/analyze/autoanalyze
  timestamp in the stats system is pending, what's the consensus view on
  what, if any,
  logging changes are wanted for autovacuum?
 
  I have the time and inclination to cut code quickly for it.
 
 It would be Really Nice if it could draw in the verbose stats as to
 what the VACUUM did...
 
 e.g. - to collect some portion (INFO?  DETAIL?  I'm easy :-)) of the
 information that PostgreSQL generates at either INFO: or DETAIL:
 levels.
 
 /* [EMAIL PROTECTED]/dba2 vacdb=*/ vacuum verbose analyze vacuum_requests;
 INFO:  vacuuming public.vacuum_requests
 INFO:  index vacuum_requests_pkey now contains 2449 row versions in 64 pages
 DETAIL:  3 index pages have been deleted, 3 are currently reusable.
 CPU 0.00s/0.00u sec elapsed 0.00 sec.
 INFO:  index vr_priority now contains 0 row versions in 19 pages
 DETAIL:  16 index pages have been deleted, 16 are currently reusable.
 CPU 0.00s/0.00u sec elapsed 0.00 sec.
 INFO:  vacuum_requests: found 0 removable, 2449 nonremovable row versions 
 in 65 pages
 DETAIL:  0 dead row versions cannot be removed yet.
 There were 2809 unused item pointers.
 0 pages are entirely empty.
 CPU 0.00s/0.00u sec elapsed 0.00 sec.
 INFO:  vacuuming pg_toast.pg_toast_95167460
 INFO:  index pg_toast_95167460_index now contains 0 row versions in 1 pages
 DETAIL:  0 index pages have been deleted, 0 are currently reusable.
 CPU 0.00s/0.00u sec elapsed 0.00 sec.
 INFO:  pg_toast_95167460: found 0 removable, 0 nonremovable row versions in 
 0 pages
 DETAIL:  0 dead row versions cannot be removed yet.
 There were 0 unused item pointers.
 0 pages are entirely empty.
 CPU 0.00s/0.00u sec elapsed 0.00 sec.
 INFO:  analyzing public.vacuum_requests
 INFO:  vacuum_requests: 65 pages, 2449 rows sampled, 2449 estimated total 
 rows
 VACUUM
 
-- 


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] autovacuum logging, part deux.

2006-05-04 Thread Rod Taylor

On Thu, 2006-05-04 at 11:25 -0500, Larry Rosenman wrote:
 Rod Taylor wrote:
  I don't know about anyone else, but the only time I look at that mess
  is to find poor tuple/table or tuple/index ratios and other
  indications that vacuum isn't working as well as it should be.
  
  How about this instead:
  
  Log when the actual autovacuum_vacuum_scale_factor (dead space cleaned
  up) was more than 2 times the autovacuum_vacuum_scale_factor listed in
  postgresql.conf. This means autovacuum isn't keeping up to what you
  want it to.
  
  Another interesting case would be a large amount of empty space in the
  index or table (say 3x autovacuum_vacuum_scale_factor). This may
  indicate unnecessary bloat and something to fix.
  
  Aside from that, the raw numbers don't really interest me.
 
 
 Does anyone think we should have a stats view for the last vacuum stats
 for each table?

This would actually suit me better as it would be trivial to plug into a
monitoring system with home-brew per table thresholds at that point.

-- 


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Is a SERIAL column a black box, or not?

2006-04-30 Thread Rod Taylor

On Sat, 2006-04-29 at 23:15 -0400, Tom Lane wrote:
 Rod Taylor [EMAIL PROTECTED] writes:
  Do both. Return SERIAL to being a macro and implement the SQL IDENTITY
  construct as the black box version.
 
 Doesn't SQL IDENTITY have a number of properties that are significantly
 different from serial/nextval?  I wasn't really volunteering to
 implement a large new feature to make this happen ;-)

Yes. Including a few really nice properties and a really ugly
workaround.

I didn't mean to imply that you should write it. I just meant that the
spec already has an automatic sequence generator which is black-box.

If SERIAL is going to be kept long term, then it should be the macro
version so it doesn't appear too duplicated.

 Also, I'm not sure how black boxy it can be without buying right back
 into the pg_dump problems.  pg_dump has to be able to see what's inside,
 I think.

Not sure which pg_dump problem you're referring to. A fully black box
generator would completely hide the column default and sequence. Pg_dump
and users can create and modify foreign keys without knowledge of the
trigger implementation, the same would be true here.

For the spec, the ugly workaround is OVERRIDING SYSTEM VALUE which
allows a table owner to override the ALWAYS GENERATE designation --
essentially the same as a trigger bypass switch for bulk data loads.

-- 


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Is a SERIAL column a black box, or not?

2006-04-29 Thread Rod Taylor

On Sat, 2006-04-29 at 17:54 -0400, Tom Lane wrote:
 In some recent activity on the patches list about responding to bug #2073,
 http://archives.postgresql.org/pgsql-bugs/2005-11/msg00303.php
 we've been discussing various possible tweaks to the behavior of dropping
 or modifying a serial column.  The hacks involved with SERIAL seem to me
 to be getting uglier and uglier, and I think it's time to take a step
 back and consider what we really want SERIAL to act like.
 
 It seems to me there are two basic philosophies at war here:
 
 1. A serial column is a black box that you're not supposed to muck with
 the innards of.


 2. A serial declaration is just a macro for setting up a sequence and a
 column default expression.  This was the original viewpoint and indeed is
 still what it says in the documentation:

 Comments, other opinions?

Do both. Return SERIAL to being a macro and implement the SQL IDENTITY
construct as the black box version.

CREATE TABLE foo (bar integer PRIMARY KEY GENERATED BY DEFAULT
AS IDENTITY);

INSERT ... RETURNS needs to be implemented before SERIAL can become a
black box. Until that time we will still need some knowledge of the
sequence involved.

-- 


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

[HACKERS] Left joins and inheritance (table partitioning)

2006-04-04 Thread Rod Taylor

I've recently been playing with table partitioning limitations. Turning
over a large volume of data in inherited structures in a live
environment, and have run into a couple of snags in the planner.

The first is that LEFT JOIN will always do a sequential scan on all
inherited tables.

The second is that IN (1,4,6) works very differently than IN (SELECT id
FROM tab) when tab contains the values 1, 4, and 6. I'm not surprised
a straight left join failed, but I was surprised that IN failed to use
an index with enable_seqscan = off.


My fallback plan is to simply create a view and replace it to point to
the correct data segment when changes occur.



BEGIN;

CREATE TABLE key (keyword_id serial PRIMARY KEY);
INSERT INTO key DEFAULT VALUES;
INSERT INTO key DEFAULT VALUES;
INSERT INTO key DEFAULT VALUES;
INSERT INTO key DEFAULT VALUES;
INSERT INTO key DEFAULT VALUES;
INSERT INTO key DEFAULT VALUES;
INSERT INTO key DEFAULT VALUES;
INSERT INTO key DEFAULT VALUES;
INSERT INTO key DEFAULT VALUES;

CREATE TABLE key2 (keyword_id integer PRIMARY KEY);
INSERT INTO key2 VALUES (1);
INSERT INTO key2 VALUES (6);
INSERT INTO key2 VALUES (8);
ANALYZE key2;


CREATE TABLE foo 
( keyword_id integer PRIMARY KEY
 REFERENCES key);

CREATE TABLE foo2
( PRIMARY KEY (keyword_id)
, FOREIGN KEY (keyword_id) REFERENCES KEY
) INHERITS (foo);

INSERT INTO foo2 VALUES (1);
INSERT INTO foo2 VALUES (2);
INSERT INTO foo2 VALUES (3);
INSERT INTO foo2 VALUES (4);
INSERT INTO foo2 VALUES (5);
INSERT INTO foo2 VALUES (6);
INSERT INTO foo2 VALUES (7);
INSERT INTO foo2 VALUES (8);
INSERT INTO foo2 VALUES (9);

SET enable_seqscan = off;

EXPLAIN
SELECT * FROM key2 LEFT JOIN foo USING (keyword_id) ;

 QUERY PLAN
-
 Hash Left Join  (cost=20073.50..20191.74 rows=6 width=4)
   Hash Cond: (outer.keyword_id = inner.keyword_id)
   -  Index Scan using key2_pkey on key2  (cost=0.00..3.03 rows=3
width=4)
   -  Hash  (cost=20062.80..20062.80 rows=4280 width=4)
 -  Append  (cost=1.00..20062.80 rows=4280 width=4)
   -  Seq Scan on foo  (cost=1.00..10031.40
rows=2140 width=4)
   -  Seq Scan on foo2 foo
(cost=1.00..10031.40 rows=2140 width=4)
(7 rows)


EXPLAIN
SELECT *
  FROM key2
LEFT JOIN (SELECT keyword_id
 FROM foo
WHERE keyword_id IN (SELECT keyword_id FROM key2)
  ) AS tab USING (keyword_id) ;

QUERY PLAN
---
 Merge Left Join  (cost=20087.38..20090.46 rows=3 width=4)
   Merge Cond: (outer.keyword_id = inner.keyword_id)
   -  Index Scan using key2_pkey on key2  (cost=0.00..3.03 rows=3
width=4)
   -  Sort  (cost=20087.38..20087.39 rows=6 width=4)
 Sort Key: public.foo.keyword_id
 -  Hash IN Join  (cost=10003.04..20087.30 rows=6
width=4)
   Hash Cond: (outer.keyword_id = inner.keyword_id)
   -  Append  (cost=1.00..20062.80 rows=4280
width=4)
 -  Seq Scan on foo
(cost=1.00..10031.40 rows=2140 width=4)
 -  Seq Scan on foo2 foo
(cost=1.00..10031.40 rows=2140 width=4)
   -  Hash  (cost=3.03..3.03 rows=3 width=4)
 -  Index Scan using key2_pkey on key2
(cost=0.00..3.03 rows=3 width=4)
(12 rows)

EXPLAIN
SELECT *
  FROM key2
LEFT JOIN (SELECT keyword_id
 FROM foo
WHERE keyword_id IN (1,6,8)
  ) AS tab USING (keyword_id) ;

   QUERY PLAN

 Merge Left Join  (cost=22.08..25.16 rows=3 width=4)
   Merge Cond: (outer.keyword_id = inner.keyword_id)
   -  Index Scan using key2_pkey on key2  (cost=0.00..3.03 rows=3
width=4)
   -  Sort  (cost=22.08..22.09 rows=6 width=4)
 Sort Key: public.foo.keyword_id
 -  Append  (cost=3.01..22.00 rows=6 width=4)
   -  Bitmap Heap Scan on foo  (cost=3.01..9.50 rows=3
width=4)
 Recheck Cond: ((keyword_id = 1) OR (keyword_id = 6)
OR (keyword_id = 8))
 -  BitmapOr  (cost=3.01..3.01 rows=3 width=0)
   -  Bitmap Index Scan on foo_pkey
(cost=0.00..1.00 rows=1 width=0)
 Index Cond: (keyword_id = 1)
   -  Bitmap Index Scan on foo_pkey
(cost=0.00..1.00 rows=1 width=0)
 Index Cond: (keyword_id = 6)
   -  Bitmap Index Scan on foo_pkey
(cost=0.00..1.00 rows=1 width=0)
 Index Cond: (keyword_id = 8)
   -  Bitmap Heap Scan on foo2 foo  (cost=6.01..12.50
rows=3 width=4)

Re: [HACKERS] Left joins and inheritance (table partitioning)

2006-04-04 Thread Rod Taylor

On Tue, 2006-04-04 at 23:50 -0400, Tom Lane wrote:
 Rod Taylor [EMAIL PROTECTED] writes:
  I've recently been playing with table partitioning limitations. Turning
  over a large volume of data in inherited structures in a live
  environment, and have run into a couple of snags in the planner.
 
  The first is that LEFT JOIN will always do a sequential scan on all
  inherited tables.
 
 Try it in CVS HEAD.

Indeed, much better.

Sorry about the noise.

-- 


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Did this work in earlier version of Postgres?

2006-03-23 Thread Rod Taylor

On Thu, 2006-03-23 at 17:31 -0600, Tony Caduto wrote:
 I could have swore that this worked in earlier releases of Postgresql 
 i.e. 7.4.
 
 CREATE TABLE public.test
 (
 junk double NOT NULL,
 CONSTRAINT junk_pkey PRIMARY KEY (junk)
 )WITHOUT OIDS;
 
 Now it gives a error that type double does not exist.

CREATE DOMAIN double AS float8;

There, now the type exists ;)

-- 


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Did this work in earlier version of Postgres?

2006-03-23 Thread Rod Taylor

On Thu, 2006-03-23 at 16:05 -0800, Joshua D. Drake wrote:
 Rod Taylor wrote:
  On Thu, 2006-03-23 at 17:31 -0600, Tony Caduto wrote:
  I could have swore that this worked in earlier releases of Postgresql 
  i.e. 7.4.
 
  CREATE TABLE public.test
  (
  junk double NOT NULL,
  CONSTRAINT junk_pkey PRIMARY KEY (junk)
  )WITHOUT OIDS;
 
  Now it gives a error that type double does not exist.
  
  CREATE DOMAIN double AS float8;
  
  There, now the type exists ;)
 
 That's a little too perl for me ;)

I suppose it depends on the goal. If it is an application that is to be
supported on more than one database, defining types and other things for
a given DB type (PostgreSQL) is easier than injecting a large number of
SWITCH statements into the code.

-- 


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Did this work in earlier version of Postgres?

2006-03-23 Thread Rod Taylor

On Thu, 2006-03-23 at 16:41 -0800, Joshua D. Drake wrote:
  Now it gives a error that type double does not exist.
  CREATE DOMAIN double AS float8;
 
  There, now the type exists ;)
  That's a little too perl for me ;)
  
  I suppose it depends on the goal. If it is an application that is to be
  supported on more than one database, defining types and other things for
  a given DB type (PostgreSQL) is easier than injecting a large number of
  SWITCH statements into the code.
 \
 
 Why in the world would you build an application for anything except 
 PostgreSQL?

To prove that, as unbelievable as it sounds, it is possible to do such a
thing? Don't worry, such a thing would not get into a production
environment.

-- 


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] PostgreSQL Anniversary Proposals -- Important Update

2006-03-18 Thread Rod Taylor

On Fri, 2006-03-17 at 22:03 -0500, Tom Lane wrote:
 Josh Berkus josh@agliodbs.com writes:
  -- There are only 13 days left to submit a proposal.  Please do so.  We'd 
  rather not be forced into a last-minute rush to evaluate all of the stuff 
  in April.  Remember this is a family event so you don't have to have all 
  of your materials together before you send something.  Heck, if you have 
  an idea for a talk you'd really, really, really like to see and can't give 
  it, send it anyway.  We may be able to find a speaker.
 
 Speaking of which, I've been trying to think of a talk proposal and am
 not coming up with anything that seems terribly sexy.  I've talked a
 couple times about the planner and am afraid people would be bored by
 that again.  I'd be willing to hold forth on almost any part of the
 backend system design (a bold claim, but with three months to prepare
 I figure I can back it up...).  What would people like to hear about?

This will, presumably, be a very PostgreSQL friendly group so a sales
pitch isn't really required.

How about the opposite? Tom Lanes list of areas that PostgreSQL does a
poor job and a detailed explanation as to how that design decision or
limitation came about as well as what can (or cannot) be done to fix it.

I know there are a large number of items on your personal TODO and
CANNOTDO lists that have either had very brief or no discussion in the
mailing lists. Usage patterns that PostgreSQL simply does not handle
well for not-so-obvious reasons and how to either work around those
limitations as a user or changes that could be made to fix them.

One example might be a 'self-aggregating' structure. Start with one
entry per minute in a table indexed by time. After 2 weeks passes, the
per-minute data is aggregated and the single entry at the start of the
day is updated with the aggregate value with the other entries for the
day being removed. I believe this can cause significant index bloat
since it results in a few entries per page in the index.

Using 2 structures via inheritance with one holding the per-minute data
and one holding the per-day aggregates is much better.


In short, tell us why the hammer of PostgreSQL makes a bad screw driver.


-- 


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 934 matches

Mail list logo