date:20110223

Re: [HACKERS] WIP: cross column correlation ...

2011-02-23 Thread PostgreSQL - Hans-Jürgen Schönig


On Feb 23, 2011, at 2:58 AM, Robert Haas wrote:

 2011/2/22 PostgreSQL - Hans-Jürgen Schönig postg...@cybertec.at:
 how does it work? we try to find suitable statistics for an arbitrary length 
 list of conditions so that the planner can use it directly rather than 
 multiplying all the selectivities. this should make estimates a lot more 
 precise.
 the current approach can be extended to work with expressions and well as 
 straight conditions.
 
 /me prepares to go down in flames.
 
 Personally, I think the first thing we ought to do is add a real, bona
 fide planner hint to override the selectivity calculation manually,
 maybe something like this:
 
 WHERE (x  5 AND y = 1) SELECTIVITY (0.1);


i thought there was an agreement that we don't want planner hints?
as tom pointed out - many broken queries come out of some query generator where 
even the design to make the design is broken by design.
personally i like query generators as long as other people use them ... telling 
people that this is the wrong way to go is actually financing my holiday next 
week ... ;).  in general - hibernate and stuff like that is a no-go.

personally i like the type of planner hints oleg and teodor came up with - i 
think we should do more of those hooks they are using but hiding it in some 
syntax is not a good idea.
it does not change the query and it still gives a lot of room to toy around. it 
looks like a compromise.

however, oleg's contrib module does not fix the core problem of cross column 
statistics because a hint is usually static but you want flexible selectivity.

regards,

hans

--
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: cross column correlation ...

2011-02-23 Thread PostgreSQL - Hans-Jürgen Schönig

 Those are real problems, but I still want it.  The last time I hit
 this problem I spent two days redesigning my schema and adding
 triggers all over the place to make things work.  If I had been
 dealing with a 30TB database instead of a 300MB database I would have
 been royally up a creek.
 
 To put that another way, it's true that some people can't adjust their
 queries, but also some people can.  It's true that nonstandard stuff
 sucks, but queries that don't work suck, too.  And as for better
 solutions, how many major release cycles do we expect people to wait
 for them?  Even one major release cycle is an eternity when you're
 trying to get the application working before your company runs out of
 money, and this particular problem has had a lot of cycles expended on
 it without producing anything very tangible (proposed patch, which
 like you I can't spare a lot of cycles to look at just now, possibly
 excepted).



cheapest and easiest solution if you run into this: add fake functions which 
the planner cannot estimate properly.
use OR to artificially prop up estimates or use AND to artificially lower them. 
there is actually no need to redesign the schema to get around it but it is 
such an ugly solution that it does not even deserve to be called ugly ...
however, fast and reliable way to get around it.

regards,

hans

--
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Synchronous standbys?

2011-02-23 Thread Tatsuo Ishii

Reading documents coming with Simon's patches, I'm a little bit
confused by the idea of synchronous standbys.

In the sgmls doc, The commit wait will last until the first reply
from any standby. Multiple standby servers allow increased
availability and possibly increase performance as well.

So in my understanding there could be one or more synchronous
standbys. However in his mail:

If the current synchronous standby dies then one of the other
standbys will take over.

It seems there's only one synchronous standby allowed at the same
time.

Does anybody know which one is correct?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Synchronous standby

2011-02-23 Thread Tatsuo Ishii

In 9.1, we will be able to have synchrnous replication. Also we have
one standby server chosen by primary to be the synchronous standby
(still I'm not sure this is correct or not as stated in another mail).

Is there anyway to know which is the synchronous standby? IMO this is
important for users because that one is likely the least behind to
primary and will be chosen to promoto in case of primary dying in most
cases.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Binary in/out for aclitem

2011-02-23 Thread rsmogura


On Tue, 22 Feb 2011 20:20:39 -0500, Tom Lane wrote:

Robert Haas robertmh...@gmail.com writes:

On Tue, Feb 22, 2011 at 5:24 PM, Tom Lane t...@sss.pgh.pa.us wrote:
It'd be more future-proof than this patch, but I'm still 
unconvinced

about the use-case.


Do we want to intentionally make binary format a second-class 
citizen?


Well, it's not exactly a first-class citizen; compare for instance 
the
amount of verbiage in the docs about text I/O formats versus the 
amount
about binary formats.  But my question isn't about that; it's about 
why

aclitem should be considered a first-class citizen.  It makes me
uncomfortable that client apps are looking at it at all, because any
that do are bound to get broken in the future, even assuming that 
they
get the right answers today.  I wonder how many such clients are up 
to
speed for per-column privileges and non-constant default privileges 
for

instance.  And sepgsql is going to cut them off at the knees.

regards, tom lane


Technically, at eye glance, I didn't seen in sepgsql modifications to 
acl.h. So, I think, aclitem will be unaffected. In any way sepgsql needs 
some way to present access rights to administrator it may use own model, 
or aclitem, too.


JDBC, and other applications may use aclitem to get just information 
about who has what access. I think psql does this in same manner as 
JDBC, by calling select from pg_class. But if user, through psql, JDBC 
or other driver. will invoke select * from pg_class it will fail with 
no binary output, because it is plain user query.


Currently proposed binary output has space for 4 more privs. Am I 
right?


One thing I realized, I do not pass flag if grant target is group or 
user.


Regards,
Radek


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pl/python quoting functions

2011-02-23 Thread Jan Urbański

On 22/02/11 22:48, Peter Eisentraut wrote:
 Committed this, with two changes:  Changed some things around with the
 way const char * is propagated.  Just casting it away is not nice.  Also
 dropped the error tests in the _quote.sql regression test.  This
 generates three different wordings of error messages from Python with
 2.6, 3.1, and 3.2, which I don't care to maintain.  Maybe one day we'll
 have a better solution for this.

Thanks.

One thing: you removed the conditional pfree from PLy_quote_ident, which
makes this function leak memory if the actual quoting took place, no? Is
that leak too small to worry about?

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL FDW update

2011-02-23 Thread Shigeru HANADA


On Tue, 22 Feb 2011 11:33:25 -0500
Robert Haas robertmh...@gmail.com wrote:
 Is anyone actually working on a new version of this patch sufficiently
 rapidly that we can expect a new version in the next day or two?
 
 If not, I think we mark this one Returned with Feedback and revisit it for 
 9.2.

I'm working on it.

Fixes for new FDW API have been done, but there are some problems in
SQL generation codes, such as SELECT clause optimization (omitting
unused column from SELECT clause).  It would take a while, but I'll
post revised version of the patch tomorrow.

Regards,
--
Shigeru Hanada



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How to extract a value from a record using attnum or attname?

2011-02-23 Thread Alvaro Herrera

Excerpts from Kevin Grittner's message of mar feb 22 20:29:26 -0300 2011:
 Andrew Dunstan and...@dunslane.net wrote:
  
  Have you performance tested it? Scanning pg_index for index
  columns for each row strikes me as likely to be unpleasant.
  
 I haven't, yet.  I had rather assumed that the index info for a
 relation would have a high probability of being cached during
 execution of an AFTER trigger for that relation, so I think we're
 talking RAM access here.  It didn't seem sane to try to create an
 HTAB for this and worry about invalidation of it, etc.  If there's a
 faster way to get to the info without going to such extremes, I'd be
 happy to hear them.  (At least I avoided building and parsing a
 query to get at it.)

I think it'd be better to use RelationGetIndexList (which gets the index
list from relcache) and fetch the index tuples from syscache; see
relationHasPrimaryKey for sample code.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: cross column correlation ...

2011-02-23 Thread Robert Haas

On Wed, Feb 23, 2011 at 12:50 AM, Nathan Boley npbo...@gmail.com wrote:
 Personally, I think the first thing we ought to do is add a real, bona
 fide planner hint to override the selectivity calculation manually,
 maybe something like this:

 WHERE (x  5 AND y = 1) SELECTIVITY (0.1);

 If you're going to go that far, why not just collect statistics on
 that specific predicate?

 ie,  ANALYZE SELECTIVITY ON tablename (x, y) WHERE (x  5 AND y = 1);

 Then it won't fall subject to all of the pitfalls that Tom outlines below.

 Selectivities are easy to estimate if we know the predicate. They only
 become hard when they have to work for every possible predicate.

Fair point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: cross column correlation ...

2011-02-23 Thread Robert Haas

2011/2/23 PostgreSQL - Hans-Jürgen Schönig postg...@cybertec.at:
 Those are real problems, but I still want it.  The last time I hit
 this problem I spent two days redesigning my schema and adding
 triggers all over the place to make things work.  If I had been
 dealing with a 30TB database instead of a 300MB database I would have
 been royally up a creek.

 To put that another way, it's true that some people can't adjust their
 queries, but also some people can.  It's true that nonstandard stuff
 sucks, but queries that don't work suck, too.  And as for better
 solutions, how many major release cycles do we expect people to wait
 for them?  Even one major release cycle is an eternity when you're
 trying to get the application working before your company runs out of
 money, and this particular problem has had a lot of cycles expended on
 it without producing anything very tangible (proposed patch, which
 like you I can't spare a lot of cycles to look at just now, possibly
 excepted).

 cheapest and easiest solution if you run into this: add fake functions 
 which the planner cannot estimate properly.
 use OR to artificially prop up estimates or use AND to artificially lower 
 them. there is actually no need to redesign the schema to get around it but 
 it is such an ugly solution that it does not even deserve to be called ugly 
 ...
 however, fast and reliable way to get around it.

We couldn't possibly design a hint mechanism that would be uglier or
less future-proof than this workaround (which, by the way, I'll keep
in mind for the next time I get bitten by this).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: cross column correlation ...

2011-02-23 Thread PostgreSQL - Hans-Jürgen Schönig

 
 
 cheapest and easiest solution if you run into this: add fake functions 
 which the planner cannot estimate properly.
 use OR to artificially prop up estimates or use AND to artificially lower 
 them. there is actually no need to redesign the schema to get around it but 
 it is such an ugly solution that it does not even deserve to be called 
 ugly ...
 however, fast and reliable way to get around it.
 
 We couldn't possibly design a hint mechanism that would be uglier or
 less future-proof than this workaround (which, by the way, I'll keep
 in mind for the next time I get bitten by this).
 
 -- 
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company
 


i think the main issue is: what we do is ugly because of despair and a lack of 
alternative ... what you proposed is ugly by design ;).
overall: the workaround will win the ugliness contest, however ;).

many thanks,

hans

--
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: cross column correlation ...

2011-02-23 Thread PostgreSQL - Hans-Jürgen Schönig


On Feb 23, 2011, at 3:46 PM, Robert Haas wrote:

 On Wed, Feb 23, 2011 at 12:50 AM, Nathan Boley npbo...@gmail.com wrote:
 Personally, I think the first thing we ought to do is add a real, bona
 fide planner hint to override the selectivity calculation manually,
 maybe something like this:
 
 WHERE (x  5 AND y = 1) SELECTIVITY (0.1);
 
 If you're going to go that far, why not just collect statistics on
 that specific predicate?
 
 ie,  ANALYZE SELECTIVITY ON tablename (x, y) WHERE (x  5 AND y = 1);
 
 Then it won't fall subject to all of the pitfalls that Tom outlines below.
 
 Selectivities are easy to estimate if we know the predicate. They only
 become hard when they have to work for every possible predicate.
 
 Fair point.
 
 -- 
 Robert Haas


basically we got the idea of allowing expressions in cross column stuff. i 
think this can be very useful. it would fix the problem of a query like that:

SELECT * FROM table WHERE cos(field) = some_number;

this takes a constant fraction of the table which is usually plain wrong as 
well (and the error tends to multiply inside the plan).
i am just not sure if i have understood all corner cases of that already.
ultimate goal: get it right for join estimates (this is why a syntax extension 
is definitely needed - you cannot track all of them automatically).

many thanks,

hans

--
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] WIP: collect frequency statistics for arrays

2011-02-23 Thread Alexander Korotkov

WIP patch of statistics collection for arrays is attached. It generally
copies statistics collection for tsvector, but there are following
differencies:
1) Default comparison, hash and equality function for element data type is
used (from corresponding default operator classes).
2) Operators @ and  don't takes care about element occurence count in
array, i.e. '{1}':int[] @ '{1,1}':int[] and so on. That's why statistics
collection and selectivity estimation functions takes care about uniqueness
counting of array element.
3) array_typanalyze collects frequency of null element into separate value
(like maximum and minimum frequencies in ts_typanalyze). Currently it is not
used in selectivity estimation, but it can be useful in future.

Also I've faced with following problems:
1) Do selectivity estimation for ANY and ALL keywords seems not so easy as
for operators because their selectivity is estimating inside planner. So
it's required to modify planner to do selectivity estimation for these
keywords. Probably I'm missing something.
2) I didn't implement selectivity estimation for column @ const
and column == const cases. The problem of column @ const case is that
we need to estimate frequency of occurence of any element not in const. We
can try to collect statistics of frequency of all elements which is not in
most common elements based on assumption of their independent occurence. But
I'm not sure that this statistic will be precise enough. column == const
case have also another problem. @ and  operators don't takes care about
element occurence count and order while == operator require exact match.
That's why statistics for @ and  operators can be applied to == very
approximately.
 3) I need to test selectivity estimation for arrays. But it's hard to
understand which distributions is typical for arrays. For example, we know
that data in tsvector is based on natural language data, so we can assume
something about data distribution in tsvector. But we don't know almost
nothing about data in arrays because it can hold any data (tsvector also can
holds any data, but it using for non nutural language data is out of
purpose).

--
With best regards,
Alexander Korotkov.


arrayanalyze-0.1.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Peter Geoghegan

On 23 February 2011 04:36, Greg Stark gsst...@mit.edu wrote:
 This is only true for server encodings. In a client library I think
 you lose on this and do have to deal with it. I'm not sure what client
 encodings we do support that aren't ascii-supersets though, it's
 possible none of them generate quote characters this way.

I'm pretty sure all of the client encodings Tatsuo mentions are ASCII
supersets. The absence of by far the most popular non-ASCII superset
encoding, UTF-16, as a client encoding indicated that to me. It isn't
byte oriented, and Postgres is.

 I'm a bit surprised libpqxx isn't using binary mode internally though.
 This would at least avoid the problems with encoding. However I'm not
 sure things like the array binary format are really stable and
 portable enough to really use from a client library. Some datatypes
 might be dependent on the server ABI (floats -- I'm looking at you) so
 that might make it difficult or impossible.

Yes, that question is addressed here:
http://pqxx.org/development/libpqxx/wiki/BinaryTransfers .

-- 
Regards,
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: cross column correlation ...

2011-02-23 Thread Robert Haas

2011/2/23 PostgreSQL - Hans-Jürgen Schönig postg...@cybertec.at:
 i thought there was an agreement that we don't want planner hints?

Well, I want them.  I think some other people do, too.  Whether those
people are more numerous than than the people who don't want them, and
how much that matters either way, is another question.  I don't want
to have to use them very often, but I like to have an out when I get
desperate.

 as tom pointed out - many broken queries come out of some query generator 
 where even the design to make the design is broken by design.
 personally i like query generators as long as other people use them ... 
 telling people that this is the wrong way to go is actually financing my 
 holiday next week ... ;).  in general - hibernate and stuff like that is a 
 no-go.

 personally i like the type of planner hints oleg and teodor came up with - i 
 think we should do more of those hooks they are using but hiding it in some 
 syntax is not a good idea.
 it does not change the query and it still gives a lot of room to toy around. 
 it looks like a compromise.

 however, oleg's contrib module does not fix the core problem of cross column 
 statistics because a hint is usually static but you want flexible selectivity.

IIRC, what Teodor and Oleg did was a contrib module that excluded a
certain index from consideration based on a GUC.  That to me is a
little more hacky than just wiring the selectivity estimate.  You're
going to need to set that just before each query that needs it, and
reset it afterwards, so it's actually worse than just decorating the
queries, IMHO.  Also, I haven't run into any actual problems in the
field that would be solved by this approach, though I am sure others
have.  IME, most bad query plans are caused by either incorrect
estimates of selectivity, or wrongheaded notions about what's likely
to be cached.  If we could find a way, automated or manual, of
providing the planner some better information about the facts of life
in those areas, I think we'd be way better off.  I'm open to ideas
about what the best way to do that is.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Andrew Dunstan




On 02/23/2011 10:09 AM, Peter Geoghegan wrote:

On 23 February 2011 04:36, Greg Starkgsst...@mit.edu  wrote:

This is only true for server encodings. In a client library I think
you lose on this and do have to deal with it. I'm not sure what client
encodings we do support that aren't ascii-supersets though, it's
possible none of them generate quote characters this way.

I'm pretty sure all of the client encodings Tatsuo mentions are ASCII
supersets. The absence of by far the most popular non-ASCII superset
encoding, UTF-16, as a client encoding indicated that to me. It isn't
byte oriented, and Postgres is.



They are not. It's precisely because they are not that they are not 
allowed as server encodings.



cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Binary in/out for aclitem

2011-02-23 Thread Tom Lane

rsmogura rsmog...@softperience.eu writes:
  On Tue, 22 Feb 2011 20:20:39 -0500, Tom Lane wrote:
 ...  But my question isn't about that; it's about 
 why aclitem should be considered a first-class citizen.  It makes me
 uncomfortable that client apps are looking at it at all, because any
 that do are bound to get broken in the future, even assuming that 
 they get the right answers today.  I wonder how many such clients are up 
 to speed for per-column privileges and non-constant default privileges 
 for instance.  And sepgsql is going to cut them off at the knees.

  Technically, at eye glance, I didn't seen in sepgsql modifications to 
  acl.h. So, I think, aclitem will be unaffected. In any way sepgsql needs 
  some way to present access rights to administrator it may use own model, 
  or aclitem, too.

You're missing the point, which is that the current internal
representation of aclitem could change drastically to support future
feature improvements in the area of privileges.  It has already changed
significantly in the past (we didn't use to have WITH GRANT OPTION).
If we had to add a field, for instance, a binary representation would
simply be broken, as clients would have difficulty telling how to
interpret it as soon as there was more than one possible format.
Text representations are typically a bit more extensible.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Heikki Linnakangas


On 23.02.2011 17:16, Andrew Dunstan wrote:

On 02/23/2011 10:09 AM, Peter Geoghegan wrote:

On 23 February 2011 04:36, Greg Starkgsst...@mit.edu wrote:

This is only true for server encodings. In a client library I think
you lose on this and do have to deal with it. I'm not sure what client
encodings we do support that aren't ascii-supersets though, it's
possible none of them generate quote characters this way.

I'm pretty sure all of the client encodings Tatsuo mentions are ASCII
supersets. The absence of by far the most popular non-ASCII superset
encoding, UTF-16, as a client encoding indicated that to me. It isn't
byte oriented, and Postgres is.


They are not. It's precisely because they are not that they are not
allowed as server encodings.


To be precise, they are all ASCII supersets in the sense that a valid 
7-bit ASCII string is valid and means the same thing in all of the 
client-only encodings as well. The difference between supported 
server-encodings and those that are only supported as client_encoding is 
whether *all* bytes in a multi-byte character have the high bit set. All 
server-encodings have that property, and we rely on it in the backend. 
In the supported client-only encodings, the *first* byte of a multi-byte 
character is guaranteed to have the high bit set, but the subsequent 
bytes are not.


Even that more loose property isn't true for UTF-16, which is why we 
don't support it even as a client-only encoding.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Greg Stark

On Wed, Feb 23, 2011 at 3:16 PM, Andrew Dunstan and...@dunslane.net wrote:
 On 02/23/2011 10:09 AM, Peter Geoghegan wrote:

 On 23 February 2011 04:36, Greg Starkgsst...@mit.edu  wrote:

 This is only true for server encodings. In a client library I think
 you lose on this and do have to deal with it. I'm not sure what client
 encodings we do support that aren't ascii-supersets though, it's
 possible none of them generate quote characters this way.

 I'm pretty sure all of the client encodings Tatsuo mentions are ASCII
 supersets. The absence of by far the most popular non-ASCII superset
 encoding, UTF-16, as a client encoding indicated that to me. It isn't
 byte oriented, and Postgres is.


 They are not. It's precisely because they are not that they are not allowed
 as server encodings.


Well just as an example, BIG5 is a 16-bit encoding where the first
byte always has the high bit set. The second byte can't be a ' or 
because those aren't in the allowable range for the second byte. So it
might be safe anyways. However \ is in the allowable range so I'm not
sure.

In the case of BIG5 ascii characters are included inline so any byte
with the high bit *not* set that isn't the second byte of a two-byte
sequence is assumed to be ascii. So an ascii parser would work fine
modulo the problem above with backslashes.

But this is just a special case. Wikipedia implies it's also true for
shift-JIS but there's no guarantee it would work for other client
encodings.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Tom Lane

Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 On 23.02.2011 17:16, Andrew Dunstan wrote:
 On 02/23/2011 10:09 AM, Peter Geoghegan wrote:
 I'm pretty sure all of the client encodings Tatsuo mentions are ASCII
 supersets.

 They are not. It's precisely because they are not that they are not
 allowed as server encodings.

 In the supported client-only encodings, the *first* byte of a multi-byte 
 character is guaranteed to have the high bit set, but the subsequent 
 bytes are not.

And to be even more specific: the problem with the client-only encodings
is that some of them allow the byte values corresponding to \ and
other escaping-critical characters to appear as non-first bytes in a
multibyte character.  This is nasty because you have to be aware of the
encoding to do escaping correctly and not break the data.  And as soon
as the server and client don't agree on what the encoding is, you have
the potential for SQL-injection security holes, not just confused data.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Merlin Moncure

On Tue, Feb 22, 2011 at 10:16 PM, Peter Geoghegan
peter.geoghega...@gmail.com wrote:
 I'm investigating the possibility of developing a utility function for
 our C++ client library, libpqxx, that produces array literals that can
 be used in prepared statements. This problem appears to be a bit of a
 tar pit, so I'm hoping that someone can help me out. My goal is to
 produce a template function that accepts arbitrarily nested standard
 library containers, that contain at the most nested level
 constants/literals of some type that can be fed into a stream, such as
 an int or a std::string.

 I'm aware that I cannot assume that types are delimited by a single
 quote, even for built-in types. I thought that I would put the onus on
 the client to specify the correct delimiter, by checking pg_type
 themselves if necessary, but default to ',' . Is this a reasonable
 approach?

 Escaping/quoting individual elements seems tricky. I have produced a
 generic and superficially well behaved implementation by using double
 quotes for constants. However, I have now opened the door to malicious
 parties injecting multiple array elements where only one is allowed,
 or causing malformed array literal errors by simply including a double
 quote of their own. It's not clear where the responsibility should
 rest for escaping constants/ensuring that constants don't contain
 double quotes. Can someone suggest a better approach? I can't very
 well use single quotes, because they are escaped/doubled up when we
 pass the array literal to something similar to PQexecPrepared(), and
 they shouldn't be - strings end up looking like this: 'has errant
 single quotes on either side'.

You can send nested arrays safely.  You just have to be very formal
about escaping *everything* both as you get it and as it goes into the
container.  This is what postgres does on the backend as it sends
arrays out the door in text.  It might be instructive to see what the
server does in terms of escaping.  Note that the way this works it's
not impossible to see 128+ consecutive backslashes when dealing with
arrays of composites.

 Since Postgres only supports encodings that are ASCII supersets, I
 don't believe that I have to consider encoding - only my clients do.

 Can someone please point me in the direction of an established client
 library/driver where all corner cases are covered, or at least enough
 of them to produce a net gain in usefulness? There may well be
 additional subtleties that have not occurred to me.

yes: libpqtypes.  it manages everything in binary.  i've been thinking
for a while that libpqtypes could be wrapped with variadic templates
or other c++ trickery.  Because libpqtypes does everything in binary,
it completely sidesteps all the escaping nastiness.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Andrew Dunstan




On 02/23/2011 10:22 AM, Heikki Linnakangas wrote:

On 23.02.2011 17:16, Andrew Dunstan wrote:

On 02/23/2011 10:09 AM, Peter Geoghegan wrote:

On 23 February 2011 04:36, Greg Starkgsst...@mit.edu wrote:

This is only true for server encodings. In a client library I think
you lose on this and do have to deal with it. I'm not sure what client
encodings we do support that aren't ascii-supersets though, it's
possible none of them generate quote characters this way.

I'm pretty sure all of the client encodings Tatsuo mentions are ASCII
supersets. The absence of by far the most popular non-ASCII superset
encoding, UTF-16, as a client encoding indicated that to me. It isn't
byte oriented, and Postgres is.


They are not. It's precisely because they are not that they are not
allowed as server encodings.


To be precise, they are all ASCII supersets in the sense that a valid 
7-bit ASCII string is valid and means the same thing in all of the 
client-only encodings as well. The difference between supported 
server-encodings and those that are only supported as client_encoding 
is whether *all* bytes in a multi-byte character have the high bit 
set. All server-encodings have that property, and we rely on it in the 
backend. In the supported client-only encodings, the *first* byte of a 
multi-byte character is guaranteed to have the high bit set, but the 
subsequent bytes are not.


Yes, that's a better explanation.



Even that more loose property isn't true for UTF-16, which is why we 
don't support it even as a client-only encoding.


The fact that UTF-16 uses nul bytes would make it particularly hard to 
handle.


There might be value in having a UTF-16 aware version of libpq that 
would translate strings into UTF-8 on the way to the server and to 
UTF-16 on the way back to the client.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Kenneth Marshall

On Wed, Feb 23, 2011 at 09:34:06AM -0600, Merlin Moncure wrote:
 On Tue, Feb 22, 2011 at 10:16 PM, Peter Geoghegan
 peter.geoghega...@gmail.com wrote:
  I'm investigating the possibility of developing a utility function for
  our C++ client library, libpqxx, that produces array literals that can
  be used in prepared statements. This problem appears to be a bit of a
  tar pit, so I'm hoping that someone can help me out. My goal is to
  produce a template function that accepts arbitrarily nested standard
  library containers, that contain at the most nested level
  constants/literals of some type that can be fed into a stream, such as
  an int or a std::string.
 
  I'm aware that I cannot assume that types are delimited by a single
  quote, even for built-in types. I thought that I would put the onus on
  the client to specify the correct delimiter, by checking pg_type
  themselves if necessary, but default to ',' . Is this a reasonable
  approach?
 
  Escaping/quoting individual elements seems tricky. I have produced a
  generic and superficially well behaved implementation by using double
  quotes for constants. However, I have now opened the door to malicious
  parties injecting multiple array elements where only one is allowed,
  or causing malformed array literal errors by simply including a double
  quote of their own. It's not clear where the responsibility should
  rest for escaping constants/ensuring that constants don't contain
  double quotes. Can someone suggest a better approach? I can't very
  well use single quotes, because they are escaped/doubled up when we
  pass the array literal to something similar to PQexecPrepared(), and
  they shouldn't be - strings end up looking like this: 'has errant
  single quotes on either side'.
 
 You can send nested arrays safely.  You just have to be very formal
 about escaping *everything* both as you get it and as it goes into the
 container.  This is what postgres does on the backend as it sends
 arrays out the door in text.  It might be instructive to see what the
 server does in terms of escaping.  Note that the way this works it's
 not impossible to see 128+ consecutive backslashes when dealing with
 arrays of composites.
 
  Since Postgres only supports encodings that are ASCII supersets, I
  don't believe that I have to consider encoding - only my clients do.
 
  Can someone please point me in the direction of an established client
  library/driver where all corner cases are covered, or at least enough
  of them to produce a net gain in usefulness? There may well be
  additional subtleties that have not occurred to me.
 
 yes: libpqtypes.  it manages everything in binary.  i've been thinking
 for a while that libpqtypes could be wrapped with variadic templates
 or other c++ trickery.  Because libpqtypes does everything in binary,
 it completely sidesteps all the escaping nastiness.
 
 merlin
 

Avoiding the escaping by using binary parameter transmission is
the best method. Shameless plug: libpqtypes is great!
I hope that it can be eventually included in the core distribution.
It is not uncommon to get It's an add-on package??? and avoidance
of pieces outside of the standard dist regardless of its value.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pl/python quoting functions

2011-02-23 Thread Peter Eisentraut

On ons, 2011-02-23 at 11:26 +0100, Jan Urbański wrote:
 One thing: you removed the conditional pfree from PLy_quote_ident,
 which makes this function leak memory if the actual quoting took
 place, no? Is that leak too small to worry about?

Many functions in PostgreSQL leak memory in this way.  It's not worth
worrying about, and certainly not worth violating a called function's
API for it.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Binary in/out for aclitem

2011-02-23 Thread Radosław Smogura

Tom Lane t...@sss.pgh.pa.us Wednesday 23 February 2011 16:19:27
 rsmogura rsmog...@softperience.eu writes:
   On Tue, 22 Feb 2011 20:20:39 -0500, Tom Lane wrote:
  ...  But my question isn't about that; it's about
  why aclitem should be considered a first-class citizen.  It makes me
  uncomfortable that client apps are looking at it at all, because any
  that do are bound to get broken in the future, even assuming that
  they get the right answers today.  I wonder how many such clients are up
  to speed for per-column privileges and non-constant default privileges
  for instance.  And sepgsql is going to cut them off at the knees.
  
   Technically, at eye glance, I didn't seen in sepgsql modifications to
   acl.h. So, I think, aclitem will be unaffected. In any way sepgsql needs
   some way to present access rights to administrator it may use own model,
   or aclitem, too.
 
 You're missing the point, which is that the current internal
 representation of aclitem could change drastically to support future
 feature improvements in the area of privileges.  It has already changed
 significantly in the past (we didn't use to have WITH GRANT OPTION).
 If we had to add a field, for instance, a binary representation would
 simply be broken, as clients would have difficulty telling how to
 interpret it as soon as there was more than one possible format.
 Text representations are typically a bit more extensible.
 
   regards, tom lane
I removed from patch this (think like currently not needed, but it is enaught 
to put in doc)

Each privilige has idividual number P from 1 to n. and it is represented by 
setted P-th bit. First n-th bits (in network bit order) represents normal 
priv, next n-th bits represents grant option of privs. This chain is encoded 
as n*2 bit number rounded up to full 8 bits, with minimal length 32 bit.

I was thinking about adding number of all privs to each ACL item, removed as 
this could be deducted from PG version, where 1st 7-bit represents version, 
last 8-th bit will represent if grant part has been added.

---
In any way binary output should be available, if we have binary mode. I know 
that text is more extensible, we may in contrast to above packed version, 
describes acl privs as byte array elements from represented setted priv (same 
as text).

Fallback solution is to just recall aclin/aclout with StringInfo.

Regards,
Radek.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Binary in/out for aclitem

2011-02-23 Thread Radosław Smogura

Tom Lane t...@sss.pgh.pa.us Wednesday 23 February 2011 16:19:27
 rsmogura rsmog...@softperience.eu writes:
   On Tue, 22 Feb 2011 20:20:39 -0500, Tom Lane wrote:
  ...  But my question isn't about that; it's about
  why aclitem should be considered a first-class citizen.  It makes me
  uncomfortable that client apps are looking at it at all, because any
  that do are bound to get broken in the future, even assuming that
  they get the right answers today.  I wonder how many such clients are up
  to speed for per-column privileges and non-constant default privileges
  for instance.  And sepgsql is going to cut them off at the knees.
  
   Technically, at eye glance, I didn't seen in sepgsql modifications to
   acl.h. So, I think, aclitem will be unaffected. In any way sepgsql needs
   some way to present access rights to administrator it may use own model,
   or aclitem, too.
 
 You're missing the point, which is that the current internal
 representation of aclitem could change drastically to support future
 feature improvements in the area of privileges.  It has already changed
 significantly in the past (we didn't use to have WITH GRANT OPTION).
 If we had to add a field, for instance, a binary representation would
 simply be broken, as clients would have difficulty telling how to
 interpret it as soon as there was more than one possible format.
 Text representations are typically a bit more extensible.
 
   regards, tom lane

Actully, You litlle messed in my head. So in prev post we don't need to send 
information if grant option has been set, currently in text mode no grant 
options means ACL_NO_RIGHTS, and in binary same may be achived be settig there 
0. 

But version field may be usefull to validate this and future calls, and 
provide backward compatibility (if newer client will send less bits then rest 
of bits will be set to 0).

I think about splitting privs chain to two numbers, it may be easier to 
implement this and parse if number of privs will extend 32...

In addition I may add support for possible, future representation, where given 
privilige may be yes, no, undefined (like in Windows).

Regrads,
Radek

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronous standbys?

2011-02-23 Thread Jaime Casanova

On Wed, Feb 23, 2011 at 3:42 AM, Tatsuo Ishii is...@postgresql.org wrote:

 It seems there's only one synchronous standby allowed at the same
 time.

 Does anybody know which one is correct?

there could be only one standby at the same time...
in the original patch there could be several synchronous standby
servers and the primary was going to wait until the first one of them
to answer, but that was removed and replaced by a list of possible
synch standby servers and the first that connects is the one the
primary will wait for. because right now it's a simple list the first
one will always be the synch standby until it's removed from the list
or phisically.

the other comment in the doc should be removed

-- 
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte y capacitación de PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How to extract a value from a record using attnum or attname?

2011-02-23 Thread Kevin Grittner

Alvaro Herrera alvhe...@commandprompt.com wrote:
 
 I think it'd be better to use RelationGetIndexList (which gets the
 index list from relcache) and fetch the index tuples from
 syscache; see relationHasPrimaryKey for sample code.
 
Thanks.  Patch done that way attached.  Will get it into tomorrow's
system testing here.
 
-Kevin
*** a/src/backend/utils/adt/trigfuncs.c
--- b/src/backend/utils/adt/trigfuncs.c
***
*** 13,21 
   */
  #include postgres.h
  
! #include access/htup.h
  #include commands/trigger.h
  #include utils/builtins.h
  
  
  /*
--- 13,23 
   */
  #include postgres.h
  
! #include executor/spi.h
! #include commands/async.h
  #include commands/trigger.h
  #include utils/builtins.h
+ #include utils/syscache.h
  
  
  /*
***
*** 93,95  suppress_redundant_updates_trigger(PG_FUNCTION_ARGS)
--- 95,252 
  
return PointerGetDatum(rettuple);
  }
+ 
+ 
+ /*
+  * Copy from s (for source) to r (for result), wrapping with q (quote)
+  * characters and doubling any quote characters found.
+  */
+ static char *
+ strcpy_quoted(char *r, const char *s, const char q)
+ {
+   *r++ = q;
+   while (*s)
+   {
+   if (*s == q)
+   *r++ = q;
+   *r++ = *s;
+   s++;
+   }
+   *r++ = q;
+   return r;
+ }
+ 
+ /*
+  * triggered_change_notification
+  *
+  * This trigger function will send a notification of data modification with
+  * primary key values.The channel will be tcn unless the trigger is
+  * created with a parameter, in which case that parameter will be used.
+  */
+ Datum
+ triggered_change_notification(PG_FUNCTION_ARGS)
+ {
+   TriggerData *trigdata = (TriggerData *) fcinfo-context;
+   Trigger*trigger;
+   int nargs;
+   HeapTuple   trigtuple;
+   Relationrel;
+   TupleDesc   tupdesc;
+   char   *channel;
+   charoperation;
+   charpayload[200];
+   char   *p;
+   boolfoundPK;
+ 
+   List   *indexoidlist;
+   ListCell   *indexoidscan;
+ 
+   /* make sure it's called as a trigger */
+   if (!CALLED_AS_TRIGGER(fcinfo))
+   ereport(ERROR,
+   
(errcode(ERRCODE_E_R_I_E_TRIGGER_PROTOCOL_VIOLATED),
+   errmsg(triggered_change_notification: must be called as 
trigger)));
+ 
+   /* and that it's called after the change */
+   if (!TRIGGER_FIRED_AFTER(trigdata-tg_event))
+   ereport(ERROR,
+   
(errcode(ERRCODE_E_R_I_E_TRIGGER_PROTOCOL_VIOLATED),
+errmsg(triggered_change_notification: must be 
called after the change)));
+ 
+   /* and that it's called for each row */
+   if (!TRIGGER_FIRED_FOR_ROW(trigdata-tg_event))
+   ereport(ERROR,
+   
(errcode(ERRCODE_E_R_I_E_TRIGGER_PROTOCOL_VIOLATED),
+errmsg(triggered_change_notification: must be 
called for each row)));
+ 
+   if (TRIGGER_FIRED_BY_INSERT(trigdata-tg_event))
+   operation = 'I';
+   else if (TRIGGER_FIRED_BY_UPDATE(trigdata-tg_event))
+   operation = 'U';
+   else if (TRIGGER_FIRED_BY_DELETE(trigdata-tg_event))
+   operation = 'D';
+   else
+   {
+   elog(ERROR, triggered_change_notification: trigger fired by 
unrecognized operation);
+   operation = 'X';/* silence compiler warning */
+   }
+ 
+   trigger = trigdata-tg_trigger;
+   nargs = trigger-tgnargs;
+   if (nargs  1)
+   ereport(ERROR,
+   
(errcode(ERRCODE_E_R_I_E_TRIGGER_PROTOCOL_VIOLATED),
+errmsg(triggered_change_notification: must 
not be called with more than one parameter)));
+ 
+   if (nargs == 0)
+   channel = tcn;
+   else
+   channel = trigger-tgargs[0];
+ 
+   /* get tuple data */
+   trigtuple = trigdata-tg_trigtuple;
+   rel = trigdata-tg_relation;
+   tupdesc = rel-rd_att;
+ 
+   foundPK = false;
+ 
+   /*
+* Get the list of index OIDs for the table from the relcache, and look 
up
+* each one in the pg_index syscache until we find one marked primary 
key
+* (hopefully there isn't more than one such).
+*/
+   indexoidlist = RelationGetIndexList(rel);
+ 
+   foreach(indexoidscan, indexoidlist)
+   {
+   Oid indexoid = lfirst_oid(indexoidscan);
+   HeapTuple   indexTuple;
+   Form_pg_index index;
+ 
+   indexTuple = SearchSysCache1(INDEXRELID, 
ObjectIdGetDatum(indexoid));
+   if (!HeapTupleIsValid(indexTuple))  /* should not 
happen */
+   elog(ERROR, cache lookup failed for index %u, 
indexoid);
+

Re: [HACKERS] Synchronous standby

2011-02-23 Thread Jaime Casanova

On Wed, Feb 23, 2011 at 3:49 AM, Tatsuo Ishii is...@postgresql.org wrote:
 In 9.1, we will be able to have synchrnous replication. Also we have
 one standby server chosen by primary to be the synchronous standby
 (still I'm not sure this is correct or not as stated in another mail).


yes, it is. a list of possible synch standbys and one of them chosen
to be the one

 Is there anyway to know which is the synchronous standby? IMO this is
 important for users because that one is likely the least behind to
 primary and will be chosen to promoto in case of primary dying in most
 cases.

i guess, we can put a new column in pg_stat_replication stating the
type of the replication (synch or asynch).
but that is surely a different patch...

-- 
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte y capacitación de PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Andrew Dunstan




On 02/23/2011 10:40 AM, Kenneth Marshall wrote:



Can someone please point me in the direction of an established client
library/driver where all corner cases are covered, or at least enough
of them to produce a net gain in usefulness? There may well be
additional subtleties that have not occurred to me.

yes: libpqtypes.  it manages everything in binary.  i've been thinking
for a while that libpqtypes could be wrapped with variadic templates
or other c++ trickery.  Because libpqtypes does everything in binary,
it completely sidesteps all the escaping nastiness.


Avoiding the escaping by using binary parameter transmission is
the best method. Shameless plug: libpqtypes is great!
I hope that it can be eventually included in the core distribution.
It is not uncommon to get It's an add-on package??? and avoidance
of pieces outside of the standard dist regardless of its value.





Binary mode had serious limitations, such as portability.

We do need some support in libpq for constructing and deconstructing 
arrays (and probably for composites too, although that will be harder, I 
suspect).


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fix for Index Advisor related hooks

2011-02-23 Thread Gurjeet Singh

On Fri, Feb 18, 2011 at 1:36 PM, Heikki Linnakangas 
heikki.linnakan...@enterprisedb.com wrote:

 On 18.02.2011 17:02, Gurjeet Singh wrote:

  Another use case of the Index Advisor is to be switched on for a few hours
 while the application runs, and gather the recommendations for the whole
 run. We'll need good performance that case too.


 How exactly does that work? I would imagine that you log all the different
 SQL statements and how often they're run during that period. Similar to
 pgFouine, for example. And only then you run the index advisor on the
 collected SQL statements.


The Index Advisor produces recommendations for every running query on the
fly and stores them in a table. After the application run is over, these
recommendations can be viewed in the table and analyzed to pick the indexes
that provide the most benefit.

Regards,
-- 
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh.gurjeet@{ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device

Re: [HACKERS] disposition of remaining patches

2011-02-23 Thread Robert Haas

On Fri, Feb 18, 2011 at 5:47 PM, Robert Haas robertmh...@gmail.com wrote:
 The CommitFest application currently reflects 17 remaining patches for
 CommitFest 2011-01.

Now we're down to 12.  As usual, the last few patches take the longest...

 1. Change pg_last_xlog_receive_location not to move backwards.  We
 don't have complete consensus on what to do here.  If we can agree on
 a way forward, I think we can finish this one up pretty quickly.  It's
 partially being held up by #2.

No change.

 2. Synchronous replication.  Splitting up this patch has allowed some
 progress to be made here, but there is a lot left to do, and I fear
 that trying to hash out the design issues at this late date is not
 going to lead to a great final product.  The proposed timeout to make
 the server boot out clients that don't seem to be responding is not
 worth committing, as it will only work when the server isn't
 generating WAL, which can't be presumed to be the normal state of
 affairs.  The patch to avoid ever letting the WAL sender status go
 backward from catchup to streaming was committed without discussion,
 and needs to be reverted for reasons discussed on that thread.  An
 updated version of the main patch has yet to be posted.

This has gotten a bunch of review, on several different threads.  I
assume Simon will publish an update when he gets back to his
keyboard...

 3, 4, 5. SQL/MED.  Tom has picked up the main FDW API patch, which I
 expect means it'll go in.  I am not so sure about the FDW patches,
 though: in particular, based on Heikki's comments, the postgresql_fdw
 patch seems to be badly in need of some more work.  The file_fdw patch
 may be in better shape (I'm not 100% sure), but it needs the encoding
 fix patch Itagaki Takahiro recently proposed.  For this to be
 worthwhile, we presumably need to get at least one FDW committed along
 with the API patch.

The core and file_fdw patches are in; postgresql_fdw is being reworked
by the author.

 6. Writeable CTEs.  Tom said he'd look at this one.
 7. contrib/btree_gist KNN.  Needs updating as a result of the
 extensions patch.  This ball is really in Teodor and Oleg's court.

No change on these.

 8, 9, 10, 11, 12, 13, 14.  PL/python patches.  I believe Peter was
 working on these, but I haven't seen any updates in a while.

Peter committed two of these seven, leaving five to be addressed.

 15. Fix snapshot taking inconsistencies.  Tom said he'd look at this one.

No change on this one.

 16. synchronized snapshots.  Alvaro is working on this one.

Lots of discussion of this one, but current status is not clear to me.
 Alvaro, are you working on this actively?

 17. determining client_encoding from client locale.  This is Peter's
 patch.  Peter, are you planning to commit this?

Peter committed this one.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] disposition of remaining patches

2011-02-23 Thread Alvaro Herrera

Excerpts from Robert Haas's message of mié feb 23 14:54:02 -0300 2011:
 On Fri, Feb 18, 2011 at 5:47 PM, Robert Haas robertmh...@gmail.com wrote:

  16. synchronized snapshots.  Alvaro is working on this one.
 
 Lots of discussion of this one, but current status is not clear to me.
  Alvaro, are you working on this actively?

I am.  I'm not sure if it's still reasonable to get into 9.1, given that
it needs to be rewritten from almost completely from scratch.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] disposition of remaining patches

2011-02-23 Thread Robert Haas

On Wed, Feb 23, 2011 at 1:05 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 Excerpts from Robert Haas's message of mié feb 23 14:54:02 -0300 2011:
 On Fri, Feb 18, 2011 at 5:47 PM, Robert Haas robertmh...@gmail.com wrote:

  16. synchronized snapshots.  Alvaro is working on this one.

 Lots of discussion of this one, but current status is not clear to me.
  Alvaro, are you working on this actively?

 I am.  I'm not sure if it's still reasonable to get into 9.1, given that
 it needs to be rewritten from almost completely from scratch.

Well, if it gets punted, I won't be too sad, since the pg_dump patch
to actually use this functionality for something useful already got
pushed off.  If you can commit something in a timely fashion that is
also high quality, great, but if not, I don't see it as a
show-stopper.  The highest priorities IMO are writeable CTEs and
synchronous replication.  I doubt that there would be majority support
for prolonging this on the basis of any other single patch, though I
might be wrong about that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Binary in/out for aclitem

2011-02-23 Thread Radosław Smogura

Tom Lane t...@sss.pgh.pa.us Wednesday 23 February 2011 16:19:27
 rsmogura rsmog...@softperience.eu writes:
   On Tue, 22 Feb 2011 20:20:39 -0500, Tom Lane wrote:
  ...  But my question isn't about that; it's about
  why aclitem should be considered a first-class citizen.  It makes me
  uncomfortable that client apps are looking at it at all, because any
  that do are bound to get broken in the future, even assuming that
  they get the right answers today.  I wonder how many such clients are up
  to speed for per-column privileges and non-constant default privileges
  for instance.  And sepgsql is going to cut them off at the knees.
  
   Technically, at eye glance, I didn't seen in sepgsql modifications to
   acl.h. So, I think, aclitem will be unaffected. In any way sepgsql needs
   some way to present access rights to administrator it may use own model,
   or aclitem, too.
 
 You're missing the point, which is that the current internal
 representation of aclitem could change drastically to support future
 feature improvements in the area of privileges.  It has already changed
 significantly in the past (we didn't use to have WITH GRANT OPTION).
 If we had to add a field, for instance, a binary representation would
 simply be broken, as clients would have difficulty telling how to
 interpret it as soon as there was more than one possible format.
 Text representations are typically a bit more extensible.
 
   regards, tom lane

Here is extended version, has version field (N_ACL_RIGHTS*2) and reserved 
mask, as well definition is more general then def of PGSQL. In any way it 
require that rights mades bit array.

Still I tested only aclitemsend.

Btw, Is it possible and needed to add group byte, indicating that grantee is 
group or user?

Regards,
Radek
diff --git a/.gitignore b/.gitignore
index 1be11e8..0d594f9 100644
--- a/.gitignore
+++ b/.gitignore
@@ -17,3 +17,5 @@ objfiles.txt
 /GNUmakefile
 /config.log
 /config.status
+/nbproject/private/
+/nbproject
diff --git a/src/backend/utils/adt/acl.c b/src/backend/utils/adt/acl.c
index 691ba3b..c25c0fd 100644
--- a/src/backend/utils/adt/acl.c
+++ b/src/backend/utils/adt/acl.c
@@ -33,6 +33,7 @@
 #include utils/lsyscache.h
 #include utils/memutils.h
 #include utils/syscache.h
+#include libpq/pqformat.h
 
 
 typedef struct
@@ -78,6 +79,10 @@ static void putid(char *p, const char *s);
 static Acl *allocacl(int n);
 static void check_acl(const Acl *acl);
 static const char *aclparse(const char *s, AclItem *aip);
+
+/** Assigns default grantor and send warning. */
+static void aclitem_assign_default_grantor(AclItem *aip);
+
 static bool aclitem_match(const AclItem *a1, const AclItem *a2);
 static int	aclitemComparator(const void *arg1, const void *arg2);
 static void check_circularity(const Acl *old_acl, const AclItem *mod_aip,
@@ -209,6 +214,14 @@ putid(char *p, const char *s)
 	*p = '\0';
 }
 
+/** Assigns default grantor and send warning. */
+void aclitem_assign_default_grantor(AclItem *aip) {
+aip-ai_grantor = BOOTSTRAP_SUPERUSERID;
+ereport(WARNING,
+(errcode(ERRCODE_INVALID_GRANTOR),
+ errmsg(defaulting grantor to user ID %u,
+BOOTSTRAP_SUPERUSERID)));
+}
 /*
  * aclparse
  *		Consumes and parses an ACL specification of the form:
@@ -343,11 +356,7 @@ aclparse(const char *s, AclItem *aip)
 	}
 	else
 	{
-		aip-ai_grantor = BOOTSTRAP_SUPERUSERID;
-		ereport(WARNING,
-(errcode(ERRCODE_INVALID_GRANTOR),
- errmsg(defaulting grantor to user ID %u,
-		BOOTSTRAP_SUPERUSERID)));
+aclitem_assign_default_grantor(aip);
 	}
 
 	ACLITEM_SET_PRIVS_GOPTIONS(*aip, privs, goption);
@@ -643,6 +652,163 @@ aclitemout(PG_FUNCTION_ARGS)
 
 	PG_RETURN_CSTRING(out);
 }
+/** Do binary read of aclitem. Input format is same as {@link aclitem_recv}, but
+ * special algorithm is used to determine grantee's and grantor's OID. The reason
+ * is to keep backward information compatiblity with text mode - typical
+ * client (which gets instructions from user)
+ * may be much more interested in sending grantee and grantors name then
+ * OID. Detailed rule is as follow:br/
+ * If message has no name and names' length then
+ * use passed OIDs (message may be truncated, we accept this, 
+ * but both, two last fields must be not present).br/
+ * If grantee's name len or grantor's name len is {@code -1} then use respecitve
+ * OIDs.br/
+ * If name length is not {@code -1} then find OID for given part, and
+ * ensure that respective OID is {@code 0} or is equal to found OID./br
+ * If grantor's OID is {@code 0} and grantor's name lenght is {@code -1} or 
+ * truncated then assign superuser as grantor.
+ */
+Datum
+aclitemrecv(PG_FUNCTION_ARGS) {
+StringInfo	buf = (StringInfo) PG_GETARG_POINTER(0);
+AclItem*aip;
+intgRawLen;
+char   *gVal = NULL;
+int4gValLen;
+OidgOid;
+int2   numberOfAcls;
+int4   mask;

Re: [HACKERS] disposition of remaining patches

2011-02-23 Thread Alvaro Herrera

Excerpts from Robert Haas's message of mié feb 23 15:14:04 -0300 2011:
 On Wed, Feb 23, 2011 at 1:05 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
  Excerpts from Robert Haas's message of mié feb 23 14:54:02 -0300 2011:
  On Fri, Feb 18, 2011 at 5:47 PM, Robert Haas robertmh...@gmail.com wrote:
 
   16. synchronized snapshots.  Alvaro is working on this one.
 
  Lots of discussion of this one, but current status is not clear to me.
   Alvaro, are you working on this actively?
 
  I am.  I'm not sure if it's still reasonable to get into 9.1, given that
  it needs to be rewritten from almost completely from scratch.
 
 Well, if it gets punted, I won't be too sad, since the pg_dump patch
 to actually use this functionality for something useful already got
 pushed off.

Oh, I thought that patch was committed which is why I was in a bit of a
hurry.  I will mark this one returned with feedback too, then.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How to extract a value from a record using attnum or attname?

2011-02-23 Thread Alvaro Herrera

Excerpts from Kevin Grittner's message of mié feb 23 13:43:19 -0300 2011:
 Alvaro Herrera alvhe...@commandprompt.com wrote:
  
  I think it'd be better to use RelationGetIndexList (which gets the
  index list from relcache) and fetch the index tuples from
  syscache; see relationHasPrimaryKey for sample code.
  
 Thanks.  Patch done that way attached.  Will get it into tomorrow's
 system testing here.

Why not use quote_identifier and quote_literal_cstr instead of this new
strcpy thing?  Also, you don't really need spi.h do you?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] disposition of remaining patches

2011-02-23 Thread Robert Haas

On Wed, Feb 23, 2011 at 1:34 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 Excerpts from Robert Haas's message of mié feb 23 15:14:04 -0300 2011:
 On Wed, Feb 23, 2011 at 1:05 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
  Excerpts from Robert Haas's message of mié feb 23 14:54:02 -0300 2011:
  On Fri, Feb 18, 2011 at 5:47 PM, Robert Haas robertmh...@gmail.com 
  wrote:
 
   16. synchronized snapshots.  Alvaro is working on this one.
 
  Lots of discussion of this one, but current status is not clear to me.
   Alvaro, are you working on this actively?
 
  I am.  I'm not sure if it's still reasonable to get into 9.1, given that
  it needs to be rewritten from almost completely from scratch.

 Well, if it gets punted, I won't be too sad, since the pg_dump patch
 to actually use this functionality for something useful already got
 pushed off.

 Oh, I thought that patch was committed which is why I was in a bit of a
 hurry.  I will mark this one returned with feedback too, then.

No, the directory archive format patch was committed, but the parallel
pg_dump one got pushed off.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How to extract a value from a record using attnum or attname?

2011-02-23 Thread Kevin Grittner

Alvaro Herrera alvhe...@commandprompt.com wrote:
 
 Why not use quote_identifier and quote_literal_cstr instead of
 this new strcpy thing?
 
We've got various types of software that will be parsing these
payloads, and it's a little easier to parse if the quoting is
unconditional.  If that's a barrier to acceptance we could use
the functions which quote conditionally and adjust our regular
expressions.
 
Probably one reason we had a bias toward quoting is that every
single application table name we use has at least on uppercase
letter and about 95% of our column names do.
 
 Also, you don't really need spi.h do you?
 
It's using these functions:
 
SPI_getrelname
SPI_fname
SPI_getvalue
 
If there's a better way to get the info, I'm game. 
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How to extract a value from a record using attnum or attname?

2011-02-23 Thread Alvaro Herrera

Excerpts from Kevin Grittner's message of mié feb 23 16:20:16 -0300 2011:
 Alvaro Herrera alvhe...@commandprompt.com wrote:
  
  Why not use quote_identifier and quote_literal_cstr instead of
  this new strcpy thing?
  
 We've got various types of software that will be parsing these
 payloads, and it's a little easier to parse if the quoting is
 unconditional.  If that's a barrier to acceptance we could use
 the functions which quote conditionally and adjust our regular
 expressions.

No strong opinion on this, really, but your strcpy should use a
StringInfo buffer instead of the char[200].  That's going to bite
someone.

 Probably one reason we had a bias toward quoting is that every
 single application table name we use has at least on uppercase
 letter and about 95% of our column names do.

Makes sense.

  Also, you don't really need spi.h do you?
  
 It's using these functions:
  
 SPI_getrelname
 SPI_fname
 SPI_getvalue
  
 If there's a better way to get the info, I'm game. 

I think you could get away without the first two (in particular get rid
of the memleak with SPI_getrelname), but the last one would require
something more involved.  No strong opinion, I just failed to see those
calls in there.

Is this intended for 9.1?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Binary in/out for aclitem

2011-02-23 Thread Alvaro Herrera

Excerpts from Radosław Smogura's message of mié feb 23 15:18:22 -0300 2011:

 Btw, Is it possible and needed to add group byte, indicating that grantee is 
 group or user?

There are no groups or users, only roles.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] disposition of remaining patches

2011-02-23 Thread Greg Smith


Robert Haas wrote:

2. Synchronous replication.  Splitting up this patch has allowed some


This has gotten a bunch of review, on several different threads.  I
assume Simon will publish an update when he gets back to his
keyboard...
  


That was the idea.  If anyone has any serious concerns about the current 
patch, please don't hold off just because you know Simon is away for a 
bit.  We've been trying to keep that from impacting community progress 
too badly this week.


On top of 4 listed reviewers I know Dan Farina is poking at the last 
update, so we may see one more larger report on top of what's already 
shown up.  And Jaime keeps kicking the tires too.  What Simon was hoping 
is that a week of others looking at this would produce enough feedback 
that it might be possible to sweep the remaining issues up soon after 
he's back.  It looks to me like that's about when everything else that's 
still open will probably settle too.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How to extract a value from a record using attnum or attname?

2011-02-23 Thread Robert Haas

On Wed, Feb 23, 2011 at 2:48 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 Is this intended for 9.1?

Kevin already expressed his intention to add this to the first 9.2CF.
It's far too late to BEGIN discussing new features for 9.1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Andrew Chernow




Binary mode had serious limitations, such as portability.



What are the other limitations?

As far as portability is concerned, we are using it on many different 
operating systems and architectures without issue.  Even our most recent 
bump to 9.0.1 and 9.0.3 was flawless in regard to libpq/libpqtypes.



We do need some support in libpq for constructing and deconstructing
arrays (and probably for composites too, although that will be harder, I
suspect).



[sigh...]

--
Andrew Chernow
eSilo, LLC
global backup
http://www.esilo.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Peter Geoghegan

On 23 February 2011 15:34, Merlin Moncure mmonc...@gmail.com wrote:
 You can send nested arrays safely.  You just have to be very formal
 about escaping *everything* both as you get it and as it goes into the
 container.  This is what postgres does on the backend as it sends
 arrays out the door in text.  It might be instructive to see what the
 server does in terms of escaping.  Note that the way this works it's
 not impossible to see 128+ consecutive backslashes when dealing with
 arrays of composites.

Sounds tedious.

 yes: libpqtypes.  it manages everything in binary.  i've been thinking
 for a while that libpqtypes could be wrapped with variadic templates
 or other c++ trickery.  Because libpqtypes does everything in binary,
 it completely sidesteps all the escaping nastiness.

The fact that libpqtypes does everything in binary mode is
interesting, but doesn't really help me.

Variadic template support is still quite patchy, and I don't think
that it is of particular use here. My proof-of-concept implementation
uses recursive template instantiation and type traits, and just uses
C++98 features. I've attached it for your information. I might be able
to use partial template specialisation to support regular arrays too.
That hasn't been a priority, because C++ generally discourages their
use, and because it's trickier. Arrays don't know their own size,
and I want to provide a uniform, simple interface. On the other hand,
I've seen interesting things done with template specialisation on
static integral values, such as the size of arrays on the stack, so
perhaps it's possible to support arrays while having a uniform
interface.

To be clear: I don't want to take responsibility for correctly
escaping the array literal. The user has a responsibility to use a
prepared statement/explicit escaping to do that, just as they do with
a regular text value, for example. There is no additional threat of a
traditional SQL injection attack, because we cannot break out of the
array literal itself. However, within the array literal, it is
currently possible to break out of a constant/value literal using a
double quote, to perhaps inject additional values (more than
intended), or to cause malformed array literal errors. Sure, I could
write my own function to escape the constant which is wary of double
quotes, but that would have many of the same challenges as writing a
general purpose drop-in replacement for PQescapeStringConn(). It might
be just as misguided.

-- 
Regards,
Peter Geoghegan
#include sstream
#include iostream
#include string
#include vector
#include deque
#include list
#include set

#include pqxx/pqxx

using namespace std;
using namespace pqxx;

struct true_type
{
};

struct false_type
{
};

templatetypename T
struct is_container:public false_type
{
};

templatetypename T, typename U
struct is_containervectorT, U :public true_type
{
};

templatetypename T, typename U
struct is_containerdequeT, U :public true_type
{
};

templatetypename T, typename U
struct is_containerlistT, U :public true_type
{
};

templatetypename T, typename U, typename V
struct is_containersetT, U, V :public true_type
{
};

templatetypename T
void do_pg_empty_element(const T, stringstream, false_type)
{
	// do nothing; there is no element/constant
}

templatetypename container
void do_pg_empty_element(const container elm, stringstream append, true_type)
{
	typedef typename container::value_type contained_type;
	// empty inner array
	append  {;
	do_pg_empty_element(contained_type(), append, is_containercontained_type());
	append  };
}

// terminating condition: individual elements
templatetypename T
void do_pg_array_element(const T elm, stringstream append, false_type)
{
	append  \  elm  \;
}

// append outer part of nested array
templatetypename container
void do_pg_array_element(const container cnt, stringstream append, true_type)
{
	typedef typename container::const_iterator it;
	typedef typename container::value_type contained_type;

	
	it last = cnt.end();
	if(!cnt.empty())
	{
		// there is no neater way of determining if 
		// an element is the last that works across
		// std lib containers
		--last;
	}
	else
	{
		do_pg_empty_element(cnt, append, true_type());
		return;
	}

	append  {;

	for(it i = cnt.begin(); i != cnt.end(); ++i)
	{	
		do_pg_array_element(*i, append, is_containercontained_type());
		if(i != last)
			append  , ;
	}
	append  };
}

templatetypename container
string to_pg_array_constr(const container cnt)
{
	stringstream result;
	do_pg_array_element(cnt, result, true_type());
	return result.str();
}

int main()
{
	vectorstring a;
	listvectorint  b;


	for(int i = 0; i  10; ++i)
	{
		a.push_back(Peter's vector);	
		vectorint sss;
		sss.push_back(0);
		sss.push_back(1);
		sss.push_back(2);
		b.push_front(sss);			
	}
	cout  a:   to_pg_array_constr(a)  endl  endl;
	cout  b:   to_pg_array_constr(b)  endl  endl;

	connection conn(dbname=postgres);

	work test(conn, test);

	conn.prepare(unnest, SELECT

Re: [HACKERS] How to extract a value from a record using attnum or attname?

2011-02-23 Thread Alvaro Herrera

Excerpts from Robert Haas's message of mié feb 23 17:03:23 -0300 2011:
 On Wed, Feb 23, 2011 at 2:48 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
  Is this intended for 9.1?
 
 Kevin already expressed his intention to add this to the first 9.2CF.
 It's far too late to BEGIN discussing new features for 9.1.

Yeah, I see that now.  I'll go review some other patch then.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Fix snapshot taking inconsistencies

2011-02-23 Thread Alvaro Herrera

Excerpts from Marko Tiikkaja's message of sáb ene 15 17:30:14 -0300 2011:
 On 2010-10-21 3:32 PM +0200, Marko Tiikkaja wrote:
  patch
 
 Here's the patch rebased against the master.  No code changes since the 
 last patch I sent.

Having a look at this.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Andrew Chernow


On 2/23/2011 3:06 PM, Peter Geoghegan wrote:

On 23 February 2011 15:34, Merlin Moncuremmonc...@gmail.com  wrote:

You can send nested arrays safely.  You just have to be very formal
about escaping *everything* both as you get it and as it goes into the
container.  This is what postgres does on the backend as it sends
arrays out the door in text.  It might be instructive to see what the
server does in terms of escaping.  Note that the way this works it's
not impossible to see 128+ consecutive backslashes when dealing with
arrays of composites.


Sounds tedious.



It is tedious, which is one reason why libpqtypes went binary.  There 
are some compelling performance reasons as well that affect both client 
and server.


libpqtypes was originally developed to serve a very particular need and 
wasn't aiming to be general purpose.  That came about along the way 
trying to solve the problem.  Personally, PQexec is dead to me as well 
as text results from a C/C++ app.  I see no advantage over libpqtypes in 
that context.


Unless I am missing your ultimate goal, you'd probably get what you want 
by wrapping libpqtypes.


--
Andrew Chernow
eSilo, LLC
global backup
http://www.esilo.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How to extract a value from a record using attnum or attname?

2011-02-23 Thread Kevin Grittner

Alvaro Herrera alvhe...@commandprompt.com wrote:
 Excerpts from Kevin Grittner's message:
 
 No strong opinion on this, really, but your strcpy should use a
 StringInfo buffer instead of the char[200].  That's going to bite
 someone.
 
Yeah, this was thrown together in a bit of a hurry because of
development deadlines here, so I cut a few corners based on
knowledge of our particular implementation details.  I know that's
something to change for general acceptance.
 
 It's using these functions:
  
 SPI_getrelname
 SPI_fname
 SPI_getvalue
  
 If there's a better way to get the info, I'm game. 
 
 I think you could get away without the first two (in particular
 get rid of the memleak with SPI_getrelname), but the last one
 would require something more involved.  No strong opinion, I just
 failed to see those calls in there.
 
I thought the trigger would be running in a context which would make
that leak immaterial.  I thought the general advice in such cases is
to *not* do retail freeing of space, but to let it get cleaned up
through release of the memory context.  I'll take another look at
memory contexts around triggers.
 
 Is this intended for 9.1?
 
Definitely not.  I added it to the first 9.2 CF, as mentioned in
earlier posts. I was going to hold off posting until the beta was
wrapped, but it seemed reasonable to post the patch in response to
Dimitri's post.  I wasn't intending to suggest it was ready for
general usage; I was mainly just putting it out there to see if
anyone was interested enough in it that I should polish it up for a
proper submission.  For instance, there are no docs or regression
tests yet.
 
Anyway, I certainly appreciate the pointers, because we have to push
something out to production along these lines in a couple months to
stay on track with the organization's Annual Plan, which we need to
provide to the legislature and are judged against when they
authorize funding each year.  I think your advice will bring this
feature from it works to it works really well.  :-)
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Andrew Dunstan




On 02/23/2011 02:21 PM, Andrew Chernow wrote:



Binary mode had serious limitations, such as portability.



What are the other limitations?

As far as portability is concerned, we are using it on many different 
operating systems and architectures without issue.  Even our most 
recent bump to 9.0.1 and 9.0.3 was flawless in regard to 
libpq/libpqtypes.



It's probably fine if you can control both ends. But there is no 
guarantee of portability, nor does it seem likely to me there ever will 
be, so I don't find your assertion terribly useful. The fact that it 
hasn't broken for you doesn't mean it can't or won't be.


The other downside I see is that binary protocols are often a lot harder 
to debug, but maybe that's just me.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Kenneth Marshall

On Wed, Feb 23, 2011 at 03:34:45PM -0500, Andrew Chernow wrote:
 On 2/23/2011 3:06 PM, Peter Geoghegan wrote:
 On 23 February 2011 15:34, Merlin Moncuremmonc...@gmail.com  wrote:
 You can send nested arrays safely.  You just have to be very formal
 about escaping *everything* both as you get it and as it goes into the
 container.  This is what postgres does on the backend as it sends
 arrays out the door in text.  It might be instructive to see what the
 server does in terms of escaping.  Note that the way this works it's
 not impossible to see 128+ consecutive backslashes when dealing with
 arrays of composites.

 Sounds tedious.


 It is tedious, which is one reason why libpqtypes went binary.  There are 
 some compelling performance reasons as well that affect both client and 
 server.

 libpqtypes was originally developed to serve a very particular need and 
 wasn't aiming to be general purpose.  That came about along the way trying 
 to solve the problem.  Personally, PQexec is dead to me as well as text 
 results from a C/C++ app.  I see no advantage over libpqtypes in that 
 context.

 Unless I am missing your ultimate goal, you'd probably get what you want by 
 wrapping libpqtypes.


The performance is one of the big reasons to use binary parameters.
Converting/packing/transmitting/unpacking/converting use a lot of
CPU resources on both the server and the client in addition to 
the larger communication resources needed by the text-based methods.

Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Andrew Chernow






It's probably fine if you can control both ends. But there is no
guarantee of portability, nor does it seem likely to me there ever will
be, so I don't find your assertion terribly useful. The fact that it
hasn't broken for you doesn't mean it can't or won't be.



All true.  If you change the protocol, libpqtypes needs to be adjusted. 
 I think that is a very fair statement.  It already toggles on server 
version around a few changes in the past ... like the money data type. 
So far, since 8.1, the number of changes to the binary protocol has put 
me to sleep :)



The other downside I see is that binary protocols are often a lot harder
to debug, but maybe that's just me.



Also very true.  However, libpqtypes addresses this by abstracting the 
end user from the binary transformation or preparation.  Instead, users 
are presented with a printf/scanf style interface.  PQexecf(conn, 
select %int4 + %int4, 4, 4)  is pretty far removed from the underlying 
byte swapping, parallel array setup for PQexecParams and other 
nastiness.  But yes, the maintainer of the library must deal with 
protocol changes and provide backward compatibility.


--
Andrew Chernow
eSilo, LLC
global backup
http://www.esilo.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Binary in/out for aclitem

2011-02-23 Thread Tom Lane

Alvaro Herrera alvhe...@commandprompt.com writes:
 Excerpts from RadosÅaw Smogura's message of miÃ© feb 23 15:18:22 -0300 2011:
 Btw, Is it possible and needed to add group byte, indicating that grantee is 
 group or user?

 There are no groups or users, only roles.

Even if there were, this is not part of the value of an aclitem.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Binary in/out for aclitem

2011-02-23 Thread Tom Lane

=?utf-8?q?Rados=C5=82aw_Smogura?= rsmog...@softperience.eu writes:
 Here is extended version, has version field (N_ACL_RIGHTS*2) and reserved 
 mask, as well definition is more general then def of PGSQL. In any way it 
 require that rights mades bit array.

You're going in quite the wrong direction here.  The consensus as I
understood it was that we should just use the text representation in
binary mode too, rather than inventing a separate representation that's
going to put a whole new set of constraints on what can happen to the
internal representation.  The proposal you have here has no redeeming
social value whatever, because nobody cares about the I/O efficiency
for aclitem (and even if anyone did, you've made no case that this would
actually be more efficient to use on the client side).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Possible substitute for PostmasterIsAlive polling loops

2011-02-23 Thread Tom Lane

We've touched a few times on trying to get rid of the
sleep-awhile-and-check-for-something-to-do loops in PG's auxiliary
processes, mainly to satisfy people who complain about CPU power
consumption when idle.  I can see how most of the something-to-do
checks can be reimplemented using latches, but up to now there
didn't seem to be a good way to get rid of waking up every so often
to check if the postmaster was still there.  So it got my attention
when someone mentioned this Linux syscall on a Red Hat mailing list:

NAME
   prctl - operations on a process

SYNOPSIS
   #include sys/prctl.h

   int prctl(int option, unsigned long arg2, unsigned long arg3,
 unsigned long arg4, unsigned long arg5);

...
   The first argument can be:
...

   PR_SET_PDEATHSIG (since Linux 2.1.57)
  Set  the  parent  process death signal of the calling process to
  arg2 (either a signal value in the  range  1..maxsig,  or  0  to
  clear).   This  is  the signal that the calling process will get
  when its parent dies.  This value is cleared for the child of  a
  fork(2).


IOW, at least on Linux, you *can* arrange to get a signal when your
parent process dies.

Not sure how ugly it'd be to use this call when available and a time
delay when not, but it's something to think about.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: cross column correlation ...

2011-02-23 Thread Bruce Momjian

PostgreSQL - Hans-J?rgen Sch?nig wrote:
  Those are real problems, but I still want it.  The last time I hit
  this problem I spent two days redesigning my schema and adding
  triggers all over the place to make things work.  If I had been
  dealing with a 30TB database instead of a 300MB database I would have
  been royally up a creek.
 
  To put that another way, it's true that some people can't adjust their
  queries, but also some people can.  It's true that nonstandard stuff
  sucks, but queries that don't work suck, too.  And as for better
  solutions, how many major release cycles do we expect people to wait
  for them?  Even one major release cycle is an eternity when you're
  trying to get the application working before your company runs out of
  money, and this particular problem has had a lot of cycles expended on
  it without producing anything very tangible (proposed patch, which
  like you I can't spare a lot of cycles to look at just now, possibly
  excepted).
 
 
 
 cheapest and easiest solution if you run into this: add fake functions
 which the planner cannot estimate properly.  use OR to artificially
 prop up estimates or use AND to artificially lower them. there is
 actually no need to redesign the schema to get around it but it is such
 an ugly solution that it does not even deserve to be called ugly ...
 however, fast and reliable way to get around it.

I agree that is super-ugly and we do need to address the cross-column
statistics better.  I personally like the 2-D histogram idea:

http://archives.postgresql.org/pgsql-hackers/2010-12/msg00913.php

--
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: cross column correlation ...

2011-02-23 Thread Bruce Momjian

Robert Haas wrote:
 2011/2/23 PostgreSQL - Hans-J?rgen Sch?nig postg...@cybertec.at:
  i thought there was an agreement that we don't want planner hints?
 
 Well, I want them.  I think some other people do, too.  Whether those
 people are more numerous than than the people who don't want them, and
 how much that matters either way, is another question.  I don't want
 to have to use them very often, but I like to have an out when I get
 desperate.
 
  as tom pointed out - many broken queries come out of some query generator 
  where even the design to make the design is broken by design.
  personally i like query generators as long as other people use them ... 
  telling people that this is the wrong way to go is actually financing my 
  holiday next week ... ;). ?in general - hibernate and stuff like that is a 
  no-go.
 
  personally i like the type of planner hints oleg and teodor came up with - 
  i think we should do more of those hooks they are using but hiding it in 
  some syntax is not a good idea.
  it does not change the query and it still gives a lot of room to toy 
  around. it looks like a compromise.
 
  however, oleg's contrib module does not fix the core problem of cross 
  column statistics because a hint is usually static but you want flexible 
  selectivity.
 
 IIRC, what Teodor and Oleg did was a contrib module that excluded a
 certain index from consideration based on a GUC.  That to me is a
 little more hacky than just wiring the selectivity estimate.  You're
 going to need to set that just before each query that needs it, and
 reset it afterwards, so it's actually worse than just decorating the
 queries, IMHO.  Also, I haven't run into any actual problems in the
 field that would be solved by this approach, though I am sure others
 have.  IME, most bad query plans are caused by either incorrect
 estimates of selectivity, or wrongheaded notions about what's likely
 to be cached.  If we could find a way, automated or manual, of
 providing the planner some better information about the facts of life
 in those areas, I think we'd be way better off.  I'm open to ideas
 about what the best way to do that is.

For me the key is finding a way to get that information to the planner
so all queries can benefit, not just the queries we decorate.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Fix snapshot taking inconsistencies

2011-02-23 Thread Tom Lane

Alvaro Herrera alvhe...@commandprompt.com writes:
 Excerpts from Marko Tiikkaja's message of sÃ¡b ene 15 17:30:14 -0300 2011:
 On 2010-10-21 3:32 PM +0200, Marko Tiikkaja wrote:
 patch
 
 Here's the patch rebased against the master.  No code changes since the 
 last patch I sent.

 Having a look at this.

My recollection is that this was pretty tightly coupled to the wCTE
patch.  I had been intending to review them together, and have just
now come up for air enough to start doing that.  Do you really want
to review this one separately?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Fix snapshot taking inconsistencies

2011-02-23 Thread Marko Tiikkaja


On 2011-02-24 12:39 AM, Tom Lane wrote:

My recollection is that this was pretty tightly coupled to the wCTE
patch.


It was, but isn't anymore.  Now it's just a bugfix.


Regards,
Marko Tiikkaja

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Patch: add GiST support for BOX @ POINT queries

2011-02-23 Thread Andrew Tipton

While playing around with the BOX and POINT datatypes, I was surprised to
note that BOX @ POINT (and likewise POINT @ BOX) queries were not using
the GiST index I had created on the BOX column.  The attached patch adds a
new strategy @(BOX,POINT) to the box_ops opclass.  Internally,
gist_box_consistent simply transforms the POINT into its corresponding BOX.

This is my first Postgres patch, and I wasn't able to figure out how to go
about creating a regression test for this change.  (All existing tests do
pass, but none of them seem to specifically test index behaviour.)

I know it is quite late in the CommitFest, should I add this to CF-Next?

-Andrew
diff --git a/src/backend/access/gist/gistproc.c b/src/backend/access/gist/gistproc.c
index 86a5d90..a2c6cb6 100644
*** a/src/backend/access/gist/gistproc.c
--- b/src/backend/access/gist/gistproc.c
*** gist_box_consistent(PG_FUNCTION_ARGS)
*** 96,101 
--- 96,113 
  	if (DatumGetBoxP(entry-key) == NULL || query == NULL)
  		PG_RETURN_BOOL(FALSE);
  
+ 	if (strategy == 27)
+ 	{
+ 	/* Convert BOX @ POINT to the equivalent BOX @ BOX query */
+ 	Point *q_point = PG_GETARG_POINT_P(1);
+ 	BOX q_box;
+ 
+ 		q_box.low = *q_point;
+ 		q_box.high = *q_point;
+ 		query = q_box;
+ 		strategy = 7;   /* Strategy number for BOX @ BOX */
+ 	}
+ 
  	/*
  	 * if entry is not leaf, use rtree_internal_consistent, else use
  	 * gist_box_leaf_consistent
diff --git a/src/include/catalog/pg_amop.h b/src/include/catalog/pg_amop.h
index aabb900..eb03255 100644
*** a/src/include/catalog/pg_amop.h
--- b/src/include/catalog/pg_amop.h
*** DATA(insert (	2593   603 603 11 s 2573 7
*** 595,600 
--- 595,601 
  DATA(insert (	2593   603 603 12 s 2572 783 0 ));
  DATA(insert (	2593   603 603 13 s 2863 783 0 ));
  DATA(insert (	2593   603 603 14 s 2862 783 0 ));
+ DATA(insert (	2593   603 600 27 s 433 783 0 ));
  
  /*
   * gist point_ops

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: add GiST support for BOX @ POINT queries

2011-02-23 Thread Kevin Grittner

Andrew Tipton andrew.t.tip...@gmail.com wrote:
 
 should I add this to CF-Next?
 
Yes.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Fix snapshot taking inconsistencies

2011-02-23 Thread Tom Lane

Marko Tiikkaja marko.tiikk...@cs.helsinki.fi writes:
 On 2011-02-24 12:39 AM, Tom Lane wrote:
 My recollection is that this was pretty tightly coupled to the wCTE
 patch.

 It was, but isn't anymore.  Now it's just a bugfix.

The connection is the question of where to do CommandCounterIncrement
between successive DML WITH operations in a single command.  Right
offhand, I don't see any CommandCounterIncrement in the wCTE patch,
so I'm sort of wondering whether the case works at all...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Fix snapshot taking inconsistencies

2011-02-23 Thread Alvaro Herrera

Excerpts from Tom Lane's message of mié feb 23 19:39:23 -0300 2011:
 Alvaro Herrera alvhe...@commandprompt.com writes:
  Excerpts from Marko Tiikkaja's message of sáb ene 15 17:30:14 -0300 2011:
  On 2010-10-21 3:32 PM +0200, Marko Tiikkaja wrote:
  patch
  
  Here's the patch rebased against the master.  No code changes since the 
  last patch I sent.
 
  Having a look at this.
 
 My recollection is that this was pretty tightly coupled to the wCTE
 patch.  I had been intending to review them together, and have just
 now come up for air enough to start doing that.  Do you really want
 to review this one separately?

Dunno.  If you're gonna pick it up I guess my time is best spent
elsewhere.  There was some restructuring in code in postgres.c to be
done near this patch, which wasn't attacked at all by Marko AFAICS.
Maybe I should be looking at that instead.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Fix snapshot taking inconsistencies

2011-02-23 Thread Marko Tiikkaja


On 2011-02-24 2:31 AM, Tom Lane wrote:

Marko Tiikkajamarko.tiikk...@cs.helsinki.fi  writes:

On 2011-02-24 12:39 AM, Tom Lane wrote:

My recollection is that this was pretty tightly coupled to the wCTE
patch.



It was, but isn't anymore.  Now it's just a bugfix.


The connection is the question of where to do CommandCounterIncrement
between successive DML WITH operations in a single command.


.. what?  We decided *not* to do any CommandCounterIncrements between 
DML WITHs.



Regards,
Marko Tiikkaja

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Fix snapshot taking inconsistencies

2011-02-23 Thread Tom Lane

Alvaro Herrera alvhe...@commandprompt.com writes:
 Excerpts from Tom Lane's message of miÃ© feb 23 19:39:23 -0300 2011:
 My recollection is that this was pretty tightly coupled to the wCTE
 patch.  I had been intending to review them together, and have just
 now come up for air enough to start doing that.  Do you really want
 to review this one separately?

 Dunno.  If you're gonna pick it up I guess my time is best spent
 elsewhere.  There was some restructuring in code in postgres.c to be
 done near this patch, which wasn't attacked at all by Marko AFAICS.
 Maybe I should be looking at that instead.

Well, Marko claims they're independent, so if you feel it fits into
what you're doing you're welcome to it.  But I was planning to deal
with Marko's two patches as soon as the FDW dust settled, and it
seems to be settled.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Fix snapshot taking inconsistencies

2011-02-23 Thread Marko Tiikkaja


On 2011-02-24 2:35 AM, Alvaro Herrera wrote:

 There was some restructuring in code in postgres.c to be
done near this patch, which wasn't attacked at all by Marko AFAICS.
Maybe I should be looking at that instead.


I don't feel at all comfortable doing the restructuring you guys have 
been talking about.



Regards,
Marko Tiikkaja

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: cross column correlation ...

2011-02-23 Thread Josh Berkus


 Personally, I think the first thing we ought to do is add a real, bona
 fide planner hint to override the selectivity calculation manually,
 maybe something like this:
 
 WHERE (x  5 AND y = 1) SELECTIVITY (0.1);
 
 Then, having provided a method for the DBA to extinguish the raging
 flames of searing agony which are consuming them while a crocodile
 chews off their leg and their boss asks them why they didn't use
 Oracle, we can continue bikeshedding about the best way of fixing this
 problem in a more user-transparent fashion.

Is there some way we can do that without adding the selectivity hint to
the query itself?  That's the biggest issue with hints.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronous standby

2011-02-23 Thread Tatsuo Ishii

 On Wed, Feb 23, 2011 at 3:49 AM, Tatsuo Ishii is...@postgresql.org wrote:
 In 9.1, we will be able to have synchrnous replication. Also we have
 one standby server chosen by primary to be the synchronous standby
 (still I'm not sure this is correct or not as stated in another mail).

 
 yes, it is. a list of possible synch standbys and one of them chosen
 to be the one
 
 Is there anyway to know which is the synchronous standby? IMO this is
 important for users because that one is likely the least behind to
 primary and will be chosen to promoto in case of primary dying in most
 cases.
 
 i guess, we can put a new column in pg_stat_replication stating the
 type of the replication (synch or asynch).
 but that is surely a different patch...

Ok, I will write a patch unless someone comes up.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: cross column correlation ...

2011-02-23 Thread Robert Haas

On Wed, Feb 23, 2011 at 8:09 PM, Josh Berkus j...@agliodbs.com wrote:
 Personally, I think the first thing we ought to do is add a real, bona
 fide planner hint to override the selectivity calculation manually,
 maybe something like this:

 WHERE (x  5 AND y = 1) SELECTIVITY (0.1);

 Then, having provided a method for the DBA to extinguish the raging
 flames of searing agony which are consuming them while a crocodile
 chews off their leg and their boss asks them why they didn't use
 Oracle, we can continue bikeshedding about the best way of fixing this
 problem in a more user-transparent fashion.

 Is there some way we can do that without adding the selectivity hint to
 the query itself?  That's the biggest issue with hints.

I've been mulling this issue over a bit more - Nathan Boley raised a
similar point upthread.  I think it's useful to consider some concrete
cases which can occur.

1. Default estimate.  The planner tends to estimate that the
selectivity of something = something is 0.005, and that the
selectivity of something != something is 0.995, when it doesn't
know any better.  This estimate often sucks.  Sometimes it sucks
because it's too high, other times because it's too low, and of course
sometimes it is close enough for government work.

2. One special customer.  Suppose we have a database that contains
lots and lots of people and associates different attributes to those
people, including customer_id.  We put all of our employees in the
table too, and assign them customer_id = 1, since the record with
customer.id = 1 represents us.  I've built this kind of system for
several different employers over the years.  Turns out, the subset of
the person table with customer_id = 1 looks very different, in terms
of the MCVs on the remaining columns and the distribution of the
values otherwise, than the records with customer_id != 1.  I'm sure
this problem comes up in different forms in other domains; this is
just where I've seen it the most.

3. The mostly-redundant condition.  Something like creation_date 
'some timestamp' AND active.  Turns out, most of the not active stuff
is also... old.  A variant of this is creation_date  'some timestamp'
AND customer_id = 1, which overlaps #2.  For extra fun the creation
date and customer_id may be in different tables, with some
intermediate join muddying the waters.

4. The condition that's redundant except when it isn't.  The classic
example here is WHERE zipcode = constant AND state = constant.
Most of the time, the selectivity of the two clauses together is much
higher than the product of their individually selectivities; you might
as well ignore the second part altogether.  But if some numbskull user
enters a state that doesn't match the zipcode, then suddenly it
matters a lot - the selectivity drops to zero when the second part is
added.

5. The bitfield.  Conditions like (x  64) != 0.  I know disk is
cheap, but people keep doing this.

There are probably some others I'm missing, too.  That's just off the
top of my head.  Now here are some possible approaches to fixing it:

A. Decorate the query.  This would often be useful for case #1, and
some instances of #3 and #5.  It's useless for #2 and #4.

B. Specify a particular predicate and the selectivity thereof.  Like,
whenever you see (x  64) = 0, assume the selectivity is 0.5.  Upon
reflection, this seems pretty terrible in every respect.  Unless you
only ever issue an extremely limited range of queries, you're going to
be hardwiring a lot of selectivities.  I think this really only
handles case #5 well, and maybe some instances of case #1.

C. Specify an expression and gather statistics on it as if it were a
column: i.e. ALTER TABLE tab ADD VIRTUAL STATISTICS COLUMN x  64.
This is pretty good.  It is pretty much ideal for #2 and also handles
#5 and some cases of #3 and #1 well.  You could even make it handle
some instances of #4 if you made the virtual column ROW(state,
zipcode) and rewrote the query as a row comparison.

D. N x N implicativeness matrix.  Record for each pair of attributes
the extent to which a given value for A implies a value for B, and
derate the selectivity multipliers based on this information.  This is
an idea of Heikki's.  It seemed good to me when he proposed it, and I
think he proposed it in regards to #4, but I'm not sure we really ever
figured out how to make it work.

E. Given a set of columns (A1, .., An), collect MCVs and make a
histogram for ROW(A1, ..., An), and then use it to handle cases like
#4.  This is similar to C and is intended to handle the zipcode
problem, but it's not as flexible (because you are only specifying
columns, not expressions).  However, it's intended to work without
rewriting the state/zipcode comparisons as a rowcompare.

If you want to take the above as in any way an exhaustive survey of
the landscape (which it isn't), C seems like a standout, maybe
augmented by the making the planner able to notice that A1 = x1 AND A2
= x2 is equivalent to

Re: [HACKERS] Possible substitute for PostmasterIsAlive polling loops

2011-02-23 Thread Robert Haas

On Wed, Feb 23, 2011 at 4:54 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 IOW, at least on Linux, you *can* arrange to get a signal when your
 parent process dies.

That's pretty cool.

 Not sure how ugly it'd be to use this call when available and a time
 delay when not, but it's something to think about.

Yeah.  It may be worth thinking about whether we want to use the
postmaster-pipe trick someone was proposing.  That might be more
portable.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronous standbys?

2011-02-23 Thread Tatsuo Ishii

 there could be only one standby at the same time...
 in the original patch there could be several synchronous standby
 servers and the primary was going to wait until the first one of them
 to answer, but that was removed and replaced by a list of possible
 synch standby servers and the first that connects is the one the
 primary will wait for. because right now it's a simple list the first
 one will always be the synch standby until it's removed from the list
 or phisically.
 
 the other comment in the doc should be removed

Thanks for the info!
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: cross column correlation ...

2011-02-23 Thread Bruce Momjian

Robert Haas wrote:
 If you want to take the above as in any way an exhaustive survey of
 the landscape (which it isn't), C seems like a standout, maybe
 augmented by the making the planner able to notice that A1 = x1 AND A2
 = x2 is equivalent to (A1,A2) = (x1, x2) so you don't have to rewrite
 queries as much.
 
 I don't really know how to handle the join selectivity problem.  I am
 not convinced that there is a better solution to that than decorating
 the query.  After all the join selectivity depends not only on the
 join clause itself, but also on what you've filtered out of each table
 in the meantime.

Thinking some more, I think another downside to the decorate the query
idea is that many queries use constants that are supplied only at
runtime, so there would be no way to hard-code a selectivity value into
a query when you don't know the value.  Could a selectivity function
handle that?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: cross column correlation ...

2011-02-23 Thread Robert Haas

On Wed, Feb 23, 2011 at 10:30 PM, Bruce Momjian br...@momjian.us wrote:
 Robert Haas wrote:
 If you want to take the above as in any way an exhaustive survey of
 the landscape (which it isn't), C seems like a standout, maybe
 augmented by the making the planner able to notice that A1 = x1 AND A2
 = x2 is equivalent to (A1,A2) = (x1, x2) so you don't have to rewrite
 queries as much.

 I don't really know how to handle the join selectivity problem.  I am
 not convinced that there is a better solution to that than decorating
 the query.  After all the join selectivity depends not only on the
 join clause itself, but also on what you've filtered out of each table
 in the meantime.

 Thinking some more, I think another downside to the decorate the query
 idea is that many queries use constants that are supplied only at
 runtime, so there would be no way to hard-code a selectivity value into
 a query when you don't know the value.  Could a selectivity function
 handle that?

Beats me.  What do you have in mind?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: cross column correlation ...

2011-02-23 Thread PostgreSQL - Hans-Jürgen Schönig

On Feb 24, 2011, at 2:09 AM, Josh Berkus wrote:

 
 Personally, I think the first thing we ought to do is add a real, bona
 fide planner hint to override the selectivity calculation manually,
 maybe something like this:
 
 WHERE (x  5 AND y = 1) SELECTIVITY (0.1);
 
 Then, having provided a method for the DBA to extinguish the raging
 flames of searing agony which are consuming them while a crocodile
 chews off their leg and their boss asks them why they didn't use
 Oracle, we can continue bikeshedding about the best way of fixing this
 problem in a more user-transparent fashion.
 
 Is there some way we can do that without adding the selectivity hint to
 the query itself?  That's the biggest issue with hints.
 



well, you could hide this hint in the system table - say; instead of decorating 
the query you could store the decoration in some system relation ... but, if 
you get it right, you call this decoration histogram ;).
i think the patch with a multi-dim histogram is good (i have seen something 
similar for PostGIS).
what is still needed in our patch is a.) multi-dim sampling (no idea how to get 
it right) and b.) investigating how to deal with joins and expressions (e.g. 
cos(id) ).
hints into the right direction are highly welcome.

many thanks,

hans


--
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

2011-02-23 Thread Daniel Farina

On Fri, Feb 18, 2011 at 4:06 PM, Simon Riggs si...@2ndquadrant.com wrote:

 Well, good news all round.

 v17 implements what I believe to be the final set of features for sync
 rep. This one I'm actually fairly happy with. It can be enjoyed best at
 DEBUG3.

I've been messing with this patch and am wondering if this behavior is expected:

I've been frobbing the server around (I was messing around with the
syncrep feature, but do not know if this is related just yet), and
came upon a case I do not expect: it would appear that prior to
establishing a connection to do streaming replication, the startup
process (which is recovering) is very slowly catching up (or so it
would be indicated by txid_current_snapshot()) and eating up enormous
amounts of memory, such as 6GB at a time in RES,  monotonically
increasing. Furthermore, the incrementation of the txid_snapshot is
very slow, and it doesn't seem like I'm coming close to making full
use of my resources: cpu and block devices are not very busy.  There
may have been a brief spurt of pgbench activity that would generate
such WAL traffic to replay.

I have not done a hard shutdown to my knowledge, and the server does
allow me to query relatively quickly as a standby.

Looks like I'm about to hit 7+GB. Is there something I'm missing?

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sync Rep v17

2011-02-23 Thread Daniel Farina

On Wed, Feb 23, 2011 at 10:39 PM, Daniel Farina dan...@heroku.com wrote:
 On Fri, Feb 18, 2011 at 4:06 PM, Simon Riggs si...@2ndquadrant.com wrote:

 Well, good news all round.

 v17 implements what I believe to be the final set of features for sync
 rep. This one I'm actually fairly happy with. It can be enjoyed best at
 DEBUG3.

 I've been messing with this patch and am wondering if this behavior is 
 expected:

 I've been frobbing the server around (I was messing around with the
 syncrep feature, but do not know if this is related just yet), and
 came upon a case I do not expect: it would appear that prior to
 establishing a connection to do streaming replication, the startup
 process (which is recovering) is very slowly catching up (or so it
 would be indicated by txid_current_snapshot()) and eating up enormous
 amounts of memory, such as 6GB at a time in RES,  monotonically
 increasing. Furthermore, the incrementation of the txid_snapshot is
 very slow, and it doesn't seem like I'm coming close to making full
 use of my resources: cpu and block devices are not very busy.  There
 may have been a brief spurt of pgbench activity that would generate
 such WAL traffic to replay.

 I have not done a hard shutdown to my knowledge, and the server does
 allow me to query relatively quickly as a standby.

Oh, yes, this reproduces past shutdowns/startups, and there's quite a
few txids before I catch up. I'm also comfortable poking around with
gdb (I have already recompiled with debugging symbols and
optimizations off and was poking around, especially at
MemoryContextStats(TopMemoryContext), but was not rewarded.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

76 matches

Mail list logo