Re: [HACKERS] Notes on implementing URI syntax for libpq

2011-11-28 Thread Greg Smith

On 11/24/2011 05:21 AM, Alvaro Herrera wrote:

A coworker also suggested using a different designator:

postgresqli:///path/to/socket:5433/database
postgresqli://:5433/database


This is not unprecedented.  An example is how CUPS handles this problem 
when connecting printers using URIs:  
http://www.cups.org/documentation.php/network.html where you might see 
this for the usual port:


lpd://ip-address-or-hostname/queue

And this for AppSocket AKA JetDirect:

socket://ip-address-or-hostname

I am certainly not going to defend printing setup with CUPS as a model 
worth emulating, just noting the similarity here.  I think we'll save 
miles of user headaches if there's only one designator.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Displaying accumulated autovacuum cost

2011-11-28 Thread Greg Smith

On 11/25/2011 08:39 AM, Alvaro Herrera wrote:

I was about to commit this when I noticed that the avg values may not be
all that representative of reality after all; consider that it's
computed across the whole duration of the vacuuming operation, including
the index scans ... it'd be possibly useful to keep separate timings for
the heap scan (which is likely to use I/O more quickly) from index
scans.  That way you can tune for the max, not a possibly misleading
average.  That's a much larger change though, so I'm not going to get
into it.


Yes, that's one of the interesting additional things to track.  At the 
point I was writing this originally, things like Single Pass VACUUM 
were still on the table; maybe it still is.  Didn't seem worthwhile to 
try and get any more detailed than the average when the underlying work 
might still be whacked around.  Just having some useful numbers on both 
costs and speeds for the first time was such an improvement I didn't 
want to get too far lost on chasing perfection here, at the risk of not 
getting anything done.




One funny thing in the test I did was that the buffer count might add to
a much larger amount than total disk pages:


Good catch, I didn't think about the ramifications of multi-pass work 
here.  I'll revisit that and see if I can do a bit better before the 
final 9.2 CommitFest; I'm not done with this area yet.  I kind of needed 
this basic logging patch in place before it's easy to experiment with 
behavior changes and see what they do though.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Displaying accumulated autovacuum cost

2011-11-28 Thread Greg Smith

On 11/25/2011 05:10 PM, Robert Haas wrote:

This patch adds a count of the number of buffers dirtied to VACUUM,
but it strikes me that it would be useful to add similar tracking to
pgBufferUsage.  Attached is a patch for that.  You can see the new
counters through pg_stat_statements or with EXPLAIN (ANALYZE,
BUFFERS).


This looks pretty useful to me.  I just threw it into the current 
CommitFest, on the basis that there's already so many other thing trying 
to whack around pg_stat_statements right now we might as well keep them 
together.  Let me clear my queue of patches submitted on time I need to 
do something with (re-review Scott Mead's pg_stat_activity change, 
respond to Jeff Janes on relation free space) and I'll take a quick spin 
on this one.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: add timing of buffer I/O requests

2011-11-28 Thread Tomas Vondra
On 28 Listopad 2011, 8:54, Greg Smith wrote:
 -Have one of the PostgreSQL background processes keep track of a time
 estimate on its own, only periodically pausing to sync against the real
 time.  Then most calls to gettimeofday() can use that value instead.  I
 was thinking of that idea for slightly longer running things though; I
 doubt that can be made accurate enough to test instrument buffer

What about random sampling, i.e. measure just 5% of the events or
something like that? Sure, it's not exact but it significantly reduces the
overhead. And it might be a config parameter, so the user might decide how
precise results are needed, and even consider how fast the clocks are.

 Something more ambitious like the v$ stuff would also take care of what
 you're doing here; I'm not sure that what you've done helps built it
 though.  Please don't take that personally.  Part of one of my own
 instrumentation patches recently was rejected out of hand for the same
 reason, just not being general enough.

Yes, that'd be significant improvement. The wait-event stuff is very
useful and changes the tuning significantly.

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoiding shutdown checkpoint at failover

2011-11-28 Thread Greg Smith

On 11/13/2011 12:13 AM, Simon Riggs wrote:

On Tue, Nov 1, 2011 at 12:11 PM, Simon Riggssi...@2ndquadrant.com  wrote:


When I say skip the shutdown checkpoint, I mean remove it from the
critical path of required actions at the end of recovery. We can still
have a normal checkpoint kicked off at that time, but that no longer
needs to be on the critical path.

Any problems foreseen? If not, looks like a quick patch.

Patch attached for discussion/review.


This one was missed for the November CF; submitted in time but not added 
to the app until just now.  Given that Tom already voiced some specific 
things to consider (detailed review of all WAL replay activities) I 
added it to the January one instead.  If anyone has been looking for 
reason to study WAL replay, by all means knock yourself out before then.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] splitting plpython into smaller parts

2011-11-28 Thread Greg Smith

On 11/13/2011 09:45 AM, Jan Urbański wrote:
The first one factors out some bolerplate related to executing SPI 
functions in subtransactions (and idea borrowed from pltcl.c).


While I haven't looked at the code, this seems worthwhile from the 
description.


The second one is the actual split. plpython.c has been split into 11 
separate files and one header. 


Could you comment a bit more about what the goal of this is?  We don't 
have a reviewer for this patch yet, and I think part of the reason is 
because it's not really obvious what it's supposed to be doing, and why 
that's useful.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] CommitFest 2011-11 Post-Tryptophan Progress Report

2011-11-28 Thread Greg Smith
With plenty of people in the US returning to work on Monday after a 
modest break last week, I wanted to put out an updated on the active 
CommitFest. I just sent out several pleas for help with specific patches 
where only a limited chunk of the community is really in a good position 
to test the code. Excluding those for the moment, here are some notes on 
the submitted patches that not only don't have a reviewer already, I 
don't know who might take them on yet:


Command Triggers by Dimitri Fontaine. This is a pretty big patch (3366 
lines) that won't be easy to slug through. I know Dimitri is still 
making some progress on this just by thinking more about the code, but 
that's not going to keep up for much longer without outside feedback.


type privileges by Peter Eisentraut. Another big patch (2537 lines), 
and it needs both some code review and searching for security issues in 
the new feature.


Both type privledges and Command Triggers are likely to slip well 
into December if we don't get a major reviewer working on them soon. I 
just don't know who that might be in either case.


cursor calling with named parameters by Yeb Havinga. This is a 
resubmission of a patch from the last CommitFest, so that previous 
feedback is around to give an idea what should be re-tested in the new 
version. At 702 lines including docs and tests, I'd call this a medium 
sized patch to work on. It'd say it's a decent candidate for someone who 
picked up a patch, got done early, and is enough of a glutton to want to 
do a second patch.


splitting plpython into smaller parts by Jan Urbański. Just asked for 
clarification on this one because I'm not quite sure what to do with it.


unite recovery.conf and postgresql.conf: Since this has a lot more 
consensus issues than code ones, I'm going to look into this more 
myself. The things I submitted actually change around the potential ways 
to satisfy all the use cases here, too.


Configuration include directory and includeifexists in configuration 
file by me. These are not too hard to look at, and given that this all 
started as hacking on Mangus's code he may get to it eventually anyway. 
I'm not too worried about these two right now.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Measuring relation free space

2011-11-28 Thread Greg Smith

On 11/25/2011 04:42 PM, Jeff Janes wrote:

It reports space that is free exclusively for updates as being free.
In other words, it considers space free even if it is reserved against
inserts in deference to fillfactor.  This is in contrast to
pg_freespace, which only reports space available for inserts as being
available.  I think this is reasonable behavior, but it is subtle and
should perhaps be documented.


Ah, that's right, this is why I first wandered this specific path.  
Ignoring fillfactor seems to have even more downsides as I see it.  
Certainly deserves a doc improvement, as well as fixing the description 
of the value so it's clearly a ratio rather than a true percentage.



(Is it common to use fill factors other
than the default in the first place?  Do we assume that people using
fillfactor are sophisticated enough not to shot themselves in the
foot?)


It's not common, and I think anyone who sets fillfactor themselves would 
understand the downside.  The bigger risk are people who inherit designs 
from others that use this feature, but the new person doesn't understand 
it.  If using this feature calls attention to a problem there that 
prompts an investigation, I'd see that as a good thing, rather than a 
foot shot.



Unless I am missing something, all indexes are handled via a procedure
designed for BTree indices, GetBTRelationFreeSpace.  I don't know
that the ultimate behavior of this is wrong, but it seems unusual.  If
I get some more time, I will try to explore what is actually going on
when called on other types of indexes.


This I think I'll punt back toward Jaime, as well as asking did you 
have a plan for TOAST here?



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for type privileges

2011-11-28 Thread Yeb Havinga

On 2011-11-15 21:50, Peter Eisentraut wrote:

Patch attached.


I cannot get the patch to apply, this is the output of patch -p1 
--dry-run on HEAD.


patching file src/include/catalog/pg_type.h
Hunk #1 succeeded at 217 (offset 1 line).
Hunk #2 succeeded at 234 (offset 1 line).
Hunk #3 succeeded at 264 (offset 1 line).
Hunk #4 succeeded at 281 (offset 1 line).
Hunk #5 FAILED at 370.
Hunk #6 FAILED at 631.
2 out of 6 hunks FAILED -- saving rejects to file 
src/include/catalog/pg_type.h.rej


I was unable to find a rev to apply the patch to do some testing: this 
one didn't work either


commit 4429f6a9e3e12bb4af6e3677fbc78cd80f160252
Author: Heikki Linnakangas heikki.linnakan...@iki.fi
Date:   Thu Nov 3 13:16:28 2011 +0200
Support range data types.

and that's strange since git log of pg_type.h shows a commit of april 
before that.


regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Notes on implementing URI syntax for libpq

2011-11-28 Thread Alexander Shulgin

Excerpts from Greg Smith's message of Mon Nov 28 10:08:42 +0200 2011:
 
 On 11/24/2011 05:21 AM, Alvaro Herrera wrote:
  A coworker also suggested using a different designator:
 
  postgresqli:///path/to/socket:5433/database
  postgresqli://:5433/database
 
 This is not unprecedented.  An example is how CUPS handles this problem 
 when connecting printers using URIs:  
 http://www.cups.org/documentation.php/network.html where you might see 
 this for the usual port:
 
 lpd://ip-address-or-hostname/queue
 
 And this for AppSocket AKA JetDirect:
 
 socket://ip-address-or-hostname
 
 I am certainly not going to defend printing setup with CUPS as a model 
 worth emulating, just noting the similarity here.  I think we'll save 
 miles of user headaches if there's only one designator.

I'm not a big fan of using different designator for local socket connections 
either, especially if they differ so little (the added 'i' might be too hard to 
spot, moreso if the displayed using proportional font.)  Not to mention that 
printers and compatibility don't go together that often, in my (probably way 
too limited) experience. ;-)

As it was suggested downthread, postgresql://[:port]/[mydb] should work 
perfectly for this purpose, since it's just a matter of allowing empty 
host/addr in the URI.  So, using the default port: postgresql:///mydb (notice 
the similarity with the local-filesystem URI scheme: file:///)

Speaking of JDBC, the postgresql:///mydb notation may be abbreviated as 
postgresql:mydb, which is not unreasonable to support in psql too.

--
Regards,
Alex

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server

2011-11-28 Thread Shigeru Hanada
Hi Kaigai-san,

(2011/11/20 2:42), Kohei KaiGai wrote:
 I'm still under reviewing of your patch, so the comment is not overall, sorry.

Thanks for the review!

 I'm not sure whether the logic of is_foreign_expr() is appropriate.
 It checks oid of the function within FuncExpr and OpExpr to disallow to push
 down user-defined functions.
 However, if a user-defined operator is implemented based on built-in functions
 with different meaning, is it really suitable to push-down?
 E.g) It is available to define '=' operator using int4ne, even though
 quite nonsense.
 So, I'd like to suggest to check oid of the operator; whether it is a
 built-in, or not.
 
 On the other hand, this hard-wired restriction may damage to the purpose of
 this module; that enables to handle a query on multiple nodes in parallel.
 I'm not sure whether it is right design that is_foreign_expr() returns false
 when the supplied expr contains mutable functions.
 
 Probably, here are two perspectives. The one want to make sure functions works
 with same manner in all nodes. The other want to distribute processing
 of queries
 as possible as we can. Here is a trade-off between these two perspectives.
 So, how about an idea to add a guc variable to control the criteria of
 pushing-down.

I agree that allowing users to control which function/operator should be
pushed down is useful, but GUC seems too large as unit of switching
behavior.  Routine Mapping, a mechanism which is defined in SQL/MED
standard, would be the answer for this issue.  It can be used to map a
local routine (a procedure or a function) to something on a foreign
server.  It is like user mapping, but it has mapping name.  Probably it
would have these attributes:

pg_catalog.pg_routine_mapping
rmname  name
rmprocidregproc
rmserverid  oid
rmfdwoptionstext[]

If we have routine mapping, FDW authors can provide default mappings
within extension installation, and users can customize them.  Maybe FDWs
will want to push down only functions/operators which have routine
mapping entries, so providing common routine which returns mapping
information of given function/operator, say GetRoutineMapping(procid,
serverid), is useful.

Unfortunately we don't have it at the moment, I'll fix pgsql_fdw so that
it pushes down only built-in operators, including scalar-array operators.

Regards,
-- 
Shigeru Hanada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] splitting plpython into smaller parts

2011-11-28 Thread Jan Urbański

On 28/11/11 11:00, Greg Smith wrote:

On 11/13/2011 09:45 AM, Jan Urbański wrote:

The second one is the actual split. plpython.c has been split into 11
separate files and one header.


Could you comment a bit more about what the goal of this is? We don't
have a reviewer for this patch yet, and I think part of the reason is
because it's not really obvious what it's supposed to be doing, and why
that's useful.


The idea of splitting plpython.c (an almost 5k lines file) into 
something more modular.


It's been floated around here:

http://postgresql.1045698.n5.nabble.com/Large-C-files-tt4766446.html#a4773493

and I think at other occasions, too.

The patch introduces no behavioural changes, it's only shuffling code 
around. The only goal is to improve the maintainability.


I guess the reviewer could verify that a) I haven't botched the split 
and it all still compiles and workds b) the choice of which modules were 
defined is correct


Cheers,
Jan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server

2011-11-28 Thread Shigeru Hanada
Hi Fujita-san,

(2011/11/25 17:27), Etsuro Fujita wrote:
 I'm still under reviewing, so the following is not all.  I'm sorry.
 estimate_costs() have been implemented to ask a remote postgres server
 for the result of EXPLAIN for a remote query to get its costs such as
 startup_cost and total_cost.  I think this approach is the most accurate
 way to get its costs.  However, I think it would be rather costly.  And
 I'm afraid of that it might work only for pgsql_fdw.

Indeed.  In addition, this approach assumes that cost factors of target
PG server are same as local's ones.  pgsql_fdw might have to have cost
factors as FDW options of foreign server.

  Because, even if we
 are able to obtain such a cost information by EXPLAINing a remote query
 at a remote server where a DBMS different from postgres runs, it might
 be difficult to incorporate such a cost information with the postgres
 cost model due to their possible inconsistency that such a cost
 information provided by the EXPLAIN command in the other DBMS might have
 different meanings (or different scales) from that provided by the
 EXPLAIN command in postgres.

Yes, so implementing cost estimation for other DBMSs accurately would be
very difficult, but AFAIS rows estimation is the most important factor,
so reasonable row count and relatively high startup cost would produce
not-so-bad plan.

   So, I think it might be better to estimate
 such costs by pgsql_fdw itself without EXPLAINing on the assumption that
 a remote postgres server has the same abilities for query optimization,
 which is less costly and widely applicable to the other DBMSs, while it,
 of course, only works once we have statistics and/or index information
 for foreign tables.  But AFAIK we eventually want to have those, so I'd
 like to propose to use the proposed approach until that time.

Knowledge of foreign indexes also provide information of sort order.
Planner will be able to consider merge join without local sort with such
information.  Without foreign index, we have to enumerate possible sort
keys with Blute-Force approach for same result, as mentioned by
Itagaki-san before.

Regards,
-- 
Shigeru Hanada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Disable OpenSSL compression

2011-11-28 Thread Magnus Hagander
On Thu, Nov 17, 2011 at 10:11, Albe Laurenz laurenz.a...@wien.gv.at wrote:
 I wrote:
 Here it is. I'll add it to the November commitfest.

 Here is the second version.

 I realized that it is better to set the option on the SSL object
 and not on the SSL context so that it is possible to change it
 per connection.

 I also improved the documentation.

Thanks, applied.

I changed the documentation around a bit further - I think it's easier
to read now.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Perl xsubpp

2011-11-28 Thread Andrew Dunstan



On 11/27/2011 10:30 PM, Mr. Aaron W. Swenson wrote:

On Sun, Nov 27, 2011 at 12:12:41PM -0800, David E. Wheeler wrote:

On Nov 27, 2011, at 6:11 AM, Andrew Dunstan wrote:


Has this been backpatched as well?

It has been to 9.1.

There may be a simple workaround, but it's non-obvious. I think it should be 
back-patched all the way.

Best,

David

That's my vote, too. It's preventing users of all versions from compiling
against ExtUtils-ParseXS-3.20.0.




OK, it's done.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql line number reporting from stdin

2011-11-28 Thread Robert Haas
On Sat, Nov 26, 2011 at 3:36 PM, Peter Eisentraut pete...@gmx.net wrote:
 There is a long-standing oddity in psql that running

 psql -f foo.sql

 returns error messages with file name and line number, like

 psql:foo.sql:1: ERROR:  syntax error at or near foo

 but running

 psql  foo.sql does not.  I suggest we change the latter to print

 psql:stdin:1: ERROR:  syntax error at or near foo

 Other examples for the use of the spelling stdin in this context
 include gcc and slonik.

 Error messages printed in interactive mode will not be affected, of
 course.

 Patch attached.

Seems like a good idea to me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] odbc_fdw

2011-11-28 Thread Robert Haas
On Sat, Nov 26, 2011 at 7:20 AM, Florian Schwendener fschw...@hsr.ch wrote:
 Hi there!

 I built the current PostgreSQL 9.1.1 sources under Ubuntu 11.04 (in a VMware
 under Win7).
 I followed the steps in this guide:
 www.thegeekstuff.com/2009/04/linux-postgresql-install-and-configure-from-source

 It seems to work (I can run the server and connect to it with PgAdmin).

 Now I'd like to integrate the ODBC_FDW extension in my installation. Sadly,
 the make file throws errors (no target named ... specified).
 Is there any way to do this in a simple way?

 I'm a linux newbie, by the way...

 Thank you for your help!

You might want to try this question on pgsql-general or pgsql-novice
rather than here; this is a list for discussing the development of
PostgreSQL itself.

I think you'll find that no one can help you much based on the
information you've provided here; you'll need to say exactly what you
did and exactly what error message you got.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql line number reporting from stdin

2011-11-28 Thread Gurjeet Singh
On Mon, Nov 28, 2011 at 8:10 AM, Robert Haas robertmh...@gmail.com wrote:

 On Sat, Nov 26, 2011 at 3:36 PM, Peter Eisentraut pete...@gmx.net wrote:
  There is a long-standing oddity in psql that running
 
  psql -f foo.sql
 
  returns error messages with file name and line number, like
 
  psql:foo.sql:1: ERROR:  syntax error at or near foo
 
  but running
 
  psql  foo.sql does not.  I suggest we change the latter to print
 
  psql:stdin:1: ERROR:  syntax error at or near foo
 
  Other examples for the use of the spelling stdin in this context
  include gcc and slonik.
 
  Error messages printed in interactive mode will not be affected, of
  course.
 
  Patch attached.

 Seems like a good idea to me.


Naysayers can always make a case for backwards-compatibility, or not
breaking the scripts written with the existing behaviour in mind. Do our
docs have anything to say about scripts executed from stdin?

-- 
Gurjeet Singh
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Patch: add timing of buffer I/O requests

2011-11-28 Thread Robert Haas
On Mon, Nov 28, 2011 at 2:54 AM, Greg Smith g...@2ndquadrant.com wrote:
 The real problem with this whole area is that we know there are
 systems floating around where the amount of time taken to grab timestamps
 like this is just terrible.

Assuming the feature is off by default (and I can't imagine we'd
consider anything else), I don't see why that should be cause for
concern.  If the instrumentation creates too much system load, then
don't use it: simple as that.  A more interesting question is how
much load does this feature create even when it's turned off?.

The other big problem for a patch of this sort is that it would bloat
the stats file.  I think we really need to come up with a more
scalable alternative to the current system, but I haven't looked the
current system in enough detail to have a clear feeling about what
such an alternative would look like.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql line number reporting from stdin

2011-11-28 Thread Robert Haas
On Mon, Nov 28, 2011 at 8:49 AM, Gurjeet Singh singh.gurj...@gmail.com wrote:
 Naysayers can always make a case for backwards-compatibility, or not
 breaking the scripts written with the existing behaviour in mind.

I'm having a hard time imagining how this could break anything.  What
scenario did you have in mind?

 Do our
 docs have anything to say about scripts executed from stdin?

If they do, we can always update them.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server

2011-11-28 Thread Robert Haas
2011/11/28 Shigeru Hanada shigeru.han...@gmail.com:
 I agree that allowing users to control which function/operator should be
 pushed down is useful, but GUC seems too large as unit of switching
 behavior.  Routine Mapping, a mechanism which is defined in SQL/MED
 standard, would be the answer for this issue.  It can be used to map a
 local routine (a procedure or a function) to something on a foreign
 server.  It is like user mapping, but it has mapping name.  Probably it
 would have these attributes:

 pg_catalog.pg_routine_mapping
    rmname              name
    rmprocid            regproc
    rmserverid          oid
    rmfdwoptions        text[]

 If we have routine mapping, FDW authors can provide default mappings
 within extension installation, and users can customize them.  Maybe FDWs
 will want to push down only functions/operators which have routine
 mapping entries, so providing common routine which returns mapping
 information of given function/operator, say GetRoutineMapping(procid,
 serverid), is useful.

 Unfortunately we don't have it at the moment, I'll fix pgsql_fdw so that
 it pushes down only built-in operators, including scalar-array operators.

One difficulty here is that even very simple operators don't
necessarily mean the same thing on both sides.  In my last job we had
a Microsoft SQL database where string equality was case insensitive,
and a PostgreSQL database where it wasn't.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Prep object creation hooks, and related sepgsql updates

2011-11-28 Thread Kohei KaiGai
2011/11/27 Dimitri Fontaine dimi...@2ndquadrant.fr:
 And, it seems to me the current proposition of the command trigger
 does not support to fire triggers on creation of databases, although
 permission checks requires Oid of source database that is not also
 appeared in pg_database catalog.

 I have to have a look at what forbids us to add support for the create
 database command here.  It seems to be just another branch of the switch
 in standard_ProcessUtility().

I'm not sure what arguments shall be provided to the guest of command
triggers. And, I'd like to mention about its specification should not depend
on details of particular modules; according to the previous discussion.

After the long discussion, one concensus is that prep-creation hook shall
be deployed around existing DAC checks and it takes arguments that are
also referenced at the existing DAC; such as oid of source database on
creation of databases.
I also checked security model between DAC and MAC, and I concluded
most of them takes common information to make its decision.

It is a hard to answer question whether we can implement sepgsql on the
upcoming command trigger feature; without information about where its
hooks shall be deployed and what arguments are provided. :-(

 I don't think schemaname+objectname fails to be unique, so I don't think
 you need another kind of Oid in BEFORE creation triggers here.

 The pg_seclabel and pg_shseclabel needs OID to assign a security label
 on a particular database object, so label provider (sepgsql) must know
 Oid of the target object on assignment time.

 Yes, and you need to refer to things you did in the BEFORE trigger from
 the AFTER trigger, I'm just offering you a way to do that.  Then if you
 need the Oid in the AFTER trigger, of course you have it.

How does it inherit an opaque private initialized at BEFORE trigger to
AFTER trigger? I checked your patch, however, it seems to me it does
not have a mechanism to deliver something between BEFORE and AFTER.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Prep object creation hooks, and related sepgsql updates

2011-11-28 Thread Dimitri Fontaine
Kohei KaiGai kai...@kaigai.gr.jp writes:
 How does it inherit an opaque private initialized at BEFORE trigger to
 AFTER trigger? I checked your patch, however, it seems to me it does
 not have a mechanism to deliver something between BEFORE and AFTER.

Right, there's no such facility provided in there.  But it seems to me
that your extension could attach some shared memory or use other
mechanisms to handle that on its own?

I'm not trying to force you into using command triggers, just to
determine if that's a road we are able to take.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: add timing of buffer I/O requests

2011-11-28 Thread Greg Stark
On Nov 28, 2011 8:55 AM, Greg Smith g...@2ndquadrant.com wrote:

 On 11/27/2011 04:39 PM, Ants Aasma wrote:

 On the AMD I saw about 3% performance drop with timing enabled. On the
 Intel machine I couldn't measure any statistically significant change.


 Oh no, it's party pooper time again.  Sorry I have to be the one to do it
this round.  The real problem with this whole area is that we know there
are systems floating around where the amount of time taken to grab
timestamps like this is just terrible.

I believe on most systems on modern linux kernels gettimeofday an its ilk
will be a vsyscall and nearly as fast as a regular function call.


 I recall a patch similar to this one was submitted by Greg Stark some
time ago.  It used the info for different reasons--to try and figure out
whether reads were cached or not--but I believe it withered rather than
being implemented mainly because it ran into the same fundamental
roadblocks here.  My memory could be wrong here, there were also concerns
about what the data would be used for.

I speculated about doing that but never did. I had an experimental patch
using mincore to do what you describe but it wasn't intended for production
code I think. The only real patch was to use getrusage which I still intend
to commit but it doesn't tell you the time spent in I/O -- though it does
tell you the sys time which should be similar.


[HACKERS] Add minor version to v3 protocol to allow changes without breaking backwards compatibility

2011-11-28 Thread Mikko Tiihonen

Hi,

As discussed few days ago it would be beneficial if we could change the v3 
backend-client protocol without breaking backwards compatibility.

Here is a working patch that exports a supported_binary_minor constant and a binary_minor session variable and a that can be used by clients to enable newer 
features.


I also added an example usage where the array encoding for constant size 
elements is optimized if binary_minor version is new enough.

I have coded the client side support for binary_minor for jdbc driver and 
tested that it worked. But lets first discuss if this is an acceptable path 
forward.

On 11/25/2011 02:20 AM, Oliver Jowett wrote:
 Re list vs. always-incrementing minor version, you could just use an
 integer and set bits to represent features, which would keep it simple
 but also let clients be more selective about which features they
 implement (you could support feature 21 and 23 without supporting 22)

I decided not to use a feature flag because when features start to depend on each other we need multiple negotiation round trips until the final feature set can 
be known.
If in your example above the feature 23 depends on server side on feature 22, but the client only requests 21,23. The server must inform back that combination 
21,23 is reduced to 21. And if then the client can not support 21 without 23 the final feature set will not contain 21 or 23.


-Mikko
commit d7ef5f1aef0941ec4b931f24745b65d77e7511e4
Author: Mikko Tiihonen mikko.tiiho...@nitor.fi
Date:   Sun Nov 27 14:12:59 2011 +0200

Define binary_minor variable to control binary protocol minor version

diff --git a/src/backend/utils/adt/version.c b/src/backend/utils/adt/version.c
index 6ea5bd2..33ed4d3 100644
--- a/src/backend/utils/adt/version.c
+++ b/src/backend/utils/adt/version.c
@@ -16,6 +16,7 @@
 
 #include utils/builtins.h
 
+int binary_minor = 0;
 
 Datum
 pgsql_version(PG_FUNCTION_ARGS)
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index da7b6d4..67e80f1 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -64,6 +64,7 @@
 #include storage/predicate.h
 #include tcop/tcopprot.h
 #include tsearch/ts_cache.h
+#include utils/binaryminorversion.h
 #include utils/builtins.h
 #include utils/bytea.h
 #include utils/guc_tables.h
@@ -2053,6 +2054,18 @@ static struct config_int ConfigureNamesInt[] =
},
 
{
+   {binary_minor, PGC_USERSET, CLIENT_CONN_LOCALE,
+   gettext_noop(Sets the binary protocol minor version 
that controls enabling
+of newer features.),
+   gettext_noop(The default value is 0 which uses the 
binary protocol features
+as specified in postgres 9.1 
release.)
+   },
+   binary_minor,
+   0, 0, SUPPORTED_BINARY_MINOR_VERSION,
+   NULL, NULL, NULL
+   },
+
+   {
{log_min_duration_statement, PGC_SUSET, LOGGING_WHEN,
gettext_noop(Sets the minimum execution time above 
which 
 statements will be logged.),
diff --git a/src/include/utils/binaryminorversion.h 
b/src/include/utils/binaryminorversion.h
new file mode 100644
index 000..40ba970
--- /dev/null
+++ b/src/include/utils/binaryminorversion.h
@@ -0,0 +1,32 @@
+/*-
+ *
+ * binaryminorversion.h
+ *   Binary protocol encoding minor version number for PostgreSQL.
+ *
+ * The catalog version number is used to flag incompatible changes in
+ * the PostgreSQL v3 binary protocol.  Whenever anyone changes the
+ * format of binary protocol, or modifies the standard types in such a
+ * way that an updated client wouldn't work with an old database
+ * (or vice versa), the binary protocol encoding version number
+ * should be changed.
+ *
+ * The point of this feature is to provide a way to allow introducing
+ * small new features to the binary encoding of types or the actual
+ * v3 protocol while still allowing full backward compatibility with
+ * old clients. The client can be an application using postgres,
+ * any tool using COPY BINARY or another postgres backend using 
+ * replication (TODO: does this affect replication?).
+ *
+ * Portions Copyright (c) 1996-2011, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/util/binprotoversion.h
+ *
+ *-
+ */
+#ifndef BINARYMINORVERSION_H
+#define BINARYMINORVERSION_H
+
+#define SUPPORTED_BINARY_MINOR_VERSION 1
+
+#endif
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 47a1412..c201d66 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -461,6 +461,8 @@ extern Datum 

Re: [HACKERS] Patch: add timing of buffer I/O requests

2011-11-28 Thread Tomas Vondra
On 28 Listopad 2011, 15:40, Greg Stark wrote:
 On Nov 28, 2011 8:55 AM, Greg Smith g...@2ndquadrant.com wrote:

 On 11/27/2011 04:39 PM, Ants Aasma wrote:

 On the AMD I saw about 3% performance drop with timing enabled. On the
 Intel machine I couldn't measure any statistically significant change.


 Oh no, it's party pooper time again.  Sorry I have to be the one to do
 it
 this round.  The real problem with this whole area is that we know there
 are systems floating around where the amount of time taken to grab
 timestamps like this is just terrible.

 I believe on most systems on modern linux kernels gettimeofday an its ilk
 will be a vsyscall and nearly as fast as a regular function call.

AFAIK a vsyscall should be faster than a regular syscall. It does not need
to switch to kernel space at all, it just reads the data from a shared
page. The problem is that this is Linux-specific - for example FreeBSD
does not have vsyscall at all (it's actually one of the Linux-isms
mentioned here: http://wiki.freebsd.org/AvoidingLinuxisms).

There's also something called VDSO, that (among other things) uses
vsyscall if availabe, or the best implementation available. So there are
platforms that do not provide vsyscall, and in that case it'd be just as
slow as a regular syscall :(

I wouldn't expect a patch that works fine on Linux but not on other
platforms to be accepted, unless there's a compile-time configure switch
(--with-timings) that'd allow to disable that.

Another option would be to reimplement the vsyscall, even on platforms
that don't provide it. The principle is actually quite simple - allocate a
shared memory, store there a current time and update it whenever a clock
interrupt happens. This is basically what Greg suggested in one of the
previous posts, where regularly means on every interrupt. Greg was
worried about the precision, but this should be just fine I guess. It's
the precision you get on Linux, anyway ...

 I recall a patch similar to this one was submitted by Greg Stark some
 time ago.  It used the info for different reasons--to try and figure out
 whether reads were cached or not--but I believe it withered rather than
 being implemented mainly because it ran into the same fundamental
 roadblocks here.  My memory could be wrong here, there were also concerns
 about what the data would be used for.

The difficulty when distinguishing whether the reads were cached or not is
the price we pay for using filesystem cache instead of managing our own.
Not sure if this can be solved just by measuring the latency - with
spinners it's quite easy, the differences are rather huge (and it's not
difficult to derive that even from pgbench log). But with SSDs, multiple
tablespaces on different storage, etc. it gets much harder.

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: add timing of buffer I/O requests

2011-11-28 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Nov 28, 2011 at 2:54 AM, Greg Smith g...@2ndquadrant.com wrote:
 The real problem with this whole area is that we know there are
 systems floating around where the amount of time taken to grab timestamps
 like this is just terrible.

 Assuming the feature is off by default (and I can't imagine we'd
 consider anything else), I don't see why that should be cause for
 concern.  If the instrumentation creates too much system load, then
 don't use it: simple as that.  A more interesting question is how
 much load does this feature create even when it's turned off?.

Right.  I see that the code already has a switch to skip the
gettimeofday calls, so the objection is only problematic if the added
overhead is significant even with the switch off.  I would worry mainly
about the added time/space to deal with the extra stats counters.

 The other big problem for a patch of this sort is that it would bloat
 the stats file.

Yes.  Which begs the question of why we need to measure this per-table.
I would think per-tablespace would be sufficient.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Prep object creation hooks, and related sepgsql updates

2011-11-28 Thread Kohei KaiGai
2011/11/28 Dimitri Fontaine dimi...@2ndquadrant.fr:
 Kohei KaiGai kai...@kaigai.gr.jp writes:
 How does it inherit an opaque private initialized at BEFORE trigger to
 AFTER trigger? I checked your patch, however, it seems to me it does
 not have a mechanism to deliver something between BEFORE and AFTER.

 Right, there's no such facility provided in there.  But it seems to me
 that your extension could attach some shared memory or use other
 mechanisms to handle that on its own?

Hmm. If extension side manage the contextual information by itself, it seems
to me feasible, although it is a bit bother.
I found up a similar idea that acquires control on ProcessUtility_hook and
save necessary contextual information on auto variable then kicks the
original ProcessUtility_hook, then it reference the contextual information
from object_access_hook.

For example, we don't want to apply permission checks on new relations
constructed with make_new_heap. It shall be invoked when CLUSTER,
VACUUM or ALTER TABLE, so we can skip permission checks when
the saved command tag indicates these commands are currently running.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql line number reporting from stdin

2011-11-28 Thread Gurjeet Singh
On Mon, Nov 28, 2011 at 8:55 AM, Robert Haas robertmh...@gmail.com wrote:

 On Mon, Nov 28, 2011 at 8:49 AM, Gurjeet Singh singh.gurj...@gmail.com
 wrote:
  Naysayers can always make a case


Should've added that I'm not one of them :)

+1 from me on the improvement.


 for backwards-compatibility, or not
  breaking the scripts written with the existing behaviour in mind.

 I'm having a hard time imagining how this could break anything.  What
 scenario did you have in mind?


Probably parsing the lines that start with 'ERROR' to report that there
were errors in the script.



  Do our
  docs have anything to say about scripts executed from stdin?

 If they do, we can always update them.


At the cost of breaking existing scripts (which I am not sure is the case).

Regards,
-- 
Gurjeet Singh
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Patch: Perl xsubpp

2011-11-28 Thread David E. Wheeler
On Nov 28, 2011, at 4:56 AM, Andrew Dunstan wrote:

 OK, it's done.

Andrew++ Thanks!

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql line number reporting from stdin

2011-11-28 Thread Alvaro Herrera

Excerpts from Peter Eisentraut's message of sáb nov 26 17:36:15 -0300 2011:
 There is a long-standing oddity in psql that running
 
 psql -f foo.sql
 
 returns error messages with file name and line number, like
 
 psql:foo.sql:1: ERROR:  syntax error at or near foo
 
 but running
 
 psql  foo.sql does not.  I suggest we change the latter to print
 
 psql:stdin:1: ERROR:  syntax error at or near foo

Not that I have ever used psql in this way, but this format is compatible
with Vim quickfix whereas the old one is not (not sure what Emacs
people would call this).  Presumably, this being useless with stdin as
a file name is the reason this wasn't implemented in the first place.

+1 on the change.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: add timing of buffer I/O requests

2011-11-28 Thread Martijn van Oosterhout
On Sun, Nov 27, 2011 at 11:54:38PM -0800, Greg Smith wrote:
 On 11/27/2011 04:39 PM, Ants Aasma wrote:
 On the AMD I saw about 3% performance drop with timing enabled. On the
 Intel machine I couldn't measure any statistically significant change.
 
 Oh no, it's party pooper time again.  Sorry I have to be the one to
 do it this round.  The real problem with this whole area is that we
 know there are systems floating around where the amount of time
 taken to grab timestamps like this is just terrible.  I've been
 annoyed enough by that problem to spend some time digging into why
 that is--seems to be a bunch of trivia around the multiple ways to
 collect time info on x86 systems--and after this CommitFest is over

Something good to know: in Linux the file
/sys/devices/system/clocksource/clocksource0/current_clocksource
lists the current clock source, and
/sys/devices/system/clocksource/clocksource0/available_clocksource
lists the available clock sources. With cat you can switch them. That
way you may be able to quantify the effects on a single machine.

Learned the hard way while tracking clock-skew on a multicore system. 
The hpet may not be the fastest (that would be the cpu timer), but it's
the fastest (IME) that gives guarenteed monotonic time.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] Replace a long chain of if's in eval_const_expressions_mutator by a switch()

2011-11-28 Thread Tom Lane
Andres Freund and...@anarazel.de writes:
 On Friday, November 18, 2011 10:14:22 PM Tom Lane wrote:
 Andres Freund and...@anarazel.de writes:
 Replacing the if chain with if; else if; ... resulted in a small
 speedup. Replacing it with a switch() in a bigger one.

 Cool, but this patch is impossible to validate by eye.  Could you
 resubmit a version that doesn't reindent unchanged code?  Leave it
 for pgindent to clean that up later.

 Sure. It was just to confusing reading the code without reindenting.
 Btw, I found git diff/show/blame -w very useful to view reindent code.

Applied, thanks.  Bruce, would you mind running pgindent on
just src/backend/optimizer/util/clauses.c ?  That seems safer than
reindenting manually.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoiding repeated snapshot computation

2011-11-28 Thread Gurjeet Singh
On Sat, Nov 26, 2011 at 6:51 PM, Andres Freund and...@anarazel.de wrote:

 On Saturday, November 26, 2011 11:39:23 PM Robert Haas wrote:
  On Sat, Nov 26, 2011 at 5:28 PM, Andres Freund and...@anarazel.de
 wrote:
   On Saturday, November 26, 2011 09:52:17 PM Tom Lane wrote:
   I'd just as soon keep the fields in a logical order.
  
   Btw, I don't think the new order is necessarily worse than the old one.
 
  You forget to attach the benchmark results.
 
  My impression is that cache lines on modern hardware are 64 or 128
  *bytes*, in which case this wouldn't matter a bit.
 
  But testing is even better than guessing.
 Being prodded like that I ran a very quick benchmark on my workstation.
 Unfortunately that means I cannot work during the time which is why I kept
 it
 rather short...

 That machine has 2 E5520@2.27GHz which means 2(sockets) * 4(cores) *
 2(threads) and 24GB of ram.

 Data was initialized with: pgbench -h /tmp/ --unlogged-tables -i -s 20
 pgbench


 pgbench -h /tmp/ pgbench -S -j 16 -c 16 -T 60

 origsnap:   92825.743958 93145.110901 93389.915474 93175.482351
 reordersnap:93560.183272 93613.333494 93495.263012 93523.368489

 pgbench -h /tmp/ pgbench -S -j 32 -c 32 -T 60

 origsnap:   81846.743329 81545.175672 81702.755226 81576.576435
 81228.154119 81546.047708 81421.436262
 reordersnap:81823.479196 81787.784508 81820.242145 81790.263415
 81762.421592 81496.333144 81732.088876

 At that point I noticed I had accidentally run with a nearly stock
 config...
 An even shorter run with a more approrioate config yielded:


 pgbench -h /tmp/ pgbench -S -j 32 -c 32 -T 20

 origsnap:   102234.664020 102003.449741 102119.509053 101722.410387
 101973.651318 102056.440561
 reordersnap:103444.877879 103385.08 103302.318923 103372.659486
 103330.157612 103313.833821



 Looks like a win to me. Even on this comparatively small machine.


This may not be necessary, but can you please share the modified config you
used for the last run.

I tabulated your last results to make it more readable, and added columns
to show the improvement.

origsnap reordersnap  diff   %age improvement
--
102234.66402 103444.8778791210.2138591.1837607827
102003.449741103385.081382.4390671.3552865815
102119.509053103302.3189231182.80987 1.1582604352
101722.410387103372.6594861650.2490991.6223063263
101973.651318103330.1576121356.5062941.3302517626
102056.440561103313.8338211257.39326 1.2320567454

That looks like a win to me too. We're getting a little over 1% improvement
for free!

Maybe submitting this patch to the commitfest might help get some serious
consideration from a reviewer.

Regards,
-- 
Gurjeet Singh
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] patch for type privileges

2011-11-28 Thread Merlin Moncure
On Tue, Nov 15, 2011 at 2:23 PM, Peter Eisentraut pete...@gmx.net wrote:
 The basics here are mainly informed by the SQL standard.  One thing from
 there I did not implement is checking for permission of a type used in
 CAST (foo AS type).  This would be doable but relatively complicated,
 and in practice someone how is not supposed to be able to use the type
 wouldn't be able to create the cast or the underlying cast function
 anyway for lack of access to the type.

I'm not quite following that: with your patch are you or are you not
prohibited from utilizing casts?  In other words, if you didn't have
USAGE priv, what would happen if you tried this:

CREATE VIEW v AS SELECT null::restricted_type::text; ?

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Prep object creation hooks, and related sepgsql updates

2011-11-28 Thread Dimitri Fontaine
Kohei KaiGai kai...@kaigai.gr.jp writes:
 I found up a similar idea that acquires control on ProcessUtility_hook and
 save necessary contextual information on auto variable then kicks the
 original ProcessUtility_hook, then it reference the contextual information
 from object_access_hook.

In this case that would be an INSTEAD OF trigger, from which you can
call the original command with EXECUTE. You just have to protect
yourself against infinite recursion, but that's doable. See attached
example.

 For example, we don't want to apply permission checks on new relations
 constructed with make_new_heap. It shall be invoked when CLUSTER,
 VACUUM or ALTER TABLE, so we can skip permission checks when
 the saved command tag indicates these commands are currently running.

CREATE TRIGGER se_permission_checks
INSTEAD OF COMMAND ALTER TABLE
   EXECUTE PROCEDURE se_permission_checks_alter_table();

In this INSTEAD OF trigger, protect against recursion, EXECUTE the
original ALTER TABLE statement which is given to you as a parameter,
enable the command trigger again.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support



cmdtrigger.ext.sql
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] odbc_fdw

2011-11-28 Thread Stefan Keller
Hi Robert

2011/11/28 Robert Haas robertmh...@gmail.com wrote:
 You might want to try this question on pgsql-general or pgsql-novice
 rather than here; this is a list for discussing the development of
 PostgreSQL itself.

Thanks for the hint.

It was actually my advice to post this question here. A quick search
retrieves mostly (unanswered) postings which report problems compiling
the ODBC_FDW extension (e.g. [1]).  My posting to psql-geneal is still
orphaned [2]?. Then I talked to some colleagues of the steering
committee of PGConf.DE recently and they confirmed that Foreign Data
Wrappers (FDW) probably are not stable.

So, I hope we finally find some FDW users and developpers over there
at pgsql-general  :-

Yours, Stefan

[1] http://postgresql.1045698.n5.nabble.com/Problem-with-odbc-fdw-td4908875.html
[2] 
http://postgresql.1045698.n5.nabble.com/Integration-of-PostgresSQL-and-MongoDB-Any-Foreign-Data-Wrappers-SQL-MED-td4900771.html


2011/11/28 Robert Haas robertmh...@gmail.com:
 On Sat, Nov 26, 2011 at 7:20 AM, Florian Schwendener fschw...@hsr.ch wrote:
 Hi there!

 I built the current PostgreSQL 9.1.1 sources under Ubuntu 11.04 (in a VMware
 under Win7).
 I followed the steps in this guide:
 www.thegeekstuff.com/2009/04/linux-postgresql-install-and-configure-from-source

 It seems to work (I can run the server and connect to it with PgAdmin).

 Now I'd like to integrate the ODBC_FDW extension in my installation. Sadly,
 the make file throws errors (no target named ... specified).
 Is there any way to do this in a simple way?

 I'm a linux newbie, by the way...

 Thank you for your help!

 You might want to try this question on pgsql-general or pgsql-novice
 rather than here; this is a list for discussing the development of
 PostgreSQL itself.

 I think you'll find that no one can help you much based on the
 information you've provided here; you'll need to say exactly what you
 did and exactly what error message you got.

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: add timing of buffer I/O requests

2011-11-28 Thread Dimitri Fontaine
Tomas Vondra t...@fuzzy.cz writes:
 Another option would be to reimplement the vsyscall, even on platforms
 that don't provide it. The principle is actually quite simple - allocate a
 shared memory, store there a current time and update it whenever a clock
 interrupt happens. This is basically what Greg suggested in one of the
 previous posts, where regularly means on every interrupt. Greg was
 worried about the precision, but this should be just fine I guess. It's
 the precision you get on Linux, anyway ...

That sounds good for other interesting things, which entails being able
to have timing information attached to the XID sequence.  If we go this
way, how far are we from having a ticker in PostgreSQL?

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade automatic testing

2011-11-28 Thread Andrew Dunstan



On 11/27/2011 06:17 PM, Tom Lane wrote:

Peter Eisentrautpete...@gmx.net  writes:

I've committed it now, and some buildfarm members are failing with lack
of shared memory, semaphores, or disk space.  Don't know what to do with
that or why so many are failing like that.  We could create a way to
omit the test if it becomes a problem.

I believe the issue is that those BF members have kernel settings that
only support running one postmaster at a time.  The way you've got this
set up, it launches a new private postmaster during a make installcheck;
which is not only problematic from a resource consumption standpoint,
but seems to me to violate the spirit of make installcheck, because
what it's testing is not the installed postmaster but a local instance.

Can you confine the test to only occur in make check mode, not make
installcheck, please?




Another thing that's annoying about this is that it doesn't give you any 
idea of how it's failing if there's a database difference. All we get is:


   Files 
/home/pgrunner/bf/root/HEAD/pgsql.3188/contrib/pg_upgrade/tmp_check/dump1.sql 
and 
/home/pgrunner/bf/root/HEAD/pgsql.3188/contrib/pg_upgrade/tmp_check/dump2.sql 
differ


See 
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=frogmouthdt=2011-11-28%2019%3A30%3A03 
for an example. For buildfarm purposes this is pretty low grade info, ISTM.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strange nbtree corruption report

2011-11-28 Thread Alvaro Herrera

Excerpts from Tom Lane's message of mar nov 22 01:14:33 -0300 2011:
 Alvaro Herrera alvhe...@alvh.no-ip.org writes:
  We got a very strange nbtree corruption report some time ago.  This was
  a btree index on a vey high churn table -- entries are updated and
  deleted very quickly, so the index grows very large and also shrinks
  quickly (AFAICT this is a work queue of sorts).
 
  The most strange thing of all is that there was this error:
 
  ERROR:  left link changed unexpectedly in block 3378 of index index_name 
  CONTEXT:  automatic vacuum of table table_name
 
  This was reported not once, but several dozens of times, by each new
  autovacuum worker that tried to vacuum the table.
 
  As far as I can see, there is just no way for this to happen ... much
  less happen repeatedly.
 
 It's not hard to believe that that would happen repeatedly given a
 corrupted set of sibling links, eg deletable page A links left to page
 B, which links right to C, which links right to A.  The question is how
 the index got into such a state.  A dropped update during a page split
 would explain it (ie, B used to be A's left sibling, then at some point
 B got split into B and C, but A's left-link never got updated on disk).
 I wonder how reliable their disk+filesystem is ...

While summarizing this, a (relatively) frequent problem with unique
indexes came to my mind: there would be a UNIQUE index but when the
admin tries to rebuild it for whatever reason, duplicate values are
found.  We've seen dozens of reports of this kind of problem (in the
pgsql lists I mean -- I don't think we've seen this problem in this
customer's servers).  I wonder if it's related, because it seems pretty
much the same mechanism: sometimes, a btree index insert would be
randomly forgotten (its page write lost in vacuum, so to speak), and
thus when the second heap item comes along, there's no entry in the
index and the insert completes, and there you have your duplicate value.

I wonder if it would be worthwhile to build some sort of tool to scan
the heap and ensure that there are index entries for all heap items,
just to test the hypothesis.  Not that this would enlighten on the
source of the actual problem.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Replace a long chain of if's in eval_const_expressions_mutator by a switch()

2011-11-28 Thread Bruce Momjian
Tom Lane wrote:
 Andres Freund and...@anarazel.de writes:
  On Friday, November 18, 2011 10:14:22 PM Tom Lane wrote:
  Andres Freund and...@anarazel.de writes:
  Replacing the if chain with if; else if; ... resulted in a small
  speedup. Replacing it with a switch() in a bigger one.
 
  Cool, but this patch is impossible to validate by eye.  Could you
  resubmit a version that doesn't reindent unchanged code?  Leave it
  for pgindent to clean that up later.
 
  Sure. It was just to confusing reading the code without reindenting.
  Btw, I found git diff/show/blame -w very useful to view reindent code.
 
 Applied, thanks.  Bruce, would you mind running pgindent on
 just src/backend/optimizer/util/clauses.c ?  That seems safer than
 reindenting manually.

Done.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql line number reporting from stdin

2011-11-28 Thread Nathan Wagner

On Sat, 26 Nov 2011 22:36:15 +0200, Peter Eisentraut wrote:

There is a long-standing oddity in psql that running

psql -f foo.sql

returns error messages with file name and line number, like

psql:foo.sql:1: ERROR:  syntax error at or near foo

but running

psql  foo.sql does not.  I suggest we change the latter to print

psql:stdin:1: ERROR:  syntax error at or near foo

Other examples for the use of the spelling stdin in this context
include gcc and slonik.

Error messages printed in interactive mode will not be affected, of
course.

Patch attached.


No issue with the change itself, but the docs claim
that

the variant using the shell's input redirection is
(in theory) guaranteed to yield exactly the same output
you would have received had you entered everything by hand.

--
nw

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strange nbtree corruption report

2011-11-28 Thread Jim Nasby
On Nov 28, 2011, at 3:44 PM, Alvaro Herrera wrote:
 I wonder if it would be worthwhile to build some sort of tool to scan
 the heap and ensure that there are index entries for all heap items,
 just to test the hypothesis.  Not that this would enlighten on the
 source of the actual problem.

There was a project to create a sanity-checker for Postgres databases... did 
that ever go anywhere? It seems like this would be a good addition for that 
tool, if it exists.

In either case, I am all for better capabilities to detect data problems (I'm 
looking at you, block checksum project! ;)
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strange nbtree corruption report

2011-11-28 Thread Bruce Momjian
Jim Nasby wrote:
 On Nov 28, 2011, at 3:44 PM, Alvaro Herrera wrote:
  I wonder if it would be worthwhile to build some sort of tool to scan
  the heap and ensure that there are index entries for all heap items,
  just to test the hypothesis.  Not that this would enlighten on the
  source of the actual problem.
 
 There was a project to create a sanity-checker for Postgres databases...
 did that ever go anywhere? It seems like this would be a good addition
 for that tool, if it exists.

Not that I know of.

--
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: add timing of buffer I/O requests

2011-11-28 Thread Jim Nasby
On Nov 28, 2011, at 9:29 AM, Tomas Vondra wrote:
 I recall a patch similar to this one was submitted by Greg Stark some
 time ago.  It used the info for different reasons--to try and figure out
 whether reads were cached or not--but I believe it withered rather than
 being implemented mainly because it ran into the same fundamental
 roadblocks here.  My memory could be wrong here, there were also concerns
 about what the data would be used for.
 
 The difficulty when distinguishing whether the reads were cached or not is
 the price we pay for using filesystem cache instead of managing our own.
 Not sure if this can be solved just by measuring the latency - with
 spinners it's quite easy, the differences are rather huge (and it's not
 difficult to derive that even from pgbench log). But with SSDs, multiple
 tablespaces on different storage, etc. it gets much harder.

True, but every use case for this information I can think of ultimately only 
cares about how long it took to perform some kind of IO; it doesn't *really* 
care about whether it was cached. So in that context, we don't really care if 
SSDs are fast enough that they look like cache, because that means they're 
performing (essentially) the same as cache.
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: add timing of buffer I/O requests

2011-11-28 Thread Tomas Vondra
On 28.11.2011 22:32, Dimitri Fontaine wrote:
 Tomas Vondra t...@fuzzy.cz writes:
 Another option would be to reimplement the vsyscall, even on platforms
 that don't provide it. The principle is actually quite simple - allocate a
 shared memory, store there a current time and update it whenever a clock
 interrupt happens. This is basically what Greg suggested in one of the
 previous posts, where regularly means on every interrupt. Greg was
 worried about the precision, but this should be just fine I guess. It's
 the precision you get on Linux, anyway ...
 
 That sounds good for other interesting things, which entails being able
 to have timing information attached to the XID sequence.  If we go this
 way, how far are we from having a ticker in PostgreSQL?

I'm not sure. On Linux/x86 this is already done, but my knowledge of
kernel development is rather limited, especially when it comes to other
OSes and platforms. E.g. I'm not sure why it's not available in FreeBSD
on x86, I guess it's rather we don't want it than it's not possible.


In Linux sources, the most interesting pieces are probably these:

1) arch/x86/include/asm/vgtod.h - that's the shared memory structure

2) arch/x86/kernel/vsyscall_64.c - this is how the memory is read
   (do_vgettimeofday)

3) arch/x86/kernel/vsyscall_64.c - this is how the memory is updated
   (update_vsyscall)

4) kernel/time/timekeeping.c - do_settimeofday (calls update_vsyscall)

5) drivers/rtc/class.c (and other) RTC drivers call do_settimeofday


Tomas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: add timing of buffer I/O requests

2011-11-28 Thread Tomas Vondra
On 29.11.2011 02:14, Jim Nasby wrote:
 On Nov 28, 2011, at 9:29 AM, Tomas Vondra wrote:
 I recall a patch similar to this one was submitted by Greg
 Stark some
 time ago.  It used the info for different reasons--to try and
 figure out whether reads were cached or not--but I believe it
 withered rather than being implemented mainly because it ran into
 the same fundamental roadblocks here.  My memory could be wrong
 here, there were also concerns about what the data would be used
 for.
 
 The difficulty when distinguishing whether the reads were cached or
 not is the price we pay for using filesystem cache instead of
 managing our own. Not sure if this can be solved just by measuring
 the latency - with spinners it's quite easy, the differences are
 rather huge (and it's not difficult to derive that even from
 pgbench log). But with SSDs, multiple tablespaces on different
 storage, etc. it gets much harder.
 
 True, but every use case for this information I can think of
 ultimately only cares about how long it took to perform some kind of
 IO; it doesn't *really* care about whether it was cached. So in that
 context, we don't really care if SSDs are fast enough that they look
 like cache, because that means they're performing (essentially) the
 same as cache.

Yup, that's right. The wait times are generally much more interesting
than the cached/not cached ratio.

Tomas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: add timing of buffer I/O requests

2011-11-28 Thread Tom Lane
Dimitri Fontaine dimi...@2ndquadrant.fr writes:
 That sounds good for other interesting things, which entails being able
 to have timing information attached to the XID sequence.  If we go this
 way, how far are we from having a ticker in PostgreSQL?

Those of us who are trying to get rid of idle-state process wakeups will
protest any such thing with vigor.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: add timing of buffer I/O requests

2011-11-28 Thread Ants Aasma
Sorry for taking so long to respong, had a pretty busy day at work. Anyway..

On Mon, Nov 28, 2011 at 9:54 AM, Greg Smith g...@2ndquadrant.com wrote:
 Oh no, it's party pooper time again.  Sorry I have to be the one to do it
 this round.  The real problem with this whole area is that we know there are
 systems floating around where the amount of time taken to grab timestamps
 like this is just terrible.  I've been annoyed enough by that problem to
 spend some time digging into why that is--seems to be a bunch of trivia
 around the multiple ways to collect time info on x86 systems--and after this
 CommitFest is over I was already hoping to dig through my notes and start
 quantifying that more.  So you can't really prove the overhead of this
 approach is acceptable just by showing two examples; we need to find one of
 the really terrible clocks and test there to get a real feel for the
 worst-case.

Sure, I know that the timing calls might be awfully slow. That's why I turned
it off by default. I saw that track_functions was already using this, so I
figured it was ok to have it potentially run very slowly.

 -Document the underlying problem and known workarounds, provide a way to
 test how bad the overhead is, and just throw our hands up and say sorry,
 you just can't instrument like this if someone has a slow system.

Some documentation about potential problems would definitely be good.
Same goes for a test tool. ISTM that fast accurate timing is just not
possible on all supported platforms. That doesn't seem like a good enough
justification to refuse implementing something useful for the majority that
do as long as it doesn't cause regressions for those that don't or
significant code complexity.

 -Have one of the PostgreSQL background processes keep track of a time
 estimate on its own, only periodically pausing to sync against the real
 time.  Then most calls to gettimeofday() can use that value instead.  I was
 thinking of that idea for slightly longer running things though; I doubt
 that can be made accurate enough to test instrument buffer

This would limit it to those cases where hundreds of milliseconds of jitter
or more don't bother all that much.

 And while I hate to kick off massive bike-shedding in your direction, I'm
 also afraid this area--collecting stats about how long individual operations
 take--will need a much wider ranging approach than just looking at the
 buffer cache ones.  If you step back and ask what do people expect here?,
 there's a pretty large number who really want something like Oracle's
 v$session_wait  and v$system_event interface for finding the underlying
 source of slow things.  There's enough demand for that that EnterpriseDB has
 even done some work in this area too; what I've been told about it suggests
 the code isn't a great fit for contribution to community PostgreSQL though.
  Like I said, this area is really messy and hard to get right.

Yeah, something like that should probably be something to strive for. I'll
ponder a bit more  about resource and latency tracking a general. Maybe the
question here should be about the cost/benefit ratio of having some utility
now vs maintaining/deprecating the user visible interface when a more
general framework will turn up.

 Something more ambitious like the v$ stuff would also take care of what
 you're doing here; I'm not sure that what you've done helps built it though.
  Please don't take that personally.  Part of one of my own instrumentation
 patches recently was rejected out of hand for the same reason, just not
 being general enough.

No problem, I understand that half-way solutions can be more trouble than
they're worth. I actually built this to help with performance testing an
application and thought it would be an interesting experience to try to
give the community back something.

On Mon, Nov 28, 2011 at 4:40 PM, Greg Stark st...@mit.edu wrote:
 I believe on most systems on modern linux kernels gettimeofday an its ilk
 will be a vsyscall and nearly as fast as a regular function call.

clock_gettime() is implemented as a vDSO since 2.6.23. gettimeofday() has
been user context callable since before git shows any history (2.6.12).

On Mon, Nov 28, 2011 at 5:55 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 The other big problem for a patch of this sort is that it would bloat
 the stats file.

 Yes.  Which begs the question of why we need to measure this per-table.
 I would think per-tablespace would be sufficient.

Yeah, I figured that this is something that should be discussed. I
implemented per-table collection because I thought it might be useful for
tools to pick up and show a quick overview on which tables are causing the
most IO overhead for queries.

On Mon, Nov 28, 2011 at 8:10 PM, Martijn van Oosterhout
klep...@svana.org wrote:
 Something good to know: in Linux the file
 /sys/devices/system/clocksource/clocksource0/current_clocksource
 lists the current clock source, and
 

Re: [HACKERS] CommitFest 2011-11 Post-Tryptophan Progress Report

2011-11-28 Thread Andres Freund
Hi Greg,

On Monday, November 28, 2011 11:13:52 AM Greg Smith wrote:
 With plenty of people in the US returning to work on Monday after a
 modest break last week, I wanted to put out an updated on the active
 CommitFest. I just sent out several pleas for help with specific patches
 where only a limited chunk of the community is really in a good position
 to test the code. Excluding those for the moment, here are some notes on
 the submitted patches that not only don't have a reviewer already, I
 don't know who might take them on yet:
 
 Command Triggers by Dimitri Fontaine. This is a pretty big patch (3366
 lines) that won't be easy to slug through. I know Dimitri is still
 making some progress on this just by thinking more about the code, but
 that's not going to keep up for much longer without outside feedback.

 Both type privledges and Command Triggers are likely to slip well
 into December if we don't get a major reviewer working on them soon. I
 just don't know who that might be in either case.
Unless a major reviewer cries, I will - although not a major reviewer - take a 
stab thursday if thats fine with you.


Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strange nbtree corruption report

2011-11-28 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 I wonder if it would be worthwhile to build some sort of tool to scan
 the heap and ensure that there are index entries for all heap items,
 just to test the hypothesis.  Not that this would enlighten on the
 source of the actual problem.

Seems like the hypothesis could be proven or disproven just by counting
the heap and index entries while the DB is otherwise idle.  It used to
be that VACUUM VERBOSE was sufficient for that sort of cross-check ...
but I'm not totally sure what push-ups are required nowadays to prevent
it from deciding that it's smarter than you are so it needn't scan the
whole table.  Is VACUUM FREEZE VERBOSE still trustworthy for this?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoiding repeated snapshot computation

2011-11-28 Thread Andres Freund
Hi Gurjeet,

On Monday, November 28, 2011 08:55:28 PM Gurjeet Singh wrote:
 On Sat, Nov 26, 2011 at 6:51 PM, Andres Freund and...@anarazel.de wrote:
  On Saturday, November 26, 2011 11:39:23 PM Robert Haas wrote:
   On Sat, Nov 26, 2011 at 5:28 PM, Andres Freund and...@anarazel.de
  
  wrote:
On Saturday, November 26, 2011 09:52:17 PM Tom Lane wrote:
I'd just as soon keep the fields in a logical order.

Btw, I don't think the new order is necessarily worse than the old
one.
   
   You forget to attach the benchmark results.
   
   My impression is that cache lines on modern hardware are 64 or 128
   *bytes*, in which case this wouldn't matter a bit.
   
   But testing is even better than guessing.
  
  Being prodded like that I ran a very quick benchmark on my workstation.
  Unfortunately that means I cannot work during the time which is why I
  kept it
  rather short...
  
  That machine has 2 E5520@2.27GHz which means 2(sockets) * 4(cores) *
  2(threads) and 24GB of ram.
  
  Data was initialized with: pgbench -h /tmp/ --unlogged-tables -i -s 20
  pgbench
  
  
  pgbench -h /tmp/ pgbench -S -j 16 -c 16 -T 60
  ...
  Looks like a win to me. Even on this comparatively small machine.
 This may not be necessary, but can you please share the modified config you
 used for the last run.
Can't right now because I don't have access from here, but I will do so. I 
don't think its particularly interesting though.

 I tabulated your last results to make it more readable, and added columns
 to show the improvement.
 
 origsnap reordersnap  diff   %age improvement
 --
 102234.66402 103444.8778791210.2138591.1837607827
 102003.449741103385.081382.4390671.3552865815
 102119.509053103302.3189231182.80987 1.1582604352
 101722.410387103372.6594861650.2490991.6223063263
 101973.651318103330.1576121356.5062941.3302517626
 102056.440561103313.8338211257.39326 1.2320567454
 
 That looks like a win to me too. We're getting a little over 1% improvement
 for free!
At least if we can convince Tom that such a change would be ok ;)

I would like to see somebody running a benchmark on a machine with higher 
concurrency...

 Maybe submitting this patch to the commitfest might help get some serious
 consideration from a reviewer.
It wasn't actually ment as an actual patch but a comment, but well ;). Will 
add it.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CommitFest 2011-11 Post-Tryptophan Progress Report

2011-11-28 Thread Greg Smith

On 11/28/2011 05:38 PM, Andres Freund wrote:
Unless a major reviewer cries, I will - although not a major reviewer 
- take a

stab thursday if thats fine with you.


I intended to use that term to suggest the review job itself could 
become a major chunk of work, not to try and classify reviewers.  Bad 
wording choice.  If you could take a look at Command Triggers later this 
week I'm sure Dimitri would appreciate your feedback.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CommitFest 2011-11 Post-Tryptophan Progress Report

2011-11-28 Thread Andres Freund
On Tuesday, November 29, 2011 02:45:50 AM Greg Smith wrote:
 On 11/28/2011 05:38 PM, Andres Freund wrote:
  Unless a major reviewer cries, I will - although not a major reviewer
  - take a
  stab thursday if thats fine with you.
 
 I intended to use that term to suggest the review job itself could
 become a major chunk of work, not to try and classify reviewers.  Bad
 wording choice.  If you could take a look at Command Triggers later this
 week I'm sure Dimitri would appreciate your feedback.
Well. I don't think its that bad wording... Imo the patch will soon need input 
from a commiter with experience in the area to solidify the direction...

Thanks,

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: add timing of buffer I/O requests

2011-11-28 Thread Greg Smith

On 11/28/2011 05:51 AM, Robert Haas wrote:

On Mon, Nov 28, 2011 at 2:54 AM, Greg Smithg...@2ndquadrant.com  wrote:

The real problem with this whole area is that we know there are
systems floating around where the amount of time taken to grab timestamps
like this is just terrible.

Assuming the feature is off by default (and I can't imagine we'd
consider anything else), I don't see why that should be cause for
concern.  If the instrumentation creates too much system load, then
don't use it: simple as that.


It's not quite that simple though.  Releasing a performance measurement 
feature that itself can perform terribly under undocumented conditions 
has a wider downside than that.


Consider that people aren't going to turn it on until they are already 
overloaded.  If that has the potential to completely tank performance, 
we better make sure that area is at least explored usefully first; the 
minimum diligence should be to document that fact and make suggestions 
for avoiding or testing it.


Instrumentation that can itself become a performance problem is an 
advocacy problem waiting to happen.  As I write this I'm picturing such 
an encounter resulting in an angry blog post, about how this proves 
PostgreSQL isn't usable for serious systems because someone sees massive 
overhead turning this on.  Right now the primary exposure to this class 
of issue is EXPLAIN ANALYZE.  When I was working on my book, I went out 
of my way to find a worst case for that[1], and that turned out to be a 
query that went from 7.994ms to 69.837ms when instrumented.  I've been 
meaning to investigate what was up there since finding that one.  The 
fact that we already have one such problem bit exposed already worries 
me; I'd really prefer not to have two.


[1] (Dell Store 2 schema, query was SELECT count(*) FROM customers;)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: add timing of buffer I/O requests

2011-11-28 Thread Tom Lane
Greg Smith g...@2ndquadrant.com writes:
 On 11/28/2011 05:51 AM, Robert Haas wrote:
 Assuming the feature is off by default (and I can't imagine we'd
 consider anything else), I don't see why that should be cause for
 concern.  If the instrumentation creates too much system load, then
 don't use it: simple as that.

 It's not quite that simple though.  Releasing a performance measurement 
 feature that itself can perform terribly under undocumented conditions 
 has a wider downside than that.

Yeah, that's a good point, and the machines on which this would suck
are exactly the ones where EXPLAIN ANALYZE creates very large overhead.
We don't seem to see a lot of complaints about that anymore, but we do
still see some ... and yes, it's documented that EXPLAIN ANALYZE can add
significant overhead, but that doesn't stop the questions.

 Instrumentation that can itself become a performance problem is an 
 advocacy problem waiting to happen.  As I write this I'm picturing such 
 an encounter resulting in an angry blog post, about how this proves 
 PostgreSQL isn't usable for serious systems because someone sees massive 
 overhead turning this on.

Of course, the rejoinder could be that if you see that, you're not
testing on serious hardware.  But still, I take your point.

 Right now the primary exposure to this class 
 of issue is EXPLAIN ANALYZE.  When I was working on my book, I went out 
 of my way to find a worst case for that[1],
 [1] (Dell Store 2 schema, query was SELECT count(*) FROM customers;)

That's pretty meaningless without saying what sort of clock hardware
was on the machine...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql setenv command

2011-11-28 Thread Josh Kupershmidt
On Sat, Nov 26, 2011 at 11:02 AM, Andrew Dunstan and...@dunslane.net wrote:
 Also, should the malloc() of newval just use pg_malloc() instead?

 Yes, also done.

This bit is unnecessary, since pg_malloc() takes care of the error handling:

+   if (!newval)
+   {
+   psql_error(out of memory\n);
+   exit(EXIT_FAILURE);
+   }


Also, the help output for setenv bleeds over an 80-character terminal,
and it seems the rest of the help outputs try to stay under this
limit. And an OCD nitpick: most of the psql-ref.sgml examples show
'testdb' at the prompt, how about we follow along.

 Other than those small gripes, the patch looks fine. Attached is an
updated version to save you some keystrokes.

Josh
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 01f57c4..f97929b 100644
*** a/doc/src/sgml/ref/psql-ref.sgml
--- b/doc/src/sgml/ref/psql-ref.sgml
*** lo_import 152801
*** 2242,2247 
--- 2242,2265 
  
  
varlistentry
+ termliteral\setenv [ replaceable class=parametername/replaceable [ replaceable class=parametervalue/replaceable ] ]/literal/term
+ 
+ listitem
+ para
+ Sets the environment variable replaceable
+ class=parametername/replaceable to replaceable
+ class=parametervalue/replaceable, or if the
+ replaceable class=parametervalue/replaceable is
+ not supplied, unsets the environment variable. Example:
+ programlisting
+ testdb=gt; userinput\setenv PAGER less/userinput
+ testdb=gt; userinput\setenv LESS -imx4F/userinput
+ /programlisting
+ /para
+ /listitem
+   /varlistentry
+ 
+   varlistentry
  termliteral\sf[+] replaceable class=parameterfunction_description/ /literal/term
  
  listitem
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 9cc73be..e4b4eaa 100644
*** a/src/bin/psql/command.c
--- b/src/bin/psql/command.c
*** exec_command(const char *cmd,
*** 1110,1115 
--- 1110,1160 
  		free(opt0);
  	}
  
+ 
+ 	/* \setenv -- set environment command */
+ 	else if (strcmp(cmd, setenv) == 0)
+ 	{
+ 		char	   *envvar = psql_scan_slash_option(scan_state,
+   OT_NORMAL, NULL, false);
+ 		char	   *envval = psql_scan_slash_option(scan_state,
+   OT_NORMAL, NULL, false);
+ 
+ 		if (!envvar)
+ 		{
+ 			psql_error(\\%s: missing required argument\n, cmd);
+ 			success = false;
+ 		}
+ 		else if (strchr(envvar,'=') != NULL)
+ 		{
+ 			psql_error(\\%s: environment variable name must not contain '='\n,
+ 	   cmd);
+ 			success = false;
+ 		}
+ 		else if (!envval)
+ 		{
+ 			/* No argument - unset the environment variable */
+ 			unsetenv(envvar);
+ 			success = true;
+ 		}
+ 		else
+ 		{
+ 			/* Set variable to the value of the next argument */
+ 			int len = strlen(envvar) + strlen(envval) + 1;
+ 			char	   *newval = pg_malloc(len + 1);
+ 
+ 			snprintf(newval, len+1, %s=%s, envvar, envval);
+ 			putenv(newval);
+ 			success = true;
+ 			/*
+ 			 * Do not free newval here, it will screw up the environment
+ 			 * if you do. See putenv man page for details. That means we
+ 			 * leak a bit of memory here, but not enough to worry about.
+ 			 */
+ 		}
+ 		free(envvar);
+ 		free(envval);
+ 	}
+ 
  	/* \sf -- show a function's source code */
  	else if (strcmp(cmd, sf) == 0 || strcmp(cmd, sf+) == 0)
  	{
diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c
index 4649e94..13d8bf6 100644
*** a/src/bin/psql/help.c
--- b/src/bin/psql/help.c
*** slashUsage(unsigned short int pager)
*** 158,164 
  {
  	FILE	   *output;
  
! 	output = PageOutput(93, pager);
  
  	/* if you add/remove a line here, change the row count above */
  
--- 158,164 
  {
  	FILE	   *output;
  
! 	output = PageOutput(94, pager);
  
  	/* if you add/remove a line here, change the row count above */
  
*** slashUsage(unsigned short int pager)
*** 257,262 
--- 257,263 
  
  	fprintf(output, _(Operating System\n));
  	fprintf(output, _(  \\cd [DIR]  change the current working directory\n));
+ 	fprintf(output, _(  \\setenv NAME [VALUE]   set or unset environment variable\n));
  	fprintf(output, _(  \\timing [on|off]   toggle timing of commands (currently %s)\n),
  			ON(pset.timing));
  	fprintf(output, _(  \\! [COMMAND]   execute command in shell or start interactive shell\n));

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Allow pg_dumpall to use dumpmem.c functions, simplify exit code

2011-11-28 Thread Bruce Momjian
Bruce Momjian wrote:
  I was wondering if it wouldn't make more sense to have pg_dumpall supply
  its own version of exit_horribly to avoid separate pg_malloc and
  pg_strdup ... but then those routines are so tiny that it hardly makes a
  difference.
  
  Another thing I wondered when seeing the original commit is the fact
  that the old code passed the AH to exit_horribly in some places, whereas
  the new one simply uses NULL.
...
 
 I am thinking we should just get rid of the whole AH passing.
 
 I have always felt the pg_dump code is overly complex, and this is
 confirming my suspicion.

I have developed the attached patch which accomplishes this.  I was also
able to move the write_msg function into dumputils.c (which is linked to
pg_dumpall), which allows pg_dumpall to use the new dumpmem.c functions,
and I removed its private ones.

FYI, I found write_msg() was a useless valist trampoline so I removed
the trampoline code and renamed _write_msg() to write_msg().  I also
modified the MSVC code.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
new file mode 100644
index 4e8e421..09101d5
*** a/src/bin/pg_dump/Makefile
--- b/src/bin/pg_dump/Makefile
*** pg_dump: pg_dump.o common.o pg_dump_sort
*** 35,42 
  pg_restore: pg_restore.o $(OBJS) $(KEYWRDOBJS) | submake-libpq submake-libpgport
  	$(CC) $(CFLAGS) pg_restore.o $(KEYWRDOBJS) $(OBJS) $(libpq_pgport) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@$(X)
  
! pg_dumpall: pg_dumpall.o dumputils.o $(KEYWRDOBJS) | submake-libpq submake-libpgport
! 	$(CC) $(CFLAGS) pg_dumpall.o dumputils.o $(KEYWRDOBJS) $(WIN32RES) $(libpq_pgport) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@$(X)
  
  install: all installdirs
  	$(INSTALL_PROGRAM) pg_dump$(X) '$(DESTDIR)$(bindir)'/pg_dump$(X)
--- 35,42 
  pg_restore: pg_restore.o $(OBJS) $(KEYWRDOBJS) | submake-libpq submake-libpgport
  	$(CC) $(CFLAGS) pg_restore.o $(KEYWRDOBJS) $(OBJS) $(libpq_pgport) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@$(X)
  
! pg_dumpall: pg_dumpall.o dumputils.o dumpmem.o $(KEYWRDOBJS) | submake-libpq submake-libpgport
! 	$(CC) $(CFLAGS) pg_dumpall.o dumputils.o dumpmem.o $(KEYWRDOBJS) $(WIN32RES) $(libpq_pgport) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@$(X)
  
  install: all installdirs
  	$(INSTALL_PROGRAM) pg_dump$(X) '$(DESTDIR)$(bindir)'/pg_dump$(X)
diff --git a/src/bin/pg_dump/dumpmem.c b/src/bin/pg_dump/dumpmem.c
new file mode 100644
index a50f4f5..a71e217
*** a/src/bin/pg_dump/dumpmem.c
--- b/src/bin/pg_dump/dumpmem.c
***
*** 14,20 
   *-
   */
  #include postgres_fe.h
! #include pg_backup.h
  #include dumpmem.h
  
  #include ctype.h
--- 14,20 
   *-
   */
  #include postgres_fe.h
! #include dumputils.h
  #include dumpmem.h
  
  #include ctype.h
*** pg_strdup(const char *string)
*** 32,41 
  	char	   *tmp;
  
  	if (!string)
! 		exit_horribly(NULL, NULL, cannot duplicate null pointer\n);
  	tmp = strdup(string);
  	if (!tmp)
! 		exit_horribly(NULL, NULL, out of memory\n);
  	return tmp;
  }
  
--- 32,41 
  	char	   *tmp;
  
  	if (!string)
! 		exit_horribly(NULL, cannot duplicate null pointer\n);
  	tmp = strdup(string);
  	if (!tmp)
! 		exit_horribly(NULL, out of memory\n);
  	return tmp;
  }
  
*** pg_malloc(size_t size)
*** 46,52 
  
  	tmp = malloc(size);
  	if (!tmp)
! 		exit_horribly(NULL, NULL, out of memory\n);
  	return tmp;
  }
  
--- 46,52 
  
  	tmp = malloc(size);
  	if (!tmp)
! 		exit_horribly(NULL, out of memory\n);
  	return tmp;
  }
  
*** pg_calloc(size_t nmemb, size_t size)
*** 57,63 
  
  	tmp = calloc(nmemb, size);
  	if (!tmp)
! 		exit_horribly(NULL, NULL, _(out of memory\n));
  	return tmp;
  }
  
--- 57,63 
  
  	tmp = calloc(nmemb, size);
  	if (!tmp)
! 		exit_horribly(NULL, _(out of memory\n));
  	return tmp;
  }
  
*** pg_realloc(void *ptr, size_t size)
*** 68,73 
  
  	tmp = realloc(ptr, size);
  	if (!tmp)
! 		exit_horribly(NULL, NULL, _(out of memory\n));
  	return tmp;
  }
--- 68,73 
  
  	tmp = realloc(ptr, size);
  	if (!tmp)
! 		exit_horribly(NULL, _(out of memory\n));
  	return tmp;
  }
diff --git a/src/bin/pg_dump/dumputils.c b/src/bin/pg_dump/dumputils.c
new file mode 100644
index 92b9d28..39601e6
*** a/src/bin/pg_dump/dumputils.c
--- b/src/bin/pg_dump/dumputils.c
***
*** 23,28 
--- 23,29 
  
  
  int			quote_all_identifiers = 0;
+ const char *progname;
  
  
  #define supports_grant_options(version) ((version) = 70400)
*** emitShSecLabels(PGconn *conn, PGresult *
*** 1211,1213 
--- 1212,1244 
  appendPQExpBuffer(buffer, ;\n);
  	}
  }
+ 
+ 
+ void
+ write_msg(const char 

Re: [HACKERS] DOMAINs and CASTs

2011-11-28 Thread Bruce Momjian
David Fetter wrote:
 On Mon, Jun 13, 2011 at 03:39:39AM -0500, Jaime Casanova wrote:
  On Mon, Jun 6, 2011 at 6:36 AM, Peter Eisentraut pete...@gmx.net wrote:
   On tis, 2011-05-17 at 14:11 -0500, Jaime Casanova wrote:
   On Tue, May 17, 2011 at 12:19 PM, Robert Haas robertmh...@gmail.com 
   wrote:
   
The more controversial question is what to do if someone tries to
create such a cast anyway. ?We could just ignore that as we do now, or
we could throw a NOTICE, WARNING, or ERROR.
  
   IMHO, not being an error per se but an implementation limitation i
   would prefer to send a WARNING
  
   Implementation limitations are normally reported as errors. ?I don't see
   why it should be different here.
  
  
  ok, patch reports an error... do we want to backpatch this? if we want
  to do so maybe we can backpatch as a warning
 
 Minor clarification attached.

What happened to this patch for casts on domains from June?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c
index 03da168..a29c13c 100644
--- a/src/backend/commands/functioncmds.c
+++ b/src/backend/commands/functioncmds.c
@@ -1517,6 +1517,17 @@ CreateCast(CreateCastStmt *stmt)
 errmsg(target data type %s is a pseudo-type,
TypeNameToString(stmt-targettype;
 
+   /* no domains allowd */
+   if (sourcetyptype == TYPTYPE_DOMAIN)
+   ereport(ERROR,
+   (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+errmsg(casts from domains are not implemented yet)));
+
+   if (targettyptype == TYPTYPE_DOMAIN)
+   ereport(ERROR,
+   (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+errmsg(casts to domains are not implemented yet)));
+
/* Permission check */
if (!pg_type_ownercheck(sourcetypeid, GetUserId())
 !pg_type_ownercheck(targettypeid, GetUserId()))
@@ -1672,11 +1683,13 @@ CreateCast(CreateCastStmt *stmt)
 * etc. would have to be modified to look through domains to their
 * base types.
 */
+#ifdef NOT_USED
if (sourcetyptype == TYPTYPE_DOMAIN ||
targettyptype == TYPTYPE_DOMAIN)
ereport(ERROR,
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
 errmsg(domain data types must not be marked binary-compatible)));
+#endif
}
 
/*

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: collect frequency statistics for arrays

2011-11-28 Thread Bruce Momjian
Alexander Korotkov wrote:
 Version of patch with few more comments and some fixes.

Where are we on the idea of better statistics for arrays?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] perltidy

2011-11-28 Thread Bruce Momjian
Peter Eisentraut wrote:
 How much do we care about applying perltidy, as described in
 src/tools/msvc/README, everywhere?  I just ran it across the entire
 tree, using
 
 perltidy -b -bl -nsfs -naws -l=100 -ole=unix **/*.pl **/*.pm
 
 and it generated 6531 lines of (unified) diff, of which 357 are in
 src/tools/msvc/ itself.  So clearly it's not being applied very
 consistently.
 
 Given how easily this appears to work and how we're sneakily expanding
 the use of Perl, I think we ought to add this to the standard pgindent
 routine.

I have moved the Perl indentation command-line docs into the pgindent
README and it will be run as part of the pgindent checklist.  Applied
patch attached.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/src/tools/msvc/README b/src/tools/msvc/README
new file mode 100644
index 58e266e..b8dd488
*** a/src/tools/msvc/README
--- b/src/tools/msvc/README
*** the libpq frontend library. For more inf
*** 9,20 
  chapter Installation on Windows.
  
  
- Notes about code indention
- --
- If the perl code is modified, use perltidy on it since pgindent won't
- touch perl code. Use the following commandline:
- perltidy -b -bl -nsfs -naws -l=100 -ole=unix *.pl *.pm
- 
  Notes about Visual Studio Express
  -
  To build PostgreSQL using Visual Studio Express, the Platform SDK
--- 9,14 
diff --git a/src/tools/pgindent/README b/src/tools/pgindent/README
new file mode 100644
index 7504650..d88c201
*** a/src/tools/pgindent/README
--- b/src/tools/pgindent/README
*** This can format all PostgreSQL *.c and *
*** 37,42 
--- 37,46 
  	gmake -C contrib install
  	gmake installcheck-world
  
+ 8) Indent the Perl MSVC code:
+ 
+ 	cd src/tools/msvc
+ 	perltidy -b -bl -nsfs -naws -l=100 -ole=unix *.pl *.pm
  
  ---
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoiding repeated snapshot computation

2011-11-28 Thread Kevin Grittner
Andres Freund  wrote:
 
 I would like to see somebody running a benchmark on a machine with
 higher concurrency...
 
Yeah, me too.  I don't find it at all hard to believe that we will
see significant performance benefit by changing the PGPROC structure
so that all parts of it can be accessed through one cache line rather
than two.  The fact that we saw improvement when changing it down to
two, even though there is extra indirection in some places, seems
like pretty solid evidence on the face of it.
 
I went in to the office on Sunday to try to get a few hours of
benchmarks for this on our new monster machine, but the DBA
responsible for putting it into production was on it first, loading
it with production data.  At this point it will get really hard to
schedule any more of the 20-hour runs that were the basis of most of
my recent numbers, but I may be able to shut down production use for
a two or three hour window on a weekend to run an abbreviated set,
focusing on the higher concurrency levels.  Ideally that would be on
top of the other pending performance patches.
 
Based on my earlier rounds of benchmarking, I expect that this
approach will show the greatest benefit at the higher concurrency
levels, where cache lines are most stressed.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: collect frequency statistics for arrays

2011-11-28 Thread Nathan Boley
 Version of patch with few more comments and some fixes.

 Where are we on the idea of better statistics for arrays?

I need to finish reviewing the patch: I'll try to send in something
tomorrow night. So far it looks good.

Best,
Nathan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] to_date() marked stable?

2011-11-28 Thread Bruce Momjian
I was wondering why we mark to_date() as stable and not immutable?  It
seems to have been set to stable in 2006 with this patch:

http://archives.postgresql.org/pgsql-committers/2006-11/msg00264.php

Also, mark to_date() and to_char(interval) as stable; although these
appear not to depend on any GUC variables as of CVS HEAD, that seems a
property unlikely to survive future improvements.  It seems best to mark
all the formatting functions stable and be done with it.

Are there people using to_date in indexes or partition functions where
changing it to immutable would be useful?  The code seems complete
enough now that we should consider an optimization here.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] to_date() marked stable?

2011-11-28 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 I was wondering why we mark to_date() as stable and not immutable?

Do you really want to guarantee that it isn't, and never will be,
affected by timezone, datestyle, lc_time, etc?  In particular it seems
likely that somebody will eventually complain that since to_char can
output localized month/day names according to lc_time, to_date should be
able to read them.

 Are there people using to_date in indexes or partition functions where
 changing it to immutable would be useful?

By definition, there are not, and I don't recall many complaints from
people trying to.  On the other hand, if we mark it immutable and then
in future wish to go back to allowing environment dependencies, we will
have to risk breaking working applications.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] to_date() marked stable?

2011-11-28 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  I was wondering why we mark to_date() as stable and not immutable?
 
 Do you really want to guarantee that it isn't, and never will be,
 affected by timezone, datestyle, lc_time, etc?  In particular it seems
 likely that somebody will eventually complain that since to_char can
 output localized month/day names according to lc_time, to_date should be
 able to read them.
 
  Are there people using to_date in indexes or partition functions where
  changing it to immutable would be useful?
 
 By definition, there are not, and I don't recall many complaints from
 people trying to.  On the other hand, if we mark it immutable and then
 in future wish to go back to allowing environment dependencies, we will
 have to risk breaking working applications.

OK  --- without user requests, it seems pointless to make a change here.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server

2011-11-28 Thread Etsuro Fujita
(2011/11/28 20:50), Shigeru Hanada wrote:
 (2011/11/25 17:27), Etsuro Fujita wrote:
So, I think it might be better to estimate
 such costs by pgsql_fdw itself without EXPLAINing on the assumption that
 a remote postgres server has the same abilities for query optimization,
 which is less costly and widely applicable to the other DBMSs, while it,
 of course, only works once we have statistics and/or index information
 for foreign tables.  But AFAIK we eventually want to have those, so I'd
 like to propose to use the proposed approach until that time.
 
 Knowledge of foreign indexes also provide information of sort order.
 Planner will be able to consider merge join without local sort with such
 information.  Without foreign index, we have to enumerate possible sort
 keys with Blute-Force approach for same result, as mentioned by
 Itagaki-san before.

Yes, with the knowledge of foreign indexes, I think we can take the
approach of thinking multiple plans for a foreign table; the cheapest
unordered plan and the cheapest plan with a given sort order.  In
addition, it would be also possible to support
nestloop-with-inner-foreign-indexscans on a foreign table as pointed out
as future work by Tom Lane at PGCon 2011[1].

[1] http://www.pgcon.org/2011/schedule/attachments/188_Planner%20talk.pdf

Best regards,
Etsuro Fujita

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Misleading CREATE TABLE error

2011-11-28 Thread Peter Eisentraut
On ons, 2011-11-09 at 12:00 -0500, Robert Haas wrote:
 On Tue, Nov 8, 2011 at 4:49 PM, Thom Brown t...@linux.com wrote:
  I found the following error message misleading:
 
  test=# create table cows2 (LIKE cows);
  ERROR:  inherited relation cows is not a table
  STATEMENT:  create table cows2 (LIKE cows);
 
  I'm not trying to inherit a relation, I'm trying to base a table on
  it.  As it happens, cows is a foreign table, which *is* a table,
  just not a regular table.  It might be useful to add support to clone
  foreign tables into regular tables, the use-case being that you may
  wish to import all the data locally into a table of the same
  structure.  But the gripe here is the suggestion that the relation
  would have been inherited, which would actually be achieved using
  INHERITS.
 
 Interesting.  I agree that there's no obvious reason why that
 shouldn't be allowed to work.  Could be useful with views, too.

I recently came across a situation where LIKE with a composite type
might have been useful.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] psql editor temp file extension

2011-11-28 Thread Peter Eisentraut
There was a question recently on a user list about how to configure an
editor to handle psql's editor temp files as SQL files.  While this is
doable with an advanced editor by reverse-engineering the file pattern
than psql uses, it would be easier if psql just added an extension
.sql.  Since both uses (\e and \ef) are actually SQL files, this
should not be a problem.  So I propose the attached patch.

diff --git i/src/bin/psql/command.c w/src/bin/psql/command.c
index 9cc73be..f885c1d 100644
--- i/src/bin/psql/command.c
+++ w/src/bin/psql/command.c
@@ -1879,10 +1879,10 @@ do_edit(const char *filename_arg, PQExpBuffer query_buf,
 		 */
 #endif
 #ifndef WIN32
-		snprintf(fnametmp, sizeof(fnametmp), %s%spsql.edit.%d, tmpdir,
+		snprintf(fnametmp, sizeof(fnametmp), %s%spsql.edit.%d.sql, tmpdir,
  /, (int) getpid());
 #else
-		snprintf(fnametmp, sizeof(fnametmp), %s%spsql.edit.%d, tmpdir,
+		snprintf(fnametmp, sizeof(fnametmp), %s%spsql.edit.%d.sql, tmpdir,
 			/* trailing separator already present */ , (int) getpid());
 #endif
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql editor temp file extension

2011-11-28 Thread Pavel Stehule
2011/11/29 Peter Eisentraut pete...@gmx.net:
 There was a question recently on a user list about how to configure an
 editor to handle psql's editor temp files as SQL files.  While this is
 doable with an advanced editor by reverse-engineering the file pattern
 than psql uses, it would be easier if psql just added an extension
 .sql.  Since both uses (\e and \ef) are actually SQL files, this
 should not be a problem.  So I propose the attached patch.

+1

Pavel



 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoiding repeated snapshot computation

2011-11-28 Thread Pavan Deolasee
On Tue, Nov 29, 2011 at 8:30 AM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Andres Freund  wrote:

 I would like to see somebody running a benchmark on a machine with
 higher concurrency...

 Yeah, me too.  I don't find it at all hard to believe that we will
 see significant performance benefit by changing the PGPROC structure
 so that all parts of it can be accessed through one cache line rather
 than two.  The fact that we saw improvement when changing it down to
 two, even though there is extra indirection in some places, seems
 like pretty solid evidence on the face of it.


I think there is fundamental difference between the PGPROC patch that
we did and the rearranging SnapshotData members that Andres has
proposed. The PGPROC structure is a shared memory area, very heavily
accessed by GetSnapshotData (and some others). There was a linear
increase in the access as number of clients go up. The SnapshotData
structure is mostly from a backend local memory (unless we do
something what I suggested in the OP) and hence fitting that in a
single cache line may or may not have much impact. We don't even
guarantee that it starts at a cacheline which means in most cases it
will still be spread on multiple cache lines.

Thanks,
Pavan
-- 
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] splitting plpython into smaller parts

2011-11-28 Thread Peter Eisentraut
On mån, 2011-11-28 at 02:00 -0800, Greg Smith wrote:
 On 11/13/2011 09:45 AM, Jan Urbański wrote:
  The first one factors out some bolerplate related to executing SPI 
  functions in subtransactions (and idea borrowed from pltcl.c).
 
 While I haven't looked at the code, this seems worthwhile from the 
 description.
 
  The second one is the actual split. plpython.c has been split into 11 
  separate files and one header. 
 
 Could you comment a bit more about what the goal of this is?  We don't 
 have a reviewer for this patch yet, and I think part of the reason is 
 because it's not really obvious what it's supposed to be doing, and why 
 that's useful.

I will look into it.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpython SPI cursors

2011-11-28 Thread Peter Eisentraut
On lör, 2011-11-26 at 11:21 -0500, Steve Singer wrote:
 I've looked over the revised version of the patch and it now seems
 fine.
 
 Ready for committer. 

I can take it from here.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch for type privileges

2011-11-28 Thread Peter Eisentraut
On mån, 2011-11-28 at 11:41 +0100, Yeb Havinga wrote:
 On 2011-11-15 21:50, Peter Eisentraut wrote:
  Patch attached.
 
 I cannot get the patch to apply, this is the output of patch -p1 
 --dry-run on HEAD.
 
 patching file src/include/catalog/pg_type.h
 Hunk #1 succeeded at 217 (offset 1 line).
 Hunk #2 succeeded at 234 (offset 1 line).
 Hunk #3 succeeded at 264 (offset 1 line).
 Hunk #4 succeeded at 281 (offset 1 line).
 Hunk #5 FAILED at 370.
 Hunk #6 FAILED at 631.
 2 out of 6 hunks FAILED -- saving rejects to file 
 src/include/catalog/pg_type.h.rej

I need to remerge it against concurrent range type activity.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoiding repeated snapshot computation

2011-11-28 Thread Pavan Deolasee
On Sun, Nov 27, 2011 at 12:26 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Pavan Deolasee pavan.deola...@gmail.com writes:
 On Sat, Nov 26, 2011 at 10:43 PM, Robert Haas robertmh...@gmail.com wrote:
 Furthermore, it's
 hard to understand how this could be a net improvement in the general
 case, because now both B and F are copying everything twice (once to
 the shared area and one to backend-local memory) instead of just once
 (to backend-local memory) and C and D are sleeping (yikes!).

 Yes, B and F pay a price of double copy. But I think it can be a net
 saving because C and D (and many more hopefully) don't need to
 recompute the snapshot again by going over a potentially large
 ProcArray.

 Like Robert, I'm finding it hard to believe that this isn't going to be
 a net pessimization in as many or more cases as it'd be a win.  Also,
 didn't we just get rid of most of the issue of going over a large
 ProcArray with the move of hot members to a separate array?


Yeah, separating the PGPROC members has helped a lot. But that does
not reduce the number of GetSnapshotData calls. It only makes each
call less expensive. As I said, I had seen 15-20% improvement with not
even a slightly tuned patch, so I am optimistic about the potential of
the approach.

 In the end, getting a snapshot is exactly about copying information out
 of shared memory.  Perhaps it would be more productive to think about
 how we could further modify the ProcArray representation to reduce the
 impedance mismatch between it and the snapshot representation, rather
 than introducing a second shared-memory copy of the same information.


I think that a good idea. We need a representation that needs minimum
processing to derive the snapshot. One idea is to maintain a bitmap of
currently running transactions in the shared memory. The size of the
bitmap has to be significantly larger than the ProcArray itself
because there could be gaps. But even then since its just a bit per
XID, we can have a bitmap to represent a window of say 16K or 32K
consecutive XIDs. We can also have a summary bitmap to quickly find if
there is at least one running transaction in any given chunk. If there
are long running transactions which don't fit in the bitmap, we can
track them separately in an array. Snapshot is then just a copy of
this bitmap along with some additional information.

Will try to think more about it and see if I can test this idea. But
any comments welcome.

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers