[HACKERS] Query plan optimization for CHECK NO INHERIT and single table?

2012-08-30 Thread Matthias
Hey,

I tried out the new CHECK NO INHERIT feature for inherited tables.
There seems to be an opportunity to generate slightly better query
plans sometimes. E.g. when I do

SELECT * FROM base WHERE partition_id = 3

and there exists only one child table for which partition_id = 3 is
true I guess the query plan could just do a seq/index/whatever scan on
that table. Right now the query plan has an intermediate Append
node. This seems only useful if the results of multiple child tables
would need to be included.

-Matthias


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] HEAD crashes on windows when doing VACUUM ANALYZE

2012-08-30 Thread Albe Laurenz
Matthias wrote:
 when running VACUUM ANALYZE on my database built on win32-x86 from
 yesterday's git checkout I always get this at some point during VACUUM
 ANALYZE:
 
 LOG:  server process (PID 5880) was terminated by exception 0xC094
 DETAIL:  Failed process was running: VACUUM VERBOSE ANALYZE
 HINT:  See C include file ntstatus.h for a description of the
 hexadecimal value.
 LOG:  terminating any other active server processes
 
 I am not sure if it's useful to report it here, but I thought I'd do
 it anyway :)

That seems to be STATUS_INTEGER_DIVIDE_BY_ZERO.

Does it only happen with a certain table?
Are you sure there is no data corruption?
A stack trace would help:
http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_Postg
reSQL_backend_on_Windows

Yours,
Laurenz Albe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Skip checkpoint on promoting from streaming replication

2012-08-30 Thread Kyotaro HORIGUCHI
Hello, sorry for long absense.

 At first I was unhappy that you'd removed the restriction that
 timelines only change on a shutdown checkpoint. But the reality is
 timelines change at any point in the WAL stream - the only way to tell
 between end of WAL and a timeline change is by looking for later
 timelines.

 Yes, I felt uncomfortable with that point. The overlooking map
of timeline evolution on WAL stream seems obscure, and it should
have been made clear to do this. I couldn't show the clear map
for this CF.

 The rest of the logic seems OK, but its a big thing we're doing here
 and so it will take longer yet. Putting all the theory into comments
 in code would certainly help here.

Ok, I agreed.

 I don't have much else to say on this right now. I'm not committing
 anything on this now since I'm about to go on holiday, but I'll be
 looking at this when I get back.

Have a nice holyday.

 For now, I'm going to mark this as Returned With Feedback, but please
 don't be discouraged by that.

I think we have enough time to think about that yet, and I
believe this will be worth doing.

Thank you.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

== My e-mail address has been changed since Apr. 1, 2012.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Chronic performance issue with Replication Failover and FSM.

2012-08-30 Thread Daniel Farina
On Tue, Mar 13, 2012 at 4:53 PM, Josh Berkus j...@agliodbs.com wrote:
 4. On a high-UPDATE workload, this means that the replica assumes tables
 have no free space until it starts to build a new FSM or autovacuum
 kicks in on some of the tables, much later on.

 5. If your hosting is such that you fail over a lot (such as on AWS),
 then this causes cumulative table bloat which can only be cured by a
 VACUUM FULL.

I'd like to revive this thread.  Like other people, I thought this was
not a huge problem -- or least maybe not directly from the mechanism
proposed -- but sometimes it's a pretty enormous one, and I've started
to notice it.  I did a bug report here
(http://archives.postgresql.org/pgsql-bugs/2012-08/msg00108.php, plots
in http://archives.postgresql.org/pgsql-performance/2012-08/msg00181.php),
but just today we promoted another system via streaming replication to
pick up the planner fix in 9.1.5 (did you know: that planner bug seems
to make GIN FTS indexes un-used in non-exotic cases, and one goes to
seqscan?), and then a 40MB GIN index bloated to two gigs on a 1.5GB
table over the course of maybe six hours.

In addition, the thread on pgsql-performance that has the plot I
linked to indicates someone having the same problem with 8.3 after a
warm-standby promotion.

So I think there are some devils at work here, and I am not even sure
if they are hard to reproduce -- yet, people use standby promotion
(unfollow) on Heroku all the time and we have not been plagued
mightily by support issues involving such incredible bloating, so
there's something about the access pattern.  In my two cases, there is
a significant number of UPDATEs vs actual number of INSERTs/DELETES of
records (the ratio is probably 1+ to 1), even though neither of
these would be close to what one could consider a large or even
medium-sized database in terms of TPS or database size.  In fact, the
latter system bloated even though it comfortably fits entirely in
memory.

-- 
fdr


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Draft release notes complete

2012-08-30 Thread Magnus Hagander
On Thu, Aug 30, 2012 at 5:20 AM, Peter Eisentraut pete...@gmx.net wrote:
 On Wed, 2012-08-29 at 22:23 -0400, Alvaro Herrera wrote:
Where are we on building the development docs more frequently?
  
   Still waiting for details on how it works to set that up on the
   buildfarm client.
 
  Where are we on this?

 Waiting on Andrew.

 As far as I can see, we need to update the machine to release 4.7, and
 then install a skip file or something like that.  Andrew, can you
 please explain how is that to be used?  I don't see it documented
 anywhere.

 Why does this need to be tied into the build farm?  Someone can surely
 set up a script that just runs the docs build at every check-in, like it
 used to work.  What's being proposed now just sounds like a lot of
 complication for little or no actual gain -- net loss in fact.

It doesn't have to. The gain is that we don't then end up conflicting
with the buildfarm on when we run. Unlike the old server, we try to
avoid running too many concurrent builds because it might kill other
services on the same box - which happened quite frequently on the old
one. In fact, it happened so frequently on the old one that people
stopped caring...

We also get the buildfarms functionality for collecting logs and
reporting them. Just some more we don't have to build a new set of
scripts for.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Use of systable_beginscan_ordered in event trigger patch

2012-08-30 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 I find $SUBJECT fairly scary, because systable_beginscan_ordered() is
 dependent on having a working, non-corrupt index.  If you are trying
 to run the backend with ignore_system_indexes so that you can rebuild
 corrupt indexes, uses of systable_beginscan_ordered() represent places
 where you can't turn that off, and are entirely at the mercy of the
 indexes being good.

Ooops. Didn't see that, thanks for noticing!

 Or maybe we should disable event triggers altogether in standalone mode?

+1

 I can think of plenty of ways that a broken event trigger could cause
 enough havoc that you'd wish there was a way to suppress it, at least
 for long enough to drop it again.

I fail to see how enabling Event Triggers in standalone mode would help
you get out of the situation that lead you there. It's a last resort
facility where you want the bare PostgreSQL behavior, I think. Now that
you mention it.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-master pgbench?

2012-08-30 Thread Dimitri Fontaine
Tatsuo Ishii is...@postgresql.org writes:
 Does it support extended query? Does it support V3 protocol?

Yes.

It also has a proxy mode where it captures the queries sent by the
client along with think times and outputs that in the session format it
reads from its setup, which is very useful. Using that capability you
can easily have Tsung replay your existing pgbench script.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] _USE_32BIT_TIME_T Patch

2012-08-30 Thread Owais Khan
Hello,

We are getting crash while using plperl on Win32 as ActiveState perl(Win32)
uses 32-bit time_t structures. So, We have to compile DB Server's code also
with 32-bit time_t structure.

Patch is adding _USE_32BIT_TIME_T in preprocessor definitions in case
platform is Windows-32 for all project files.

Thanks  Regards,

Owais.


MSBuildProject_PG_Patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] _USE_32BIT_TIME_T Patch

2012-08-30 Thread Dave Page
On Thu, Aug 30, 2012 at 6:34 AM, Owais Khan owais.k...@enterprisedb.com wrote:
 Hello,

 We are getting crash while using plperl on Win32 as ActiveState perl(Win32)
 uses 32-bit time_t structures. So, We have to compile DB Server's code also
 with 32-bit time_t structure.

 Patch is adding _USE_32BIT_TIME_T in preprocessor definitions in case
 platform is Windows-32 for all project files.

For additional background info, we did originally define this macro
for compatibility with third party code:

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=22867ab9867a145b676f906b98f491c4496a70da

however it got removed here for some reason:

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=cd004067742ee16ee63e55abfb4acbd5f09fbaab

The bottom line is, without it, pl/perl will crash with modern
versions of ActiveState Perl on Win32 (Windows users cannot use
Strawberry Perl as it doesn't contain the shared library we need).

This should definitely go in 9.2, and ideally the earlier branches
that didn't have it defined as well (this has been reported in the
past for 9.1 - for example;
http://archives.postgresql.org/pgsql-bugs/2012-04/msg00054.php) -
though I'm a little worried that adding it there may cause other
existing addons to require recompilation.

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-08-30 Thread David Fetter
On Wed, Aug 29, 2012 at 06:39:37PM -0400, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  On the more general issue, I continue to see minimal risk of harm
  in allowing things like LPAD() to implicitly cast the first
  argument to text.
 
 Well, I see your point about LPAD(), but the problem is how to tell
 the difference between a harmless cast omission and an actual
 mistake that the user will be very grateful if we point out.  If we
 allow implicit casts to text in the general case in
 function/operator calls, we are definitely going to re-introduce a
 lot of room for mistakes.
 
 Upthread you were complaining about how we'd reject calls even when
 there was only one possible interpretation.  I wonder whether
 there'd be any value in taking that literally: that is, allow use of
 assignment rules when there is, in fact, exactly one function with
 the right number of parameters visible in the search path.

+1 for this.

 The main downside I can see is that code that used to work is likely
 to stop working as soon as someone creates a potential overloading
 situation.  Worse, the error message could be pretty confusing,
 since if you had been successfully calling f(smallint) with f(42),
 you'd get f(integer) does not exist, not something like f() is
 ambiguous, after adding f(float8) to the mix.  This seems related
 to the confusing changes in regression test cases that I got in my
 experiments yesterday.  This may be sufficient reason to reject the
 idea, since the very last thing we need in this area is any
 degradation in the relevance of the error messages.

With the ANY* parameters introduced in the past few versions, there's
a lot less incentive to create this problem.  The trick here is
documenting the ANY* parameters in enough places to make sure that
incentive is reduced.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-08-30 Thread Robert Haas
On Wed, Aug 29, 2012 at 6:39 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Well, I see your point about LPAD(), but the problem is how to tell
 the difference between a harmless cast omission and an actual mistake
 that the user will be very grateful if we point out.  If we allow
 implicit casts to text in the general case in function/operator calls,
 we are definitely going to re-introduce a lot of room for mistakes.

I concede that point.  :-)

 Upthread you were complaining about how we'd reject calls even when
 there was only one possible interpretation.  I wonder whether there'd be
 any value in taking that literally: that is, allow use of assignment
 rules when there is, in fact, exactly one function with the right number
 of parameters visible in the search path.  This would solve the LPAD()
 problem (at least as stated), and probably many other practical cases
 too, since I admit your point that an awful lot of users do not use
 function overloading.  The max() example I mentioned earlier would not
 get broken since there's more than one max(), and in general it seems
 likely that cases where there's a real risk would involve overloaded
 names.

That's an interesting idea.  I like it.

 The main downside I can see is that code that used to work is likely
 to stop working as soon as someone creates a potential overloading
 situation.  Worse, the error message could be pretty confusing, since
 if you had been successfully calling f(smallint) with f(42), you'd get
 f(integer) does not exist, not something like f() is ambiguous,
 after adding f(float8) to the mix.  This seems related to the confusing
 changes in regression test cases that I got in my experiments yesterday.

One thought I had when looking at those messages was that, in some
ways, the new messages were actually less confusing than the old
messages. I mean, if you try to call f(42) and you get f(integer) does
not exist, ok, you'll probably figure out that the issue is with the
argument type, since you most likely know that an f of some type does
in fact exist.  But it would be even more clear if the error message
said, ok, so there is an f, but I'm not going to call it because the
argument types don't match closely enough.  The distinction would be
even more useful if the function happens to be called snuffleupagus
rather than f, because then when you call snufleupagus(42.0), it'll
tell you i know nothing about a function by that name whereas when
you call snuffleupagus(42) it'll tell you i know about a function by
that name, but not with those argument types.  I've certainly
encountered this confusion before whilst debugging my own and other
people's databases: is it giving me that error because the function
doesn't exist, or because of an argument type mismatch?

 This may be sufficient reason to reject the idea, since the very last
 thing we need in this area is any degradation in the relevance of the
 error messages.

 ... as long as I work for a company that helps
 people migrate from other database systems, I'm not going to be able
 to stop caring about this issue even in cases where I don't personally
 think implicit casting is a good idea, because other people who are
 not me have tens of thousands of lines of procedural code written for
 those other systems and if you tell them they've got to go through and
 add hundreds or thousands of casts before they can migrate, it tends
 to turn them off.  Maybe there's no perfect solution to that problem,
 but the status quo is definitely not perfect either.

 Meh.  I tend to think that a better solution to those folks' problem is
 a package of add-on casts that they could install for use with their
 legacy code; not dumbing down the system's error detection capability
 for everyone.  Peter's original try at re-adding implicit text casts
 in that way didn't work very well IIRC, but maybe we could try harder.

Well, the big problem that you run into is that when you add casts,
you tend to create situations that the type system thinks are
ambiguous.  A particular example of this is textanycat, anytextcat,
and plain old textcat.  If you start adding casts, the system can get
confused about which one it's supposed to call in which situation.
The frustrating thing is that we don't really care.  The only reason
why there are three different operators in the first place is because
we want to make sure that everything someone does will match one of
them.  But then if something matches two of them, we error out
unnecessarily.

It would be nice to have a way to say among this group of functions,
we don't care or perhaps among this group of functions, here is a
preference ordering; in case of doubt, pick the one with the highest
preference.  But in some sense I feel that that isn't really solving
the problem, because the only reason those extra functions exist in
the first place is to work around the fact that sometimes the system
doesn't perform typecasts in situations where we wish it did.  It's
almost 

Re: [HACKERS] Getting rid of cheap-startup-cost paths earlier

2012-08-30 Thread Bruce Momjian
On Tue, May 22, 2012 at 08:29:48AM -0400, Robert Haas wrote:
 On Tue, May 22, 2012 at 1:50 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  Currently, the planner keeps paths that appear to win on the grounds of
  either cheapest startup cost or cheapest total cost.  It suddenly struck
  me that in many simple cases (viz, those with no LIMIT, EXISTS, cursor
  fast-start preference, etc) we could know a-priori that cheapest startup
  cost is not going to be interesting, and hence immediately discard any
  path that doesn't win on total cost.
 
  This would require some additional logic to detect whether the case
  applies, as well as extra complexity in add_path.  So it's possible
  that it wouldn't be worthwhile overall.  Still, it seems like it might
  be a useful idea to investigate.
 
  Thoughts?
 
 Yeah, I think we should investigate that.  Presumably you could easily
 have a situation where one part of the tree is under a LIMIT or EXISTS
 and therefore needs to preserve fast-start plans but the rest of the
 (potentially large) tree isn't, so we need something fairly
 fine-grained, I think.  Maybe we could add a flag to each RelOptInfo
 indicating whether fast-start plans should be kept, or something like
 that.

Is this a TODO?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wal_buffers

2012-08-30 Thread Robert Haas
On Wed, Aug 29, 2012 at 10:25 PM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 On 19 February 2012 05:24, Robert Haas robertmh...@gmail.com wrote:
 I have attached tps scatterplots.  The obvious conclusion appears to
 be that, with only 16MB of wal_buffers, the buffer wraps around with
 some regularity: we can't insert more WAL because the buffer we need
 to use still contains WAL that hasn't yet been fsync'd, leading to
 long stalls.  More buffer space ameliorates the problem.

 Incidentally, I wondered if we could further improve group commit
 performance by implementing commit_delay with a WaitLatch call, and
 setting the latch in the event of WAL buffers wraparound (or rather, a
 queued wraparound request - a segment switch needs WALWriteLock, which
 the group commit leader holds for a relatively long time during the
 delay). I'm not really sure how significant a win this might be,
 though. There could be other types of contention, which could be
 considerably more significant. I'll try and take a look at it next
 week.

I have a feeling that one of the big bottlenecks here is that we force
an immediate fsync when we reach the end of a segment.  I think it was
originally done that way to keep the code simple, and it does
accomplish that, but it's not so hot for performance.  More generally,
I think we really need to split WALWriteLock into two locks, one to
protect the write position and the other to protect the flush
position.  I think we're often ending up with a write (which is
usually fast) waiting for a flush (which is often much slower) when in
fact those things ought to be able to happen in parallel.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Query plan optimization for CHECK NO INHERIT and single table?

2012-08-30 Thread Robert Haas
On Thu, Aug 30, 2012 at 2:00 AM, Matthias nitrogen...@gmail.com wrote:
 Hey,

 I tried out the new CHECK NO INHERIT feature for inherited tables.
 There seems to be an opportunity to generate slightly better query
 plans sometimes. E.g. when I do

 SELECT * FROM base WHERE partition_id = 3

 and there exists only one child table for which partition_id = 3 is
 true I guess the query plan could just do a seq/index/whatever scan on
 that table. Right now the query plan has an intermediate Append
 node. This seems only useful if the results of multiple child tables
 would need to be included.

I think it's needed to translate between the child's set of attribute
numbers and their parent's set of attribute numbers - that is, the
child could have the columns in a different order, or could have extra
ones.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Chronic performance issue with Replication Failover and FSM.

2012-08-30 Thread Tom Lane
Daniel Farina dan...@heroku.com writes:
 but just today we promoted another system via streaming replication to
 pick up the planner fix in 9.1.5 (did you know: that planner bug seems
 to make GIN FTS indexes un-used in non-exotic cases, and one goes to
 seqscan?), and then a 40MB GIN index bloated to two gigs on a 1.5GB
 table over the course of maybe six hours.

I think this is probably unrelated to what Josh was griping about:
even granting that the system forgot any free space that had been
available within the original 40MB, that couldn't in itself lead
to eating more than another 40MB, no?  My guess is something is
broken about the oldest-xmin-horizon mechanism, such that VACUUM
is failing to recover space.  Can you put together a self-contained
test case that exhibits similar growth?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] rows modified in current transaction

2012-08-30 Thread Miroslav Šimulčík
Hi,

is there any way to check if row have already been modified by the current
transaction? I tried condition txid_current() = xmin, but there is problem
with the savepoints. After every savepoint rows are getting higher xmin
values, but txid_current() remains the same.

Regards,
Miroslav Simulcik


Re: [HACKERS] Minor comment fixups

2012-08-30 Thread Robert Haas
On Mon, Aug 27, 2012 at 11:21 PM, Jeff Davis pg...@j-davis.com wrote:
 I noticed a couple comments that look wrong to me. Patch attached.

Thanks, committed.  But I updated the parenthesized comment in the
first fix instead of removing it.  Let me know if you see an issue
with that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_operator.oprcode in 9.2rc1

2012-08-30 Thread Joe Abbate
Hello hackers,

I've been testing Pyrseas against 9.2rc1.  A test that does a CREATE
OPERATOR is giving a small difference.  Specifically, the test issues
the statement:

CREATE OPERATOR + (PROCEDURE = upper, RIGHTARG = text);

Pyrseas then queries the pg_operator catalog to map the procedure for
output.  Selecting oprcode in 8.4, 9.0, and 9.1, always returns 'upper',
but in 9.2rc1, the value is 'pg_catalog.upper'.  It does not matter
whether pg_catalog is on the search_path or not.

I looked over the release notes but I couldn't find any reference to
this possible change in behavior.  I'd like to confirm whether the
schema-qualified procedure name is a permanent change or an unintended
side effect of something else.

Thanks.

Joe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Event Triggers reduced, v1

2012-08-30 Thread Robert Haas
On Mon, Aug 27, 2012 at 7:52 AM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 I've been reviewing your changes and here's a very small patch with some
 details I would have spelled out differently. See what you think, I
 mostly needed to edit some code to get back in shape :)

I guess I don't particularly like either of these changes.  The first
one is mostly harmless, but I don't really see why it's any better,
and it does have the downside of traversing the string twice (once for
strlen and a second time in str_toupper) instead of just once.  It
also makes a line wider than 80 columns, which is a bit ugly.  In the
second hunk, the point is that we never have to do CreateCommandTag()
here at all unless either casserts are enabled or EventCacheLookup
returns a non-empty list.  That means that in a non-assert-enabled
build, we get to skip that work altogether in the presumably-common
case where there are no relevant event triggers.  Your proposed change
would avoid doing it twice when asserts are disabled, but the cost
would be that we'd have to do it once when asserts were disabled even
if no event triggers exist.  I don't think that's a good trade-off.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hunspell and tsearch2 ?

2012-08-30 Thread Robert Haas
On Mon, Aug 27, 2012 at 8:31 AM, Dirk Lutzebäck
dirk.lutzeba...@thinkproject.com wrote:
 we have issues with compound words in tsearch2 using the german (ispell)
 dictionary. This has been discussed before but there is no real solution
 using the recommended german dictionary at
 http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2 (convert old
 openoffice dict file to ispell suitable for tsearch):

 # select ts_lexize('german_ispell', 'vollklimatisiert');
  ts_lexize
 
  {vollklimatisiert}
 (1 row)

 This should return atleast

  {vollklimatisiert, voll, klimatisiert}


 The issue with compound words in ispell has been addressed in hunspell. But
 this has not been integrated fully to tsearch2 (according to the
 documentation).

Just out of curiosity, which part of the documentation are you looking
at?  The only mention of hunspell I see in the documentation is a
mention that we apparently support their dictionary-file format.

 Are there any plans to fully integrate hunspell into tsearch2? What is
 needed to do this? What is the functional delta which is missing? Maybe we
 can help...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: pgbench - random sampling of transaction written into log

2012-08-30 Thread Robert Haas
On Sun, Aug 26, 2012 at 1:04 PM, Tomas Vondra t...@fuzzy.cz wrote:
 Attached is an improved patch, with a call to rand() replaced with
 getrand().

 I was thinking about the counter but I'm not really sure how to handle
 cases like 39% - I'm not sure a plain (counter % 100  37) is not a
 good sampling, because it always keeps continuous sequences of
 transactions. Maybe there's a clever way to use a counter, but let's
 stick to a getrand() unless we can prove is't causing issues. Especially
 considering that a lot of data won't be be written at all with low
 sampling rates.

I like this patch, and I think sticking with a random number is a good
idea.  But I have two suggestions.  Number one, I think the sampling
rate should be stored as a float, not an integer, because I can easily
imagine wanting a sampling rate that is not an integer percentage -
especially, one that is less than one percent, like half a percent or
a tenth of a percent.  Also, I suggest that the command-line option
should be a long option rather than a single character option.  That
will be more mnemonic and avoid using up too many single letter
options, of which there is a limited supply.  So to sample every
hundredth result, you could do something like this:

pgbench --latency-sample-rate 0.01

Another option I personally think would be useful is an option to
record only those latencies that are above some minimum bound, like
this:

pgbench --latency-only-if-more-than $MICROSECONDS

The problem with recording all the latencies is that it tends to have
a material impact on throughput.  Your patch should address that for
the case where you just want to characterize the latency, but it would
also be nice to have a way of recording the outliers.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: optimized DROP of multiple tables within a transaction

2012-08-30 Thread Robert Haas
On Fri, Aug 24, 2012 at 6:36 PM, Tomas Vondra t...@fuzzy.cz wrote:
 attached is a patch that improves performance when dropping multiple
 tables within a transaction. Instead of scanning the shared buffers for
 each table separately, the patch removes this and evicts all the tables
 in a single pass through shared buffers.

 Our system creates a lot of working tables (even 100.000) and we need
 to perform garbage collection (dropping obsolete tables) regularly. This
 often took ~ 1 hour, because we're using big AWS instances with lots of
 RAM (which tends to be slower than RAM on bare hw). After applying this
 patch and dropping tables in groups of 100, the gc runs in less than 4
 minutes (i.e. a 15x speed-up).

 This is not likely to improve usual performance, but for systems like
 ours, this patch is a significant improvement.

Seems pretty reasonable.  But instead of duplicating so much code,
couldn't we find a way to use replace DropRelFileNodeAllBuffers with
DropRelFileNodeAllBuffersList?  Surely anyone who was planning to call
the first one could instead call the second one with a count of one
and a pointer to the address of the data they were planning to pass.
I'd probably swap the order of arguments to
DropRelFileNodeAllBuffersList as well.  We could do something similar
with smgrdounlink/smgrdounlinkall so that, again, only one copy of the
code is needed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: pgbench - aggregation of info written into log

2012-08-30 Thread Robert Haas
On Fri, Aug 24, 2012 at 5:25 PM, Tomas Vondra t...@fuzzy.cz wrote:
 This patch is a bit less polished (and more complex) than the other
 pgbench patch I've sent a while back, and I'm not sure how to handle the
 Windows branch. That needs to be fixed during the commit fest.

What's the problem with the Windows branch?

Could you un-cuddle your brances to follow the PostgreSQL style, omit
braces around single-statement blocks, and remove the spurious
whitespace changes?

Instead of having both use_log_agg and naggseconds, I think you can
get by with just having a single variable that is zero if aggregation
is not in use and is otherwise the aggregation period.  On a related
note, you can't specify -A without an associated value, so there is no
point in documenting a default.  As with your other patch, I suggest
using a long option name like --latency-aggregate-interval and then
making the name of the variable in the code match the option name,
with s/-/_/g, for clarity.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rows modified in current transaction

2012-08-30 Thread Robert Haas
On Thu, Aug 30, 2012 at 10:36 AM, Miroslav Šimulčík
simulcik.m...@gmail.com wrote:
 is there any way to check if row have already been modified by the current
 transaction? I tried condition txid_current() = xmin, but there is problem
 with the savepoints. After every savepoint rows are getting higher xmin
 values, but txid_current() remains the same.

It sounds like you're looking for a function that will give an array
of all XIDs for the current transcation, rather than just the XID of
the current sub-transaction.  I don't think we currently expose that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rows modified in current transaction

2012-08-30 Thread Andres Freund
On Thursday, August 30, 2012 06:06:59 PM Robert Haas wrote:
 On Thu, Aug 30, 2012 at 10:36 AM, Miroslav Šimulčík
 
 simulcik.m...@gmail.com wrote:
  is there any way to check if row have already been modified by the
  current transaction? I tried condition txid_current() = xmin, but there
  is problem with the savepoints. After every savepoint rows are getting
  higher xmin values, but txid_current() remains the same.
 
 It sounds like you're looking for a function that will give an array
 of all XIDs for the current transcation, rather than just the XID of
 the current sub-transaction.  I don't think we currently expose that.
txid_current_snapshot(), txis_visible_in_snapshot() may work.

Greetings,

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] splitting *_desc routines

2012-08-30 Thread Andres Freund
On Wednesday, August 29, 2012 10:06:16 PM Tom Lane wrote:
 Alvaro Herrera alvhe...@alvh.no-ip.org writes:
  I looked at Andres' patch and the general idea is rather horrible: it
  links all backend files into the output executable.  This is so that the
  *_desc functions can be used from their respective object files, which
  obviously have a lot of requirements for backend infrastructure.
 
 Check.
I said it was a preliminary hack though ;). Especially the way I assembled the 
object files... 
The xlogdump utility itself is equally crappy atm, it was just a demonstration 
which suited me enough for debugging... But it really doesn't need that much 
more.

 An alternative thing that might be worth considering before you go all
 in on this is whether the xlogdump functionality shouldn't just be part
 of the regular server executable, ie you'd call it with
 
   postgres --xlogdump arguments
 
 and the only extra code needed is two lines for another redirect in
 main.c.  We've already done that for things like --describe-config.
 This'd likely be a lot less work than getting the _desc routines to
 be operable standalone ...
It definitely would be simpler. It doesn't seem nice to pile more and more 
utilities into the main postgres binary though.

Note the ugliness some the testing tools in src/backend go through just to 
link to a few files... Yuck.

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rows modified in current transaction

2012-08-30 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Thu, Aug 30, 2012 at 10:36 AM, Miroslav Šimulčík
 simulcik.m...@gmail.com wrote:
 is there any way to check if row have already been modified by the current
 transaction? I tried condition txid_current() = xmin, but there is problem
 with the savepoints. After every savepoint rows are getting higher xmin
 values, but txid_current() remains the same.

 It sounds like you're looking for a function that will give an array
 of all XIDs for the current transcation, rather than just the XID of
 the current sub-transaction.  I don't think we currently expose that.

IIRC,  txid_current() actually reflects the current *top* transaction,
which is why rows inserted by subtransactions aren't matching it.
But yeah, there's no exported way to identify all the XIDs belonging
to the current transaction.

A larger problem with the above is that txid isn't an XID anyway,
so the comparisons would fail altogether once the XID epoch becomes
more than zero.  So if we did want to support this, it would be a
lot more useful to provide something along the lines of
xid_belongs_to_current_transaction(xid) returns bool
than to expose the XIDs as such.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rows modified in current transaction

2012-08-30 Thread Andres Freund
On Thursday, August 30, 2012 06:09:43 PM Andres Freund wrote:
 On Thursday, August 30, 2012 06:06:59 PM Robert Haas wrote:
  On Thu, Aug 30, 2012 at 10:36 AM, Miroslav Šimulčík
  
  simulcik.m...@gmail.com wrote:
   is there any way to check if row have already been modified by the
   current transaction? I tried condition txid_current() = xmin, but there
   is problem with the savepoints. After every savepoint rows are getting
   higher xmin values, but txid_current() remains the same.
  
  It sounds like you're looking for a function that will give an array
  of all XIDs for the current transcation, rather than just the XID of
  the current sub-transaction.  I don't think we currently expose that.
 
 txid_current_snapshot(), txis_visible_in_snapshot() may work.
No, it obviously cannot. No idea what I thought... txid_visible_in_snapshot is 
fine, but txid_current_snapshot() obviously will return a snapshot containing 
that sees everything the current transaction does.

You possibly could calculate a difference between a txid_current_snapshot() 
taken at the beginning of a repeatable read transaction and the current one, 
but thats too ugly.

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_operator.oprcode in 9.2rc1

2012-08-30 Thread Tom Lane
Joe Abbate j...@freedomcircle.com writes:
 Hello hackers,
 I've been testing Pyrseas against 9.2rc1.  A test that does a CREATE
 OPERATOR is giving a small difference.  Specifically, the test issues
 the statement:

 CREATE OPERATOR + (PROCEDURE = upper, RIGHTARG = text);

 Pyrseas then queries the pg_operator catalog to map the procedure for
 output.  Selecting oprcode in 8.4, 9.0, and 9.1, always returns 'upper',
 but in 9.2rc1, the value is 'pg_catalog.upper'.  It does not matter
 whether pg_catalog is on the search_path or not.

The reason for the difference is that in 9.2 there's more than one
pg_catalog.upper():

regression=# \df upper
  List of functions
   Schema   | Name  | Result data type | Argument data types |  Type  
+---+--+-+
 pg_catalog | upper | anyelement   | anyrange| normal
 pg_catalog | upper | text | text| normal
(2 rows)

whereas prior versions had only upper(text).  The regproc output
function isn't allowed to print argument types, which is what would be
needed to disambiguate altogether, but it does what it can to warn you
that upper alone is not a unique name by schema-qualifying the name.

This is not new behavior in 9.2, it just happens to occur for upper()
when it did not before.  If you're expecting regproc to always return
unique unqualified names then your code is broken anyway, since such
a requirement cannot be met when the function is overloaded.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] HEAD crashes on windows when doing VACUUM ANALYZE

2012-08-30 Thread Andres Freund
On Thursday, August 30, 2012 06:50:13 PM Matthias wrote:
 2012/8/30 Albe Laurenz laurenz.a...@wien.gv.at:
  Matthias wrote:
  when running VACUUM ANALYZE on my database built on win32-x86 from
  yesterday's git checkout I always get this at some point during VACUUM
  ANALYZE:
  
  LOG:  server process (PID 5880) was terminated by exception 0xC094
  DETAIL:  Failed process was running: VACUUM VERBOSE ANALYZE
  HINT:  See C include file ntstatus.h for a description of the
  hexadecimal value.
  LOG:  terminating any other active server processes
  
  I am not sure if it's useful to report it here, but I thought I'd do
  it anyway :)
  
  That seems to be STATUS_INTEGER_DIVIDE_BY_ZERO.
  
  Does it only happen with a certain table?
  Are you sure there is no data corruption?
  A stack trace would help:
  http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_Postg
  reSQL_backend_on_Windows
 
 Attached the debugger.
 
 It crashes in rangetypes_typeanalyze.c at line 186:
 
 delta = (non_empty_cnt - 1) / (num_hist - 1);
 
 According to the debugger num_hist = 1, so it divides by zero.
Its curious though that the SIGFPE isn't properly cought though. That would 
only lead to a different error, but ...
Postgres does install a handler for it. What happens if you run SELECT 
-2147483648/-1;?

Greetings,

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] HEAD crashes on windows when doing VACUUM ANALYZE

2012-08-30 Thread Matthias
2012/8/30 Albe Laurenz laurenz.a...@wien.gv.at:
 Matthias wrote:
 when running VACUUM ANALYZE on my database built on win32-x86 from
 yesterday's git checkout I always get this at some point during VACUUM
 ANALYZE:

 LOG:  server process (PID 5880) was terminated by exception 0xC094
 DETAIL:  Failed process was running: VACUUM VERBOSE ANALYZE
 HINT:  See C include file ntstatus.h for a description of the
 hexadecimal value.
 LOG:  terminating any other active server processes

 I am not sure if it's useful to report it here, but I thought I'd do
 it anyway :)

 That seems to be STATUS_INTEGER_DIVIDE_BY_ZERO.

 Does it only happen with a certain table?
 Are you sure there is no data corruption?
 A stack trace would help:
 http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_Postg
 reSQL_backend_on_Windows

Attached the debugger.

It crashes in rangetypes_typeanalyze.c at line 186:

delta = (non_empty_cnt - 1) / (num_hist - 1);

According to the debugger num_hist = 1, so it divides by zero. I guess
this is due to the new statistics collection for range types?

-Matthias


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_operator.oprcode in 9.2rc1

2012-08-30 Thread Joe Abbate
Hello Tom,

On 30/08/12 12:27, Tom Lane wrote:
 The reason for the difference is that in 9.2 there's more than one
 pg_catalog.upper():
 
 regression=# \df upper
   List of functions
Schema   | Name  | Result data type | Argument data types |  Type  
 +---+--+-+
  pg_catalog | upper | anyelement   | anyrange| normal
  pg_catalog | upper | text | text| normal
 (2 rows)
 
 whereas prior versions had only upper(text).  The regproc output
 function isn't allowed to print argument types, which is what would be
 needed to disambiguate altogether, but it does what it can to warn you
 that upper alone is not a unique name by schema-qualifying the name.
 
 This is not new behavior in 9.2, it just happens to occur for upper()
 when it did not before.  If you're expecting regproc to always return
 unique unqualified names then your code is broken anyway, since such
 a requirement cannot be met when the function is overloaded.

Hmmm ... Well, I'm just doing the same thing as pg_dump, which in 9.2rc1
still outputs the same as before, namely:

SET search_path = public, pg_catalog;

--
-- Name: +; Type: OPERATOR; Schema: public; Owner: -
--

CREATE OPERATOR + (
PROCEDURE = upper,
RIGHTARG = text
);

What's somewhat confusing is that the documentation (and \d pg_operator)
states oprcode (as well as oprrest and oprjoin) are of type 'regproc'
and that it references a pg_proc.oid.  Does the catalog actually store
an OID, i.e., the OID of pg_catalog.upper(text), or something else?

Best regards,

Joe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix for gistchoose

2012-08-30 Thread Robert Haas
On Mon, Aug 20, 2012 at 12:57 PM, Alexander Korotkov
aekorot...@gmail.com wrote:
 I found that code of gistchoose doesn't follow it's logic. Idea of
 gistchoose is that first column penalty is more important than penalty of
 second column. If we meet same penalty values of first column then we choose
 minimal penalty value of second column among them.

Nice catch.  Thanks for the patch, which I have now committed.  But
since these functions were very short on useful comments, I added
some.  Hopefully they're even right, but let me know if you think
otherwise, or have any ideas for further improvement.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Don't allow relative path for copy from file

2012-08-30 Thread Robert Haas
On Thu, Aug 16, 2012 at 2:11 AM, Etsuro Fujita
fujita.ets...@lab.ntt.co.jp wrote:
 Agreed.  I'd like to withdraw the patch sent in the earlier post, and propose 
 to
 update the documentation in the COPY reference page.  Please find attached a
 patch.

I think this is a good idea, but I didn't like the exact wording you
chose, so I committed something a little different.  Let me know
whether it looks OK.

Thanks,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_operator.oprcode in 9.2rc1

2012-08-30 Thread Tom Lane
Joe Abbate j...@freedomcircle.com writes:
 On 30/08/12 12:27, Tom Lane wrote:
 The reason for the difference is that in 9.2 there's more than one
 pg_catalog.upper():

 Hmmm ... Well, I'm just doing the same thing as pg_dump, which in 9.2rc1
 still outputs the same as before, namely:

Well, evidently you're *not* doing the same thing as pg_dump.

A look at pg_dump says that what it does is to cast the column to
regprocedure, and then strip the argument types from that printout.

Perhaps some experimentation would be illuminating:

regression=# select 'upper'::regproc;
ERROR:  more than one function named upper
LINE 1: select 'upper'::regproc;
   ^
regression=# select 'upper(text)'::regprocedure;
 regprocedure 
--
 upper(text)
(1 row)

regression=# select 'upper(text)'::regprocedure::oid;
 oid 
-
 871
(1 row)

regression=# select 871::regprocedure;
 regprocedure 
--
 upper(text)
(1 row)

regression=# select 871::regproc; 
 regproc  
--
 pg_catalog.upper
(1 row)

 What's somewhat confusing is that the documentation (and \d pg_operator)
 states oprcode (as well as oprrest and oprjoin) are of type 'regproc'
 and that it references a pg_proc.oid.  Does the catalog actually store
 an OID, i.e., the OID of pg_catalog.upper(text), or something else?

What's physically in there is an OID (and so the casts above are no-ops
at the representational level).  What we're discussing is the behavior
of the output function for the regproc or regprocedure types.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix for gistchoose

2012-08-30 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Aug 20, 2012 at 12:57 PM, Alexander Korotkov
 aekorot...@gmail.com wrote:
 I found that code of gistchoose doesn't follow it's logic. Idea of
 gistchoose is that first column penalty is more important than penalty of
 second column. If we meet same penalty values of first column then we choose
 minimal penalty value of second column among them.

 Nice catch.  Thanks for the patch, which I have now committed.

Should we backpatch that?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] HEAD crashes on windows when doing VACUUM ANALYZE

2012-08-30 Thread Heikki Linnakangas

On 30.08.2012 19:50, Matthias wrote:

It crashes in rangetypes_typeanalyze.c at line 186:

 delta = (non_empty_cnt - 1) / (num_hist - 1);

According to the debugger num_hist = 1, so it divides by zero. I guess
this is due to the new statistics collection for range types?


Yep. Fixed, thanks for the report!

I added just a check that the histogram is not created if there are less 
than 2 values in the sample. The corresponding code for scalars also 
checks that there are more than 1 distinct value, so that the histogram 
doesn't consist of all duplicates. We don't currently count the number 
of distinct values for ranges, so that would require a bit more work, 
but I don't think it makes much difference in practice.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] canceling autovacuum task woes

2012-08-30 Thread Robert Haas
On Mon, Aug 13, 2012 at 11:59 PM, Peter Eisentraut pete...@gmx.net wrote:
 On Tue, 2012-07-24 at 16:14 -0400, Robert Haas wrote:
 On Tue, Jul 24, 2012 at 4:09 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Robert Haas robertmh...@gmail.com writes:
  Yeah, you're right.  So you do get the table name.  But you don't get
  the cause, which is what you really need to understand why it's
  happening.  Attached is a patch that adds some more detail.
 
  Uh, what's the added dependency on pgstat.h for?  Looks sane to the
  eyeball otherwise.

 Woops, that was leftovers from some earlier silliness that I (mostly)
 removed before posting.

 New version attached.


 The detail message should have a period at the end.

Oops.  Fixed, sorry it took me so long to get to this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix for gistchoose

2012-08-30 Thread Robert Haas
On Thu, Aug 30, 2012 at 1:27 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, Aug 20, 2012 at 12:57 PM, Alexander Korotkov
 aekorot...@gmail.com wrote:
 I found that code of gistchoose doesn't follow it's logic. Idea of
 gistchoose is that first column penalty is more important than penalty of
 second column. If we meet same penalty values of first column then we choose
 minimal penalty value of second column among them.

 Nice catch.  Thanks for the patch, which I have now committed.

 Should we backpatch that?

Arguably, yes.  Does the patch look sane to you?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SIGFPE handler is naive

2012-08-30 Thread Robert Haas
On Tue, Aug 14, 2012 at 5:46 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Noah Misch n...@leadboat.com writes:
 On Tue, Aug 14, 2012 at 08:40:06AM -0400, Robert Haas wrote:
 On Tue, Aug 14, 2012 at 6:50 AM, Greg Stark st...@mit.edu wrote:
 It is possible to check if the signal was synchronous or was sent from
 an external process. You can check siginfo-si_pid to see who sent you
 the signal. I'm not sure checking that and handling it at
 check_for_interrupts if it's asynchronous is the best solution or not
 though.

 If that's portable it might be an option, but I doubt that it is.

 I suspect it is portable.

 Single Unix Spec V2 says (on the sigaction man page)

 The si_code member contains a code identifying the cause of the
 signal. If the value of si_code is less than or equal to 0, then the
 signal was generated by a process and si_pid and si_uid respectively
 indicate the process ID and the real user ID of the sender.

 I'm not sure I would trust checking si_pid alone; it would definitely
 fail on my old HPUX box, where I see that field is union'ed with si_addr
 and so will read as garbage for a locally-sourced SIGFPE.  But it might
 be that checking si_code alone would work reliably.

 I think that rejecting an externally sourced SIGFPE might be worth doing
 if we can distinguish that reliably.

Currently, our signal handlers on all platforms are declared to take
just an int.  We'd need to change that in order to try to do this.
Any thoughts on how we could go about that?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] HEAD crashes on windows when doing VACUUM ANALYZE

2012-08-30 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On Thursday, August 30, 2012 06:50:13 PM Matthias wrote:
 According to the debugger num_hist = 1, so it divides by zero.

 Its curious though that the SIGFPE isn't properly cought though. That would 
 only lead to a different error, but ...

Not all platforms think an integer divide-by-zero maps to SIGFPE.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] splitting *_desc routines

2012-08-30 Thread Robert Haas
On Thu, Aug 30, 2012 at 12:14 PM, Andres Freund and...@2ndquadrant.com wrote:
 An alternative thing that might be worth considering before you go all
 in on this is whether the xlogdump functionality shouldn't just be part
 of the regular server executable, ie you'd call it with

   postgres --xlogdump arguments

 and the only extra code needed is two lines for another redirect in
 main.c.  We've already done that for things like --describe-config.
 This'd likely be a lot less work than getting the _desc routines to
 be operable standalone ...
 It definitely would be simpler. It doesn't seem nice to pile more and more
 utilities into the main postgres binary though.

Agreed.  Another advantage of making this standalone is that other
people might then be able to write code that does interesting things
with it.  If we just add the functionality into core then nobody's any
better off than before.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix for gistchoose

2012-08-30 Thread Alexander Korotkov
On Thu, Aug 30, 2012 at 9:11 PM, Robert Haas robertmh...@gmail.com wrote:

 On Mon, Aug 20, 2012 at 12:57 PM, Alexander Korotkov
 aekorot...@gmail.com wrote:
  I found that code of gistchoose doesn't follow it's logic. Idea of
  gistchoose is that first column penalty is more important than penalty of
  second column. If we meet same penalty values of first column then we
 choose
  minimal penalty value of second column among them.

 Nice catch.  Thanks for the patch, which I have now committed.  But
 since these functions were very short on useful comments, I added
 some.  Hopefully they're even right, but let me know if you think
 otherwise, or have any ideas for further improvement.


Thanks! Comments looks good for me.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] [WIP] Performance Improvement by reducing WAL for Update Operation

2012-08-30 Thread Robert Haas
On Fri, Aug 10, 2012 at 1:25 AM, Amit Kapila amit.kap...@huawei.com wrote:
 I think the property that recovery only needs to worry about each
 block individually is one that we want to preserve.  Supporting this
 optimizating only when full_page_writes=off seems ugly,

 I think recovery needs to worry about multiple blocks as well in some cases.
 Please see below case and correct me if I am wrong.
 I think currently also there can be problems in case of full_page_writes=off
 for crash recovery.
 1. Tuple A on page 1 is updated.  The new version, tuple B, is placed on
 page 2.
 2. Page 1 is Partially written to disk.
 3. During recovery, it can so appear that there is no need to update XMAX
 and other related things in Old tuple
as LSN is greater than WAL lsn.
 4. Now also there can be other problems related to tuple visibility.

Well, you're only supposed to turn full_page_writes=off if partial
page writes are impossible on your system.  If you turn off
full_page_writes on a system where partial page writes are impossible,
then you've intentionally broken crash recovery, and you get to keep
both pieces.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: shared session variables

2012-08-30 Thread Robert Haas
On Tue, Aug 14, 2012 at 3:46 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 patch that implements shared client/server session variables

I don't really see what we can do with this that we can't do without this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: shared session variables

2012-08-30 Thread Pavel Stehule
2012/8/30 Robert Haas robertmh...@gmail.com:
 On Tue, Aug 14, 2012 at 3:46 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 patch that implements shared client/server session variables

 I don't really see what we can do with this that we can't do without this.

a motivation for this patch was discussion about parametrised DO
statement - and simple possibility of access to host variables (psql)
variables from server - PL scripts.

It is based on Tom's and Magnus's ideas - it is secure, because only
variables explicitly mentioned in shared namespace are shared.

http://archives.postgresql.org/pgsql-hackers/2012-06/msg01506.php

Regards

Pavel


 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_operator.oprcode in 9.2rc1

2012-08-30 Thread Joe Abbate
Hello Tom,

On 30/08/12 13:23, Tom Lane wrote:
 Joe Abbate j...@freedomcircle.com writes:
 Hmmm ... Well, I'm just doing the same thing as pg_dump, which in 9.2rc1
 still outputs the same as before, namely:
 
 Well, evidently you're *not* doing the same thing as pg_dump.

I meant that the Pyrseas dbtoyaml's output is essentially the same as
pg_dump, e.g.,

schema public:
  operator +(NONE, text):
procedure: upper

Therefore, if psql doesn't have problem restoring the operator from the
pg_dump output, neither should yamltodb have problem generating the SQL
to recreate the operator.  The above YAML (with or without the schema
qualification) does generate the correct SQL and pg_operator.oprcode
ends up with the correct OID.  So at least for this test case,
dbtoyam/yamltodb is not broken (but I'll have to do something about the
unittest difference).

 What's physically in there is an OID (and so the casts above are no-ops
 at the representational level).  What we're discussing is the behavior
 of the output function for the regproc or regprocedure types.

Yes, I suspected that an OID was stored.  What I'd still quibble with is
the use of the ambiguous regproc in pg_operator (also pg_type) and the
still-ambiguous schema-qualified proc name.  I guess it's not feasible
(at least, short term), but it'd be preferable to store a raw OID and
let the user cast to regprocedure (or change the 'regproc' to
'regprocedure').

Best regards,

Joe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_operator.oprcode in 9.2rc1

2012-08-30 Thread Tom Lane
Joe Abbate j...@freedomcircle.com writes:
 Yes, I suspected that an OID was stored.  What I'd still quibble with is
 the use of the ambiguous regproc in pg_operator (also pg_type) and the
 still-ambiguous schema-qualified proc name.  I guess it's not feasible
 (at least, short term), but it'd be preferable to store a raw OID and
 let the user cast to regprocedure (or change the 'regproc' to
 'regprocedure').

Yeah, ideally those columns would be regprocedure.  There are
bootstrapping problems involved though with populating the initial
contents of the catalogs during initdb --- the regprocedure input
function doesn't work in that environment.  (It might be possible to
hack something for pg_operator, but the circularity is rather
fundamental for loading pg_type, since the input function would need to
consult pg_type to make sense of argument types.)

In the meantime I'd suggest casting the columns to regprocedure when
querying, if you want unambiguous results.  That's what pg_dump does.
Or you can cast to OID if you like numbers.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wiki link for max_connections? (Fwd: Re: [ADMIN] PostgreSQL oom_adj postmaster process to -17)

2012-08-30 Thread Robert Haas
On Thu, Aug 9, 2012 at 1:11 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 I didn't figure it was; my emphasis was because this has been raised
 before and nothing happened for want of a consensus on what
 particular wording should be used, so users were left with no
 guidance.  I don't want this to continue to be a victim of the
 perfect is the enemy of the good syndrome.

So, to get the ball rolling here, I spent some time on this today, and
added a paragraph to the Linux Memory Overcommit section of the
documentation.  I back-patched it back to 9.0.   There were additional
merge on conflicts in 8.4 which I did not bother to resolve.  There
may be room to further improve what I did here; suggestions are
welcome.  I think we probably still need to add something to the
max_connections documentation; I have not done that.

 Also, I am a bit doubtful about the advice on sizing the
 connection pool as applied to small servers:
 surely it's not sane to recommend that a single-processor system
 with one disk should have max_connections = 3.  At least, *I*
 don't think that's sane.

 I'm not sure it's wrong when combined with this: Remember that this
 sweet spot is for the number of connections that are actively
 doing work.  ...  You should always make max_connections a bit
 bigger than the number of connections you enable in your connection
 pool. That way there are always a few slots available for direct
 connections for system maintenance and monitoring.  Where would you
 expect the knee to be for connections concurrently actively doing
 work on a single-core, single-drive system ?

I don't know.  But my experience with our customers is that people are
often forced to set the size of the connection pool far larger than
what that formula would suggest.  Many people are doing
transaction-level pooling, and for those people, they've got to look
at how many multi-statement transactions they've got and think about
what the peak value for that quantity is.  It's still worth using
pooling because it reduces the number of simultaneous connections, but
it's not going to reduce it to the kind of values you're talking
about.  Also, consider that transactions aren't all the same length.
Suppose 90% of your queries execute in 50ms, and 10% execute in 60s.
Even though it's less efficient, you've got to set the connection pool
large enough that at least some of the 50 ms queries can continue to
get processed even if the maximum number of 60s queries that you ever
expect to see in parallel are already running.  This may seem like a
theoretical problem but we have customers who use connection pools to
get the number of simultaneous connections down to, say, 800.  I
guarantee you that these people do not have 200 CPUs and 400 disks,
but they're smart people and they find that smaller pool sizes don't
work.

Sure, we can say, well, the fine print tells you that 2*CPUs+disks is
not REALLY the formula you should use, but it's just so far off what I
see in the field that I have a hard time thinking it's really helping
people to give them that as a starting point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: shared session variables

2012-08-30 Thread Robert Haas
On Thu, Aug 30, 2012 at 2:18 PM, Pavel Stehule pavel.steh...@gmail.com wrote:
 2012/8/30 Robert Haas robertmh...@gmail.com:
 On Tue, Aug 14, 2012 at 3:46 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 patch that implements shared client/server session variables

 I don't really see what we can do with this that we can't do without this.

 a motivation for this patch was discussion about parametrised DO
 statement - and simple possibility of access to host variables (psql)
 variables from server - PL scripts.

 It is based on Tom's and Magnus's ideas - it is secure, because only
 variables explicitly mentioned in shared namespace are shared.

Sure, but you could get to the same place by issuing a SET command for
just the particular variable you want to use with DO.  You don't
really need a magic facility for it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proof of concept: auto updatable views

2012-08-30 Thread Robert Haas
On Sun, Aug 12, 2012 at 5:14 PM, Dean Rasheed dean.a.rash...@gmail.com wrote:
 None of this new code kicks in for non-security barrier views, so the
 kinds of plans I posted upthread remain unchanged in that case. But
 now a significant fraction of the patch is code added to handle
 security barrier views. Of course we could simply say that such views
 aren't updatable, but that seems like an annoying limitation if there
 is a feasible way round it.

Maybe it'd be a good idea to split this into two patches: the first
could implement the feature but exclude security_barrier views, and
the second could lift that restriction.

Just a thought.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: optimized DROP of multiple tables within a transaction

2012-08-30 Thread Tomas Vondra
On 30 Srpen 2012, 17:53, Robert Haas wrote:
 On Fri, Aug 24, 2012 at 6:36 PM, Tomas Vondra t...@fuzzy.cz wrote:
 attached is a patch that improves performance when dropping multiple
 tables within a transaction. Instead of scanning the shared buffers for
 each table separately, the patch removes this and evicts all the tables
 in a single pass through shared buffers.

 Our system creates a lot of working tables (even 100.000) and we need
 to perform garbage collection (dropping obsolete tables) regularly. This
 often took ~ 1 hour, because we're using big AWS instances with lots of
 RAM (which tends to be slower than RAM on bare hw). After applying this
 patch and dropping tables in groups of 100, the gc runs in less than 4
 minutes (i.e. a 15x speed-up).

 This is not likely to improve usual performance, but for systems like
 ours, this patch is a significant improvement.

 Seems pretty reasonable.  But instead of duplicating so much code,
 couldn't we find a way to use replace DropRelFileNodeAllBuffers with
 DropRelFileNodeAllBuffersList?  Surely anyone who was planning to call
 the first one could instead call the second one with a count of one
 and a pointer to the address of the data they were planning to pass.
 I'd probably swap the order of arguments to
 DropRelFileNodeAllBuffersList as well.  We could do something similar
 with smgrdounlink/smgrdounlinkall so that, again, only one copy of the
 code is needed.

Yeah, I was thinking about that too, but I simply wasn't sure which is the
best choice so I've sent the raw patch. OTOH these functions are called on
a very limited number of places, so a refactoring like this seems fine.

Tomas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER command reworks

2012-08-30 Thread Robert Haas
On Mon, Aug 13, 2012 at 3:35 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 The attached patch is a refreshed version of ALTER command
 reworks towards the latest tree. Here is few big changes except
 for code integration of the code to rename event triggers.

This seems to have bit-rotted a bit.  Please rebase.

 BTW, I had to adjust between oid of pg_largeobject_metadata
 and pg_largeobject on three points of this patch, like:

 if (classId == LargeObjectRelationId)
 classId = LargeObjectMetadataRelationId;

 When we supported largeobject permission features, we put
 special handling to track dependency of its ownership.
 The pg_depend records oid of pg_largeobject, instead of
 pg_largeobject_metadata. Thus, we cannot use classId of
 ObjectAddress being returned from get_object_address()
 as an argument of heap_open() as is, if it indicates oid of
 pg_largeobject.

 Was it a right decision to track dependency of large object using
 oid of pg_largeobject, instead of pg_largeobject_metadata?
 IIRC, the reason why we used oid of pg_largeobject is backward
 compatibility for applications that tries to reference pg_depend
 with built-in oids.

I think it was a terrible decision and I'm pretty sure I said as much
at the time, but I lost the argument, so I'm inclined to think we're
stuck with continuing to support that kludge.

Some other preliminary comments:

- Surely you need to take AccessExclusiveLock on the object being
renamed, not just ShareUpdateExclusiveLock.
- I don't think it's acceptable to assemble the object-type
object-name already exists message using getObjectDescription();
it's not good for translators.  Use an array of messages, one per
object-type, as we have done in other cases.
- I would like to handle either the RENAME case or the ALTER OWNER
case in one patch, and the other in a follow-on patch.  Can you pick
one to do first and remove everything related to the other one?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: pgbench - aggregation of info written into log

2012-08-30 Thread Tomas Vondra
On 30 Srpen 2012, 18:02, Robert Haas wrote:
 On Fri, Aug 24, 2012 at 5:25 PM, Tomas Vondra t...@fuzzy.cz wrote:
 This patch is a bit less polished (and more complex) than the other
 pgbench patch I've sent a while back, and I'm not sure how to handle the
 Windows branch. That needs to be fixed during the commit fest.

 What's the problem with the Windows branch?

Well, there are comments about how timestamp does not work on Windows
(filling 0), and I'm not sure how that affects the patch (e.g. determining
the aggregation interval). I have no Windows workstation available so I
can't actually try that.

 Could you un-cuddle your brances to follow the PostgreSQL style, omit
 braces around single-statement blocks, and remove the spurious
 whitespace changes?

OK, will do.

 Instead of having both use_log_agg and naggseconds, I think you can
 get by with just having a single variable that is zero if aggregation
 is not in use and is otherwise the aggregation period.  On a related
 note, you can't specify -A without an associated value, so there is no
 point in documenting a default.  As with your other patch, I suggest
 using a long option name like --latency-aggregate-interval and then
 making the name of the variable in the code match the option name,
 with s/-/_/g, for clarity.

Why --latency-aggregate-interval? It has nothing to do with latencies,
it's merely number of transactions per interval.

Tomas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix for gistchoose

2012-08-30 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Thu, Aug 30, 2012 at 1:27 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Should we backpatch that?

 Arguably, yes.  Does the patch look sane to you?

I was afraid you'd ask that.

[ studies code for awhile ... ]

I think this fixes the bug, but the function could really do with slightly
more invasive code adjustments for clarity.  For instance, we could get
rid of the sum_grow = 1 kluge if we removed the  sum_grow from the
outer for statement (where it's pretty damn ugly/surprising anyway ---
I dislike for loops that look like they're just running a counter when
they're doing other things too) and instead put something like this at
the bottom of the outer loop:

/*
 * If we examined all the columns and there is zero penalty for
 * all of them, we can stop examining tuples; this is a good one
 * to insert the new key into.
 */
if (sum_grow == 0  j == r-rd_att-natts)
break;

I'm not thrilled with the added comments either; they need copy-editing
and they randomly fail to fit in an 80-column window.  (pgindent will
have something to say about that later for some of them, but I think it
doesn't reformat comments that start at the left margin.)

BTW, I think the real issue with the bug is not so much that we're
resetting anything as that we may be initializing which_grow[j] for
the next index column for the first time.  Note that the function's
startup code only bothers to initialize which_grow[0] (although it
does so in a less than legible fashion).  The rest have to get filled
as the loop proceeds.

Do you want to have another go at it, or would you like me to try to
make it better?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How to form a self-defined TupleTableSlot

2012-08-30 Thread Robert Haas
On Thu, Jul 26, 2012 at 11:39 PM,  chaoyong.w...@emc.com wrote:
 Here is my task situation:

 I have a TupleTableSlot, with its own TupleDesc. Now I want to extract
 several attributes to form a new TupleTableSlot, how can I define my own
 TupleDesc and the ProjectionInfo?

You might get more helpful advice if you describe more specifically
what you're trying to do.  I mean, here's one way to create a
TupleDesc (from contrib/adminpack) but it may or may not be suitable
for your purposes:

tupdesc = CreateTemplateTupleDesc(2, false);
TupleDescInitEntry(tupdesc, (AttrNumber) 1, starttime,
   TIMESTAMPOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 2, filename,
   TEXTOID, -1, 0);

As ProjectInfo, maybe ExecBuildProjectionInfo?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoiding adjacent checkpoint records

2012-08-30 Thread Robert Haas
On Mon, Aug 13, 2012 at 6:19 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 On Sat, Jun 9, 2012 at 5:43 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 So now the standard for my patches is that I must consider what will
 happen if the xlog is deleted?

 When you're messing around with something that affects data integrity, yes.
 The long and the short of it is that this patch does reduce our ability
 to recover from easily-foreseeable disasters.  The problem it was meant
 to solve is not dire enough to justify that, and other fixes are
 possible that don't require any compromises in this dimension.
 So please revert.  We can revisit the original complaint in 9.3.

 This reversion was done, so
 b8b69d89905e04b910bcd   Wed Jun 13, 2012
 reverted:
 18fb9d8d21a28caddb72  Wed Nov 2, 2011.

 However, the corresponding doc changes 43342891861cc2d08de and
 bd2396988a1afbcb6424 were not reverted.

 A simple reversion is probably not the right thing, because the
 original docs seemed rather inadequate.

 I've attached an attempt to fix this.  I also changed WAL shipping
 to WAL archiving, as the reason for setting archive_timeout applies
 to all WAL archiving not just the special case of warm standby.

Committed, thanks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: shared session variables

2012-08-30 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Thu, Aug 30, 2012 at 2:18 PM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 2012/8/30 Robert Haas robertmh...@gmail.com:
 On Tue, Aug 14, 2012 at 3:46 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 patch that implements shared client/server session variables

 I don't really see what we can do with this that we can't do without this.

 a motivation for this patch was discussion about parametrised DO
 statement - and simple possibility of access to host variables (psql)
 variables from server - PL scripts.
 
 It is based on Tom's and Magnus's ideas - it is secure, because only
 variables explicitly mentioned in shared namespace are shared.

 Sure, but you could get to the same place by issuing a SET command for
 just the particular variable you want to use with DO.  You don't
 really need a magic facility for it.

FWIW, I don't particularly care for this idea either.  It may be less
klugy than the original proposal, but it's still a kluge.  Also, it's
not very sensible to consider extensions of this sort unless we have
ambitions of turning psql into a full-fledged scripting language,
with conditionals and iteration at the very least.  I do not want
to go there.  If you need scripting capability, there are lots of
better tools out there already.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] effective_io_concurrency

2012-08-30 Thread Robert Haas
On Sat, Jul 28, 2012 at 4:09 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 From my attempted reading of the thread posix_fadvise v22, it seems
 like modification of the planner was never discussed, rather than
 being discussed and rejected.  So, is there a reason not to make the
 planner take account of effective_io_concurrency?

Not that I can see.

 But it might be better yet to make ordinary index scans benefit from
 effective_io_concurrency, but even if/when that gets done it would
 probably still be worthwhile to make the planner understand the
 benefit.

That sounds good too, but separate.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: pgbench - random sampling of transaction written into log

2012-08-30 Thread Tomas Vondra
On 30 Srpen 2012, 17:46, Robert Haas wrote:
 On Sun, Aug 26, 2012 at 1:04 PM, Tomas Vondra t...@fuzzy.cz wrote:
 Attached is an improved patch, with a call to rand() replaced with
 getrand().

 I was thinking about the counter but I'm not really sure how to handle
 cases like 39% - I'm not sure a plain (counter % 100  37) is not a
 good sampling, because it always keeps continuous sequences of
 transactions. Maybe there's a clever way to use a counter, but let's
 stick to a getrand() unless we can prove is't causing issues. Especially
 considering that a lot of data won't be be written at all with low
 sampling rates.

 I like this patch, and I think sticking with a random number is a good
 idea.  But I have two suggestions.  Number one, I think the sampling
 rate should be stored as a float, not an integer, because I can easily
 imagine wanting a sampling rate that is not an integer percentage -
 especially, one that is less than one percent, like half a percent or
 a tenth of a percent.  Also, I suggest that the command-line option
 should be a long option rather than a single character option.  That
 will be more mnemonic and avoid using up too many single letter
 options, of which there is a limited supply.  So to sample every
 hundredth result, you could do something like this:

 pgbench --latency-sample-rate 0.01

Right, I was thinking about that too. I'll do that in the next version of
the patch.

 Another option I personally think would be useful is an option to
 record only those latencies that are above some minimum bound, like
 this:

 pgbench --latency-only-if-more-than $MICROSECONDS

 The problem with recording all the latencies is that it tends to have
 a material impact on throughput.  Your patch should address that for
 the case where you just want to characterize the latency, but it would
 also be nice to have a way of recording the outliers.

That sounds like a pretty trivial patch. I've been thinking about yet
another option - histograms (regular or with exponential bins).

What I'm not sure about is which of these options should be allowed at the
same time - to me, combinations like 'sampling + aggregation' don't make
much sense. Maybe except 'latency-only-if-more-than + aggregation'.

Tomas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix for gistchoose

2012-08-30 Thread Robert Haas
On Thu, Aug 30, 2012 at 3:27 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Thu, Aug 30, 2012 at 1:27 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Should we backpatch that?

 Arguably, yes.  Does the patch look sane to you?

 I was afraid you'd ask that.

 [ studies code for awhile ... ]

 I think this fixes the bug, but the function could really do with slightly
 more invasive code adjustments for clarity.  For instance, we could get
 rid of the sum_grow = 1 kluge if we removed the  sum_grow from the
 outer for statement (where it's pretty damn ugly/surprising anyway ---
 I dislike for loops that look like they're just running a counter when
 they're doing other things too) and instead put something like this at
 the bottom of the outer loop:

 /*
  * If we examined all the columns and there is zero penalty for
  * all of them, we can stop examining tuples; this is a good one
  * to insert the new key into.
  */
 if (sum_grow == 0  j == r-rd_att-natts)
 break;

 I'm not thrilled with the added comments either; they need copy-editing
 and they randomly fail to fit in an 80-column window.  (pgindent will
 have something to say about that later for some of them, but I think it
 doesn't reformat comments that start at the left margin.)

Sorry.  I must have had my window sized wrong.  At any rate, as far as
the GiST code goes anyway, I'm inclined to think that even mis-spelled
comments are an improvement over the status quo.

 BTW, I think the real issue with the bug is not so much that we're
 resetting anything as that we may be initializing which_grow[j] for
 the next index column for the first time.  Note that the function's
 startup code only bothers to initialize which_grow[0] (although it
 does so in a less than legible fashion).  The rest have to get filled
 as the loop proceeds.

I noticed all that, but didn't feel like putting in the effort to make
it better.  I would have been happy to have someone else pick up the
patch, but as it had been languishing I thought it would be better to
get it committed more or less as it was than to wait for someone to
have time to make it beautiful.  If you want to hack on it more that's
fine with me.  I kind of wonder if we ought to rename the variables,
and maybe turn sum_grow into a boolean.   But I'm not really eager to
go crazy if this is something we have to back-patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix for gistchoose

2012-08-30 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 I noticed all that, but didn't feel like putting in the effort to make
 it better.  I would have been happy to have someone else pick up the
 patch, but as it had been languishing I thought it would be better to
 get it committed more or less as it was than to wait for someone to
 have time to make it beautiful.  If you want to hack on it more that's
 fine with me.  I kind of wonder if we ought to rename the variables,
 and maybe turn sum_grow into a boolean.   But I'm not really eager to
 go crazy if this is something we have to back-patch.

Yeah, the idea of replacing sum_grow with a boolean just occurred to me
too.  As is, I think the code is making some less-than-portable
assumptions about what will happen if sum_grow overflows; which can
definitely happen, seeing that gistpenalty and its callees intentionally
return infinity in some cases.  I'd rather it didn't attempt to add
column penalties together, and I think there's a case to be made that
not doing so is a back-patchable bug fix.

I'll take a shot at improving this some more.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 9.2rc1 build requirements

2012-08-30 Thread Joe Abbate
Hello hackers,

In order to test 9.2rc1, I had to build contrib (because Pyrseas uses
some of those modules).  The build instructions
(http://www.postgresql.org/docs/9.2/static/install-procedure.html )
state the way to build everything (contrib + docs, etc.) is

   gmake world

Unfortunately, that failed because the doc build requires jade.  I
managed to build contrib separately, but wanted to point out that jade
is not mentioned in the requirements page
(http://www.postgresql.org/docs/9.2/static/install-requirements.html ).
 In fact, searching for 'jade' returns no results.

As an aside, I installed jade (on Debian) and tried to make world but
got several errors, starting with the following:

jade  -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D .  -d
stylesheet.dsl -t sgml -i output-html -V html-index postgres.sgml
jade:E: unknown warning type fully-tagged

Best regards,

Joe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix for gistchoose

2012-08-30 Thread Robert Haas
On Thu, Aug 30, 2012 at 4:15 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 I noticed all that, but didn't feel like putting in the effort to make
 it better.  I would have been happy to have someone else pick up the
 patch, but as it had been languishing I thought it would be better to
 get it committed more or less as it was than to wait for someone to
 have time to make it beautiful.  If you want to hack on it more that's
 fine with me.  I kind of wonder if we ought to rename the variables,
 and maybe turn sum_grow into a boolean.   But I'm not really eager to
 go crazy if this is something we have to back-patch.

 Yeah, the idea of replacing sum_grow with a boolean just occurred to me
 too.  As is, I think the code is making some less-than-portable
 assumptions about what will happen if sum_grow overflows; which can
 definitely happen, seeing that gistpenalty and its callees intentionally
 return infinity in some cases.  I'd rather it didn't attempt to add
 column penalties together, and I think there's a case to be made that
 not doing so is a back-patchable bug fix.

Keep in mind that the worst case outcome is the index quality is worse
than it otherwise would have been, so it's not like
OMG-PostgreSQ-eats-your-data.

 I'll take a shot at improving this some more.

Okey dokey.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] effective_io_concurrency

2012-08-30 Thread Peter Geoghegan
On 30 August 2012 20:28, Robert Haas robertmh...@gmail.com wrote:
 On Sat, Jul 28, 2012 at 4:09 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 But it might be better yet to make ordinary index scans benefit from
 effective_io_concurrency, but even if/when that gets done it would
 probably still be worthwhile to make the planner understand the
 benefit.

 That sounds good too, but separate.

Indeed. The original effective_io_concurrency commit message said:


***SNIP***

(The best way to handle this for plain index scans is still under debate,
so that part is not applied yet --- tgl)


...seems like a pity that this debate never reached a useful conclusion.

Just how helpful is effective_io_concurrency? Did someone produce a
benchmark at some point?

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] --disable-shared is entirely broken these days

2012-08-30 Thread Bruce Momjian
On Mon, May 28, 2012 at 02:27:15AM +0300, Peter Eisentraut wrote:
 On lör, 2012-05-26 at 15:53 -0400, Tom Lane wrote:
  2. Seeing that this is the first complaint since 9.0, should we decide
  that --disable-shared is no longer worth supporting?  Seems like we
  should either make this case work or remove this switch.
 
 I think the last remaining use was the QNX port, which didn't support
 shared libraries.
 
 We should just remove it now.

--disable-shared removed, with the attached, applied patch.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/configure b/configure
new file mode 100755
index 6a89cca..8f59c93
*** a/configure
--- b/configure
*** LCOV
*** 748,754 
  GCOV
  enable_debug
  enable_rpath
- enable_shared
  default_port
  WANTED_LANGUAGES
  enable_nls
--- 748,753 
*** with_libs
*** 810,816 
  enable_integer_datetimes
  enable_nls
  with_pgport
- enable_shared
  enable_rpath
  enable_spinlocks
  enable_debug
--- 809,814 
*** Optional Features:
*** 1490,1496 
disable 64-bit integer date/time support
--enable-nls[=LANGUAGES]
enable Native Language Support
-   --disable-shareddo not build shared libraries
--disable-rpath do not embed shared library search path in
executables
--disable-spinlocks do not use spinlocks
--- 1488,1493 
*** _ACEOF
*** 2471,2506 
  
  
  
- #
- # Option to disable shared libraries
- #
- 
- 
- # Check whether --enable-shared was given.
- if test ${enable_shared+set} = set; then
-   enableval=$enable_shared;
-   case $enableval in
- yes)
-   :
-   ;;
- no)
-   :
-   ;;
- *)
-   { { $as_echo $as_me:$LINENO: error: no argument expected for --enable-shared option 5
- $as_echo $as_me: error: no argument expected for --enable-shared option 2;}
-{ (exit 1); exit 1; }; }
-   ;;
-   esac
- 
- else
-   enable_shared=yes
- 
- fi
- 
- 
- 
- 
  #
  # '-rpath'-like feature can be disabled
  #
--- 2468,2473 
diff --git a/configure.in b/configure.in
new file mode 100644
index fa48a2b..3acefa1
*** a/configure.in
--- b/configure.in
*** AC_DEFINE_UNQUOTED(DEF_PGPORT_STR, ${de
*** 164,176 
  AC_SUBST(default_port)
  
  #
- # Option to disable shared libraries
- #
- PGAC_ARG_BOOL(enable, shared, yes,
-   [do not build shared libraries])
- AC_SUBST(enable_shared)
- 
- #
  # '-rpath'-like feature can be disabled
  #
  PGAC_ARG_BOOL(enable, rpath, yes,
--- 164,169 
diff --git a/src/Makefile.global.in b/src/Makefile.global.in
new file mode 100644
index 5b43819..1e3b401
*** a/src/Makefile.global.in
--- b/src/Makefile.global.in
*** with_libxml	= @with_libxml@
*** 165,171 
  with_libxslt	= @with_libxslt@
  with_system_tzdata = @with_system_tzdata@
  with_zlib	= @with_zlib@
- enable_shared	= @enable_shared@
  enable_rpath	= @enable_rpath@
  enable_nls	= @enable_nls@
  enable_debug	= @enable_debug@
--- 165,170 
*** endif
*** 397,409 
  # isn't created with the same link flags as libpq, it can't be used.)
  libpq = -L$(libpq_builddir) -lpq
  
- # If doing static linking, shared library dependency info isn't available,
- # so add in the libraries that libpq depends on.
- ifeq ($(enable_shared), no)
- libpq += $(filter -lintl -lssl -lcrypto -lkrb5 -lcrypt, $(LIBS)) \
- 	$(LDAP_LIBS_FE) $(PTHREAD_LIBS)
- endif
- 
  # This macro is for use by client executables (not libraries) that use libpq.
  # We force clients to pull symbols from the non-shared library libpgport
  # rather than pulling some libpgport symbols from libpq just because
--- 396,401 
diff --git a/src/Makefile.shlib b/src/Makefile.shlib
new file mode 100644
index 294d10f..4da2f10
*** a/src/Makefile.shlib
--- b/src/Makefile.shlib
*** LINK.static = $(AR) $(AROPT)
*** 81,100 
  
  ifdef SO_MAJOR_VERSION
  # Default library naming convention used by the majority of platforms
- ifeq ($(enable_shared), yes)
  shlib		= lib$(NAME)$(DLSUFFIX).$(SO_MAJOR_VERSION).$(SO_MINOR_VERSION)
  shlib_major	= lib$(NAME)$(DLSUFFIX).$(SO_MAJOR_VERSION)
  shlib_bare	= lib$(NAME)$(DLSUFFIX)
- endif
  # Testing the soname variable is a reliable way to determine whether a
  # linkable library is being built.
  soname		= $(shlib_major)
  else
  # Naming convention for dynamically loadable modules
- ifeq ($(enable_shared), yes)
  shlib		= $(NAME)$(DLSUFFIX)
  endif
- endif
  stlib		= lib$(NAME).a
  
  ifndef soname
--- 81,96 
*** $(stlib): $(OBJS) | $(SHLIB_PREREQS)
*** 321,327 
  	$(RANLIB) $@
  endif #haslibarule
  
- ifeq ($(enable_shared), yes)
  
  ifeq (,$(filter cygwin win32,$(PORTNAME)))
  ifneq ($(PORTNAME), aix)
--- 317,322 
*** $(stlib): $(shlib) $(DLL_DEFFILE) | 

Re: [HACKERS] Performance patch for Win32

2012-08-30 Thread Bruce Momjian
On Tue, May 29, 2012 at 03:54:59PM -0700, Mark Dilger wrote:
 I was imagining that this would be a trap for linux developers
 who saw nothing wrong with their code until it made it to the
 build/test farm.  That's pretty far down the development
 process.  Of course, it is also a trap in the other direction, for
 Windows developers who use the pattern but do not include
 anything equivalent for the non-Windows execution path.
 
 On the whole, however, your argument in favor of tighter
 patterns might be more convincing than my argument in favor
 of catching bugs sooner.
 
 I will start implementing your suggestion for patch v2.

Any progress on this?

---


 
 ━━━
 From: Tom Lane t...@sss.pgh.pa.us
 To: Mark Dilger markdil...@yahoo.com
 Cc: pgsql-hackers@postgresql.org pgsql-hackers@postgresql.org
 Sent: Tuesday, May 29, 2012 3:42 PM
 Subject: Re: [HACKERS] Performance patch for Win32
 
 Mark Dilger markdil...@yahoo.com writes:
  I am hesitant to write a function like AllocateDirWithFilePattern
  if the pattern is simply ignored on non-Windows.  In my patch,
  the pattern underspecified the files, and the ad-hoc matching code
  applied to all the returned files tightened that up.  But a person
  could just as well overspecify the pattern and then they would get
  different behavior on Windows vs. non-Windows, with fewer
  files returned by FindNextFile() than would have matched the
  ad-hoc pattern.
 
 Well, if you're imagining that we wouldn't need to test carefully on
 both Windows and non-Windows, I think that's a pipe dream.  As an
 example, your proposal of AllocateDirWithFilePrefix would only work
 consistently across platforms if the prefix didn't contain anything
 that Windows thought was a pattern metacharacter.  (This might never
 come up, but I'm not too sure what the metacharacters are on Windows.)
 
 Having said that, I have nothing particularly against the idea of
 specifying a prefix rather than an arbitrary pattern.  I'm just
 saying it'll still need testing.  Also, I wonder how many of the
 potential stat-equivalent operations we'll be unable to optimize
 away with the more restricted definition.  Using a tighter pattern
 on Windows seems basically free (modulo testing) if we accept that
 it's Windows-only.
 
 regards, tom lane
 
 

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] We're not lax enough about maximum time zone offset from UTC

2012-08-30 Thread Bruce Momjian
On Wed, May 30, 2012 at 06:10:12PM -0400, Tom Lane wrote:
 Currently, our datetime input code thinks that any UTC offset of more
 than 14:59:59 either way from Greenwich must be a mistake.  However,
 after seeing Patric Bechtel's recent bug report, I went trolling in the
 Olson timezone files to see what are the largest offsets used there.
 I found three entries that are further out than that:
 
 # ZoneNAMEGMTOFF  RULES   FORMAT  [UNTIL]
 Zone Asia/Manila  -15:56:00 - LMT 1844 Dec 31
 Zone America/Juneau15:02:19 - LMT 1867 Oct 18
 Zone America/Metlakatla15:13:42 - LMT 1867 Oct 18
 
 These are all ancient history of course; it does not appear that any
 zones *currently* use offsets larger than +/- 14 hours, which if memory
 serves is what we considered when we set the existing sanity limit.
 
 However, as pointed out by Patric, if you dump and restore an old
 timestamptz value in one of these zones, it will fail to restore because
 of the sanity check.  I think therefore that we'd better enlarge the
 allowed range to 15:59:59 either way.

Any status on this?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] pg_dump and thousands of schemas

2012-08-30 Thread Bruce Momjian
On Thu, May 31, 2012 at 09:20:43AM +0900, Tatsuo Ishii wrote:
  Yeah, Jeff's experiments indicated that the remaining bottleneck is lock
  management in the server.  What I fixed so far on the pg_dump side
  should be enough to let partial dumps run at reasonable speed even if
  the whole database contains many tables.  But if psql is taking
  AccessShareLock on lots of tables, there's still a problem.
  
  Ok, I modified the part of pg_dump where tremendous number of LOCK
  TABLE are issued. I replace them with single LOCK TABLE with multiple
  tables. With 100k tables LOCK statements took 13 minutes in total, now
  it only takes 3 seconds. Comments?
 
 Shall I commit to master and all supported branches?

Was this applied?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix for gistchoose

2012-08-30 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Thu, Aug 30, 2012 at 4:15 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Yeah, the idea of replacing sum_grow with a boolean just occurred to me
 too.  As is, I think the code is making some less-than-portable
 assumptions about what will happen if sum_grow overflows; which can
 definitely happen, seeing that gistpenalty and its callees intentionally
 return infinity in some cases.  I'd rather it didn't attempt to add
 column penalties together, and I think there's a case to be made that
 not doing so is a back-patchable bug fix.

 Keep in mind that the worst case outcome is the index quality is worse
 than it otherwise would have been, so it's not like
 OMG-PostgreSQ-eats-your-data.

Agreed, but we've seen plenty of complaining about bloated gist indexes,
and this might be the cause.

 I'll take a shot at improving this some more.

 Okey dokey.

Attached is a revised version of gistchoose(); I've not yet transposed
the changes into gistRelocateBuildBuffersOnSplit().  It looks a lot
more readable to me now.  Objections?

regards, tom lane

diff --git a/src/backend/access/gist/gistutil.c b/src/backend/access/gist/gistutil.c
index 7c6ce8495caac1e9d7c99afdb513689de93beea5..48d70d98e25363d8425020be7607f20348a0456a 100644
*** a/src/backend/access/gist/gistutil.c
--- b/src/backend/access/gist/gistutil.c
*** gistgetadjusted(Relation r, IndexTuple o
*** 363,475 
  }
  
  /*
!  * Search a page for the entry with lowest penalty.
   *
!  * The index may have multiple columns, and there's a penalty value for each column.
!  * The penalty associated with a column which appears earlier in the index definition is
!  * strictly more important than the penalty of column which appears later in the index
!  * definition.
   */
  OffsetNumber
  gistchoose(Relation r, Page p, IndexTuple it,	/* it has compressed entry */
  		   GISTSTATE *giststate)
  {
  	OffsetNumber maxoff;
  	OffsetNumber i;
! 	OffsetNumber which;
! 	float		sum_grow,
! which_grow[INDEX_MAX_KEYS];
  	GISTENTRY	entry,
  identry[INDEX_MAX_KEYS];
  	bool		isnull[INDEX_MAX_KEYS];
  
! 	maxoff = PageGetMaxOffsetNumber(p);
! 	*which_grow = -1.0;
! 	which = InvalidOffsetNumber;
! 	sum_grow = 1;
  	gistDeCompressAtt(giststate, r,
  	  it, NULL, (OffsetNumber) 0,
  	  identry, isnull);
  
! 	Assert(maxoff = FirstOffsetNumber);
! 	Assert(!GistPageIsLeaf(p));
  
  	/*
! 	 * Loop over tuples on page.
  	 *
! 	 * We'll exit early if we find an index key that can accommodate the new key with no
! 	 * penalty on any column.  sum_grow is used to track this condition.  Normally, it is the
! 	 * sum of the penalties we've seen for this column so far, which is not a very useful
! 	 * quantity in general because the penalties for each column are only considered
! 	 * independently, but all we really care about is whether or not it's greater than zero.
! 	 * Since penalties can't be negative, the sum of the penalties will be greater than
! 	 * zero if and only if at least one penalty was greater than zero.  To make things just
! 	 * a bit more complicated, we arbitrarily set sum_grow to 1.0 whenever we want to force
! 	 * the at least one more iteration of this outer loop.  Any non-zero value would serve
! 	 * just as well.
  	 */
! 	for (i = FirstOffsetNumber; i = maxoff  sum_grow; i = OffsetNumberNext(i))
  	{
- 		int			j;
  		IndexTuple	itup = (IndexTuple) PageGetItem(p, PageGetItemId(p, i));
  
! 		sum_grow = 0;
  
! 		/* Loop over indexed attribtues. */
  		for (j = 0; j  r-rd_att-natts; j++)
  		{
  			Datum		datum;
  			float		usize;
  			bool		IsNull;
  
  			datum = index_getattr(itup, j + 1, giststate-tupdesc, IsNull);
  			gistdentryinit(giststate, j, entry, datum, r, p, i,
  		   FALSE, IsNull);
  			usize = gistpenalty(giststate, j, entry, IsNull,
  identry[j], isnull[j]);
  
! 			if (which_grow[j]  0 || usize  which_grow[j])
  			{
  /*
!  * We get here in two cases.  First, we may have just discovered that the
!  * current tuple is the best one we've seen so far; that is, for the first
!  * column for which the penalty is not equal to the best tuple seen so far,
!  * this one has a lower penalty than the previously-seen one.  But, when
!  * a new best tuple is found, we must record the best penalty value for
!  * all the remaining columns.  We'll end up here for each remaining index
!  * column in that case, too.
   */
! which = i;
! which_grow[j] = usize;
  if (j  r-rd_att-natts - 1)
! 	which_grow[j + 1] = -1;
! sum_grow += which_grow[j];
  			}
! 			else if (which_grow[j] == usize)
  			{
  /*
!  * The current tuple is exactly as good for this column as the best tuple
!  * seen so far.  The next iteration of this loop will compare the next
!  * column.
   */
- sum_grow += usize;
  			}
  			else
  			{
  /*
!  * The current tuple is worse for this column than the best 

Re: [HACKERS] --disable-shared is entirely broken these days

2012-08-30 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 On Mon, May 28, 2012 at 02:27:15AM +0300, Peter Eisentraut wrote:
 We should just remove it now.

 --disable-shared removed, with the attached, applied patch.

No documentation changes?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] We're not lax enough about maximum time zone offset from UTC

2012-08-30 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 On Wed, May 30, 2012 at 06:10:12PM -0400, Tom Lane wrote:
 However, as pointed out by Patric, if you dump and restore an old
 timestamptz value in one of these zones, it will fail to restore because
 of the sanity check.  I think therefore that we'd better enlarge the
 allowed range to 15:59:59 either way.

 Any status on this?

Shipped in last week's updates.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] --disable-shared is entirely broken these days

2012-08-30 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 On Thu, Aug 30, 2012 at 04:50:22PM -0400, Tom Lane wrote:
 No documentation changes?

 I couldn't find any place we document it.  I did:
   grep _shared *.sgml
 and no hits were returned.  Should I search for something else?

It's --enable-shared, not --enable_shared.  I see at least one hit, in
installation.sgml.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] --disable-shared is entirely broken these days

2012-08-30 Thread Bruce Momjian
On Thu, Aug 30, 2012 at 04:57:30PM -0400, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  On Thu, Aug 30, 2012 at 04:50:22PM -0400, Tom Lane wrote:
  No documentation changes?
 
  I couldn't find any place we document it.  I did:
  grep _shared *.sgml
  and no hits were returned.  Should I search for something else?
 
 It's --enable-shared, not --enable_shared.  I see at least one hit, in
 installation.sgml.

Oh, got -/_ mixed up. Fixed with attached applied patch.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml
new file mode 100644
index c02ed87..81cd550
*** a/doc/src/sgml/installation.sgml
--- b/doc/src/sgml/installation.sgml
*** su - postgres
*** 234,241 
   /para
  
   para
!   If you have problems, run productnamePython/ 2.3 or later's
!   configure using the literal--enable-shared/ flag.  On some
operating systems you don't have to build a shared library, but
you will have to convince the productnamePostgreSQL/ build
system of this.  Consult the filenameMakefile/filename in
--- 234,240 
   /para
  
   para
!   On some
operating systems you don't have to build a shared library, but
you will have to convince the productnamePostgreSQL/ build
system of this.  Consult the filenameMakefile/filename in

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] splitting htup.h

2012-08-30 Thread Alvaro Herrera
Excerpts from Alvaro Herrera's message of mié ago 29 15:13:11 -0400 2012:
 Excerpts from Andres Freund's message of mié ago 29 12:10:17 -0400 2012:
 
   OK, scratch that thought then.  So we seem to be down to choosing a new
   name for what we're going to take out of htup.h.  If you don't like
   heap.h, maybe something like heap_tuple.h?  I'm not terribly excited
   about it either way though.  Any other ideas out there?
  htup_details.h? That doesn't have the sound of fringe details that 
  htup_private.h has.
 
 htup_details.h seems reasonable, thanks.

Committed, with a slight adjustment that I had left out of this patch
but was present in a previous version of it: the function prototypes,
which in the last posted patch were to remain in htup.h, are now in
htup_details.h.  So I removed access/tupdesc.h from htup.h.  I had to
add #include access/htup_details.h to half a dozen more .c files but
that seemed a good tradeoff.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] pg_dump and thousands of schemas

2012-08-30 Thread Bruce Momjian
On Thu, Aug 30, 2012 at 04:51:56PM -0400, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  On Thu, May 31, 2012 at 09:20:43AM +0900, Tatsuo Ishii wrote:
  Ok, I modified the part of pg_dump where tremendous number of LOCK
  TABLE are issued. I replace them with single LOCK TABLE with multiple
  tables. With 100k tables LOCK statements took 13 minutes in total, now
  it only takes 3 seconds. Comments?
 
  Shall I commit to master and all supported branches?
 
  Was this applied?
 
 No, we fixed the server side instead.

Again, thanks.  I knew we fixed the server, but wasn't clear that made
the client changes unnecessary, but I think I now do remember discussion
about that.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2rc1 build requirements

2012-08-30 Thread Jeff Janes
On Thu, Aug 30, 2012 at 1:18 PM, Joe Abbate j...@freedomcircle.com wrote:
 Hello hackers,

 In order to test 9.2rc1, I had to build contrib (because Pyrseas uses
 some of those modules).  The build instructions
 (http://www.postgresql.org/docs/9.2/static/install-procedure.html )
 state the way to build everything (contrib + docs, etc.) is

gmake world

 Unfortunately, that failed because the doc build requires jade.  I
 managed to build contrib separately, but wanted to point out that jade
 is not mentioned in the requirements page
 (http://www.postgresql.org/docs/9.2/static/install-requirements.html ).

That page should probably point out that there are additional
requirements to build the documentation, and link to the relevant
place that describes those.

  In fact, searching for 'jade' returns no results.

Searching in the 9.2 docs from the web page doesn't work at all, I'm
not sure why.  You can search for jade in 9.1, and then once you
find the page there is a cross link to the 9.2 version of it:

http://www.postgresql.org/docs/9.2/static/docguide-toolsets.html


 As an aside, I installed jade (on Debian) and tried to make world but
 got several errors, starting with the following:

 jade  -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D .  -d
 stylesheet.dsl -t sgml -i output-html -V html-index postgres.sgml
 jade:E: unknown warning type fully-tagged

I think is probably because you don't have DocBook DTD or some of
the other prerequisites listed in the URL I gave above.  Or maybe you
just need to re-run make maintainer-clean and ./configure after
installing jade.  I've been tripped up by both issues in the past.


Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] --disable-shared is entirely broken these days

2012-08-30 Thread Bruce Momjian
On Thu, Aug 30, 2012 at 05:01:39PM -0400, Bruce Momjian wrote:
 On Thu, Aug 30, 2012 at 04:57:30PM -0400, Tom Lane wrote:
  Bruce Momjian br...@momjian.us writes:
   On Thu, Aug 30, 2012 at 04:50:22PM -0400, Tom Lane wrote:
   No documentation changes?
  
   I couldn't find any place we document it.  I did:
 grep _shared *.sgml
   and no hits were returned.  Should I search for something else?
  
  It's --enable-shared, not --enable_shared.  I see at least one hit, in
  installation.sgml.
 
 Oh, got -/_ mixed up. Fixed with attached applied patch.

Oops, that text is talking about Python's configure, so I put the text
back.  Seemed we had _no_ mention of our own --enable-shared.

---


 
 -- 
   Bruce Momjian  br...@momjian.ushttp://momjian.us
   EnterpriseDB http://enterprisedb.com
 
   + It's impossible for everything to be true. +

 diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml
 new file mode 100644
 index c02ed87..81cd550
 *** a/doc/src/sgml/installation.sgml
 --- b/doc/src/sgml/installation.sgml
 *** su - postgres
 *** 234,241 
/para
   
para
 !   If you have problems, run productnamePython/ 2.3 or later's
 !   configure using the literal--enable-shared/ flag.  On some
 operating systems you don't have to build a shared library, but
 you will have to convince the productnamePostgreSQL/ build
 system of this.  Consult the filenameMakefile/filename in
 --- 234,240 
/para
   
para
 !   On some
 operating systems you don't have to build a shared library, but
 you will have to convince the productnamePostgreSQL/ build
 system of this.  Consult the filenameMakefile/filename in

 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] pg_dump and thousands of schemas

2012-08-30 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 On Thu, May 31, 2012 at 09:20:43AM +0900, Tatsuo Ishii wrote:
 Ok, I modified the part of pg_dump where tremendous number of LOCK
 TABLE are issued. I replace them with single LOCK TABLE with multiple
 tables. With 100k tables LOCK statements took 13 minutes in total, now
 it only takes 3 seconds. Comments?

 Shall I commit to master and all supported branches?

 Was this applied?

No, we fixed the server side instead.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] --disable-shared is entirely broken these days

2012-08-30 Thread Bruce Momjian
On Thu, Aug 30, 2012 at 04:50:22PM -0400, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  On Mon, May 28, 2012 at 02:27:15AM +0300, Peter Eisentraut wrote:
  We should just remove it now.
 
  --disable-shared removed, with the attached, applied patch.
 
 No documentation changes?

I couldn't find any place we document it.  I did:

grep _shared *.sgml

and no hits were returned.  Should I search for something else?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] --disable-shared is entirely broken these days

2012-08-30 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 On Thu, Aug 30, 2012 at 05:01:39PM -0400, Bruce Momjian wrote:
 Oh, got -/_ mixed up. Fixed with attached applied patch.

 Oops, that text is talking about Python's configure, so I put the text
 back.  Seemed we had _no_ mention of our own --enable-shared.

Yeah, I just realized the same.  Apparently the configure script's
--help output was indeed the only documentation.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] We're not lax enough about maximum time zone offset from UTC

2012-08-30 Thread Bruce Momjian
On Thu, Aug 30, 2012 at 04:51:02PM -0400, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  On Wed, May 30, 2012 at 06:10:12PM -0400, Tom Lane wrote:
  However, as pointed out by Patric, if you dump and restore an old
  timestamptz value in one of these zones, it will fail to restore because
  of the sanity check.  I think therefore that we'd better enlarge the
  allowed range to 15:59:59 either way.
 
  Any status on this?
 
 Shipped in last week's updates.

Thank you.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2rc1 build requirements

2012-08-30 Thread Alvaro Herrera
Excerpts from Joe Abbate's message of jue ago 30 16:18:05 -0400 2012:
 Hello hackers,
 
 In order to test 9.2rc1, I had to build contrib (because Pyrseas uses
 some of those modules).  The build instructions
 (http://www.postgresql.org/docs/9.2/static/install-procedure.html )
 state the way to build everything (contrib + docs, etc.) is
 
gmake world
 
 Unfortunately, that failed because the doc build requires jade.  I
 managed to build contrib separately, but wanted to point out that jade
 is not mentioned in the requirements page
 (http://www.postgresql.org/docs/9.2/static/install-requirements.html ).
  In fact, searching for 'jade' returns no results.
 
 As an aside, I installed jade (on Debian) and tried to make world but
 got several errors, starting with the following:
 
 jade  -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D .  -d
 stylesheet.dsl -t sgml -i output-html -V html-index postgres.sgml
 jade:E: unknown warning type fully-tagged

I'm not sure what's the status of jade, but our Debian instructions to
build docs suggest to use openjade instead:
http://www.postgresql.org/docs/devel/static/docguide-toolsets.html

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2rc1 build requirements

2012-08-30 Thread Tom Lane
Joe Abbate j...@freedomcircle.com writes:
 As an aside, I installed jade (on Debian) and tried to make world but
 got several errors, starting with the following:

 jade  -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D .  -d
 stylesheet.dsl -t sgml -i output-html -V html-index postgres.sgml
 jade:E: unknown warning type fully-tagged

FWIW, that suggests that this version of jade is too old.  I'm not sure
that jade per se (as opposed to the successor project openjade) can be
used to build our docs at all --- you should check whether this is
openjade, or really the original project.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2rc1 build requirements

2012-08-30 Thread Joe Abbate
Hello Jeff,

On 30/08/12 17:05, Jeff Janes wrote:
 I think is probably because you don't have DocBook DTD or some of
 the other prerequisites listed in the URL I gave above.

Indeed.  I was able to build world after invoking the apt-get line in
J.2.3 on that page.  The only adjustment I had to make is to add a
symbolic from /usr/share/sgml/openjade1.3 to /usr/share/sgml/jade
because it was looking for the 'catalog' file in the latter.

Thanks,

Joe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pg default's verbosity?

2012-08-30 Thread Bruce Momjian
On Sun, Jun 17, 2012 at 12:00:20AM -0400, nik9...@gmail.com wrote:
 I've always used -1-f -  file.sql. It is confusing that -1 doesn't warn you 
 when it wont work though. 

This will be fixed in 9.3 with this commit:

commit be690e291d59e8d0c9f4df59abe09f1ff6cc0da9
Author: Robert Haas rh...@postgresql.org
Date:   Thu Aug 9 09:59:45 2012 -0400

Make psql -1  file behave as expected.

Previously, the -1 option was silently ignored.

Also, emit an error if -1 is used in a context where it won't be
respected, to avoid user confusion.

Original patch by Fabien COELHO, but this version is quite different
from the original submission.

---


 
 Sent from my iPhone
 
 On Jun 16, 2012, at 3:42 AM, Fabien COELHO coe...@cri.ensmp.fr wrote:
 
  
  Hello pgdev,
  
  (Second attempt)
  
  I've conducted a statistical study about PostgreSQL use in OSS. One of the 
  result is that quite a few projects have errors in their SQL setup scripts 
  which lead to some statements to be ignored, typically somme ADD 
  CONSTRAINTS which do not change the database schema from a functional point 
  of view, or syntactic errors (typically a mysql syntax...) that
  result in missing tables, but which are not found if the application is not 
  fully tested.
  
  I think that there are two reasons why these errors are not caught by 
  application developers:
  
  (1) the default verbosity is set to notice, which is much to high. The 
  users just get used to seeing a lot of messages on loading an sql script, 
  and to ignore them, so that errors are just hidden in the flow of notices. 
  I think that a better default setting would be warnings, that is messages 
  that require some attention from the developer.
  
  (2) the default behavior of psql on errors is to keep going. Developers of 
  SQL script that are expected to work shoud be advised to:
  - encourage application devs to set ON_ERROR_STOP and/or use a global
transaction in their script.
  - provide a simple/short option to do that from the command line
basically that could be an enhanced -1, NOT restricted
to -f but that would work on standard input as well.
  
sh psql -1 -f setup.sql # -1 does work here
sh psql -1  setup.sql # -1 does not apply to stdin stuff...
  
  
  So I would suggest the following todos:
  
  1 - change the default verbosity to warning.
  
  2 - change -1 to work on stdin as well instead of being ignored,
 or provide another option that would do that.
  
  -- 
  Fabien Coelho - coe...@cri.ensmp.fr
  
  -- 
  Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
  To make changes to your subscription:
  http://www.postgresql.org/mailpref/pgsql-hackers
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: optimized DROP of multiple tables within a transaction

2012-08-30 Thread Robert Haas
On Thu, Aug 30, 2012 at 3:17 PM, Tomas Vondra t...@fuzzy.cz wrote:
 On 30 Srpen 2012, 17:53, Robert Haas wrote:
 On Fri, Aug 24, 2012 at 6:36 PM, Tomas Vondra t...@fuzzy.cz wrote:
 attached is a patch that improves performance when dropping multiple
 tables within a transaction. Instead of scanning the shared buffers for
 each table separately, the patch removes this and evicts all the tables
 in a single pass through shared buffers.

 Our system creates a lot of working tables (even 100.000) and we need
 to perform garbage collection (dropping obsolete tables) regularly. This
 often took ~ 1 hour, because we're using big AWS instances with lots of
 RAM (which tends to be slower than RAM on bare hw). After applying this
 patch and dropping tables in groups of 100, the gc runs in less than 4
 minutes (i.e. a 15x speed-up).

 This is not likely to improve usual performance, but for systems like
 ours, this patch is a significant improvement.

 Seems pretty reasonable.  But instead of duplicating so much code,
 couldn't we find a way to use replace DropRelFileNodeAllBuffers with
 DropRelFileNodeAllBuffersList?  Surely anyone who was planning to call
 the first one could instead call the second one with a count of one
 and a pointer to the address of the data they were planning to pass.
 I'd probably swap the order of arguments to
 DropRelFileNodeAllBuffersList as well.  We could do something similar
 with smgrdounlink/smgrdounlinkall so that, again, only one copy of the
 code is needed.

 Yeah, I was thinking about that too, but I simply wasn't sure which is the
 best choice so I've sent the raw patch. OTOH these functions are called on
 a very limited number of places, so a refactoring like this seems fine.

If there are enough call sites then DropRelFileNodeAllBuffers could
become a one-line function that simply calls
DropRelFileNodeAllBuffersList(1, arg).  But we should avoid
duplicating the code.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: pgbench - aggregation of info written into log

2012-08-30 Thread Tomas Vondra
On 30 Srpen 2012, 23:47, Robert Haas wrote:
 On Thu, Aug 30, 2012 at 3:25 PM, Tomas Vondra t...@fuzzy.cz wrote:
 On 30 Srpen 2012, 18:02, Robert Haas wrote:
 On Fri, Aug 24, 2012 at 5:25 PM, Tomas Vondra t...@fuzzy.cz wrote:
 This patch is a bit less polished (and more complex) than the other
 pgbench patch I've sent a while back, and I'm not sure how to handle
 the
 Windows branch. That needs to be fixed during the commit fest.

 What's the problem with the Windows branch?

 Well, there are comments about how timestamp does not work on Windows
 (filling 0), and I'm not sure how that affects the patch (e.g.
 determining
 the aggregation interval). I have no Windows workstation available so I
 can't actually try that.

 Hmm.  That seems like it might be something we need to fix first...

 Could you un-cuddle your brances to follow the PostgreSQL style, omit
 braces around single-statement blocks, and remove the spurious
 whitespace changes?

 OK, will do.

 Instead of having both use_log_agg and naggseconds, I think you can
 get by with just having a single variable that is zero if aggregation
 is not in use and is otherwise the aggregation period.  On a related
 note, you can't specify -A without an associated value, so there is no
 point in documenting a default.  As with your other patch, I suggest
 using a long option name like --latency-aggregate-interval and then
 making the name of the variable in the code match the option name,
 with s/-/_/g, for clarity.

 Why --latency-aggregate-interval? It has nothing to do with latencies,
 it's merely number of transactions per interval.

 Oh, I thought it was modifying the behavior of -l.

It does, but AFAIK the -l means logging. I suppose
--aggregate-interval would be a good option name, I don't see a reason
to put there the additional word when there are other aggregated values
(e.g. num of transactions).

T.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq compression

2012-08-30 Thread Bruce Momjian
On Sun, Jun 17, 2012 at 11:45:54PM +0800, Magnus Hagander wrote:
 On Sun, Jun 17, 2012 at 11:42 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Magnus Hagander mag...@hagander.net writes:
  Is there a reason why we don't have a parameter on the client
  mirroring ssl_ciphers?
 
  Dunno, do we need one?  I am not sure what the cipher negotiation process
  looks like or which side has the freedom to choose.
 
 I haven't looked into the details, but it seems reasonable that
 *either* side should be able to at least define a list of ciphers it
 *doens't* want to talk with.
 
 Do we need it - well, it makes sense for the client to be able to say
 I won't trust 56-bit encryption before it sends over the password,
 imo..
 
 
  That, or just have DEFAULT as being the default (which in current
  openssl means ALL:!aNULL:!eNULL.
 
  If our default isn't the same as the underlying default, I have to
  question why not.
 
 Yeah, that's exaclty what I'm questioning here..
 
   But are you sure this ! notation will work with
  all openssl versions?
 
 Uh. We have the ! notation in our default *now*. What openssl also
 supports is the text DEFAULT, which is currently the equivalent of
 ALL!aNULL!eNULL. The question, which is valid of course, should be
 if DEFAULT works with all openssl versions.
 
 It would seem reasonable it does, but I haven't investigated.

Do we want to change our ssl_ciphers default to 'DEFAULT'?  Currently it
is 'ALL:!ADH:!LOW:!EXP:!MD5:@STRENGTH'.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2rc1 build requirements

2012-08-30 Thread Joe Abbate
On 30/08/12 17:36, Tom Lane wrote:
 FWIW, that suggests that this version of jade is too old.  I'm not sure
 that jade per se (as opposed to the successor project openjade) can be
 used to build our docs at all --- you should check whether this is
 openjade, or really the original project.

It was the old jade.  After I installed openjade, as suggested by Alvaro
and Jeff Janes, and re-ran ./configure the invocation line changed to
use openjade.

Joe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: pgbench - random sampling of transaction written into log

2012-08-30 Thread Tomas Vondra
On 30 Srpen 2012, 23:44, Robert Haas wrote:
 On Thu, Aug 30, 2012 at 3:48 PM, Tomas Vondra t...@fuzzy.cz wrote:
 That sounds like a pretty trivial patch. I've been thinking about yet
 another option - histograms (regular or with exponential bins).

 I thought about that, too, but I think high-outliers is a lot more
 useful.  At least for the kinds of things I worry about.

OK, let's fix the current patches first. We can add more options later.


 What I'm not sure about is which of these options should be allowed at
 the
 same time - to me, combinations like 'sampling + aggregation' don't make
 much sense. Maybe except 'latency-only-if-more-than + aggregation'.

 Maybe, but perhaps not even.  I don't have a problem with saying that
 the user is allowed to pick at most one method of reducing the output
 volume.  I think we could say: either sample, or take high outliers,
 or aggregate.  If we want to make some of those work in combination,
 fine, but I don't think it's absolutely required.  What WILL be
 required is a clear error message telling you what you did wrong if
 you use an unsupported combination.

I'll allow a single option - we can enable combinations that make sense
later.


 BTW, I think that using any of these options should automatically
 enable -l, rather than requiring it to be specified separately.

Good idea.

Tomas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix for gistchoose

2012-08-30 Thread Robert Haas
On Thu, Aug 30, 2012 at 4:48 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Thu, Aug 30, 2012 at 4:15 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Yeah, the idea of replacing sum_grow with a boolean just occurred to me
 too.  As is, I think the code is making some less-than-portable
 assumptions about what will happen if sum_grow overflows; which can
 definitely happen, seeing that gistpenalty and its callees intentionally
 return infinity in some cases.  I'd rather it didn't attempt to add
 column penalties together, and I think there's a case to be made that
 not doing so is a back-patchable bug fix.

 Keep in mind that the worst case outcome is the index quality is worse
 than it otherwise would have been, so it's not like
 OMG-PostgreSQ-eats-your-data.

 Agreed, but we've seen plenty of complaining about bloated gist indexes,
 and this might be the cause.

More likely one of several, but sure.

 I'll take a shot at improving this some more.

 Okey dokey.

 Attached is a revised version of gistchoose(); I've not yet transposed
 the changes into gistRelocateBuildBuffersOnSplit().  It looks a lot
 more readable to me now.  Objections?

Looks good to me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: pgbench - aggregation of info written into log

2012-08-30 Thread Robert Haas
On Thu, Aug 30, 2012 at 3:25 PM, Tomas Vondra t...@fuzzy.cz wrote:
 On 30 Srpen 2012, 18:02, Robert Haas wrote:
 On Fri, Aug 24, 2012 at 5:25 PM, Tomas Vondra t...@fuzzy.cz wrote:
 This patch is a bit less polished (and more complex) than the other
 pgbench patch I've sent a while back, and I'm not sure how to handle the
 Windows branch. That needs to be fixed during the commit fest.

 What's the problem with the Windows branch?

 Well, there are comments about how timestamp does not work on Windows
 (filling 0), and I'm not sure how that affects the patch (e.g. determining
 the aggregation interval). I have no Windows workstation available so I
 can't actually try that.

Hmm.  That seems like it might be something we need to fix first...

 Could you un-cuddle your brances to follow the PostgreSQL style, omit
 braces around single-statement blocks, and remove the spurious
 whitespace changes?

 OK, will do.

 Instead of having both use_log_agg and naggseconds, I think you can
 get by with just having a single variable that is zero if aggregation
 is not in use and is otherwise the aggregation period.  On a related
 note, you can't specify -A without an associated value, so there is no
 point in documenting a default.  As with your other patch, I suggest
 using a long option name like --latency-aggregate-interval and then
 making the name of the variable in the code match the option name,
 with s/-/_/g, for clarity.

 Why --latency-aggregate-interval? It has nothing to do with latencies,
 it's merely number of transactions per interval.

Oh, I thought it was modifying the behavior of -l.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] pg_dump and thousands of schemas

2012-08-30 Thread Robert Haas
On Thu, Aug 30, 2012 at 4:51 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Bruce Momjian br...@momjian.us writes:
 On Thu, May 31, 2012 at 09:20:43AM +0900, Tatsuo Ishii wrote:
 Ok, I modified the part of pg_dump where tremendous number of LOCK
 TABLE are issued. I replace them with single LOCK TABLE with multiple
 tables. With 100k tables LOCK statements took 13 minutes in total, now
 it only takes 3 seconds. Comments?

 Shall I commit to master and all supported branches?

 Was this applied?

 No, we fixed the server side instead.

But only for 9.2, right?  So people running back branches are still screwed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: pgbench - random sampling of transaction written into log

2012-08-30 Thread Robert Haas
On Thu, Aug 30, 2012 at 3:48 PM, Tomas Vondra t...@fuzzy.cz wrote:
 That sounds like a pretty trivial patch. I've been thinking about yet
 another option - histograms (regular or with exponential bins).

I thought about that, too, but I think high-outliers is a lot more
useful.  At least for the kinds of things I worry about.

 What I'm not sure about is which of these options should be allowed at the
 same time - to me, combinations like 'sampling + aggregation' don't make
 much sense. Maybe except 'latency-only-if-more-than + aggregation'.

Maybe, but perhaps not even.  I don't have a problem with saying that
the user is allowed to pick at most one method of reducing the output
volume.  I think we could say: either sample, or take high outliers,
or aggregate.  If we want to make some of those work in combination,
fine, but I don't think it's absolutely required.  What WILL be
required is a clear error message telling you what you did wrong if
you use an unsupported combination.

BTW, I think that using any of these options should automatically
enable -l, rather than requiring it to be specified separately.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2rc1 build requirements

2012-08-30 Thread Jeff Janes
On Thu, Aug 30, 2012 at 2:05 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 On Thu, Aug 30, 2012 at 1:18 PM, Joe Abbate j...@freedomcircle.com wrote:

gmake world

 Unfortunately, that failed because the doc build requires jade.  I
 managed to build contrib separately, but wanted to point out that jade
 is not mentioned in the requirements page
 (http://www.postgresql.org/docs/9.2/static/install-requirements.html ).

 That page should probably point out that there are additional
 requirements to build the documentation, and link to the relevant
 place that describes those.

patch to do that attached.

Cheers,

Jeff


doc_require_doc_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2rc1 build requirements

2012-08-30 Thread Jaime Casanova
On Thu, Aug 30, 2012 at 4:58 PM, Joe Abbate j...@freedomcircle.com wrote:
 On 30/08/12 17:36, Tom Lane wrote:
 FWIW, that suggests that this version of jade is too old.  I'm not sure
 that jade per se (as opposed to the successor project openjade) can be
 used to build our docs at all --- you should check whether this is
 openjade, or really the original project.

 It was the old jade.  After I installed openjade, as suggested by Alvaro
 and Jeff Janes, and re-ran ./configure the invocation line changed to
 use openjade.


so, now the question is: should we accept jade at all in configure? or
should we fail after not finding jade and report why?

-- 
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2rc1 build requirements

2012-08-30 Thread Jaime Casanova
On Thu, Aug 30, 2012 at 7:47 PM, Jaime Casanova ja...@2ndquadrant.com wrote:
 On Thu, Aug 30, 2012 at 4:58 PM, Joe Abbate j...@freedomcircle.com wrote:
 On 30/08/12 17:36, Tom Lane wrote:
 FWIW, that suggests that this version of jade is too old.  I'm not sure
 that jade per se (as opposed to the successor project openjade) can be
 used to build our docs at all --- you should check whether this is
 openjade, or really the original project.

 It was the old jade.  After I installed openjade, as suggested by Alvaro
 and Jeff Janes, and re-ran ./configure the invocation line changed to
 use openjade.


 so, now the question is: should we accept jade at all in configure? or
 should we fail after not finding jade and report why?


the last one should read openjade ;)


-- 
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2rc1 build requirements

2012-08-30 Thread Peter Eisentraut
On Thu, 2012-08-30 at 17:36 -0400, Tom Lane wrote:
 Joe Abbate j...@freedomcircle.com writes:
  As an aside, I installed jade (on Debian) and tried to make world but
  got several errors, starting with the following:
 
  jade  -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D .  -d
  stylesheet.dsl -t sgml -i output-html -V html-index postgres.sgml
  jade:E: unknown warning type fully-tagged
 
 FWIW, that suggests that this version of jade is too old.  I'm not sure
 that jade per se (as opposed to the successor project openjade) can be
 used to build our docs at all --- you should check whether this is
 openjade, or really the original project.

This is a bit bizarre, actually.  His problem is that the old version of
jade doesn't understand the -wfully-tagged warning option.  But the
comment in the Makefile says

# -wfully-tagged needed to throw a warning on missing tags
# for older tool chains, 2007-08-31

AFAICT, the desirable effect of all these options together is to warn
about empty start tags, but only openjade supports -wfully-tagged, and
even the most recent version needs it to produce that warning.

Of course there could have been intermediate versions that I don't have
access to right now.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] emacs configuration for new perltidy settings

2012-08-30 Thread Bruce Momjian
On Thu, Jul 12, 2012 at 12:35:26AM +0300, Peter Eisentraut wrote:
 This might be useful for some people.  Here is an emacs configuration
 for perl-mode that is compatible with the new perltidy settings.  Note
 that the default perl-mode settings produce indentation that will be
 completely shredded by the new perltidy settings.
 
 (defun pgsql-perl-style ()
   Perl style adjusted for PostgreSQL project
   (interactive)
   (setq tab-width 4)
   (setq perl-indent-level 4)
   (setq perl-continued-statement-offset 4)
   (setq perl-continued-brace-offset 4)
   (setq perl-brace-offset 0)
   (setq perl-brace-imaginary-offset 0)
   (setq perl-label-offset -2))
 
 (add-hook 'perl-mode-hook
(lambda ()
  (if (string-match postgresql buffer-file-name)
  (pgsql-perl-style

Added to src/tools/editors/emacs.samples;  applied patch attached.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/src/tools/editors/emacs.samples b/src/tools/editors/emacs.samples
new file mode 100644
index d9cfa2f..c8d8d07
*** a/src/tools/editors/emacs.samples
--- b/src/tools/editors/emacs.samples
***
*** 12,17 
--- 12,19 
  
  ;
  
+ ;;; Mode for C files to match src/tools/pgindent/pgindent formatting
+ 
  ;;; This set is known to work with old versions of emacs
  
  (setq auto-mode-alist
***
*** 80,85 
--- 82,107 
  
  ;
  
+ ;;; Mode for Perl files to match src/tools/pgindent/perltidyrc formatting
+ 
+ (defun pgsql-perl-style ()
+   Perl style adjusted for PostgreSQL project
+   (interactive)
+   (setq tab-width 4)
+   (setq perl-indent-level 4)
+   (setq perl-continued-statement-offset 4)
+   (setq perl-continued-brace-offset 4)
+   (setq perl-brace-offset 0)
+   (setq perl-brace-imaginary-offset 0)
+   (setq perl-label-offset -2))
+ 
+ (add-hook 'perl-mode-hook
+(lambda ()
+  (if (string-match postgresql buffer-file-name)
+  (pgsql-perl-style
+ 
+ ;
+ 
  ;;; To work on the documentation, the following (or a variant, as above)
  ;;; can be helpful.
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >