Re: [HACKERS] Doc patch making firm recommendation for setting the value of commit_delay

2012-11-15 Thread Albe Laurenz
Peter Geoghegan wrote:
 Some of you will be aware that I've tried to formulate a good general
 recommendation for setting the value of commit_delay, in light of the
 improvements made in 9.3. I previously asked for help finding a
 methodology for optimising commit_delay [1]. The documentation related
 to commit_delay still says that we don't know what might work well,
 though I don't think that's true any more.
 
 I found the time to do some benchmarks of my own - Greg Smith made
 available a server that he frequently uses for benchmarks. It was
 previously my observation that half of raw single-page sync time as
 reported by pg_test_fsync for you wal_sync_method worked best for
 commit_delay. I went so far as to modify pg_test_fsync to output
 average time per op in microseconds for each operation with
 commit_delay in mind, which is a patch that has already been committed
 [2].

[...]

 Attached is a doc-patch that makes recommendations that are consistent
 with my observations about what works best here. I'd like to see us
 making *some* recommendation - for sympathetic cases, setting
 commit_delay appropriately can make a very large difference to
 transaction throughput. Such sympathetic cases - many small write
 transactions - are something that tends to be seen relatively
 frequently with web applications, that disproportionately use cloud
 hosting. It isn't at all uncommon for these cases to be highly bound
 by their commit rate, and so it is compelling to try to amortize the
 cost of a flush as effectively as possible there. It would be
 unfortunate if no one was even aware that commit_delay is now useful
 for these cases, since the setting allows cloud hosting providers to
 help these cases quite a bit, without having to do something like
 compromise durability, which in general isn't acceptable.
 
 The point of all these benchmarks isn't to show how effective
 commit_delay now is, or can be - we already had that discussion months
 ago, before the alteration to its behaviour was committed. The point
 is to put my proposed doc changes in the appropriate context, so that
 I can build confidence that they're balanced and helpful, by showing
 cases that are not so sympathetic.
 
 Thoughts?

If there is an agreement that half the sync time as reported by
pg_test_fsync is a good value, would it make sense to have initdb test
sync time and preset commit_delay?

Yours,
Laurenz Albe



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] feature proposal - triggers by semantics

2012-11-15 Thread Craig Ringer

On 11/15/2012 03:44 PM, Darren Duncan wrote:
 As they currently exist, triggers always fire based on certain SQL
 syntax used, rather than on the semantics of what is actually going on.

That's not quite right. COPY fires INSERT triggers, despite never using
an explicit INSERT statement. There are probably other examples.
 As a simple example, I'd like to be able to define a trigger like
 AFTER DELETE ON foo FOR EACH ROW and have that trigger be invoked
 not only by a DELETE on foo but also by a TRUNCATE on foo.  So I would
 like to do some auditing action when a row of foo is deleted, no
 matter how it happens.
TRUNCATE triggers already exist.

If you wanted you could have a BEFORE TRUNCATE trigger SELECT each row
and invoke a function on each row, same as a FOR EACH ROW .. DELETE
trigger. So you can already achieve what you want to do with the
currently available features.

 The reason this particular example in particular is important is that
 TRUNCATE is documented as a data-manipulation action semantically
 equivalent to an unqualified DELETE in its effects, primarily.
I'd say we should just change the docs to note that TRUNCATE does not
act on each row individually, so it does not fire DELETE triggers. There
is a TRUNCATE trigger you can use to monitor and control TRUNCATE
behaviour.


 So, I'm partly proposing a specific narrow new feature, TRUNCATE FOR
 EACH ROW, but I'm also proposing the ability to generally define
 triggers based not on the syntax used but the actual action requested.
That already exists. Maybe you're using some old version?

regress= \h CREATE TRIGGER
Command: CREATE TRIGGER
Description: define a new trigger
Syntax:
CREATE [ CONSTRAINT ] TRIGGER name { BEFORE | AFTER | INSTEAD OF } {
event [ OR ... ] }
ON table_name
...
where event can be one of:

INSERT
UPDATE [ OF column_name [, ... ] ]
DELETE
TRUNCATE



 A tangential feature request is to provide a runtime config option
 that can cause TRUNCATE to always behave as unqualified DELETE FROM
 regardless of any triggers, as if it were just a syntactic shorthand.
Please, no!

If you want to prevent TRUNCATE, deny the privilege or add a trigger
that aborts the command.

TRUNCATE should do what it says, not magically be changed to do
something else by a GUC.
 Or alternately/also provide extra syntax to TRUNCATE itself where one
 can specify which behavior to have, and both options can be given
 explicitly to override any config option.
What advantage would that have over using DELETE when you want DELETE,
and TRUNCATE when you want TRUNCATE?

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doc patch making firm recommendation for setting the value of commit_delay

2012-11-15 Thread Greg Smith

On 11/15/12 12:19 AM, Albe Laurenz wrote:

If there is an agreement that half the sync time as reported by
pg_test_fsync is a good value, would it make sense to have initdb test
sync time and preset commit_delay?


Peter has validated this against a good range of systems, but it would 
be optimistic to say it's a universal idea.


My main concern with this would be the relatively common practice of 
moving the pg_xlog directory after initdb time.  Sometimes people don't 
know about the option to have initdb move it.  Sometimes the drive to 
hold pg_xlog isn't available when the database is originally created 
yet.  And the camp I fall into (which admittedly is the group who can 
take care of this on their own) will move pg_xlog manually and symlink 
it on their own, because that's what we're used to.


I would rather see this just turn into one of the things a more general 
tuning tool knew how to do, executing against a fully setup system. 
Having a useful implementation of commit_delay and useful docs on it 
seems like enough of a jump forward for one release.  Moving fully into 
auto-tuning before getting more field feedback on how that works out is 
pretty aggressive.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doc patch making firm recommendation for setting the value of commit_delay

2012-11-15 Thread Craig Ringer
On 11/15/2012 04:56 PM, Greg Smith wrote:

 I would rather see this just turn into one of the things a more
 general tuning tool knew how to do, executing against a fully setup
 system. Having a useful implementation of commit_delay and useful docs
 on it seems like enough of a jump forward for one release.  Moving
 fully into auto-tuning before getting more field feedback on how that
 works out is pretty aggressive.


It'll also potentially make it harder to get reproducible results in
benchmarking and testing across repeated runs, cause confusion when
someone relocates a DB or changes hardware, and slow down initdb (and
thus testing).

I'd be all for making it part of a test my hardware and tune my DB
tool, but not such a fan of doing it at initdb time. Making initdb less
predictable and more complicated sounds like asking for trouble.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v3

2012-11-15 Thread Andres Freund
Hi,

On Thursday, November 15, 2012 05:08:26 AM Michael Paquier wrote:
 Looks like cool stuff @-@
 I might be interested in looking at that a bit as I think I will hopefully
 be hopefully be able to grab some time in the next couple of weeks.
 Are some of those patches already submitted to a CF?

I added the patchset as one entry to the CF this time, it seems to me they are 
too hard to judge individually to make them really separately reviewable.

I can split it off there, but really all the complicated stuff is in one patch 
anyway...

Greetings,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Timing events WIP v1

2012-11-15 Thread Greg Smith
Attached is a first WIP saving Timing Events via a new hook, grabbed by 
a new pg_timing_events contrib module.  This only implements a small 
portion of the RFP spec I presented earlier this month, and is by no 
means finished code looking for a regular review.  This is just enough 
framework to do something useful while looking for parts that I suspect 
have tricky details.


Right now this saves just a single line of text and inserts the hook 
into the checkpoint code, so output looks like this:


$ psql -c select * from pg_timing_events -x
-[ RECORD 1 ]-
seq   | 0
event | check/restartpoint complete: wrote 0 buffers (0.0%); 0 
transaction log file(s) added, 0 removed, 0 recycled; write=0.000 s, 
sync=0.000 s, total=0.001 s; sync files=0, longest=0.000 s, average=0.000 s


Extending this to save the key/value set and most of the other data I 
mentioned before is pretty straightforward.  There is a fair amount of 
debris in here from the pg_stat_statements code this started as.  I've 
left a lot of that here but commented out because it gives examples of 
grabbing things like the user and database I'll need soon.


There's a modest list of things that I've done or am facing soon that 
involve less obvious choices though, and I would appreciate feedback on 
these things in particular:


-This will eventually aggregate data from clients running queries, the 
checkpointer process, and who knows what else in the future.  The most 
challenging part of this so far is deciding how to handle the memory 
management side.  I've tried just dumping everything into 
TopMemoryContext, and that not working as hoped is the biggest known bug 
right now.  I'm not decided on whether to continue doing that but 
resolve the bug; if there is an alternate context that makes more sense 
for a contrib module like this; or if I should fix the size of the data 
and allocate everything at load time.  The size of the data from 
log_lock_waits in particular will be hard to estimate in advance.


-The event queue into a simple array accessed in circular fashion. 
After realizing that output needed to navigate in the opposite order of 
element addition, I ended up just putting all the queue navigation code 
directly into the add/output routines.  I'd be happy to use a more 
formal Postgres type here instead--I looked at SHM_QUEUE for 
example--but I haven't found something yet that makes this any simpler.


-I modeled the hook here on the logging one that went into 9.2.  It's 
defined in its own include file and it gets initialized by the logging 
system.  No strong justification for putting it there, it was just 
similar to the existing hook and that seemed reasonable.  This is in 
some respects an alternate form of logging after all.  The data sent by 
the hook itself needs to be more verbose in order to handle the full 
spec, right now I'm just asking about placement.


-I'm growing increasingly worried about allowing concurrent reads of 
this data (which might be large and take a while to return to the 
client) without blocking insertions.  The way that was split to improve 
pg_stat_statements concurrency doesn't convert naturally to this data. 
The serial number field in here is part of one idea I had.  I might grab 
the header with an exclusive lock for only a moment and lookup the 
serial number of the last record I should display.  Then I could drop to 
a share lock looping over the elements, stopping if I find one with a 
serial number over that.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
diff --git a/contrib/pg_timing_events/Makefile 
b/contrib/pg_timing_events/Makefile
new file mode 100644
index 000..3e7f1ab
--- /dev/null
+++ b/contrib/pg_timing_events/Makefile
@@ -0,0 +1,18 @@
+# contrib/pg_timing_events/Makefile
+
+MODULE_big = pg_timing_events
+OBJS = pg_timing_events.o
+
+EXTENSION = pg_timing_events
+DATA = pg_timing_events--1.0.sql pg_timing_events--unpackaged--1.0.sql
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = contrib/pg_timing_events
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/contrib/pg_timing_events/pg_timing_events--1.0.sql 
b/contrib/pg_timing_events/pg_timing_events--1.0.sql
new file mode 100644
index 000..ccf6372
--- /dev/null
+++ b/contrib/pg_timing_events/pg_timing_events--1.0.sql
@@ -0,0 +1,43 @@
+/* contrib/pg_timing_events/pg_timing_events--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use CREATE EXTENSION pg_timing_events to load this file. \quit
+
+-- Register functions.
+CREATE FUNCTION pg_timing_events_reset()
+RETURNS void
+AS 'MODULE_PATHNAME'
+LANGUAGE C;
+
+CREATE FUNCTION pg_timing_events(
+   OUT seq int8,
+OUT event text
+--OUT occured timestamp
+
+--OUT userid oid,

Re: [HACKERS] Doc patch making firm recommendation for setting the value of commit_delay

2012-11-15 Thread Magnus Hagander
On Thu, Nov 15, 2012 at 9:56 AM, Greg Smith g...@2ndquadrant.com wrote:
 On 11/15/12 12:19 AM, Albe Laurenz wrote:

 If there is an agreement that half the sync time as reported by
 pg_test_fsync is a good value, would it make sense to have initdb test
 sync time and preset commit_delay?


 Peter has validated this against a good range of systems, but it would be
 optimistic to say it's a universal idea.

 My main concern with this would be the relatively common practice of moving
 the pg_xlog directory after initdb time.  Sometimes people don't know about
 the option to have initdb move it.  Sometimes the drive to hold pg_xlog
 isn't available when the database is originally created yet.  And the camp I
 fall into (which admittedly is the group who can take care of this on their
 own) will move pg_xlog manually and symlink it on their own, because that's
 what we're used to.

An even more common usecase for this, I think, is I installed from a
package that ran initdb for me.. I still think manual moving of
pg_xlog is a lot more common than using the initdb option in general.


 I would rather see this just turn into one of the things a more general
 tuning tool knew how to do, executing against a fully setup system. Having a
 useful implementation of commit_delay and useful docs on it seems like
 enough of a jump forward for one release.  Moving fully into auto-tuning
 before getting more field feedback on how that works out is pretty
 aggressive.

+1.


--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] feature proposal - triggers by semantics

2012-11-15 Thread Michael Paquier
A row-level trigger for TRUNCATE does not really make sense, as it would
mean that TRUNCATE needs to scan each tuple of the table it needs to
interact with to fire its trigger, so it would more or less achieve the
same performance as a plain DELETE FROM table;.

TRUNCATE is performant because it only removes the tuples, and does not
care about scanning. I am also not sure it is meant for scanning (btw there
are discussions about TRUNCATE these days so I might be missing something).
So, what you are looking for can be simply solved with row-level triggers
on DELETE, so why not using that and avoid reinventing the wheel?
-- 
Michael Paquier
http://michael.otacoo.com


Re: [HACKERS] feature proposal - triggers by semantics

2012-11-15 Thread Hannu Krosing

On 11/15/2012 09:48 AM, Craig Ringer wrote:

If you want to prevent TRUNCATE, deny the privilege or add a trigger
that aborts the command.

You can abort the transaction but not skip action as currently it is only
possible to skip in ROW level triggers.

So I'd modify this request to allow BEFORE EACH STATEMENT triggers
to also be able to silently skip current action like BEFORE EACH ROW 
triggers can.


Then this request would simply be satisfied by a simple trigger which
rewrites TRUNCATE into DELETE .

Hannu



TRUNCATE should do what it says, not magically be changed to do
something else by a GUC.

Or alternately/also provide extra syntax to TRUNCATE itself where one
can specify which behavior to have, and both options can be given
explicitly to override any config option.

What advantage would that have over using DELETE when you want DELETE,
and TRUNCATE when you want TRUNCATE?





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2012-11-15 Thread Greg Smith

On 11/14/12 6:28 PM, Kevin Grittner wrote:

  - Documentation is incomplete.
  ...
  - There are no regression tests yet.


Do you have any simple test cases you've been using you could attach? 
With epic new features like this, when things don't work it's hard to 
distinguish between that just isn't implemented yet and the author 
never tested that.  Having some known good samples you have tested, 
even if they're not proper regression tests, would be helpful for 
establishing the code baseline works.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for hint bit i/o mitigation

2012-11-15 Thread Amit Kapila
On Thursday, November 15, 2012 2:02 AM Atri Sharma wrote:
On Thu, Nov 15, 2012 at 12:42 AM, Atri Sharma atri.j...@gmail.com wrote:

On Wed, Nov 7, 2012 at 9:47 PM, Merlin Moncure mmonc...@gmail.com wrote:

On Wed, Nov 7, 2012 at 6:01 AM, Amit Kapila amit.kap...@huawei.com wrote:
  Following the sig is a first cut at a patch (written by Atri) that




PFA below the sig the updated patch for the same.It maintains a cache
cachedVisibilityXid which holds the results of clog visibility check.
cachedVisibilityXidStatus 

holds the commit status of the XID in cachedVisibilityXid.

In each visibility function (except HeapTupleSatisfiesVacuum() ), an
addition check has been added to check if the commit status of Xmin or Xmax
of a tuple can be retrieved from the cache.



 

1.  From your explanation and code, it is quite clear that it will
certainly give performance benefits in the scenario's mentioned by you.

   I can once validate the performance numbers again and do the code
review for this patch during CF-3. 

   However I am just not very sure about the use case, such that whether
it is a sufficient use case. 

   So I would like to ask opinion of other people as well.

 


2.  After this patch, tuple hint bit is not set by Select operations after
data populated by one transaction. 

  This appears to be good as it will save many ops (page dirty followed
by flush , clog inquiry). 

  Though it is no apparent, however we should see whether it can cause
any other impact due to this:

a.like may be now VACUUM needs set the hint bit which may cause more
I/O during Vacuum.

 

   Hackers, any opinion/suggestions about the use case?

With Regards,

Amit Kapila.

 



Re: [HACKERS] Doc patch making firm recommendation for setting the value of commit_delay

2012-11-15 Thread Peter Geoghegan
On 15 November 2012 10:04, Magnus Hagander mag...@hagander.net wrote:
 I would rather see this just turn into one of the things a more general
 tuning tool knew how to do, executing against a fully setup system. Having a
 useful implementation of commit_delay and useful docs on it seems like
 enough of a jump forward for one release.  Moving fully into auto-tuning
 before getting more field feedback on how that works out is pretty
 aggressive.

 +1.

I am inclined to agree. I did attempt to use the instrumentation
macros to have commit_delay set adaptively at runtime, which would
have at least addressed this concern, but that just didn't work as
well as this.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] feature proposal - triggers by semantics

2012-11-15 Thread Craig Ringer
On 11/15/2012 06:25 PM, Hannu Krosing wrote:
 On 11/15/2012 09:48 AM, Craig Ringer wrote:
 If you want to prevent TRUNCATE, deny the privilege or add a trigger
 that aborts the command.
 You can abort the transaction but not skip action as currently it is only
 possible to skip in ROW level triggers.

 So I'd modify this request to allow BEFORE EACH STATEMENT triggers
 to also be able to silently skip current action like BEFORE EACH ROW
 triggers can.

 Then this request would simply be satisfied by a simple trigger which
 rewrites TRUNCATE into DELETE .
That seems sensible to me, too.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Switching timeline over streaming replication

2012-11-15 Thread Heikki Linnakangas
Here's an updated version of this patch, rebased with master, including 
the recent replication timeout changes, and some other cleanup.


On 12.10.2012 09:34, Amit Kapila wrote:

The test is finished from myside.

one more issue:

 ...

./pg_basebackup -P -D ../../data_sub -X fetch -p 2303
pg_basebackup: COPY stream ended before last file was finished


Fixed this.

However, the test scenario you point to here: 
http://archives.postgresql.org/message-id/00a801cda6f3$4aba27b0$e02e7710$@kap...@huawei.com 
still seems to be broken, although I get a different error message now. 
I'll dig into this..


- Heikki


streaming-tli-switch-5.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doc patch making firm recommendation for setting the value of commit_delay

2012-11-15 Thread Peter Geoghegan
On 15 November 2012 08:56, Greg Smith g...@2ndquadrant.com wrote:
 My main concern with this would be the relatively common practice of moving
 the pg_xlog directory after initdb time.

I probably should have increased the default number of seconds that
pg_test_fsync runs for, in light of the fact that that can make an
appreciable difference when following this method. I'd suggest that a
value of 5 be used there. I think we should just change that, since
everyone will just use the default anyway (or, the more cautious ones
will use a higher value than the default, if anything).

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Switching timeline over streaming replication

2012-11-15 Thread Heikki Linnakangas

On 10.10.2012 17:26, Amit Kapila wrote:

36.+SendTimeLineHistory(TimeLineHistoryCmd *cmd)  {  ..
  if (nread= 0)
+ereport(ERROR,
+(errcode_for_file_access(),
+ errmsg(could not read file
\%s\: %m,
+path)));

FileClose should be done in error case as well.


Hmm, I think you're right. The straightforward fix to just call 
FileClose() before the ereport()s in that function would not be enough, 
though. You might run out of memory in pq_sendbytes(), for example, 
which would throw an error. We could use PG_TRY/CATCH for this, but 
seems like overkill. Perhaps the simplest fix is to use a global 
(static) variable for the fd, and clean it up in WalSndErrorCleanup().


This is a fairly general issue, actually. Looking around, I can see at 
least two similar cases in existing code, with BasicOpenFile, where we 
will leak file descriptors on error:


copy_file: there are several error cases, including out-of-disk space, 
with no attempt to close the fds.


XLogFileInit: again, no attempt to close the file descriptor on failure. 
This is called at checkpoint from the checkpointer process, to 
preallocate new xlog files.


Given that we haven't heard any complaints of anyone running into these, 
these are not a big deal in practice, but in theory at least the 
XLogFileInit leak could lead to serious problems, as it could cause the 
checkpointer to run out of file descriptors.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Switching timeline over streaming replication

2012-11-15 Thread Heikki Linnakangas

On 15.11.2012 12:44, Heikki Linnakangas wrote:

Here's an updated version of this patch, rebased with master, including
the recent replication timeout changes, and some other cleanup.

On 12.10.2012 09:34, Amit Kapila wrote:

The test is finished from myside.

one more issue:

  ...

./pg_basebackup -P -D ../../data_sub -X fetch -p 2303
pg_basebackup: COPY stream ended before last file was finished


Fixed this.

However, the test scenario you point to here:
http://archives.postgresql.org/message-id/00a801cda6f3$4aba27b0$e02e7710$@kap...@huawei.com
still seems to be broken, although I get a different error message now.
I'll dig into this..


Ok, here's an updated patch again, with that bug fixed.

- Heikki


streaming-tli-switch-6.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Switching timeline over streaming replication

2012-11-15 Thread Amit Kapila
On Thursday, November 15, 2012 6:05 PM Heikki Linnakangas wrote:
 On 15.11.2012 12:44, Heikki Linnakangas wrote:
  Here's an updated version of this patch, rebased with master,
  including the recent replication timeout changes, and some other
 cleanup.
 
  On 12.10.2012 09:34, Amit Kapila wrote:
  The test is finished from myside.
 
  one more issue:
...
  ./pg_basebackup -P -D ../../data_sub -X fetch -p 2303
  pg_basebackup: COPY stream ended before last file was finished
 
  Fixed this.
 
  However, the test scenario you point to here:
  http://archives.postgresql.org/message-id/00a801cda6f3$4aba27b0$e02e77
  10$@kap...@huawei.com still seems to be broken, although I get a
  different error message now.
  I'll dig into this..
 
 Ok, here's an updated patch again, with that bug fixed.

I shall review and test the updated Patch in Commit Fest.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash id in pg_stat_statements

2012-11-15 Thread Magnus Hagander
On Tue, Oct 2, 2012 at 8:22 PM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 On 2 October 2012 18:16, Tom Lane t...@sss.pgh.pa.us wrote
 1. Why isn't something like md5() on the reported query text an equally
 good solution for users who want a query hash?

 Because that does not uniquely identify the entry. The very first
 thing that the docs say on search_path is Qualified names are tedious
 to write, and it's often best not to wire a particular schema name
 into applications anyway. Presumably, the reason it's best not to
 wire schema names into apps is because it might be useful to modify
 search_path in a way that dynamically made the same queries in some
 application reference what are technically distinct relations. If
 anyone does this, and it seems likely that many do for various
 reasons, they will be out of luck when using some kind of
 pg_stat_statements aggregation.

 This was the behaviour that I intended for pg_stat_statements all
 along, and I think it's better than a solution that matches query
 strings.

 2. If people are going to accumulate stats on queries over a long period
 of time, is a 32-bit hash really good enough for the purpose?  If I'm
 doing the math right, the chance of collision is already greater than 1%
 at 1 queries, and rises to about 70% for 10 queries; see
 http://en.wikipedia.org/wiki/Birthday_paradox
 We discussed this issue and decided it was okay for pg_stat_statements's
 internal hash table, but it's not at all clear to me that it's sensible
 to use 32-bit hashes for external accumulation of query stats.

 Well, forgive me for pointing this out, but I did propose that the
 hash be a 64-bit value (which would have necessitated adopting
 hash_any() to produce 64-bit values), but you rejected the proposal. I
 arrived at the same probability for a collision as you did and posted
 in to the list, in discussion shortly after the normalisation stuff
 was committed.

What was the argument for rejecting it? Just that it required
hash_any() to be adapted?

Now that we have a very clear use case where this would help, perhaps
it's time to re-visit this proposal?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash id in pg_stat_statements

2012-11-15 Thread Peter Geoghegan
On 15 November 2012 13:10, Magnus Hagander mag...@hagander.net wrote:
 Well, forgive me for pointing this out, but I did propose that the
 hash be a 64-bit value (which would have necessitated adopting
 hash_any() to produce 64-bit values), but you rejected the proposal. I
 arrived at the same probability for a collision as you did and posted
 in to the list, in discussion shortly after the normalisation stuff
 was committed.

 What was the argument for rejecting it? Just that it required
 hash_any() to be adapted?

I think so, yes. It just wasn't deemed necessary. Note that hash_any()
has a comment that says something like if you want to adapt this so
that the Datum returned is a 64-bit integer, the way to do that
is I followed this advice in a revision of the pg_stat_statements
normalisation patch, and preserved compatibility by using a macro,
which Tom objected to.

 Now that we have a very clear use case where this would help, perhaps
 it's time to re-visit this proposal?

Perhaps. I think that the issue of whether or not a 64-bit integer is
necessary is totally orthogonal to the question of whether or not we
should expose the hash. The need to aggregate historical statistics
just doesn't appreciably alter things here, I feel. The number of
discrete queries that an application will execute in a week just isn't
that different from the number that it will ever execute, I suspect.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown

2012-11-15 Thread Amit kapila
On Monday, November 12, 2012 8:23 PM Fujii Masao wrote:
On Fri, Nov 9, 2012 at 3:03 PM, Amit Kapila amit.kap...@huawei.com wrote:
 On Thursday, November 08, 2012 10:42 PM Fujii Masao wrote:
 On Thu, Nov 8, 2012 at 5:53 PM, Amit Kapila amit.kap...@huawei.com
 wrote:
  On Thursday, November 08, 2012 2:04 PM Heikki Linnakangas wrote:
  On 19.10.2012 14:42, Amit kapila wrote:
   On Thursday, October 18, 2012 8:49 PM Fujii Masao wrote:

 Are you planning to introduce the timeout mechanism in pg_basebackup
 main process? Or background process? It's useful to implement both.

 By background process, you mean ReceiveXlogStream?
 For both.

 I think for background process, it can be done in a way similar to what we
 have done for walreceiver.

 Yes.

 But I have some doubts for how to do for main process:

 Logic similar to walreceiver can not be used incase network goes down during
 getting other database file from server.
 The reason for the same is to receive the data files PQgetCopyData() is
 called in synchronous mode, so it keeps waiting for infinite time till it
 gets some data.
 In order to solve this issue, I can think of following options:
 1. Making this call also asynchronous (but now sure about impact of this).

 +1

 Walreceiver already calls PQgetCopyData() asynchronously. ISTM you can
 solve the issue in the similar way to walreceiver's.

 2. In function pqWait, instead of passing hard-code value -1 (i.e. infinite
 wait), we can send some finite time. This time can be received as command
 line argument
 from respective utility and set the same in PGconn structure.

 Yes, I think that we should add something like --conninfo option to
 pg_basebackup
 and pg_receivexlog. We can easily set not only connect_timeout but also 
 sslmode,
 application_name, ... by using such option accepting conninfo string.

I have prepared an attached patch to make pg_basebackup and pg_receivexlog as 
non-blocking.
To do so I have to add new command line parameters in pg_basebackup and 
pg_receivexlog
for now added two more command line arguments 
a.  -r  for pg_basebackup and pg_receivexlog to take receive time-out 
value. Default value for this parameter is 60 sec. 
b. -t   for pg_basebackup and pg_receivexlog to take initial 
connection timeout value. Default value is infinite wait. 
We can change to accept --conninfo as well. 

I feel apart from above, remaining problem is for function call PQgetResult()
1. Wherever query is getting sent from BaseBackup, it calls the function 
PQgetResult to receive the result of query. 
As PQgetResult() is blocking function (it calls pqWait which can hang), so 
if network is down before sending the query itself, 
then there will not be any result, so it will keep hanging in PQgetResult . 
IMO, it can be solved in below ways:
a. Create one corresponding non-blocking function. But this function is being 
called from inside some of the 
 other libpq function (PQexec-PQexecFinish-PQgetResult). So it can be 
little tricky to solve this way.
b. Add the receive_timeout variable in PGconn structure and use it in pqWait 
for timeout whenever it is set.
c. any other better way?


 BTW, IIRC the walsender has no timeout mechanism during sending
 backup data to pg_basebackup. So it's also useful to implement the
 timeout mechanism for the walsender during backup.


What about using pq_putmessage_noblock()?

I think may be some more functions also needs to be made as noblock. I am still 
evaluating.

I will upload the attached patch in commitfest if you don't have any objections?

More Suggestions/Comments?

With Regards,
Amit Kapila.

noblock_basebackup_and_receivexlog.patch
Description: noblock_basebackup_and_receivexlog.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Patch to compute Max LSN of Data Pages

2012-11-15 Thread Robert Haas
On Thu, Nov 15, 2012 at 12:08 AM, Amit Kapila amit.kap...@huawei.com wrote:
 Okay.
 So as Robert and Alvaro suggested to have it separate utility rather than
 having options in pg_resetxlog to print MAX LSN seems to be quite
 appropriate.
 I am planning to update the patch to make it a separate utility as
 pg_computemaxlsn with options same as what I have proposed for pg_resetxlog
 to print MAX LSN.
 So considering it a separate utility there can be 2 options:
 a. have a utility in contrib.
 b. have a utility in bin similar to pg_resetxlog

I guess I'd vote for contrib, but I wouldn't be crushed if it went the
other way.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 08/14] Store the number of subtransactions in xl_running_xacts separately from toplevel xids

2012-11-15 Thread Simon Riggs
On 14 November 2012 22:17, Andres Freund and...@2ndquadrant.com wrote:

 To avoid complicating logic we store both, the toplevel and the subxids, in
 -xip, first -xcnt toplevel ones, and then -subxcnt subxids.

That looks good, not much change. Will apply in next few days. Please
add me as committer and mark ready.

 Also skip logging any subxids if the snapshot is suboverflowed, they aren't
 useful in that case anyway.

 This allows to make some operations cheaper and it allows faster startup for
 the future logical decoding feature because that doesn't care about
 subtransactions/suboverflow'edness.

...but please don't add extra touches of Andres magic along the way.
Doing that will just slow down patch acceptance and its not important.
I suggest to keep note of things like that and come back to them
later.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 08/14] Store the number of subtransactions in xl_running_xacts separately from toplevel xids

2012-11-15 Thread Andres Freund
On 2012-11-15 09:07:23 -0300, Simon Riggs wrote:
 On 14 November 2012 22:17, Andres Freund and...@2ndquadrant.com wrote:

  To avoid complicating logic we store both, the toplevel and the subxids, in
  -xip, first -xcnt toplevel ones, and then -subxcnt subxids.

 That looks good, not much change. Will apply in next few days. Please
 add me as committer and mark ready.

Cool. Will do.

  Also skip logging any subxids if the snapshot is suboverflowed, they aren't
  useful in that case anyway.

  This allows to make some operations cheaper and it allows faster startup for
  the future logical decoding feature because that doesn't care about
  subtransactions/suboverflow'edness.

 ...but please don't add extra touches of Andres magic along the way.
 Doing that will just slow down patch acceptance and its not important.
 I suggest to keep note of things like that and come back to them
 later.

Which magic are you talking about?

Only two parts changed in comparison to the previous situation. One is
that the following in ProcArrayApplyRecoveryInfo only applies to
toplevel transactions by virtue of -xcnt now only containing the
toplevel transaction count:
 +   /*
 +* Remove stale locks, if any.
 +*
 +* Locks are always assigned to the toplevel xid so we don't
 need to care
 +* about subxcnt/subxids (and by extension not about
 -suboverflowed).
 +*/
 StandbyReleaseOldLocks(running-xcnt, running-xids);

Note that there was no code change, just a change in meaning.

The other part is:
 +   /*
 +* Spin over procArray collecting all subxids, but only if there 
 hasn't
 +* been a suboverflow.
 +*/
 +   if (!suboverflowed)

Well, thats something that basically had to be decided either way when
writing the patch...

Greetings,

Andres

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 02/14] Add support for a generic wal reading facility dubbed XLogReader

2012-11-15 Thread Heikki Linnakangas

On 15.11.2012 03:17, Andres Freund wrote:


Features:
- streaming reading/writing
- filtering
- reassembly of records

Reusing the ReadRecord infrastructure in situations where the code that wants
to do so is not tightly integrated into xlog.c is rather hard and would require
changes to rather integral parts of the recovery code which doesn't seem to be
a good idea.

Missing:
- compressing the stream when removing uninteresting records
- writing out correct CRCs
- separating reader/writer


I'm disappointed to see that there has been no progress on this patch 
since last commitfest. I thought we agreed on the approach I championed 
for here: 
http://archives.postgresql.org/pgsql-hackers/2012-09/msg00636.php. There 
wasn't much work left to finish that, I believe.


Are you going to continue working on this?

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] add -Wlogical-op to standard compiler options?

2012-11-15 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 I think it might be worth adding -Wlogical-op to the standard warning
 options (for supported compilers, determined by configure test).

Does that add any new warnings with the current source code, and if
so what?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [PATCH 02/14] Add support for a generic wal reading facility dubbed XLogReader

2012-11-15 Thread Andres Freund
On 2012-11-15 16:22:56 +0200, Heikki Linnakangas wrote:
 On 15.11.2012 03:17, Andres Freund wrote:
 
 Features:
 - streaming reading/writing
 - filtering
 - reassembly of records
 
 Reusing the ReadRecord infrastructure in situations where the code that wants
 to do so is not tightly integrated into xlog.c is rather hard and would 
 require
 changes to rather integral parts of the recovery code which doesn't seem to 
 be
 a good idea.
 
 Missing:
 - compressing the stream when removing uninteresting records
 - writing out correct CRCs
 - separating reader/writer

 I'm disappointed to see that there has been no progress on this patch since
 last commitfest. I thought we agreed on the approach I championed for here:
 http://archives.postgresql.org/pgsql-hackers/2012-09/msg00636.php. There
 wasn't much work left to finish that, I believe.

While I still think my approach is superior at this point I have
accepted that I haven't convinced anybody of that. I plan to port over
what I have submitted to Alvaro's version of your patch.

I have actually started that but I simply couldn't finish it in
time. The approach for porting I took didn't work all that well and I
plan to restart doing that after doing some review work.

 Are you going to continue working on this?

this being my version of XlogReader? No. The patch above is unchanged
except some very minor rebasing to recent wal changes by Tom. The reason
its included in the series is simply that I haven't gotten rid of it yet
and the subsequent patches needed it. I do plan to continue working on a
rebased xlogdump version if nobody beats me to it (please do beat me!).

Ok?

The cover letter said:

 * Add support for a generic wal reading facility dubbed XLogReader
 There's some discussion about whats the best way to implement this in a
 separate CF topic.
 (unchanged)

I should have folded that in into the patch description, sorry.

Greetings,

Andres

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [v9.3] Extra Daemons (Re: [HACKERS] elegant and effective way for running jobs inside a database)

2012-11-15 Thread Heikki Linnakangas

On 23.10.2012 00:29, Alvaro Herrera wrote:

Here's an updated version of this patch, which also works in
an EXEC_BACKEND environment.  (I haven't tested this at all on Windows,
but I don't see anything that would create a portability problem there.)


Looks good at first glance. Fails on Windows, though:

C:\postgresql\pgsql.sln (default target) (1) -
C:\postgresql\auth_counter.vcxproj (default target) (29) -
(Link target) -
  auth_counter.obj : error LNK2001: unresolved external symbol 
UnBlockSig [C:\p

ostgresql\auth_counter.vcxproj]
  .\Release\auth_counter\auth_counter.dll : fatal error LNK1120: 1 
unresolved externals [C:\postgresql\auth_counter.vcxproj]



C:\postgresql\pgsql.sln (default target) (1) -
C:\postgresql\worker_spi.vcxproj (default target) (77) -
  worker_spi.obj : error LNK2001: unresolved external symbol UnBlockSig 
[C:\pos

tgresql\worker_spi.vcxproj]
  .\Release\worker_spi\worker_spi.dll : fatal error LNK1120: 1 
unresolved externals [C:\postgresql\worker_spi.vcxproj]


Marking UnBlockSig with PGDLLIMPORT fixes that. But I wonder if it's a 
good idea to leave unblocking signals the responsibility of the user 
code in the first place? That seems like the kind of low-level stuff 
that you want to hide from extension writers.


Oh, and this needs docs.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 02/14] Add support for a generic wal reading facility dubbed XLogReader

2012-11-15 Thread Alvaro Herrera
Heikki Linnakangas wrote:
 On 15.11.2012 03:17, Andres Freund wrote:
 
 Features:
 - streaming reading/writing
 - filtering
 - reassembly of records
 
 Reusing the ReadRecord infrastructure in situations where the code that wants
 to do so is not tightly integrated into xlog.c is rather hard and would 
 require
 changes to rather integral parts of the recovery code which doesn't seem to 
 be
 a good idea.
 
 Missing:
 - compressing the stream when removing uninteresting records
 - writing out correct CRCs
 - separating reader/writer
 
 I'm disappointed to see that there has been no progress on this
 patch since last commitfest. I thought we agreed on the approach I
 championed for here:
 http://archives.postgresql.org/pgsql-hackers/2012-09/msg00636.php.
 There wasn't much work left to finish that, I believe.
 
 Are you going to continue working on this?

I worked a bit more on that patch of yours, but I neglected to submit
it.  Did you have something in particular that you wanted changed in it?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Switching timeline over streaming replication

2012-11-15 Thread Tom Lane
Heikki Linnakangas hlinnakan...@vmware.com writes:
 This is a fairly general issue, actually. Looking around, I can see at 
 least two similar cases in existing code, with BasicOpenFile, where we 
 will leak file descriptors on error:

Um, don't we automatically clean those up during transaction abort?
If we don't, we ought to think about that, not about cluttering calling
code with certain-to-be-inadequate cleanup in error cases.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-11-15 Thread Amit kapila
On Wednesday, November 14, 2012 12:24 AM Robert Haas wrote:
On Mon, Nov 12, 2012 at 10:59 PM, Amit kapila amit.kap...@huawei.com wrote:

 Uh, no, I don't think that's a good idea.  IMHO, what we should do is:

 1. Read postgresql.conf.auto and remember all the settings we saw.  If we see 
 something funky like an include directive, barf.
 2. Forget the value we remembered for the particular setting being changed.  
 Instead, remember the user-supplied new value for that parameter.
 3. Write a new postgresql.conf.auto based on the information remembered in 
 steps 1 and 2.

Attached patch contains implementaion for above concept. 
It can be changed to adapt the write file based on GUC variables as described 
by me yesterday or in some other better way.

Currenty to remember and forget, I have used below concept:
1. Read postgresql.auto.conf in memory.
2. parse the GUC_file for exact loction of changed variable
3. update the changed variable in memory at location found in step-2
4. Write the postgresql.auto.conf

Overall changes:
1. include dir in postgresql.conf at time of initdb
2. new built-in function pg_change_conf to change configuration settings
3. write file as per logic described above.

Some more things left are:
1. Transactional capability to command, so that rename of .lock file to 
.auto.conf can be done at commit time.

I am planing to upload the attached for this CF.

Suggestions/Comments?
  
With Regards,
Amit Kapila.

pg_change_conf.patch
Description: pg_change_conf.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [v9.3] Extra Daemons (Re: [HACKERS] elegant and effective way for running jobs inside a database)

2012-11-15 Thread Alvaro Herrera
Heikki Linnakangas escribió:
 On 23.10.2012 00:29, Alvaro Herrera wrote:
 Here's an updated version of this patch, which also works in
 an EXEC_BACKEND environment.  (I haven't tested this at all on Windows,
 but I don't see anything that would create a portability problem there.)
 
 Looks good at first glance.

Thanks.

 Fails on Windows, though:
 
 C:\postgresql\pgsql.sln (default target) (1) -
 C:\postgresql\auth_counter.vcxproj (default target) (29) -
 (Link target) -
   auth_counter.obj : error LNK2001: unresolved external symbol
 UnBlockSig [C:\p
 ostgresql\auth_counter.vcxproj]
   .\Release\auth_counter\auth_counter.dll : fatal error LNK1120: 1
 unresolved externals [C:\postgresql\auth_counter.vcxproj]

Wow.  If that's the only problem it has on Windows, I am extremely
pleased.

Were you able to test the provided test modules?  Only now I realise
that they aren't very friendly because there's a hardcoded database name
in there (alvherre, not the wisest choice I guess), but they should at
least be able to run and not turn into a fork bomb due to being unable
to connect, for instance.

 Marking UnBlockSig with PGDLLIMPORT fixes that. But I wonder if it's
 a good idea to leave unblocking signals the responsibility of the
 user code in the first place? That seems like the kind of low-level
 stuff that you want to hide from extension writers.

Sounds sensible.

I am unsure about the amount of pre-cooked stuff we need to provide.
For instance, do we want some easy way to let the user code run
transactions?

 Oh, and this needs docs.

Hmm, yes it does.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] binary heap implementation

2012-11-15 Thread Robert Haas
On Wed, Nov 14, 2012 at 8:11 AM, Abhijit Menon-Sen a...@2ndquadrant.com wrote:
 Comments? Suggestions?

It could use a run through pgindent.  And the header comments are
separated by a blank line from the functions to which they are
attached, which is not project style.

+   if (heap-size + 1 == heap-space)
+   Assert(binary heap is full);

This doesn't really make any sense.  Assert(binary heap is full)
will never fail, because that expression is a character string which
is, therefore, not NULL.  I think you want Assert(heap-size 
heap-space).  Or you could use (heap-size + 1 = heap-space)
elog(LEVEL, message) if there's some reason this needs to be checked
in non-debug builds.

+   {
+   sift_down(heap, i);
+   }

Project style is to omit braces for a single-line body.  This comes up
a few other places as well.

Other than the coding style issues, I think this looks fine.  If you
can fix that up, I believe I could go ahead and commit this, unless
anyone else objects.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 02/14] Add support for a generic wal reading facility dubbed XLogReader

2012-11-15 Thread Andres Freund
On 2012-11-15 11:50:37 -0300, Alvaro Herrera wrote:
 Heikki Linnakangas wrote:
  On 15.11.2012 03:17, Andres Freund wrote:
  
  Features:
  - streaming reading/writing
  - filtering
  - reassembly of records
  
  Reusing the ReadRecord infrastructure in situations where the code that 
  wants
  to do so is not tightly integrated into xlog.c is rather hard and would 
  require
  changes to rather integral parts of the recovery code which doesn't seem 
  to be
  a good idea.
  
  Missing:
  - compressing the stream when removing uninteresting records
  - writing out correct CRCs
  - separating reader/writer
 
  I'm disappointed to see that there has been no progress on this
  patch since last commitfest. I thought we agreed on the approach I
  championed for here:
  http://archives.postgresql.org/pgsql-hackers/2012-09/msg00636.php.
  There wasn't much work left to finish that, I believe.
 
  Are you going to continue working on this?

 I worked a bit more on that patch of yours, but I neglected to submit
 it.  Did you have something in particular that you wanted changed in it?

Could you push your newest version to your git repository or similar?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Switching timeline over streaming replication

2012-11-15 Thread Heikki Linnakangas

On 15.11.2012 16:55, Tom Lane wrote:

Heikki Linnakangashlinnakan...@vmware.com  writes:

This is a fairly general issue, actually. Looking around, I can see at
least two similar cases in existing code, with BasicOpenFile, where we
will leak file descriptors on error:


Um, don't we automatically clean those up during transaction abort?


Not the ones allocated with PathNameOpenFile or BasicOpenFile. Files 
allocated with AllocateFile() and OpenTemporaryFile() are cleaned up at 
abort.



If we don't, we ought to think about that, not about cluttering calling
code with certain-to-be-inadequate cleanup in error cases.


Agreed. Cleaning up at end-of-xact won't help walsender or other 
non-backend processes, though, because they don't do transactions. But a 
top-level ResourceOwner that's reset in the sigsetjmp() cleanup routine 
would work.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 02/14] Add support for a generic wal reading facility dubbed XLogReader

2012-11-15 Thread Alvaro Herrera
Andres Freund wrote:
 On 2012-11-15 11:50:37 -0300, Alvaro Herrera wrote:

  I worked a bit more on that patch of yours, but I neglected to submit
  it.  Did you have something in particular that you wanted changed in it?
 
 Could you push your newest version to your git repository or similar?

Sadly, I cannot, because I had it on my laptop only and its screen died
this morning (well, actually it doesn't boot at all, so I can't use the
external screen either).  I'm trying to get it fixed as soon as possible
but obviously I have no idea when I will be able to get it back.  Most
likely I will have to go out and buy a 2.5 drive enclosure to get the
valuable stuff out of it.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] binary heap implementation

2012-11-15 Thread Alvaro Herrera
Robert Haas escribió:
 On Wed, Nov 14, 2012 at 8:11 AM, Abhijit Menon-Sen a...@2ndquadrant.com 
 wrote:
  Comments? Suggestions?
 
 It could use a run through pgindent.  And the header comments are
 separated by a blank line from the functions to which they are
 attached, which is not project style.

Also there are some comments in the .h file which we typically don't
carry.

 Other than the coding style issues, I think this looks fine.  If you
 can fix that up, I believe I could go ahead and commit this, unless
 anyone else objects.

Does this include the changes in nodeMergeappend.c?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [v9.3] Extra Daemons (Re: [HACKERS] elegant and effective way for running jobs inside a database)

2012-11-15 Thread Heikki Linnakangas

On 15.11.2012 17:10, Alvaro Herrera wrote:

Heikki Linnakangas escribió:

On 23.10.2012 00:29, Alvaro Herrera wrote:

Here's an updated version of this patch, which also works in
an EXEC_BACKEND environment.  (I haven't tested this at all on Windows,
but I don't see anything that would create a portability problem there.)


Looks good at first glance.


Thanks.


Fails on Windows, though:

C:\postgresql\pgsql.sln (default target) (1) -
C:\postgresql\auth_counter.vcxproj (default target) (29) -
(Link target) -
   auth_counter.obj : error LNK2001: unresolved external symbol
UnBlockSig [C:\p
ostgresql\auth_counter.vcxproj]
   .\Release\auth_counter\auth_counter.dll : fatal error LNK1120: 1
unresolved externals [C:\postgresql\auth_counter.vcxproj]


Wow.  If that's the only problem it has on Windows, I am extremely
pleased.

Were you able to test the provided test modules?


I tested the auth_counter module, seemed to work. It counted all 
connections as successful, though, even when I tried to log in with an 
invalid username/database. Didn't try with an invalid password. And I 
didn't try worker_spi.



I am unsure about the amount of pre-cooked stuff we need to provide.
For instance, do we want some easy way to let the user code run
transactions?


Would be nice, of course. I guess it depends on how much work would it 
be provide that. But we can leave that for later, once the base patch is in.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] binary heap implementation

2012-11-15 Thread Will Crawford
Assert(!description of error) is an idiom I've seen more than once,
although I'm not sure I understand why it's not written as Robert says with
the condition in the brackets (or as a print to STDERR followed by abort()
instead).


On 15 November 2012 15:11, Robert Haas robertmh...@gmail.com wrote:

 On Wed, Nov 14, 2012 at 8:11 AM, Abhijit Menon-Sen a...@2ndquadrant.com
 wrote:
  Comments? Suggestions?

 It could use a run through pgindent.  And the header comments are
 separated by a blank line from the functions to which they are
 attached, which is not project style.

 +   if (heap-size + 1 == heap-space)
 +   Assert(binary heap is full);

 This doesn't really make any sense.  Assert(binary heap is full)
 will never fail, because that expression is a character string which
 is, therefore, not NULL.  I think you want Assert(heap-size 
 heap-space).  Or you could use (heap-size + 1 = heap-space)
 elog(LEVEL, message) if there's some reason this needs to be checked
 in non-debug builds.

 +   {
 +   sift_down(heap, i);
 +   }

 Project style is to omit braces for a single-line body.  This comes up
 a few other places as well.

 Other than the coding style issues, I think this looks fine.  If you
 can fix that up, I believe I could go ahead and commit this, unless
 anyone else objects.

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] [PATCH 02/14] Add support for a generic wal reading facility dubbed XLogReader

2012-11-15 Thread Heikki Linnakangas

On 15.11.2012 16:50, Alvaro Herrera wrote:

Heikki Linnakangas wrote:

On 15.11.2012 03:17, Andres Freund wrote:


Features:
- streaming reading/writing
- filtering
- reassembly of records

Reusing the ReadRecord infrastructure in situations where the code that wants
to do so is not tightly integrated into xlog.c is rather hard and would require
changes to rather integral parts of the recovery code which doesn't seem to be
a good idea.

Missing:
- compressing the stream when removing uninteresting records
- writing out correct CRCs
- separating reader/writer


I'm disappointed to see that there has been no progress on this
patch since last commitfest. I thought we agreed on the approach I
championed for here:
http://archives.postgresql.org/pgsql-hackers/2012-09/msg00636.php.
There wasn't much work left to finish that, I believe.

Are you going to continue working on this?


I worked a bit more on that patch of yours, but I neglected to submit
it.  Did you have something in particular that you wanted changed in it?


Off the top of my head, there were a two open items with the patch as I 
submitted it:


1. Need to make sure it's easy to compile outside backend code. So that 
it's suitable for using in an xlogdump contrib module, for example.


2. do something about error reporting. In particular, xlogreader.c 
should not call emode_for_corrupt_record(), but we need to provide for 
that functionlity somehow. I think I'd prefer xlogreader.c to not 
ereport() on a corrupt record. Instead, it would return an error string 
to the caller, which could then decide what to do with it. Translating 
the messages needs some thought, though.


Other than those, and cleanup of any obsoleted comments etc. and adding 
docs, I think it was good to go.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for hint bit i/o mitigation

2012-11-15 Thread Merlin Moncure
On Thu, Nov 15, 2012 at 4:39 AM, Amit Kapila amit.kap...@huawei.com wrote:
In each visibility function (except HeapTupleSatisfiesVacuum() ), an
 addition check has been added to check if the commit status of Xmin or Xmax
 of a tuple can be retrieved from the cache.



 1.  From your explanation and code, it is quite clear that it will
 certainly give performance benefits in the scenario’s mentioned by you.

I can once validate the performance numbers again and do the code
 review for this patch during CF-3.

However I am just not very sure about the use case, such that whether
 it is a sufficient use case.

So I would like to ask opinion of other people as well.

sure.  I'd like to note though that hint bit i/o is a somewhat common
complaint.  it tends to most affect OLAP style workloads.  in
pathological workloads, it can really burn you -- it's not fun when
you are i/o starved via sequential scan.  This can still happen when
sweeping dead records (which this patch doesn't deal with, though
maybe it should).

 2.  After this patch, tuple hint bit is not set by Select operations after
 data populated by one transaction.

   This appears to be good as it will save many ops (page dirty followed
 by flush , clog inquiry).

Technically it does not save clog fetch as transam.c has a very
similar cache mechanism.  However, it does save a page write i/o and a
lock on the page header, as well as a couple of other minor things.
In the best case, the page write is completely masked as the page gets
dirty for other reasons.  I think this is going to become more
important in checksum enabled scenarios.

   Though it is no apparent, however we should see whether it can cause
 any other impact due to this:

 a.like may be now VACUUM needs set the hint bit which may cause more
 I/O during Vacuum.

IMNSHO. deferring non-critical i/o from foreground process to
background process is generally good.  VACUUM has nice features like
i/o throttling and in place cancel so latent management of internal
page optimization flags really belong there ideally.  Also, the longer
you defer such I/O the more opportunity there is for it to be masked
off by some other page dirtying operation (again, this is more
important in the face of having to log hint bit changes).

There could be some good rebuttal analysis though.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] binary heap implementation

2012-11-15 Thread Andrew Dunstan


On 11/15/2012 10:11 AM, Robert Haas wrote:


+   {
+   sift_down(heap, i);
+   }

Project style is to omit braces for a single-line body.  This comes up
a few other places as well.



I thought we modified that some years ago, although my memory of it is a 
bit hazy. Personally I think it's a bad rule, at least as stated, in the 
case where there's an else clause with a compound statement. I'd prefer 
to see a rule that says that either both branches of an if/else should 
be compound statements or neither should be. I think the cases where 
only one branch is a compound statement are rather ugly.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tuplesort memory usage: grow_memtuples

2012-11-15 Thread Robert Haas
On Tue, Oct 16, 2012 at 4:47 PM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 The same basic strategy for sizing the tuplesort memtuples array in
 also exists in tuplestore. I wonder if we should repeat this there? I
 suppose that that could follow later.

I think it'd be a good idea to either adjust tuplestore as well or
explain in the comments there why the algorithm is different, because
it has a comment which references this comment, and now the logic
won't be in sync in spite of a comment implying that it is.

I didn't initially understand why the proposed heuristic proposed is
safe.  It's clear that we have to keep LACKMEM() from becoming true,
or we'll bail out of this code with elog(ERROR).  It will become true
if we increase the size of the array more by a number of elements
which exceeds state-availmem / sizeof(SortTuple), but the actual
number of elements being added is memtupsize * allowedMem / memNowUsed
- memtupsize, which doesn't look closely related.  However, I think I
figured it out.  We know that memNowUsed = memtupsize *
sizeof(SortTuple); substituting, we're adding no more than memtupsize
* allowedMem / (memtupsize * sizeof(SortTuple)) - memtupsize elements,
which simplifies to allowedMem / sizeof(SortTuple) - memtupsize =
(allowedMem - sizeof(SortTuple) * memtupsize) / sizeof(SortTuple).
Since we know availMem  allowedMem - sizeof(SortTuple) * memtupsize,
that means the number of new elements is less than
state-availMem/sizeof(SortTuple).  Whew.  In the attached version, I
updated the comment to reflect this, and also reversed the order of
the if/else block to put the shorter and more common case first, which
I think is better style.

I'm still not too sure why this part is OK:

/* This may not be our first time through */
if (newmemtupsize = memtupsize)
return false;

Suppose we enlarge the memtuples array by something other than a
multiple of 2, and then we kick out all of the tuples that are
currently in memory and load a new group with a smaller average size.
ISTM that it could now appear that the memtuples array can be useful
further enlarged, perhaps by as little as one tuple, and that that
this could happen repeatedly.  I think we should either refuse to grow
the array unless we can increase it by at least, say, 10%, or else set
a flag after the final enlargement that acts as a hard stop to any
further growth.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


sortmem_grow-v4.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] binary heap implementation

2012-11-15 Thread Robert Haas
On Thu, Nov 15, 2012 at 10:21 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Other than the coding style issues, I think this looks fine.  If you
 can fix that up, I believe I could go ahead and commit this, unless
 anyone else objects.

 Does this include the changes in nodeMergeappend.c?

I think so.  I was going to double-check the performance before
committing, but it doesn't look like there should be a problem.
Coding style changes appear needed in that file as well, however.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] binary heap implementation

2012-11-15 Thread Andres Freund
On 2012-11-14 18:41:12 +0530, Abhijit Menon-Sen wrote:
 There are two or three places in the Postgres source that implement heap
 sort (e.g. tuplesort.c, nodeMergeAppend.c), and it's also needed by the
 BDR code. It seemed reasonable to factor out the functionality.

pg_dump also contains a binary heap implementation if memory serves
right which makes me wonder whether we should try binaryheap.[ch]
backend clean... It currently uses palloc/pfree.

Too bad we don't have a memory allocation routine which easily works in
the backend and frontend... palloc references MemoryContextAlloc and
CurrentMemoryContext so thats not that easy to emulate.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 02/14] Add support for a generic wal reading facility dubbed XLogReader

2012-11-15 Thread Alvaro Herrera
Heikki Linnakangas wrote:
 On 15.11.2012 16:50, Alvaro Herrera wrote:

 I worked a bit more on that patch of yours, but I neglected to submit
 it.  Did you have something in particular that you wanted changed in it?
 
 Off the top of my head, there were a two open items with the patch
 as I submitted it:
 
 1. Need to make sure it's easy to compile outside backend code. So
 that it's suitable for using in an xlogdump contrib module, for
 example.
 
 2. do something about error reporting. In particular, xlogreader.c
 should not call emode_for_corrupt_record(), but we need to provide
 for that functionlity somehow. I think I'd prefer xlogreader.c to
 not ereport() on a corrupt record. Instead, it would return an error
 string to the caller, which could then decide what to do with it.
 Translating the messages needs some thought, though.
 
 Other than those, and cleanup of any obsoleted comments etc. and
 adding docs, I think it was good to go.

Thanks.  I was toying with the idea that xlogreader.c should return a
status code to the caller, and additionally an error string; not all
error cases are equal.

Most of what I did (other than general cleanup) was moving some xlog.c
global vars into a private_data struct for xlogreader.c to pass around;
one problem I had was deciding what to do with curFileTLI and
LastFileTLI (IIRC), because they are used outside of the reader module
(they were examined after recovery finished).

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for hint bit i/o mitigation

2012-11-15 Thread Amit Kapila
On Thursday, November 15, 2012 9:27 PM Merlin Moncure wrote:
 On Thu, Nov 15, 2012 at 4:39 AM, Amit Kapila amit.kap...@huawei.com
 wrote:
 In each visibility function (except HeapTupleSatisfiesVacuum() ), an
  addition check has been added to check if the commit status of Xmin
 or Xmax
  of a tuple can be retrieved from the cache.
 
 
 
  1.  From your explanation and code, it is quite clear that it will
  certainly give performance benefits in the scenario's mentioned by
 you.
 
 I can once validate the performance numbers again and do the
 code
  review for this patch during CF-3.
 
 However I am just not very sure about the use case, such that
 whether
  it is a sufficient use case.
 
 So I would like to ask opinion of other people as well.
 
 sure.  I'd like to note though that hint bit i/o is a somewhat common
 complaint.  it tends to most affect OLAP style workloads.  in
 pathological workloads, it can really burn you -- it's not fun when
 you are i/o starved via sequential scan.  This can still happen when
 sweeping dead records (which this patch doesn't deal with, though
 maybe it should).
 
  2.  After this patch, tuple hint bit is not set by Select operations
 after
  data populated by one transaction.
 
This appears to be good as it will save many ops (page dirty
 followed
  by flush , clog inquiry).
 
 Technically it does not save clog fetch as transam.c has a very
 similar cache mechanism.  However, it does save a page write i/o and a
 lock on the page header, as well as a couple of other minor things.
 In the best case, the page write is completely masked as the page gets
 dirty for other reasons.  I think this is going to become more
 important in checksum enabled scenarios.
 
Though it is no apparent, however we should see whether it can
 cause
  any other impact due to this:
 
  a.like may be now VACUUM needs set the hint bit which may
 cause more
  I/O during Vacuum.
 
 IMNSHO. deferring non-critical i/o from foreground process to
 background process is generally good.  

Yes, in regard of deferring you are right. 
However in this case may be when foreground process has to mark page dirty
due to hint-bit, it was already dirty so no extra I/O, but when it is done
by VACUUM, page may not be dirty. 

Also due to below points, doing it in VACUUM may cost more:
a. VACUUM has ring-buffer of fixed size and if such pages are many then
write of page needs to be done by VACUUM to replace existing page 
   in ring.
b. Considering sometimes people want VACUUM to run when system is not busy,
the chances of generating more overall I/O in system can be  
   more.

Why we can't avoid setting hint-bit in VACUUM?
Is it due to reason that it has to be done in some way, so that hint-bits
are properly set.
Or may be I am missing something trivial?

Though the case VACUUM, I am talking occurs very less in practical, but the
point came to my mind, 
so I thought of sharing with you.

 VACUUM has nice features like
 i/o throttling and in place cancel so latent management of internal
 page optimization flags really belong there ideally.  Also, the longer
 you defer such I/O the more opportunity there is for it to be masked
 off by some other page dirtying operation (again, this is more
 important in the face of having to log hint bit changes).
 
 There could be some good rebuttal analysis though.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] binary heap implementation

2012-11-15 Thread Alvaro Herrera
Andrew Dunstan escribió:
 
 On 11/15/2012 10:11 AM, Robert Haas wrote:
 
 +{
 +sift_down(heap, i);
 +}
 
 Project style is to omit braces for a single-line body.  This comes up
 a few other places as well.
 
 I thought we modified that some years ago, although my memory of it
 is a bit hazy.

No, we only modified pg_indent to not take the braces off, because
of PG_TRY blocks.  But we keep using single statements instead of
compound in many places; but there is no hard rule about it.  To me,
using braces were they are not needed is pointless and ugly.  We're very
careful about ensuring that our macro definitions work nicely with
single-statement if/else, for example.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 03/14] Add simple xlogdump tool

2012-11-15 Thread Peter Eisentraut
On 11/14/12 8:17 PM, Andres Freund wrote:
 diff --git a/src/bin/Makefile b/src/bin/Makefile
 index b4dfdba..9992f7a 100644
 --- a/src/bin/Makefile
 +++ b/src/bin/Makefile
 @@ -14,7 +14,7 @@ top_builddir = ../..
  include $(top_builddir)/src/Makefile.global
  
  SUBDIRS = initdb pg_ctl pg_dump \
 - psql scripts pg_config pg_controldata pg_resetxlog pg_basebackup
 + psql scripts pg_config pg_controldata pg_resetxlog pg_basebackup 
 xlogdump

should be pg_xlogdump

  
  ifeq ($(PORTNAME), win32)
  SUBDIRS += pgevent
 diff --git a/src/bin/xlogdump/Makefile b/src/bin/xlogdump/Makefile
 new file mode 100644
 index 000..d54640a
 --- /dev/null
 +++ b/src/bin/xlogdump/Makefile
 @@ -0,0 +1,25 @@
 +#-
 +#
 +# Makefile for src/bin/xlogdump
 +#
 +# Copyright (c) 1998-2012, PostgreSQL Global Development Group
 +#
 +# src/bin/pg_resetxlog/Makefile

fix that

 +#
 +#-
 +
 +PGFILEDESC = xlogdump
 +PGAPPICON=win32
 +
 +subdir = src/bin/xlogdump
 +top_builddir = ../../..
 +include $(top_builddir)/src/Makefile.global
 +
 +OBJS= xlogdump.o \
 +  $(WIN32RES)
 +
 +all: xlogdump
 +
 +
 +xlogdump: $(OBJS) $(shell find ../../backend ../../timezone -name 
 objfiles.txt|xargs cat|tr -s   \012|grep -v /main.o|sed 
 's/^/..\/..\/..\//')
 + $(CC) $(CFLAGS) $^ $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@$(X)

This looks pretty evil, and there is no documentation about what it is
supposed to do.

Windows build support needs some thought.


 diff --git a/src/bin/xlogdump/xlogdump.c b/src/bin/xlogdump/xlogdump.c
 new file mode 100644
 index 000..0f984e4
 --- /dev/null
 +++ b/src/bin/xlogdump/xlogdump.c
 @@ -0,0 +1,468 @@
 +#include postgres.h
 +
 +#include unistd.h
 +
 +#include access/xlogreader.h
 +#include access/rmgr.h
 +#include miscadmin.h
 +#include storage/ipc.h
 +#include utils/memutils.h
 +#include utils/guc.h
 +
 +#include getopt_long.h
 +
 +/*
 + * needs to be declared because otherwise its defined in main.c which we 
 cannot
 + * link from here.
 + */
 +const char *progname = xlogdump;

Which may be a reason not to link with main.o.  We generally don't want
to hardcode the program name inside the program.

 +static void
 +usage(void)
 +{
 + printf(_(%s reads/writes postgres transaction logs for 
 debugging.\n\n),
 +progname);
 + printf(_(Usage:\n));
 + printf(_(  %s [OPTION]...\n), progname);
 + printf(_(\nOptions:\n));
 + printf(_(  -v, --version  output version information, then 
 exit\n));
 + printf(_(  -h, --help show this help, then exit\n));
 + printf(_(  -s, --startfrom where recptr onwards to 
 read\n));
 + printf(_(  -e, --end  up to which recptr to read\n));
 + printf(_(  -t, --timeline which timeline do we want to 
 read\n));
 + printf(_(  -i, --inpath   from where do we want to read? 
 cwd/pg_xlog is the default\n));
 + printf(_(  -o, --output   where to write [start, end]\n));
 + printf(_(  -f, --file wal file to parse\n));
 +}

Options list should be in alphabetic order (or some other less random
order).  Most of these descriptions are not very intelligible (at least
without additional documentation).

 +
 +int main(int argc, char **argv)
 +{
 + uint32 xlogid;
 + uint32 xrecoff;
 + XLogReaderState *xlogreader_state;
 + XLogDumpPrivateData private;
 + XLogRecPtr from = InvalidXLogRecPtr;
 + XLogRecPtr to = InvalidXLogRecPtr;
 + bool bad_argument = false;
 +
 + static struct option long_options[] = {
 + {help, no_argument, NULL, 'h'},
 + {version, no_argument, NULL, 'v'},

Standard letters for help and version are ? and V.

 + {start, required_argument, NULL, 's'},
 + {end, required_argument, NULL, 'e'},
 + {timeline, required_argument, NULL, 't'},
 + {inpath, required_argument, NULL, 'i'},
 + {outpath, required_argument, NULL, 'o'},
 + {file, required_argument, NULL, 'f'},
 + {NULL, 0, NULL, 0}
 + };
 + int c;
 + int option_index;
 +
 + memset(private, 0, sizeof(XLogDumpPrivateData));
 +
 + while ((c = getopt_long(argc, argv, hvs:e:t:i:o:f:,

This could also be in a less random order.

 + long_options, 
 option_index)) != -1)
 + {
 + switch (c)
 + {
 + case 'h':
 + usage();
 + exit(0);
 + break;
 + case 'v':
 + printf(Version: 0.1\n);
 + exit(0);
 + break;

This should be the PostgreSQL version.


also:

no man page

no nls.mk



-- 

Re: [HACKERS] [PATCH] binary heap implementation

2012-11-15 Thread Robert Haas
On Thu, Nov 15, 2012 at 11:22 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2012-11-14 18:41:12 +0530, Abhijit Menon-Sen wrote:
 There are two or three places in the Postgres source that implement heap
 sort (e.g. tuplesort.c, nodeMergeAppend.c), and it's also needed by the
 BDR code. It seemed reasonable to factor out the functionality.

 pg_dump also contains a binary heap implementation if memory serves
 right which makes me wonder whether we should try binaryheap.[ch]
 backend clean... It currently uses palloc/pfree.

 Too bad we don't have a memory allocation routine which easily works in
 the backend and frontend... palloc references MemoryContextAlloc and
 CurrentMemoryContext so thats not that easy to emulate.

I think there's a clear need for a library of code that can be shared
by frontend and backend code.  I don't necessarily want to punt
everything into libpq, though.  What I wonder if we might do is create
something like src/lib/pgguts that contains routines for memory
allocation, error handling, stringinfos (so that you don't have to use
PQexpBuffer in the frontend and StringInfo in the backend), etc. and
make that usable by both front-end and back-end code.  I think that
would be a tremendously nice thing to have.

I don't think we should make it this patch's problem to do that, but
I'd be strongly in favor of such a thing if someone wants to put it
together.  It's been a huge nuisance for me in the past and all
indicates are that the work you're doing on LR is just going to
exacerbate that problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 03/14] Add simple xlogdump tool

2012-11-15 Thread Andres Freund
On 2012-11-15 11:31:55 -0500, Peter Eisentraut wrote:
 On 11/14/12 8:17 PM, Andres Freund wrote:
  diff --git a/src/bin/Makefile b/src/bin/Makefile
  index b4dfdba..9992f7a 100644
  --- a/src/bin/Makefile
  +++ b/src/bin/Makefile
  @@ -14,7 +14,7 @@ top_builddir = ../..
   include $(top_builddir)/src/Makefile.global
 
   SUBDIRS = initdb pg_ctl pg_dump \
  -   psql scripts pg_config pg_controldata pg_resetxlog pg_basebackup
  +   psql scripts pg_config pg_controldata pg_resetxlog pg_basebackup 
  xlogdump

 should be pg_xlogdump

Good Point.

 
   ifeq ($(PORTNAME), win32)
   SUBDIRS += pgevent
  diff --git a/src/bin/xlogdump/Makefile b/src/bin/xlogdump/Makefile
  new file mode 100644
  index 000..d54640a
  --- /dev/null
  +++ b/src/bin/xlogdump/Makefile
  @@ -0,0 +1,25 @@
  +#-
  +#
  +# Makefile for src/bin/xlogdump
  +#
  +# Copyright (c) 1998-2012, PostgreSQL Global Development Group
  +#
  +# src/bin/pg_resetxlog/Makefile

 fix that

Dito.

  +#
  +#-
  +
  +PGFILEDESC = xlogdump
  +PGAPPICON=win32
  +
  +subdir = src/bin/xlogdump
  +top_builddir = ../../..
  +include $(top_builddir)/src/Makefile.global
  +
  +OBJS= xlogdump.o \
  +$(WIN32RES)
  +
  +all: xlogdump
  +
  +
  +xlogdump: $(OBJS) $(shell find ../../backend ../../timezone -name 
  objfiles.txt|xargs cat|tr -s   \012|grep -v /main.o|sed 
  's/^/..\/..\/..\//')
  +   $(CC) $(CFLAGS) $^ $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@$(X)

 This looks pretty evil, and there is no documentation about what it is
 supposed to do.

There has been some talk about this before and this clearly isn't an
acceptable solution. The previously stated idea was to split of the
_desc routines so we don't need to link with the whole backend.

Alvaro stared to work on that a bit:
http://archives.postgresql.org/message-id/1346268803-sup-9854%40alvh.no-ip.org

(What the above does is simply collect all backend object files, remove
main.o from that list an dlist them as dependencies.)

 Windows build support needs some thought.

I don't have the slightest clue how the windows build environment works,
is there still a problem if we only link to a very selected list of
backend object files? Or do we need to link them to some external
location?

  diff --git a/src/bin/xlogdump/xlogdump.c b/src/bin/xlogdump/xlogdump.c
  new file mode 100644
  index 000..0f984e4
  --- /dev/null
  +++ b/src/bin/xlogdump/xlogdump.c
  @@ -0,0 +1,468 @@
  +#include postgres.h
  +
  +#include unistd.h
  +
  +#include access/xlogreader.h
  +#include access/rmgr.h
  +#include miscadmin.h
  +#include storage/ipc.h
  +#include utils/memutils.h
  +#include utils/guc.h
  +
  +#include getopt_long.h
  +
  +/*
  + * needs to be declared because otherwise its defined in main.c which we 
  cannot
  + * link from here.
  + */
  +const char *progname = xlogdump;

 Which may be a reason not to link with main.o.

Well, we're not linking to main.o which causes the problem, but yes,
really fixing this is definitely the goal, but not really possible yet.

  +static void
  +usage(void)
  +{
  +   printf(_(%s reads/writes postgres transaction logs for 
  debugging.\n\n),
  +  progname);
  +   printf(_(Usage:\n));
  +   printf(_(  %s [OPTION]...\n), progname);
  +   printf(_(\nOptions:\n));
  +   printf(_(  -v, --version  output version information, then 
  exit\n));
  +   printf(_(  -h, --help show this help, then exit\n));
  +   printf(_(  -s, --startfrom where recptr onwards to 
  read\n));
  +   printf(_(  -e, --end  up to which recptr to read\n));
  +   printf(_(  -t, --timeline which timeline do we want to 
  read\n));
  +   printf(_(  -i, --inpath   from where do we want to read? 
  cwd/pg_xlog is the default\n));
  +   printf(_(  -o, --output   where to write [start, end]\n));
  +   printf(_(  -f, --file wal file to parse\n));
  +}

 Options list should be in alphabetic order (or some other less random
 order).  Most of these descriptions are not very intelligible (at
 least without additional documentation).

True, its noticeable that this mostly was a development tool. But it
shouldn't stay that way. There have been several bugreports of late
where a bin/pg_xlogdump would have been very helpful...

 This should be the PostgreSQL version.


 also:

 no man page

 no nls.mk

Will try to provide some actually submittable version once the
xlogreader situation is finalized and the _desc routines are splitted...

Thanks!

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for hint bit i/o mitigation

2012-11-15 Thread Merlin Moncure
On Thu, Nov 15, 2012 at 10:25 AM, Amit Kapila amit.kap...@huawei.com wrote:
 IMNSHO. deferring non-critical i/o from foreground process to
 background process is generally good.

 Yes, in regard of deferring you are right.
 However in this case may be when foreground process has to mark page dirty
 due to hint-bit, it was already dirty so no extra I/O, but when it is done
 by VACUUM, page may not be dirty.

Yeah.  We can try to be smart and set the hint bits in that case.  Not
sure that will work out with checksum having to wal log hint bits
though (which by reading the checksum threads seems likely to happen).

 Also due to below points, doing it in VACUUM may cost more:
 a. VACUUM has ring-buffer of fixed size and if such pages are many then
 write of page needs to be done by VACUUM to replace existing page
in ring.

Sure, although in scans we are using ring buffer as well so in
practical sense the results are pretty close.

 b. Considering sometimes people want VACUUM to run when system is not busy,
 the chances of generating more overall I/O in system can be
more.

It's hard to imagine overall i/o load increasing.  However, longer
vacuum times should be considered.   A bigger  issue is that we are
missing an early opportunity to set the all visible bit. vacuum is
doing:

if (all_visible)
  {
  TransactionId xmin;

  if (!(tuple.t_data-t_infomask  HEAP_XMIN_COMMITTED))
  {
  all_visible = false;
  break;
  }

assuming the cache is working and vacuum rolls around after a scan,
you lost the opportunity to set all_visible flag where previously it
would have been set, thereby dismantling the positive effect of an
index only scan.  AFAICT, this is the only case where vaccum is
directly interfacing with hint bits.  This could be construed as a
violation of heapam API?  Maybe if that's an issue we could proxy that
check to a heaptuple/tqual.c maintained function (in the same manner
as HeapTupleSetHintBits) so that the cache bit could be uniformly
checked.

All other *setting* of hint bits is running through SetHintBits(), so
I think we are safe from vacuum point of view.  That's another thing
to test for though.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] binary heap implementation

2012-11-15 Thread Andres Freund
On 2012-11-15 11:37:16 -0500, Robert Haas wrote:
 On Thu, Nov 15, 2012 at 11:22 AM, Andres Freund and...@2ndquadrant.com 
 wrote:
  On 2012-11-14 18:41:12 +0530, Abhijit Menon-Sen wrote:
  There are two or three places in the Postgres source that implement heap
  sort (e.g. tuplesort.c, nodeMergeAppend.c), and it's also needed by the
  BDR code. It seemed reasonable to factor out the functionality.
 
  pg_dump also contains a binary heap implementation if memory serves
  right which makes me wonder whether we should try binaryheap.[ch]
  backend clean... It currently uses palloc/pfree.
 
  Too bad we don't have a memory allocation routine which easily works in
  the backend and frontend... palloc references MemoryContextAlloc and
  CurrentMemoryContext so thats not that easy to emulate.

 I think there's a clear need for a library of code that can be shared
 by frontend and backend code.  I don't necessarily want to punt
 everything into libpq, though.

Aggreed.

 What I wonder if we might do is create
 something like src/lib/pgguts that contains routines for memory
 allocation, error handling, stringinfos (so that you don't have to use
 PQexpBuffer in the frontend and StringInfo in the backend), etc. and
 make that usable by both front-end and back-end code.  I think that
 would be a tremendously nice thing to have.

Yes. But it also sounds like quite a bit of work :(

 I don't think we should make it this patch's problem to do that

Me neither. I was thinking about letting the user allocate enough
memory like:

binaryheap *heap = palloc(binaryheap_size(/*capacity*/ 10));
binaryheap_init(heap, 10, comparator);

But thats pretty ugly.

 but
 I'd be strongly in favor of such a thing if someone wants to put it
 together.  It's been a huge nuisance for me in the past and all
 indicates are that the work you're doing on LR is just going to
 exacerbate that problem.

I actually hope I won't have to fight with that all that much, but we
will see :/

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 03/14] Add simple xlogdump tool

2012-11-15 Thread Jeff Janes
On Wed, Nov 14, 2012 at 5:17 PM, Andres Freund and...@2ndquadrant.com wrote:
 ---
  src/bin/Makefile|   2 +-
  src/bin/xlogdump/Makefile   |  25 +++
  src/bin/xlogdump/xlogdump.c | 468 
 
  3 files changed, 494 insertions(+), 1 deletion(-)
  create mode 100644 src/bin/xlogdump/Makefile
  create mode 100644 src/bin/xlogdump/xlogdump.c

Is this intended to be the successor of
https://github.com/snaga/xlogdump which will then be deprecated?

Thanks,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] binary heap implementation

2012-11-15 Thread Robert Haas
On Thu, Nov 15, 2012 at 11:50 AM, Andres Freund and...@2ndquadrant.com wrote:
 Me neither. I was thinking about letting the user allocate enough
 memory like:

 binaryheap *heap = palloc(binaryheap_size(/*capacity*/ 10));
 binaryheap_init(heap, 10, comparator);

 But thats pretty ugly.

Yeah.  I would vote against doing that for now.  We can always uglify
the code later if we decide it's absolutely necessary.

One trick that we could potentially use to make code run in the
frontend and backend is to put it in src/port.  That code gets
compiled twice, once within -DFRONTEND and once without.  So you can
use #ifdef to handle things that need to work different ways.  The
only real problem with that approach is that you end up putting the
code in libpgport where it seems rather out of place.  Still, maybe we
could have a src/framework directory that uses the same trick to
produce a libpgframework that frontend code can use, and a lib
pgframework_srv that can be linked into the backend.  That's might not
actually be that much work; we'd just need to clone all the existing
src/port logic.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 03/14] Add simple xlogdump tool

2012-11-15 Thread Andres Freund
On 2012-11-15 09:06:23 -0800, Jeff Janes wrote:
 On Wed, Nov 14, 2012 at 5:17 PM, Andres Freund and...@2ndquadrant.com wrote:
  ---
   src/bin/Makefile|   2 +-
   src/bin/xlogdump/Makefile   |  25 +++
   src/bin/xlogdump/xlogdump.c | 468 
  
   3 files changed, 494 insertions(+), 1 deletion(-)
   create mode 100644 src/bin/xlogdump/Makefile
   create mode 100644 src/bin/xlogdump/xlogdump.c

 Is this intended to be the successor of
 https://github.com/snaga/xlogdump which will then be deprecated?

As-is this is just a development tool which was sorely needed for the
development of this patchset. But yes I think that once ready
(xlogreader infrastructure, *_desc routines splitted) it should
definitely be able to do most of what the above xlogdump can do and it
should live in bin/. I think mostly some filtering is missing.

That doesn't really deprecate the above though.

Does that answer your question?

Greetings,

Andres

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tuplesort memory usage: grow_memtuples

2012-11-15 Thread Peter Geoghegan
On 15 November 2012 16:09, Robert Haas robertmh...@gmail.com wrote:
[Lots of complicated commentary on tuplesort variables]
 Whew.  In the attached version, I
 updated the comment to reflect this, and also reversed the order of
 the if/else block to put the shorter and more common case first, which
 I think is better style.

Fine by me.

In conversation with Greg Stark in Prague, he seemed to think that
there was an integer overflow hazard in v3, which is, in terms of the
machine code it will produce, identical to your revision. He pointed
this out to me when we were moving between sessions, and I only
briefly looked over his shoulder, so while I don't want to
misrepresent what he said, I *think* he said that this could overflow:

+   newmemtupsize = memtupsize * allowedMem / memNowUsed;

Having only looked at this properly now, I don't think that that's
actually the case, but I thought that it should be noted. I'm sorry if
it's unfair to Greg to correct him, even though that's something he
didn't formally put on the record, and may have only looked at
briefly. I can see why it might have looked like a concern, since
we're assigning the result of an arithmetic expression involving long
variables to an int, newmemtupsize. The fact of the matter is that
we're already asserting that:

+   Assert(newmemtupsize = state-memtupsize * 2);

Previously, we'd just have doubled memtupsize anyway. So any
eventuality in which newmemtupsize overflows ought to be logically
impossible.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tuplesort memory usage: grow_memtuples

2012-11-15 Thread Peter Geoghegan
On 15 November 2012 16:09, Robert Haas robertmh...@gmail.com wrote:
 I'm still not too sure why this part is OK:

 /* This may not be our first time through */
 if (newmemtupsize = memtupsize)
 return false;

 Suppose we enlarge the memtuples array by something other than a
 multiple of 2, and then we kick out all of the tuples that are
 currently in memory and load a new group with a smaller average size.
 ISTM that it could now appear that the memtuples array can be useful
 further enlarged, perhaps by as little as one tuple, and that that
 this could happen repeatedly.  I think we should either refuse to grow
 the array unless we can increase it by at least, say, 10%, or else set
 a flag after the final enlargement that acts as a hard stop to any
 further growth.

I thought that the flag added to Tuplesortstate in earlier revisions
was a little bit ugly. I can see the concern though, and I suppose
that given the alternative is to add a heuristic, simply using a flag
may be the best way to proceed.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] another idea for changing global configuration settings from SQL

2012-11-15 Thread Peter Eisentraut
Independent of the discussion of how to edit configuration files from
SQL, I had another idea how many of the use cases for this could be handled.

We already have the ability to store in pg_db_role_setting configuration
settings for

specific user, specific database
specific user, any database
any user, specific database

The existing infrastructure would also support

any user, any database (= all the time)

All you'd need is to add

ApplySetting(InvalidOid, InvalidOid, relsetting, PGC_S_$SOMETHING);

in postinit.c, and have some SQL command to modify this setting.

The only thing you couldn't handle that way are SIGHUP settings, but the
often-cited use cases work_mem, logging, etc. would work.

There would also be the advantage that pg_dumpall would save these settings.

Thoughts?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-11-15 Thread Cédric Villemain
Le jeudi 15 novembre 2012 15:48:14, Amit kapila a écrit :
 On Wednesday, November 14, 2012 12:24 AM Robert Haas wrote:
 
 On Mon, Nov 12, 2012 at 10:59 PM, Amit kapila amit.kap...@huawei.com 
wrote:
  Uh, no, I don't think that's a good idea.  IMHO, what we should do is:
  
  1. Read postgresql.conf.auto and remember all the settings we saw.  If we
  see something funky like an include directive, barf. 2. Forget the value
  we remembered for the particular setting being changed.  Instead,
  remember the user-supplied new value for that parameter. 3. Write a new
  postgresql.conf.auto based on the information remembered in steps 1 and
  2.
 
 Attached patch contains implementaion for above concept.
 It can be changed to adapt the write file based on GUC variables as
 described by me yesterday or in some other better way.
 
 Currenty to remember and forget, I have used below concept:
 1. Read postgresql.auto.conf in memory.
 2. parse the GUC_file for exact loction of changed variable
 3. update the changed variable in memory at location found in step-2
 4. Write the postgresql.auto.conf
 
 Overall changes:
 1. include dir in postgresql.conf at time of initdb
 2. new built-in function pg_change_conf to change configuration settings
 3. write file as per logic described above.
 
 Some more things left are:
 1. Transactional capability to command, so that rename of .lock file to
 .auto.conf can be done at commit time.
 
 I am planing to upload the attached for this CF.
 
 Suggestions/Comments?

Yes, sorry to jump here so late. 
* Why don't we use pg_settings ? (i.e. why not materialize the view and use 
it, it should be easier to have something transactional and also serializable 
with probably a DEFERRABLE select pg_reload_config() which mv the configuration 
file at commit time)
* Can I define automatic parameters to be loaded before and/or after the non-
automatic parameters in a convenient way (without editing files at all)?

-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] another idea for changing global configuration settings from SQL

2012-11-15 Thread Robert Haas
On Thu, Nov 15, 2012 at 12:53 PM, Peter Eisentraut pete...@gmx.net wrote:
 Independent of the discussion of how to edit configuration files from
 SQL, I had another idea how many of the use cases for this could be handled.

 We already have the ability to store in pg_db_role_setting configuration
 settings for

 specific user, specific database
 specific user, any database
 any user, specific database

 The existing infrastructure would also support

 any user, any database (= all the time)

 All you'd need is to add

 ApplySetting(InvalidOid, InvalidOid, relsetting, PGC_S_$SOMETHING);

 in postinit.c, and have some SQL command to modify this setting.

 The only thing you couldn't handle that way are SIGHUP settings, but the
 often-cited use cases work_mem, logging, etc. would work.

 There would also be the advantage that pg_dumpall would save these settings.

 Thoughts?

Personally, I think that would be wonderful.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] another idea for changing global configuration settings from SQL

2012-11-15 Thread Cédric Villemain
Le jeudi 15 novembre 2012 18:53:15, Peter Eisentraut a écrit :
 Independent of the discussion of how to edit configuration files from
 SQL, I had another idea how many of the use cases for this could be
 handled.
 
 We already have the ability to store in pg_db_role_setting configuration
 settings for
 
 specific user, specific database
 specific user, any database
 any user, specific database
 
 The existing infrastructure would also support
 
 any user, any database (= all the time)
 
 All you'd need is to add
 
 ApplySetting(InvalidOid, InvalidOid, relsetting, PGC_S_$SOMETHING);
 
 in postinit.c, and have some SQL command to modify this setting.
 
 The only thing you couldn't handle that way are SIGHUP settings, but the
 often-cited use cases work_mem, logging, etc. would work.
 
 There would also be the advantage that pg_dumpall would save these
 settings.
 
 Thoughts?

I like the idea.
-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] tuplesort memory usage: grow_memtuples

2012-11-15 Thread Robert Haas
On Thu, Nov 15, 2012 at 12:14 PM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 On 15 November 2012 16:09, Robert Haas robertmh...@gmail.com wrote:
 [Lots of complicated commentary on tuplesort variables]
 Whew.  In the attached version, I
 updated the comment to reflect this, and also reversed the order of
 the if/else block to put the shorter and more common case first, which
 I think is better style.

 Fine by me.

 In conversation with Greg Stark in Prague, he seemed to think that
 there was an integer overflow hazard in v3, which is, in terms of the
 machine code it will produce, identical to your revision. He pointed
 this out to me when we were moving between sessions, and I only
 briefly looked over his shoulder, so while I don't want to
 misrepresent what he said, I *think* he said that this could overflow:

 +   newmemtupsize = memtupsize * allowedMem / memNowUsed;

Ah, yeah.  I wondered in passing about that but forgot to follow up on
it.  The problem specifically is that the intermediate result
memtupsize * newmemtuples might overflow.  I believe that the old
memtupsize can never be more than 2^26 bytes, because the allocation
limit is 1GB and each SortTuple is 16 bytes.   So if this is done as a
32-bit calculation, we'll potentially overflow if allowedMem is more
than 64 bytes i.e. almost for sure.  If it is done as a 64-byte
calculation we'll potentially overflow if allowedMem is more than 2^38
bytes = 256GB.  The actual declared type is long which I assume is
probably 64 bits at least for most people these days, but maybe not
for everyone, and even 256GB is not necessarily safe.  The highest
value for work_mem I can set here is 2047GB.  So I think there is an
issue here, unless there's something wrong with my analysis.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_ctl reload -o ....

2012-11-15 Thread Jeff Janes
If I want to change a parameter that affects an auxiliary process
(like bgwriter), I can usually get away with doing;

pg_ctl restart -o '--setting=new'

But sometimes I really need to avoid the restart, because it blows
away shared_buffers or for other reasons.

I can do pg_ctl reload, but that ignores the -o option (without
notice).  So to go the reload route, I have to edit postgresql.conf
before doing the reload.  This is quite tedious and error-prone,
especially in a script.

Is there a reason pg_ctl reload shouldn't honor -o ?  Is there
reasonable avenue to get it do so?

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Timing events WIP v1

2012-11-15 Thread Josh Berkus

 Extending this to save the key/value set and most of the other data I
 mentioned before is pretty straightforward. 

Why not use Hstore?  Seriously?

It would require merging Hstore into core, but I don't necessarily see
that as a bad thing.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2012-11-15 Thread Josh Berkus
Kevin,

 Attached is a patch that is still WIP but that I think is getting
 pretty close to completion. It is not intended to be the be-all and
 end-all for materialized views, but the minimum useful feature set --
 which is all that I've had time to do for this release. In
 particular, the view is only updated on demand by a complete rebuild.
 For the next release, I hope to build on this base to allow more
 eager and incremental updates, and perhaps a concurrent batch update.

Nice to see this come in!

 1.  CREATE MATERIALIZED VIEW syntax is stolen directly from CREATE
 TABLE AS, with all the same clauses supported. That includes
 declaring a materialized view to be temporary or unlogged.

What use would a temporary matview be?

Unlogged is good.

 2.  MVs don't support inheritance.

In which direction?  Can't inherit, or can't be inherited from?

 3.  MVs can't define foreign keys.
 4.  MVs can't be the target of foreign keys.
 5.  MVs can't have triggers.

Makes sense.

 9.  MVs can't directly be used in a COPY statement, but can be the
 source of data using a SELECT.

Hmmm? I don't understand the reason for this.

 13. pg_class now has a relisvalid column, which is true if an MV is
 truncated or created WITH NO DATA. You can not scan a relation
 flagged as invalid.

What error would a user see?

 14. ALTER MATERIALIZED VIEW is supported for the options that seemed
 to make sense. For example, you can change the tablespace or
 schema, but you cannot add or drop column with ALTER.

How would you change the definition of an MV then?

 16. To get new data into the MV, the command is LOAD MATERIALIZED
 VIEW mat view_name. This seemed more descriptive to me that the
 alternatives and avoids declaring any new keywords beyond
 MATERIALIZED. If the MV is flagged as relisvalid == false, this
 will change it to true.

UPDATE MATERIALIZED VIEW was problematic?

Does LOAD automatically TRUNCATE the view before reloading it?  If not,
why not?

 It would be good to have some discussion to try to reach a consensus
 about whether we need to differentiate between *missing* datat (where
 a materialized view which has been loaded WITH NO DATA or TRUNCATEd
 and has not been subsequently LOADed) and potentially *stale* data.
 If we don't care to distinguish between a view which generated no
 rows when it ran and a one for which the query has not been run, we
 can avoid adding the relisvalid flag, and we could support UNLOGGED
 MVs. Perhaps someone can come up with a better solution to that
 problem.

Hmmm.  I understand the distinction you're making here, but I'm not sure
it actually matters to the user.  MVs, by their nature, always have
potentially stale data.  Being empty (in an inaccurate way) is just one
kind of stale data.

It would be nice for the user to have some way to know that a matview is
empty due to never being LOADed or recently being TRUNCATEd.  However, I
don't think that relisvalid flag -- and preventing scanning the relation
-- is a good solution.  What I'd rather have instead is a timestamp of
when the MV was last LOADed.  If the MV was never loaded (or was
truncated) that timestamp would be NULL.  Such a timestamp would allow
users to construct all kinds of ad-hoc refresh schemes for MVs which
would not be possible without it.

I don't see how this relates to UNLOGGED matviews either way.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] another idea for changing global configuration settings from SQL

2012-11-15 Thread Josh Berkus

 ApplySetting(InvalidOid, InvalidOid, relsetting, PGC_S_$SOMETHING);
 
 in postinit.c, and have some SQL command to modify this setting.
 
 The only thing you couldn't handle that way are SIGHUP settings, but the
 often-cited use cases work_mem, logging, etc. would work.
 
 There would also be the advantage that pg_dumpall would save these settings.

I think this is a great idea.

One caveat: we really, really, really need a system view which allows
DBAs to easily review settings defined for specific users and databases.
 Right now, it requires significant pg_catalog hacking expertise to pull
out user-specific settings.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch: add (PRE|POST)PROCESSOR options to COPY

2012-11-15 Thread Robert Haas
On Wed, Nov 14, 2012 at 10:14 PM, Craig Ringer cr...@2ndquadrant.com wrote:
 So?  You're already handing the keys to the kingdom to anybody who can
 control the contents of that command line, even if it's only to point at
 the wrong program.  And one man's unexpected side-effect is another
 man's essential feature, as in my examples above.

 That's true if they're controlling the whole command, not so much if
 they just provide a file name. I'm just worried that people will use it
 without thinking deeply about the consequences, just like they do with
 untrusted client input in SQL injection attacks.

Yeah.  If we're going to do this at all, and I'm not convinced it's
worth the work, I think it's definitely good to support a variant
where we specify exactly the things that will be passed to exec().
There's just too many ways to accidentally shoot yourself in the foot
otherwise.  If we want to have an option that lets people shoot
themselves in the foot, that's fine.  But I think we'd be smart not to
make that the only option.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tuplesort memory usage: grow_memtuples

2012-11-15 Thread Peter Geoghegan
On 15 November 2012 18:13, Robert Haas robertmh...@gmail.com wrote:
 Ah, yeah.  I wondered in passing about that but forgot to follow up on
 it.  The problem specifically is that the intermediate result
 memtupsize * newmemtuples might overflow.  I believe that the old
 memtupsize can never be more than 2^26 bytes, because the allocation
 limit is 1GB and each SortTuple is 16 bytes.

Do you mean the intermediate result of memtupsize * allowedMem? Oh,
yeah, that could overflow rather easily on a platform where long is
only 32-bit. We're multiplying the entire current allocation size of
the array by the maximum length. I guess the fact that you didn't spot
it made me overconfident. :-)

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tuplesort memory usage: grow_memtuples

2012-11-15 Thread Robert Haas
On Thu, Nov 15, 2012 at 2:13 PM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 On 15 November 2012 18:13, Robert Haas robertmh...@gmail.com wrote:
 Ah, yeah.  I wondered in passing about that but forgot to follow up on
 it.  The problem specifically is that the intermediate result
 memtupsize * newmemtuples might overflow.  I believe that the old
 memtupsize can never be more than 2^26 bytes, because the allocation
 limit is 1GB and each SortTuple is 16 bytes.

 Do you mean the intermediate result of memtupsize * allowedMem?

Yes.

 Oh,
 yeah, that could overflow rather easily on a platform where long is
 only 32-bit.

Or even 64-bit, with really large values of work_mem.

 We're multiplying the entire current allocation size of
 the array by the maximum length. I guess the fact that you didn't spot
 it made me overconfident. :-)

Ha!

So what's next here?  Do you want to work on these issue some more?
Or does Jeff?  I'd like to see this go in, but I'm not sure I have the
bandwidth to do the legwork myself.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_ctl reload -o ....

2012-11-15 Thread Tom Lane
Jeff Janes jeff.ja...@gmail.com writes:
 Is there a reason pg_ctl reload shouldn't honor -o ?

-o means pass these switches on the postmaster's command line.
If you're not restarting the postmaster, you don't get to change
its command line.

IMO setting stuff on the command line is pretty evil anyway.
Adjusting postgresql.conf is almost always the Better Way.
We have discussions going already about how to make that easier,
so I'm not excited about trying to kluge something in pg_ctl.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch: add (PRE|POST)PROCESSOR options to COPY

2012-11-15 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Yeah.  If we're going to do this at all, and I'm not convinced it's
 worth the work, I think it's definitely good to support a variant
 where we specify exactly the things that will be passed to exec().
 There's just too many ways to accidentally shoot yourself in the foot
 otherwise.  If we want to have an option that lets people shoot
 themselves in the foot, that's fine.  But I think we'd be smart not to
 make that the only option.

[ shrug... ]  Once again, that will turn this from a ten-line patch
into hundreds of lines (and some more, different, hundreds of lines
for Windows I bet), with a corresponding growth in the opportunities
for bugs, for a benefit that's at best debatable.

The biggest problem this patch has had from the very beginning is
overdesign, and this is more of the same.  Let's please just define the
feature as popen, not fopen, the given string and have done.  You can
put all the warning verbiage you want in the documentation.  (But note
that the server-side version would be superuser-only in any flavor of
the feature.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tuplesort memory usage: grow_memtuples

2012-11-15 Thread Peter Geoghegan
On 15 November 2012 19:16, Robert Haas robertmh...@gmail.com wrote:
 So what's next here?  Do you want to work on these issue some more?
 Or does Jeff?  I'd like to see this go in, but I'm not sure I have the
 bandwidth to do the legwork myself.

I'll take another look. No elegant solution immediately occurs to me, though.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_trgm partial-match

2012-11-15 Thread Fujii Masao
Hi,

I'd like to propose to extend pg_trgm so that it can compare a partial-match
query key to a GIN index. IOW, I'm thinking to implement the 'comparePartial'
GIN method for pg_trgm.

Currently, when the query key is less than three characters, we cannot use
a GIN index (+ pg_trgm) efficiently, because pg_trgm doesn't support a
partial-match method. In this case, seq scan or index full scan would be
executed, and its response time would be very slow. I'd like to alleviate this
problem.

Note that we cannot do a partial-match if KEEPONLYALNUM is disabled,
i.e., if query key contains multibyte characters. In this case, byte length of
the trigram string might be larger than three, and its CRC is used as a
trigram key instead of the trigram string itself. Because of using CRC, we
cannot do a partial-match. Attached patch extends pg_trgm so that it
compares a partial-match query key only when KEEPONLYALNUM is
enabled.

Attached patch is WIP yet. What I should do next is:

* version up pg_trgm from 1.0 to 1.1, i.e., create pg_trgm--1.1.sql, etc.
* write the regression test

Comments? Review? Objection?

Regards,

-- 
Fujii Masao


trgm_compare_partial_v0.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dumping an Extension's Script

2012-11-15 Thread Robert Haas
On Mon, Nov 12, 2012 at 11:00 AM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Please find attached to this email an RFC patch implementing the basics
 of the pg_dump --extension-script option. After much discussion around
 the concept of an inline extension, we decided last year that a good
 first step would be pg_dump support for an extension's script.

Do you have a link to the original thread?  I have to confess I don't
remember what the purpose of this was and, heh heh, there are no
documentation changes in the patch itself either.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] support for LDAP URLs

2012-11-15 Thread Robert Haas
On Mon, Nov 12, 2012 at 10:38 PM, Peter Eisentraut pete...@gmx.net wrote:
 Here is a patch to support RFC 2255 LDAP URLs in pg_hba.conf.  So,
 instead of, say

 host ... ldap ldapserver=ldap.example.net ldapbasedn=dc=example, dc=net 
 ldapsearchattribute=uid

 you could write

 host ... ldap lapurl=ldap://ldap.example.net/dc=example,dc=net?uid?sub;

 Apache and probably other software uses the same format, and it's easier
 to have a common format for all such configuration instead of having to
 translate the information provided by the LDAP admin into each
 software's particular configuration spellings.

 I'm using the OpenLDAP-provided URL parsing routine, which means this
 wouldn't be supported on Windows.  But we already support different
 authentication settings on different platforms, so this didn't seem such
 a big problem.

I think this is broadly reasonable, but I'm not sure this part is a good idea:

+#ifdef USE_LDAP
+#ifndef WIN32
+/* We use a deprecated function to keep the codepath the same as win32. */
+#define LDAP_DEPRECATED 1
+#include ldap.h
+#else
+#include winldap.h
+#endif
+#endif

Presumably if it's deprecated now, it might go away without notice,
and we shouldn't be relying on it to stick around.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enabling Checksums

2012-11-15 Thread Jeff Davis
On Wed, 2012-11-14 at 21:22 -0500, Robert Haas wrote:
  But there are some practical issues, as Tom points out. Another one is
  that it's harder for external utilities (like pg_basebackup) to verify
  checksums.

 Well, I think the invariant we'd need to maintain is as follows: every
 page for which the checksum fork might be wrong must have an FPI
 following the redo pointer.  So, at the time we advance the redo
 pointer, we need the checksum fork to be up-to-date for all pages for
 which a WAL record was written after the old redo pointer except for
 those for which a WAL record has again been written after the new redo
 pointer.  In other words, the checksum pages we write out don't need
 to be completely accurate; the checksums for any blocks we know will
 get clobbered anyway during replay don't really matter.

The issue about external utilities is a bigger problem than I realized
at first. Originally, I thought that it was just a matter of code to
associate the checksum with the data.

However, an external utility will never see a torn page while the system
is online (after recovery); but it *will* see an inconsistent view of
the checksum and the data if they are issued in separate write() calls.
So, the hazard of storing the checksum in a different place is not
equivalent to the existing hazard of a torn page.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] feature proposal - triggers by semantics

2012-11-15 Thread Darren Duncan

Craig Ringer wrote:

On 11/15/2012 06:25 PM, Hannu Krosing wrote:

On 11/15/2012 09:48 AM, Craig Ringer wrote:

If you want to prevent TRUNCATE, deny the privilege or add a trigger
that aborts the command.

You can abort the transaction but not skip action as currently it is only
possible to skip in ROW level triggers.

So I'd modify this request to allow BEFORE EACH STATEMENT triggers
to also be able to silently skip current action like BEFORE EACH ROW
triggers can.

Then this request would simply be satisfied by a simple trigger which
rewrites TRUNCATE into DELETE .

That seems sensible to me, too.


To further explain ...

What I'm desiring here with respect to TRUNCATE is that users are allowed to use 
TRUNCATE syntax as an alternative to DELETE (they are GRANTed both) but that any 
triggers defined for DELETE can also be applied to TRUNCATE without too much 
difficulty.  So users can use different syntactic constructs that look similar 
without having to think about, is this going to be audited.


I understand that ROW level triggers don't exist yet for TRUNCATE (this is 
already clearly documented in the manual) and adding them would mean TRUNCATE 
would do a table scan in their presence.


I still think the syntax of TRUNCATE FOR EACH ROW would be useful, but if no one 
agrees, then I'll just make do with alternative solutions mentioned here.  That 
is, either a statement-level TRUNCATE trigger of some kind, or simply disallow 
TRUNCATE privileges.


Thank you.

-- Darren Duncan


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tuplesort memory usage: grow_memtuples

2012-11-15 Thread Greg Stark
On Thu, Nov 15, 2012 at 7:36 PM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 On 15 November 2012 19:16, Robert Haas robertmh...@gmail.com wrote:
 So what's next here?  Do you want to work on these issue some more?
 Or does Jeff?  I'd like to see this go in, but I'm not sure I have the
 bandwidth to do the legwork myself.

 I'll take another look. No elegant solution immediately occurs to me, though.

The overflow was trivial to fix.

The only concern I had was about the behaviour after it did the
special case. I didn't want it to keep doing the math and trying to
grow again a little bit every tuple. I think I was leaning to putting
the magic flag back. The alternative is to only do the one-off grow if
we can grow at least some arbitrary percentage like 10% or something
like that. But it seems like a lot of arithmetic to be doing each time
around for probably no gain.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] feature proposal - triggers by semantics

2012-11-15 Thread Christopher Browne
On Thu, Nov 15, 2012 at 2:53 PM, Darren Duncan dar...@darrenduncan.net
wrote:
 I still think the syntax of TRUNCATE FOR EACH ROW would be useful, but if
no one agrees...

I'm compelled to disagree.

What was useful about TRUNCATE in the first place was that it quickly
operated against the entire table.

If you want to change that to row-by-row processing, then it is actually a
little bit worse than
DELETE FROM some_table, in that it is introducing an irregularity in
language that no longer
provides any corresponding benefit.  (e.g. - such as that TRUNCATE is fast).

If you want to be certain of doing row-by-row processing, then the better
answer is to:
a) Use DELETE instead in your application, and
b) Put a guard on to prevent using TRUNCATE.  (e.g. - attach triggers that
react to TRUNCATE with go away, don't bother me!)

I observe that the Slony replication system initially implemented that
'guard' approach when TRUNCATE
triggers were first provided, until we came up with a suitable strategy to
capture and replicate the
TRUNCATE request.
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, How would the Lone Ranger handle this?


Re: [HACKERS] Dumping an Extension's Script

2012-11-15 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 Please find attached to this email an RFC patch implementing the basics
 of the pg_dump --extension-script option. After much discussion around
 the concept of an inline extension, we decided last year that a good
 first step would be pg_dump support for an extension's script.

 Do you have a link to the original thread?  I have to confess I don't
 remember what the purpose of this was and, heh heh, there are no
 documentation changes in the patch itself either.

My notes include those links to the original thread:

  http://archives.postgresql.org/message-id/3157.1327298...@sss.pgh.pa.us
  http://archives.postgresql.org/pgsql-hackers/2012-01/msg01311.php
  https://commitfest.postgresql.org/action/patch_view?id=746

I could of course work on documenting the changes prior to the
reviewing, the thing is that I've been taking a different implementation
route towards the pg_dump --extension-script idea we talked about, that
I think is much simpler than anything else.

So I'd like to know if that approach is deemed acceptable by the
Guardians Of The Code before expanding any more hour on this…

It basically boils down to this hunk in dumpExtension():

  output CREATE EXTENSION x WITH … AS $x$

   /*
* Have another archive for this extension: this allows us to simply
* walk the extension's dependencies and use the existing pg_dump code
* to get the object create statement to be added in the script.
*
*/
   eout = CreateArchive(NULL, archNull, 0, archModeAppend);

   EH = (ArchiveHandle *) eout;

   /* grab existing connection and remote version information */
   EH-connection = ((ArchiveHandle *)fout)-connection;
   eout-remoteVersion = fout-remoteVersion;

   /* dump all objects for this extension, that have been sorted out in
* the right order following dependencies etc */
   ...

   /* restore the eout Archive into the local buffer */
   for (te = EH-toc-next; te != EH-toc; te = te-next)
   {
   if (strlen(te-defn)  0)
   appendPQExpBuffer(q, %s, te-defn);
   }
   CloseArchive(eout);

  output $x$;

What do you think?
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Timing events WIP v1

2012-11-15 Thread Alvaro Herrera
Greg Smith wrote:

 -The event queue into a simple array accessed in circular fashion.
 After realizing that output needed to navigate in the opposite order
 of element addition, I ended up just putting all the queue
 navigation code directly into the add/output routines.  I'd be happy
 to use a more formal Postgres type here instead--I looked at
 SHM_QUEUE for example--but I haven't found something yet that makes
 this any simpler.

SHM_QUEUE is on the death row.  Try a dlist from src/backend/lib/ilist.c

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_ctl reload -o ....

2012-11-15 Thread Jeff Janes
On Thu, Nov 15, 2012 at 11:26 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Jeff Janes jeff.ja...@gmail.com writes:
 Is there a reason pg_ctl reload shouldn't honor -o ?

 -o means pass these switches on the postmaster's command line.
 If you're not restarting the postmaster, you don't get to change
 its command line.

OK.  You see pg_ctl as a wrapper around postmaster, while I was
viewing it as a thing in itself.

 IMO setting stuff on the command line is pretty evil anyway.

I wouldn't do it in a production system, but it is invaluable for
performance testing during development or diagnosis.

Oh well, I guess this is what perl -i -pe is for.

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] User control over psql error stream

2012-11-15 Thread Karl O. Pinc
Hi,

This patch gives the end user control over psql's
error stream.  This allows a single psql session
to use \o to send both errors and table output
to multiple files.  Useful when capturing test output, etc.

Control is provided via a new estream \pset.  Here's
the docs.

-snip
estream

Controls the output stream(s) used to report error messages. Value 
must be one of: stderr (the default), which sends errors to the 
standard error stream; query, which injects error messages into the 
query result output stream; or both, which sends errors to both output 
streams. Error messages are comprised of errors from psql and notice 
messages and errors from the database server. 
-snip

Against head.

psql-estream.patch   The patch.
psql-estream_test.patch  Adds a regression test to test the patch.

There's a number of problems with psql-estream_test.patch,
the most notable is that it probably won't work on
MS Windows because it uses /dev/null to avoid touching the
host filesystem.   I'm not sure whether this should have
a regression test and if so what the right way is to do it.

Note that psql-stream.patch includes some re-writing of
the docs for the psql \o option that goes slightly beyond
the minimum change required to explain \pset estream's effects.

Regards,

Karl k...@meme.com
Free Software:  You don't pay back, you pay forward.
 -- Robert A. Heinlein

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index c41593c..695739e 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1800,11 +1800,16 @@ lo_import 152801
 specified, the query output will be reset to the standard output.
 /para
 
-paraquoteQuery results/quote includes all tables, command
-responses, and notices obtained from the database server, as
-well as output of various backslash commands that query the
-database (such as command\d/command), but not error
-messages.
+paraquoteQuery results/quote includes all tables, sql
+command responses and status information, notifications from
+commandLISTEN/command, and output of various backslash
+commands that query the database (such as
+command\d/command). quoteQuery results/quote do not
+include errors from applicationpsql/applicationor notice
+messages or errors from the database server unless
+command\pset estream/command is used.  Nor do they include
+output of backslash commands which do not query the database
+(such as command\h/command).
 /para
 
 tip
@@ -1863,16 +1868,18 @@ lo_import 152801
 
 listitem
 para
-This command sets options affecting the output of query result tables.
-replaceable class=parameteroption/replaceable
-indicates which option is to be set. The semantics of
-replaceable class=parametervalue/replaceable vary depending
-on the selected option.  For some options, omitting replaceable
-class=parametervalue/replaceable causes the option to be toggled
-or unset, as described under the particular option.  If no such
-behavior is mentioned, then omitting
-replaceable class=parametervalue/replaceable just results in
-the current setting being displayed.
+This command sets options affecting the output of errors and
+query results.  There are extensive options regarding table
+formatting.  replaceable
+class=parameteroption/replaceable indicates which option
+is to be set. The semantics of replaceable
+class=parametervalue/replaceable vary depending on the
+selected option.  For some options, omitting replaceable
+class=parametervalue/replaceable causes the option to be
+toggled or unset, as described under the particular option.
+If no such behavior is mentioned, then omitting replaceable
+class=parametervalue/replaceable just results in the
+current setting being displayed.
 /para
 
 para
@@ -1914,6 +1921,24 @@ lo_import 152801
   /varlistentry
 
   varlistentry
+  termliteralestream/literal/term
+  listitem
+  para
+  Controls the output stream(s) used to report error messages.
+  replaceable class=parameterValue/replaceable must be
+  one of: literalstderr/literal (the default), which sends
+  errors to the standard error stream;
+  literalquery/literal, which injects error messages into
+  the query result output stream; or literalboth/literal,
+  which sends errors to both output streams.  quoteError
+  messages/quote are comprised of errors from
+  applicationpsql/application and notice messages and
+  errors from the database 

Re: [HACKERS] Further pg_upgrade analysis for many tables

2012-11-15 Thread Dimitri Fontaine
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Could we use some adaptive mechanism here?  Say we use a list for the
 first ten entries, and if an eleventh one comes in, we create a hash
 table for that one and all subsequent ones.  All future calls would
 have to examine both the list for the first few and then the hash table.

Is it necessary to do so? Do we know for sure that a 10 elements hash
table is slower than a 10 elements list when only doing key based
lookups, for the object data type we're interested into here?

-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_ctl reload -o ....

2012-11-15 Thread Andrew Dunstan


On 11/15/2012 03:41 PM, Jeff Janes wrote:

On Thu, Nov 15, 2012 at 11:26 AM, Tom Lane t...@sss.pgh.pa.us wrote:

Jeff Janes jeff.ja...@gmail.com writes:

Is there a reason pg_ctl reload shouldn't honor -o ?

-o means pass these switches on the postmaster's command line.
If you're not restarting the postmaster, you don't get to change
its command line.

OK.  You see pg_ctl as a wrapper around postmaster, while I was
viewing it as a thing in itself.



You're possibly not aware of its history. Before 8.0 it was a shell 
script and its use was seen as rather optional. IIRC there was even some 
debate about whether or not we needed it at all for the original Windows 
port (luckily Bruce and I thought we did, and made it happen.)



cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tuplesort memory usage: grow_memtuples

2012-11-15 Thread Peter Geoghegan
On 15 November 2012 19:54, Greg Stark st...@mit.edu wrote:
 The only concern I had was about the behaviour after it did the
 special case. I didn't want it to keep doing the math and trying to
 grow again a little bit every tuple. I think I was leaning to putting
 the magic flag back.

The alternative might just be to add a new constant to the
TupSortStatus enum. That might be more logical.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Partial-index predicate proofs got dumber in 9.2

2012-11-15 Thread Tom Lane
I looked into the optimization regression reported here:
http://archives.postgresql.org/pgsql-performance/2012-11/msg00140.php

It's easy to reproduce the problem in the regression database:

create index ti on tenk1 (fivethous) where fivethous is not null;
explain select * from int4_tbl, tenk1 where f1 = fivethous;

9.1 and earlier will produce a nestloop-with-inner-indexscan, but 9.2
fails to (and ends up with a much-more-expensive hash join) because it
doesn't think the partial index ti has been proven usable for the query.
However, because the = operator is strict, no row in which fivethous is
null could be relevant to the query result, so we should be able to use
the partial index.

The older code succeeds at this because when it is considering whether
it can use an index for an inner indexscan, it will apply all the join
clauses found by find_clauses_for_join() to the task of proving the index
predicate true.  (It's notable that this includes all join clauses
mentioning the target relation, not only those that can be used with the
index's columns; so we'll be able to prove things true even if the index
predicate involves table columns that aren't in the index proper.)

In 9.2, which has been rewritten to generate inner indexscans bottom up
as parameterized paths, we simply skip generating parameterized paths for
any index that's not marked predOK, and the predOK test is made using
only the restriction clauses for that relation, not join clauses.  Ooops.

We could try to replicate the way the previous code did it, but it'd
probably add significant expense.  What I'm thinking at the moment is that
check_partial_indexes() has been missing a bet all along, because there is
no reason for it to confine its attention to restriction clauses.  We
should just have it include join clauses for the rel in the restriction
list passed to predicate_implied_by().  That is, a join clause can be
assumed true for the purposes of deciding whether an index is usable
*whether or not we ever actually use that join clause with the index*.

This claim is shaky in the presence of outer joins, though.  Consider

select ... from t1 left join t2 on (t1.a = t2.b)

We can use a partial index on t2 that requires t2.b IS NOT NULL, since no
row where b is null will affect the result.  A more interesting case is

select ... from t1 left join t2 on (t1.a = t2.b)
where t2.c  0;

This is actually going to get simplified to a plain inner join, but even
if it did not, we needn't fetch t2 rows where c is null or negative.
That might result in generating some null-extended join rows that
shouldn't be there, if particular values of t1.a no longer have matches,
but the outer WHERE clause would eliminate such bogus rows anyway.

However, that last conclusion depends on the outer WHERE clause being
strict, which doesn't work for instance with

select ... from t1 left join t2 on (t1.a = t2.b)
where t2.c is null;

We could not use an index having the predicate c is null to scan t2,
else we might eliminate some rows that should produce matches to t1,
allowing bogus null-extended rows to be produced (which would survive
the outer WHERE).

So I'm thinking that the correct heuristic for check_partial_indexes()
is to use restriction clauses plus any join clauses that satisfy
join_clause_is_movable_to().  That should prevent unsafe use of
upper-level join clauses when there are outer joins below them.

Thoughts?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_ctl reload -o ....

2012-11-15 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 11/15/2012 03:41 PM, Jeff Janes wrote:
 OK.  You see pg_ctl as a wrapper around postmaster, while I was
 viewing it as a thing in itself.

 You're possibly not aware of its history. Before 8.0 it was a shell 
 script and its use was seen as rather optional.

It's *still* rather optional.  A lot of distros use init scripts that
don't bother with it.  Personally I find it convenient for stopping
the postmaster, but not all that helpful for starting it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] another idea for changing global configuration settings from SQL

2012-11-15 Thread Dimitri Fontaine
Peter Eisentraut pete...@gmx.net writes:
 The existing infrastructure would also support
 any user, any database (= all the time)

 There would also be the advantage that pg_dumpall would save these settings.

 Thoughts?

That's brilliant. +1.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] feature proposal - triggers by semantics

2012-11-15 Thread Dimitri Fontaine
Darren Duncan dar...@darrenduncan.net writes:
 So, I'm partly proposing a specific narrow new feature, TRUNCATE FOR EACH
 ROW

Kevin has been proposing that we consider an alternative approach in
some other cases that I think would work better for you, too. Namely, to
have access to OLD and NEW in FOR EACH STATEMENT triggers, where they
would be relations rather than records.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash id in pg_stat_statements

2012-11-15 Thread Dimitri Fontaine
Peter Geoghegan pe...@2ndquadrant.com writes:
 should expose the hash. The need to aggregate historical statistics
 just doesn't appreciably alter things here, I feel. The number of
 discrete queries that an application will execute in a week just isn't
 that different from the number that it will ever execute, I suspect.

Please don't forget that some people won't write the most effective SQL
from the get go and will rewrite problematic queries. Some people will
even rollout new features in their applications, with new queries to
implement them. Or new applications on top of the existing database.

So I don't think that the query list is that static. That said, I don't
have any idea at all about the impact of what I'm saying to your
analysis…

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Further pg_upgrade analysis for many tables

2012-11-15 Thread Tom Lane
Dimitri Fontaine dimi...@2ndquadrant.fr writes:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Could we use some adaptive mechanism here?  Say we use a list for the
 first ten entries, and if an eleventh one comes in, we create a hash
 table for that one and all subsequent ones.  All future calls would
 have to examine both the list for the first few and then the hash table.

 Is it necessary to do so? Do we know for sure that a 10 elements hash
 table is slower than a 10 elements list when only doing key based
 lookups, for the object data type we're interested into here?

Well, we'd want to do some testing to choose the cutover point.
Personally I'd bet on that point being quite a bit higher than ten,
for the case that sequence.c is using where the key being compared is
just an OID.  You can compare a lot of OIDs in the time it takes
dynahash.c to do something.

(I think the above sketch is wrong in detail, btw.  What we should do
once we decide to create a hash table is move all the existing entries
into the hash table, not continue to scan a list for them.  There's a
similar case in the planner for tracking join RelOptInfos.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] add -Wlogical-op to standard compiler options?

2012-11-15 Thread Peter Eisentraut
On 11/15/12 9:40 AM, Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
 I think it might be worth adding -Wlogical-op to the standard warning
 options (for supported compilers, determined by configure test).
 
 Does that add any new warnings with the current source code, and if
 so what?

none



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] feature proposal - triggers by semantics

2012-11-15 Thread Darren Duncan

Dimitri Fontaine wrote:

Darren Duncan dar...@darrenduncan.net writes:

So, I'm partly proposing a specific narrow new feature, TRUNCATE FOR EACH
ROW


Kevin has been proposing that we consider an alternative approach in
some other cases that I think would work better for you, too. Namely, to
have access to OLD and NEW in FOR EACH STATEMENT triggers, where they
would be relations rather than records.

Regards,


Yes, I believe that would work very well.  In fact, that would provide some 
power features.  I look forward to this, probably the best solution. -- Darren 
Duncan



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] add -Wlogical-op to standard compiler options?

2012-11-15 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On 11/15/12 9:40 AM, Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
 I think it might be worth adding -Wlogical-op to the standard warning
 options (for supported compilers, determined by configure test).

 Does that add any new warnings with the current source code, and if
 so what?

 none

No objection from me then.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] another idea for changing global configuration settings from SQL

2012-11-15 Thread Magnus Hagander
On Thu, Nov 15, 2012 at 6:53 PM, Peter Eisentraut pete...@gmx.net wrote:
 Independent of the discussion of how to edit configuration files from
 SQL, I had another idea how many of the use cases for this could be handled.

 We already have the ability to store in pg_db_role_setting configuration
 settings for

 specific user, specific database
 specific user, any database
 any user, specific database

 The existing infrastructure would also support

 any user, any database (= all the time)

 All you'd need is to add

 ApplySetting(InvalidOid, InvalidOid, relsetting, PGC_S_$SOMETHING);

 in postinit.c, and have some SQL command to modify this setting.




 The only thing you couldn't handle that way are SIGHUP settings, but the
 often-cited use cases work_mem, logging, etc. would work.

How hard would it be to make it work for SIGHUP? I can see how it
would be impossible to handle things like POSTMASTER, but SIGHUP seems
like it should be doable somehow?


 There would also be the advantage that pg_dumpall would save these settings.

 Thoughts?

I like it. Not as a replacement for the other facility, but as another
way of doing it. And I'd expect it could be the main way for manual
changes, but tools would still need access to the other way of course.

We probably need to enhance pg_settings to tell the user *where* the
setting came from whe nit's set this way. In fact, we need this
already, since it can be hard to track down...

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index only scans wiki page

2012-11-15 Thread Peter Geoghegan
On 13 November 2012 01:25, Peter Geoghegan pe...@2ndquadrant.com wrote:
 Attached is a piece I wrote on the feature. That might form the basis
 of a new wiki page.

Since no one else moved on this, I've completely replaced the existing
page with the contents of the user-orientated document I posted. I
don't believe that any of the information that has been removed was of
general interest, since it only existed as a developer-orientated
page. If anyone thinks that I shouldn't have removed everything that
was there, or indeed who would like to change what I've added, well,
it's a wiki.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autovacuum truncate exclusive lock round two

2012-11-15 Thread Dimitri Fontaine
Jan Wieck janwi...@yahoo.com writes:
 Use this lmgr feature inside count_nondeletable_pages() of vacuumlazy.c to
 periodically check, if there is a conflicting lock request waiting. If not,
 keep going. If there is a waiter, truncate the relation to the point checked
 thus far, release the AccessExclusiveLock, then loop back to where we
 acquire this lock in the first place and continue checking/truncating.

I think that maybe we could just bail out after releasing the
AccessExclusiveLock and trust autovacuum to get back to truncating that
relation later. That would allow removing 2 of the 3 GUCs below:

 autovacuum_truncate_lock_check = 100ms # how frequent to check
# for conflicting locks

This is the one remaining. Could we maybe check for lock conflict after
every move backward a page, or some multiple thereof? The goal would be
to ensure that progress is made, while also being aware of concurrent
activity, ala CHECK_FOR_INTERRUPTS().

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autovacuum truncate exclusive lock round two

2012-11-15 Thread Alvaro Herrera
Dimitri Fontaine wrote:
 Jan Wieck janwi...@yahoo.com writes:
  Use this lmgr feature inside count_nondeletable_pages() of vacuumlazy.c to
  periodically check, if there is a conflicting lock request waiting. If not,
  keep going. If there is a waiter, truncate the relation to the point checked
  thus far, release the AccessExclusiveLock, then loop back to where we
  acquire this lock in the first place and continue checking/truncating.
 
 I think that maybe we could just bail out after releasing the
 AccessExclusiveLock and trust autovacuum to get back to truncating that
 relation later.

That doesn't work, because the truncating code is not reached unless
vacuuming actually took place.  So if you interrupt it, it will just not
get called again later.  Maybe we could have autovacuum somehow invoke
that separately, but that would require that the fact that truncation
was aborted is kept track of somewhere.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >