date:20140726

Re: [HACKERS] Use unique index for longer pathkeys.

2014-07-26 Thread Amit Kapila

On Fri, Jul 25, 2014 at 12:48 PM, Kyotaro HORIGUCHI 
horiguchi.kyot...@lab.ntt.co.jp wrote:
  I think there is one more disadvantage in the way current patch is
  done which is that you need to collect index path keys for all relations
  irrespective of whether they will be of any use to eliminate useless
  pathkeys from query_pathkeys.  One trivial case that comes to mind is
  when there are multiple relations involved in query and ORDER BY is
  base on columns of only part of the tables involved in query.

 Like this?

 select x.a, x.b, y.b from x, y where x.a = y.a order by x.a, x.b;

 Equivalent class consists of (x.a=y.a) and (x.b), so index
 pathkeys for i_y is (y.a.=x.a). As a result, no common primary
 pathkeys found.

I think it will find common pathkey incase you have an unique index
on x.a (please see the example below), but currently I am not clear
why there is a need for a common index path key in such a case to
eliminate useless keys in ORDER BY, why can't we do it based
on individual table's path key.

Example:

create table t (a int not null, b int not null, c int, d text);
create unique index i_t_pkey on t(a, b);
insert into t (select a % 10, a / 10, a, 't' from generate_series(0,
10) a);
analyze;

create table t1 (a int not null, b int not null, c int, d text);
create unique index i_t1_pkey_1 on t1(a);
create unique index i_t1_pkey_2 on t1(a, b);
insert into t1 (select a * 2, a / 10, a, 't' from generate_series(0,
10) a);
explain (costs off, analyze off) select * from t,t1 where t.a=t1.a order by
t1.a,t1.b,t1.c,t1.d;

QUERY PLAN
--
 Merge Join
   Merge Cond: (t.a = t1.a)
   -  Index Scan using i_t_pkey on t
   -  Index Scan using i_t1_pkey_1 on t1
(4 rows)

Here we can notice that there is no separate sort key in plan.

Now drop the i_t1_pkey_1 and check the query plan again.

drop index i_t1_pkey_1;
explain (costs off, analyze off) select * from t,t1 where t.a=t1.a order by
t1.a,t1.b,t1.c,t1.d;
   QUERY PLAN

 Sort
   Sort Key: t.a, t1.b, t1.c, t1.d
   -  Merge Join
 Merge Cond: (t.a = t1.a)
 -  Index Scan using i_t_pkey on t
 -  Index Scan using i_t1_pkey_2 on t1
(6 rows)

Can't above plan eliminate Sort Key even after dropping index
(i_t1_pkey_1)?



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] pg_background (and more parallelism infrastructure patches)

2014-07-26 Thread Andres Freund

Hi,

On 2014-07-25 14:11:32 -0400, Robert Haas wrote:
 Attached is a contrib module that lets you launch arbitrary command in
 a background worker, and supporting infrastructure patches for core.

Cool.

I assume this 'fell out' of the work towards parallelism? Do you think
all of the patches (except the contrib one) are required for that or is
some, e.g. 3), only required to demonstrate the others?

 Patch 3 adds the ability for a backend to request that the protocol
 messages it would normally send to the frontend get redirected to a
 shm_mq.  I did this by adding a couple of hook functions.  The best
 design is definitely arguable here, so if you'd like to bikeshed, this
 is probably the patch to look at.

Uh. This doesn't sound particularly nice. Shouldn't this rather be
clearly layered by making reading/writing from the client a proper API
instead of adding hook functions here and there?

Also, you seem to have only touched receiving from the client, and not
sending back to the subprocess. Is that actually sufficient? I'd expect
that for this facility to be fully useful it'd have to be two way
communication. But perhaps I'm overestimating what it could be used for.

 This patch also adds a function to
 help you parse an ErrorResponse or NoticeResponse and re-throw the
 error or notice in the originating backend.  Obviously, parallelism is
 going to need this kind of functionality, but I suspect a variety of
 other applications people may develop using background workers may
 want it too; and it's certainly important for pg_background itself.

I would have had use for it previously.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] BUG - broken make check if different options

2014-07-26 Thread Fabien COELHO



As I was investing playing around with blocksize, I noticed that some test 
cases under make check vary depending on compilation parameters, as 
they:


 - do not order the result of queries, thus are not deterministic
   [join, with]

 - output query plans which differ depending on some parameters
   [updatable_views, union, select_views]

Thus make check fails.

  sh ./configure --blocksize=4
  sh make
  sh make check
   ...
   5 of 144 tests failed.

Adding some ORDER BY should solve the first issue, but ISTM that testing 
the output of query plans is a lost case for determinism, so maybe these 
test cases should be ignored/skipped when parameters are different.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parametric block size?

2014-07-26 Thread Andres Freund

Hi,

On 2014-07-22 10:22:53 +0200, Fabien wrote:
 The default blocksize is currently 8k, which is not necessary optimal for
 all setup, especially with SSDs where the latency is much lower than HDD.

I don't think that really follows.

 There is a case for different values with significant impact on performance
 (up to a not-to-be-sneezed-at 10% on a pgbench run on SSD, see
 http://www.cybertec.at/postgresql-block-sizes-getting-started/), and ISTM
 that the ability to align PostgreSQL block size to the underlying FS/HW
 block size would be nice.

I don't think that benchmark is very meaningful. Way too small scale,
way to short runtime (there'll be barely any checkpoints, hot pruning,
vacuum at all).

 More advanced features, but with much more impact on the code, would be to
 be able to change the size at database/table level.

That'd be pretty horrible because the size of pages in shared_buffers
wouldn't be uniform anymore.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] get_loop_count() fails to ignore RELOPT_DEADREL rels

2014-07-26 Thread David Rowley

I've just been hacking away a bit more at the WIP patch that I posted a
while back which allows join removals for SEMI and ANTI joins that could be
proved useless due to the existence of a foreign key which matched the join
condition (here
http://www.postgresql.org/message-id/caaphdvq0nai8ceqtnndqg6mhfh__7_a6tn9xu4v0cut9wab...@mail.gmail.com
)

On testing I found that the Assert(outer_rel-rows  0) was failing in
get_loop_count(). The relation which it was failing on was one that I had
declared dead in remove_useless_joins(). I've not done very much work so
far in the costing part of the planner, however I see that set_rel_size()
seems to be in charge of dishing out row estimates in this case and that it
naturally does nothing for rels marked with RELOPT_DEADREL.

I've not yet determined if I can exploit this with the existing join
removal code, but I can give it a try if required.

In order to get my patch working with an Assert enabled build I've had to
apply the attached patch.

Regards

David Rowley


get_loop_count_ignore_dead_rels.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] get_loop_count() fails to ignore RELOPT_DEADREL rels

2014-07-26 Thread David Rowley

On Sat, Jul 26, 2014 at 9:11 PM, David Rowley dgrowle...@gmail.com wrote:

 In order to get my patch working with an Assert enabled build I've had to
 apply the attached patch.


Actually I meant to attach this patch instead.

Regards

David Rowley


get_loop_count_ignore_dead_rels_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SKIP LOCKED DATA (work in progress)

2014-07-26 Thread Thomas Munro

On 24 July 2014 00:52, Thomas Munro mu...@ip9.org wrote:
 On 23 July 2014 13:15, David Rowley dgrowle...@gmail.com wrote:
 I'm also wondering about this block of code in general:

 if (erm-waitPolicy == RWP_WAIT)
 wait_policy = LockWaitBlock;
 else if (erm-waitPolicy == RWP_SKIP )
 wait_policy = LockWaitSkip;
 else /* erm-waitPolicy == RWP_NOWAIT */
 wait_policy = LockWaitError;

 Just after this code heap_lock_tuple() is called, and if that returns
 HeapTupleWouldBlock, the code does a goto lnext, which then will repeat that
 whole block again. I'm wondering why there's 2 enums that are for the same
 thing? if you just had 1 then you'd not need this block of code at all, you
 could just pass erm-waitPolicy to heap_lock_tuple().

 True.  Somewhere upthread I mentioned the difficulty I had deciding
 how many enumerations were needed, for the various subsystems, ie
 which headers and type they were allowed to share. [...]

I tried getting rid of the offending if-then-else enum conversion code
and replaced it with a simple assignment -- please see attached.  I
also added compile time assertions that the enum values line up to
make that work, and are correctly ordered for use in that 'Max'
expression.  Please let me know if you think this is an improvement or
an abomination.

I couldn't find an existing reasonable place to share a single wait
policy enumeration between parser/planner/executor and the heap access
module, and I get the feeling that it would be unacceptable to
introduce one.

I suppose that the LockClauseWaitPolicy and RowWaitPolicy could at
least be merged into a single enum defined in nodes.h following the
example of CmdType, which is also used by both parsenodes.h and
plannnode.h, but do I detect a tiny hint of reluctance in its comment,
so put it here...?

(The attached patch also has a couple of trivial typo fixes in
documentation and comments).

Best regards,
Thomas Munro
diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index 231dc6a..0469705 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -45,7 +45,7 @@ SELECT [ ALL | DISTINCT [ ON ( replaceable class=parameterexpression/replac
 [ LIMIT { replaceable class=parametercount/replaceable | ALL } ]
 [ OFFSET replaceable class=parameterstart/replaceable [ ROW | ROWS ] ]
 [ FETCH { FIRST | NEXT } [ replaceable class=parametercount/replaceable ] { ROW | ROWS } ONLY ]
-[ FOR { UPDATE | NO KEY UPDATE | SHARE | KEY SHARE } [ OF replaceable class=parametertable_name/replaceable [, ...] ] [ NOWAIT ] [...] ]
+[ FOR { UPDATE | NO KEY UPDATE | SHARE | KEY SHARE } [ OF replaceable class=parametertable_name/replaceable [, ...] ] [ NOWAIT | SKIP LOCKED ] [...] ]
 
 phrasewhere replaceable class=parameterfrom_item/replaceable can be one of:/phrase
 
@@ -1283,7 +1283,7 @@ FETCH { FIRST | NEXT } [ replaceable class=parametercount/replaceable ] {
 The locking clause has the general form
 
 synopsis
-FOR replaceablelock_strength/ [ OF replaceable class=parametertable_name/replaceable [, ...] ] [ NOWAIT ]
+FOR replaceablelock_strength/ [ OF replaceable class=parametertable_name/replaceable [, ...] ] [ NOWAIT | SKIP LOCKED ]
 /synopsis
 
 where replaceablelock_strength/ can be one of
@@ -1359,11 +1359,17 @@ KEY SHARE
 
para
 To prevent the operation from waiting for other transactions to commit,
-use the literalNOWAIT/ option.  With literalNOWAIT/, the statement
-reports an error, rather than waiting, if a selected row
-cannot be locked immediately.  Note that literalNOWAIT/ applies only
-to the row-level lock(s) mdash; the required literalROW SHARE/literal
-table-level lock is still taken in the ordinary way (see
+use either the literalNOWAIT/ or literalSKIP LOCKED/literal
+option.  With literalNOWAIT/, the statement reports an error, rather
+than waiting, if a selected row cannot be locked immediately.
+With literalSKIP LOCKED/literal, any selected rows that cannot be
+immediately locked are skipped.  Skipping locked rows provides an
+inconsistent view of the data, so this is not suitable for general purpose
+work, but can be used to avoid lock contention with multiple consumers
+accessing a queue-like table.  Note that literalNOWAIT/
+and literalSKIP LOCKED/literal apply only to the row-level lock(s)
+mdash; the required literalROW SHARE/literal table-level lock is
+still taken in the ordinary way (see
 xref linkend=mvcc).  You can use
 xref linkend=sql-lock
 with the literalNOWAIT/ option first,
@@ -1386,14 +1392,14 @@ KEY SHARE
/para
 
para
-Multiple locking
-clauses can be written if it is necessary to specify different locking
-behavior for different tables.  If the same table is mentioned (or
-implicitly affected) by more than one locking clause,
-then it is processed as if it was only specified by the strongest one.
-Similarly, a table is processed
-as

Re: [HACKERS] [RFC] Should smgrtruncate() avoid sending sinval message for temp relations

2014-07-26 Thread MauMau


From: Robert Haas robertmh...@gmail.com

I think the problem here is that it actually is possible for one
session to access the temporary objects of another session:
Now, we could prohibit that specific thing.  But at the very least, it
has to be possible for one session to drop another session's temporary
objects, because autovacuum does it eventually, and superusers will
want to do it sooner to shut autovacuum up.  So it's hard to reason
about whether and to what extent it's safe to not send sinval messages
for temporary objects.


I was a bit surprised to know that one session can access the data of 
another session's temporary tables.  That implenentation nay be complicating 
the situation -- extra sinval messages.




I think you might be approaching this problem from the wrong end,
though.  The question in my mind is: why does the
StartTransactionCommand() / CommitTransactionCommand() pair in
ProcessCatchupEvent() end up writing a commit record?  The obvious
possibility that occurs to me is that maybe rereading the invalidated
catalog entries causes a HOT prune, and maybe there ought to be some
way for a transaction that has only done HOT pruning to commit
asynchronously, just as we already do for transactions that only
modify temporary tables.  Or, failing that, maybe there's a way to
suppress synchronous commit for this particular transaction.


I could figure out what log record was output in the transaction started in 
ProcessCatchupEvent() by inserting elog() in XLogInsert().  The log record 
was (RM_STANDBY_ID, XLOG_STANDBY_LOCK).


The cause of the hang turned out clear.  It was caused as follows:

1. When a transaction commits which used a temporary table created with ON 
COMMIT DELETE ROWS, the sinval catchup signal (SIGUSR1) was issued from 
smgrtruncate().  This is because the temporary table is truncated at 
transaction end.


2. Another session, which is waiting for a client request, receives SIGUSR1. 
It calls ProcessCatchupEvent().


3. ProcessCatchupEvent() calls StartTransactionCommand(), emits the 
XLOG_STANDBY_LOCK WAL record, and then calls CommitTransactionCommand().


4. It then calls SyncRepWaitForLSN(), which in turn waits on the latch.

5. But the WaitLatch() never returns, because the session is already running 
inside the SIGUSR1 handler in step 2.  WaitLatch() needs SIGUSR1 to 
complete.


I think there is a problem with the latch and SIGUSR1 mechanism.  How can we 
fix this problem?


Regards
MauMau




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SKIP LOCKED DATA (work in progress)

2014-07-26 Thread David Rowley

On Sat, Jul 26, 2014 at 9:34 PM, Thomas Munro mu...@ip9.org wrote:

 I couldn't find an existing reasonable place to share a single wait
 policy enumeration between parser/planner/executor and the heap access
 module, and I get the feeling that it would be unacceptable to
 introduce one.


I guess the way I justify it in my head is something like, the 3 enums are
for the same purpose, so having 3 exist all with different names is
confusing and it makes the code harder to follow. So to fix that up I
think, oh we can just give them all the same name... But then, how can be
we be sure each definition matches the other 2? ... hmm, just merge it
into one and put it somewhere that can be accessed from everywhere.

Saying that I don't know what the project best practises are for locations
for sharing such things, but if nothing exists then maybe this would be a
good time to invent somewhere.

Maybe someone with more experience can chime in and give advice on this?

Regards

David Rowley

Re: [HACKERS] [RFC] Should smgrtruncate() avoid sending sinval message for temp relations

2014-07-26 Thread MauMau


From: Andres Freund and...@2ndquadrant.com

I think we should do what the first paragraph in
http://archives.postgresql.org/message-id/20140707155113.GB1136%40alap3.anarazel.de
outlined. As Tom says somewhere downthread that requires some code
review, but other than that it should get rid of a fair amount of
problems.


As mentioned in the mail I've just sent,  there seems to be a problem around 
the latch and/or sinval catchup implementation.


Or, is it bad that many things are done in SIGUSR1 handler?  If some 
processing in SIGUSR1 handler requires waiting on a latch, it hangs at 
WaitLatch().  Currently, the only processing in the backend which requires a 
latch may be to wait for the sync standby.  However, in the future, the 
latch may be used for more tasks.


Another problem is, who knows WaitLatch() can return prematurely (before the 
actual waited-for event does SetLatch()) due to the SIGUSR1 issued for 
sinval catchup event?


How should we tackle these problem?

Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parametric block size?

2014-07-26 Thread Fabien COELHO



Hello Andres,


The default blocksize is currently 8k, which is not necessary optimal for
all setup, especially with SSDs where the latency is much lower than HDD.


I don't think that really follows.


The rationale, which may be proven false, is that with a SSD the latency 
penalty for reading and writing randomly vs sequentially is much lower 
than for HDD, so there is less insentive to group stuff in larger chunks 
on that account.



There is a case for different values with significant impact on performance
(up to a not-to-be-sneezed-at 10% on a pgbench run on SSD, see
http://www.cybertec.at/postgresql-block-sizes-getting-started/), and ISTM
that the ability to align PostgreSQL block size to the underlying FS/HW
block size would be nice.


I don't think that benchmark is very meaningful. Way too small scale, 
way to short runtime (there'll be barely any checkpoints, hot pruning, 
vacuum at all).


These benchs have the merit to exist, to be consistent (the smaller the 
blocksize, the better the performance), and ISTM that the performance 
results suggest that this is worth investigating.


Possibly the small scale means that data fit in memory, so the 
benchmarks as run emphasize write performance linked to the INSERT/UPDATE.


What would you suggest as meaningful for scale and run time, say on a 
dual-core 8GB memory 256GB SSD laptop?



More advanced features, but with much more impact on the code, would be to
be able to change the size at database/table level.


That'd be pretty horrible because the size of pages in shared_buffers 
wouldn't be uniform anymore.


Yep, I also thought of that, so I'm not planing to investigate.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] BUG - broken make check if different options

2014-07-26 Thread Tom Lane

Fabien COELHO coe...@cri.ensmp.fr writes:
 As I was investing playing around with blocksize, I noticed that some test 
 cases under make check vary depending on compilation parameters, as 
 they:

There has never been any expectation that the regression tests would pass
exactly no matter what the environment.  If we tried to make them do so,
we'd end up restricting the scope of testing so much as to be nearly
useless.

IOW, this is not a bug.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] get_loop_count() fails to ignore RELOPT_DEADREL rels

2014-07-26 Thread Tom Lane

David Rowley dgrowle...@gmail.com writes:
 In order to get my patch working with an Assert enabled build I've had to
 apply the attached patch.

That patch is entirely bogus.  What you should be asking is why
get_loop_count is being applied to a relation that's supposedly been
removed from the query.  It should only get applied to rels that are
required outer rels for a parameterized path, and thus certainly
not dead.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SKIP LOCKED DATA (work in progress)

2014-07-26 Thread Tom Lane

Thomas Munro mu...@ip9.org writes:
 I couldn't find an existing reasonable place to share a single wait
 policy enumeration between parser/planner/executor and the heap access
 module, and I get the feeling that it would be unacceptable to
 introduce one.

There is a precedent in the form of AclMode, which is needed throughout
the system and is currently declared in parsenodes.h.  I can't say I've
ever been particularly pleased with that arrangement though, since it
forces inclusion of parsenodes.h in many places that might not otherwise
have any interest in parse nodes.

It might be better if we'd declared AclMode in a single-purpose header,
say utils/aclmode.h, and then #include'd that into parsenodes.h.
There's certainly plenty of other single-datatype headers laying about.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] config.sgml referring to unix_socket_directories on older releases

2014-07-26 Thread Guillaume Lelarge

Hi,

While updating the french translation of the latest releases, I stumbled
upon a small issue on the config.sgml file.

It talks about unix_socket_directories whereas this parameter only appears
with the 9.3 release. It should probably be replaced with
unix_socket_directory for all releases where this has been commited (8.4 to
9.2). The patch attached does this. It applies cleanly on all releases
(with a hunk though).

Thanks.

Regards.


-- 
Guillaume.
  http://blog.guillaume.lelarge.info
  http://www.dalibo.com
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d47dd9c..9f23e8c 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -473,7 +473,7 @@ SET ENABLE_SEQSCAN TO OFF;
para
 This parameter is irrelevant on systems, notably Solaris as of Solaris
 10, that ignore socket permissions entirely.  There, one can achieve a
-similar effect by pointing varnameunix_socket_directories/ to a
+similar effect by pointing varnameunix_socket_directory/ to a
 directory having search permission limited to the desired audience.
 This parameter is also irrelevant on Windows, which does not have
 Unix-domain sockets.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] BUG - broken make check if different options

2014-07-26 Thread Fabien COELHO




As I was investing playing around with blocksize, I noticed that some test
cases under make check vary depending on compilation parameters, as
they:


There has never been any expectation that the regression tests would 
pass exactly no matter what the environment.  If we tried to make them 
do so, we'd end up restricting the scope of testing so much as to be 
nearly useless.


Hmmm... ok, so this is a feature.

I would have thought that tests about functional results should always 
pass, no matter what the environment, provided the environment allows to 
run the test, so basicaly queries should return deterministic results, 
implying an ORDER BY when there is more than one row in the output. So for 
me this part looks more like a bug than a feature.


For tests about plans, this is less obvious. Maybe test settings should 
control the environment enough so as to warrant deterministic results 
(say, tell the planner to assume this and this and this, and what is your 
plan ?). That would also help to test plan decisions with more extreme 
hardware. However, this would probably imply a test infrastructure beyond 
what is currently available. So I would be more ok for that part as a 
feature.



Also, this means that changing the default block size is basically never 
tested, otherwise the buildfarm would be reder than it is.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Should smgrtruncate() avoid sending sinval message for temp relations

2014-07-26 Thread Tom Lane

MauMau maumau...@gmail.com writes:
 [ sinval catchup signal - ProcessCatchupEvent - WaitLatch - deadlock ]

Ugh.

One line of thought is that it's pretty unsafe to be doing anything
as complicated as transaction start/commit in a signal handler, even one
that is sure it's not interrupting anything else.  The only excuse
ProcessCatchupEvent has for that is that it's trying to ensure proper
cleanup from an error.  Perhaps with closer analysis we could convince
ourselves that errors thrown from AcceptInvalidationMessages() wouldn't
need transaction abort cleanup.

For that matter, it's not exactly clear that an error thrown out of
there would result in good things even with the transaction
infrastructure.  Presuming that we're waiting for client input, an
error longjmp out of ProcessCatchupEvent could mean taking control
away from OpenSSL, and I bet that won't end well.  Maybe we should
just be doing

PG_TRY
AcceptInvalidationMessages();
PG_CATCH
elog(FATAL, curl up and die);

ProcessIncomingNotify is also using
StartTransactionCommand()/CommitTransactionCommand(), so that would
need some thought too.

Or we could try to get rid of the need to do anything beyond setting
a flag in the interrupt handler itself.  But I'm afraid that's probably
unworkable, especially now that we use SA_RESTART on all signals.

Another line of thought is that we've been way too uncritical about
shoving different kinds of events into the SIGUSR1 multiplexor.
It might be a good idea to separate high level interrupts from
low level ones, using say SIGUSR1 for the former and SIGUSR2
for the latter.  However, that doesn't sound very back-patchable,
even assuming that we can come up with a clean division.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parametric block size?

2014-07-26 Thread Andres Freund

Hi,

On 2014-07-26 12:50:30 +0200, Fabien COELHO wrote:
 The default blocksize is currently 8k, which is not necessary optimal for
 all setup, especially with SSDs where the latency is much lower than HDD.
 
 I don't think that really follows.
 
 The rationale, which may be proven false, is that with a SSD the latency
 penalty for reading and writing randomly vs sequentially is much lower than
 for HDD, so there is less insentive to group stuff in larger chunks on that
 account.

A higher number of blocks has overhead unrelated to this though:
Increased waste/lower storage density as it gets more frequently that
tuples don't fit into a page; more locks; higher number of buffer
headers; more toasted rows; smaller toast chunks; more vacuuming/heap
pruning WAL records, ...

Now obviously there's also a inverse to this, otherwise we'd all be
using 1GB page sizes. But I don't think storage latency has much to do
with it - it's imo more about write amplification (i.e. turning a single
row update into a 8/4/16/32 kb write).

 There is a case for different values with significant impact on performance
 (up to a not-to-be-sneezed-at 10% on a pgbench run on SSD, see
 http://www.cybertec.at/postgresql-block-sizes-getting-started/), and ISTM
 that the ability to align PostgreSQL block size to the underlying FS/HW
 block size would be nice.
 
 I don't think that benchmark is very meaningful. Way too small scale, way
 to short runtime (there'll be barely any checkpoints, hot pruning, vacuum
 at all).
 
 These benchs have the merit to exist, to be consistent (the smaller the
 blocksize, the better the performance), and ISTM that the performance
 results suggest that this is worth investigating.

Well, it's easy to make claims that aren't meaningful with bad
benchmarks.

Those numbers are *far* too low for the presented SSD - invalidating the
entire thing. That's the speed you'd expect for rotating media, not an
SSD. My laptop has the 1TB variant of that disk and I get nearly 10 that
number of TPS. With a parallel parallel make running, a profiler
started, and assertions enabled.

This isn't an actual benchmark, sorry. It's SEO.

 Possibly the small scale means that data fit in memory, so the benchmarks
 as run emphasize write performance linked to the INSERT/UPDATE.

Well, the generated data is 160MB in size. Nobody with a concurrent
write heavy OLTP load has that little data.

 What would you suggest as meaningful for scale and run time, say on a
 dual-core 8GB memory 256GB SSD laptop?

At the very least scale hundred - then it likely doesn't fit into
internal caches on common consumer drives anymore. But more importantly
the test has to run over several checkpoint cycles, so hot pruning and
vacuuming are also measured.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_background (and more parallelism infrastructure patches)

2014-07-26 Thread Robert Haas

On Fri, Jul 25, 2014 at 4:16 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 On Fri, Jul 25, 2014 at 02:11:32PM -0400, Robert Haas wrote:
 + pq_mq_busy = true;
 +
 + iov[0].data = msgtype;
 + iov[0].len = 1;
 + iov[1].data = s;
 + iov[1].len = len;
 +
 + Assert(pq_mq_handle != NULL);
 + result = shm_mq_sendv(pq_mq_handle, iov, 2, false);
 +
 + pq_mq_busy = false;

 Don't you need a PG_TRY block here to reset pq_mq_busy?

No.  If shm_mq_sendv is interrupted, we can't use the shm_mq any more.
But since that should only happen if an interrupt arrives while the
queue is full, I think that's OK.  (Think about the alternatives: if
the queue is full, we have no way of notifying the launching process
without waiting for it to retrieve the results, but it might not do
that right away, and if we've been killed we need to die *now* not
later.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Should smgrtruncate() avoid sending sinval message for temp relations

2014-07-26 Thread Andres Freund

On 2014-07-26 11:32:24 -0400, Tom Lane wrote:
 MauMau maumau...@gmail.com writes:
  [ sinval catchup signal - ProcessCatchupEvent - WaitLatch - deadlock ]
 
 Ugh.
 
 One line of thought is that it's pretty unsafe to be doing anything
 as complicated as transaction start/commit in a signal handler, even one
 that is sure it's not interrupting anything else.

Yea, that's really not nice.

 The only excuse
 ProcessCatchupEvent has for that is that it's trying to ensure proper
 cleanup from an error.  Perhaps with closer analysis we could convince
 ourselves that errors thrown from AcceptInvalidationMessages() wouldn't
 need transaction abort cleanup.

Hm. I'm not convinced that's going to be easy.

Wouldn't it be better to move the catchup interrupt processing out of
the signal handler? For normal backends we only enable when reading from
the client and DoingCommandRead is set. How about setting a variable in
the signal handler and doing the actual catchup processing after the
recv() returned EINTR? That'd require either renegging on SA_RESTART or
using WaitLatchOrSocket() and nonblocking send/recv.  I think that'd be
a much safer model and after researching it a bit for the idle in
transaction timeout thing it doesn't look super hard. Even windows seems
to already support everything necessary.

 Or we could try to get rid of the need to do anything beyond setting
 a flag in the interrupt handler itself.  But I'm afraid that's probably
 unworkable, especially now that we use SA_RESTART on all signals.

Yea :(. Several platforms actually ignore SA_RESTART for
send/recv/select/... under some circumstances (notably linux), but it'd
probably be hard to get it right for all.

I don't think we can continue to use the blocking calls for much longer
unless we allow them to be interruptible. But I doubt that change would
be backpatchable.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_background (and more parallelism infrastructure patches)

2014-07-26 Thread Robert Haas

On Sat, Jul 26, 2014 at 4:37 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-07-25 14:11:32 -0400, Robert Haas wrote:
 Attached is a contrib module that lets you launch arbitrary command in
 a background worker, and supporting infrastructure patches for core.

 Cool.

 I assume this 'fell out' of the work towards parallelism? Do you think
 all of the patches (except the contrib one) are required for that or is
 some, e.g. 3), only required to demonstrate the others?

I'm fairly sure that patches 3, 4, and 5 are all required in some form
as building blocks for parallelism.  Patch 1 contains two functions,
one of which (shm_mq_set_handle) I think is generally useful for
people using background workers, but not absolutely required; and one
of which is infrastructure for patch 3 which might not be necessary
with different design choices (shm_mq_sendv).  Patch 2 is only
included because pg_background can benefit from it; we could instead
use an eoxact callback, at the expense of doing cleanup at
end-of-transaction rather than end-of-query.  But it's a mighty small
patch and seems like a reasonable extension to the API, so I lean
toward including it.

 Patch 3 adds the ability for a backend to request that the protocol
 messages it would normally send to the frontend get redirected to a
 shm_mq.  I did this by adding a couple of hook functions.  The best
 design is definitely arguable here, so if you'd like to bikeshed, this
 is probably the patch to look at.

 Uh. This doesn't sound particularly nice. Shouldn't this rather be
 clearly layered by making reading/writing from the client a proper API
 instead of adding hook functions here and there?

I don't know exactly what you have in mind here.  There is an API for
writing to the client that is used throughout the backend, but right
now the client always has to be a socket.  Hooking a couple of parts
of that API lets us write someplace else instead.  If you've got
another idea how to do this, suggest away...

 Also, you seem to have only touched receiving from the client, and not
 sending back to the subprocess. Is that actually sufficient? I'd expect
 that for this facility to be fully useful it'd have to be two way
 communication. But perhaps I'm overestimating what it could be used for.

Well, the basic shm_mq infrastructure can be used to send any kind of
messages you want between any pair of processes that care to establish
them.  But in general I expect that data is going to flow mostly in
one direction - the user backend will launch workers and give them an
initial set of instructions, and then results will stream back from
the workers to the user backend.  Other messaging topologies are
certainly possible, and probably useful for something, but I don't
really know exactly what those things will be yet, and I'm not sure
the FEBE protocol will be the right tool for the job anyway.  But
error propagation, which is the main thrust of this, seems like a need
that will likely be pretty well ubiquitous.

 This patch also adds a function to
 help you parse an ErrorResponse or NoticeResponse and re-throw the
 error or notice in the originating backend.  Obviously, parallelism is
 going to need this kind of functionality, but I suspect a variety of
 other applications people may develop using background workers may
 want it too; and it's certainly important for pg_background itself.

 I would have had use for it previously.

Cool.  I know Petr was interested as well (possibly for the same project?).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SKIP LOCKED DATA (work in progress)

2014-07-26 Thread Thomas Munro

On 26 July 2014 15:43, Tom Lane t...@sss.pgh.pa.us wrote:
 Thomas Munro mu...@ip9.org writes:
 I couldn't find an existing reasonable place to share a single wait
 policy enumeration between parser/planner/executor and the heap access
 module, and I get the feeling that it would be unacceptable to
 introduce one.

 There is a precedent in the form of AclMode, which is needed throughout
 the system and is currently declared in parsenodes.h.  I can't say I've
 ever been particularly pleased with that arrangement though, since it
 forces inclusion of parsenodes.h in many places that might not otherwise
 have any interest in parse nodes.

 It might be better if we'd declared AclMode in a single-purpose header,
 say utils/aclmode.h, and then #include'd that into parsenodes.h.
 There's certainly plenty of other single-datatype headers laying about.

Here is a new version of the patch with a single enum LockWaitPolicy
defined in utils/lockwaitpolicy.h.

Best regards,
Thomas Munro
diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index 231dc6a..0469705 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -45,7 +45,7 @@ SELECT [ ALL | DISTINCT [ ON ( replaceable class=parameterexpression/replac
 [ LIMIT { replaceable class=parametercount/replaceable | ALL } ]
 [ OFFSET replaceable class=parameterstart/replaceable [ ROW | ROWS ] ]
 [ FETCH { FIRST | NEXT } [ replaceable class=parametercount/replaceable ] { ROW | ROWS } ONLY ]
-[ FOR { UPDATE | NO KEY UPDATE | SHARE | KEY SHARE } [ OF replaceable class=parametertable_name/replaceable [, ...] ] [ NOWAIT ] [...] ]
+[ FOR { UPDATE | NO KEY UPDATE | SHARE | KEY SHARE } [ OF replaceable class=parametertable_name/replaceable [, ...] ] [ NOWAIT | SKIP LOCKED ] [...] ]
 
 phrasewhere replaceable class=parameterfrom_item/replaceable can be one of:/phrase
 
@@ -1283,7 +1283,7 @@ FETCH { FIRST | NEXT } [ replaceable class=parametercount/replaceable ] {
 The locking clause has the general form
 
 synopsis
-FOR replaceablelock_strength/ [ OF replaceable class=parametertable_name/replaceable [, ...] ] [ NOWAIT ]
+FOR replaceablelock_strength/ [ OF replaceable class=parametertable_name/replaceable [, ...] ] [ NOWAIT | SKIP LOCKED ]
 /synopsis
 
 where replaceablelock_strength/ can be one of
@@ -1359,11 +1359,17 @@ KEY SHARE
 
para
 To prevent the operation from waiting for other transactions to commit,
-use the literalNOWAIT/ option.  With literalNOWAIT/, the statement
-reports an error, rather than waiting, if a selected row
-cannot be locked immediately.  Note that literalNOWAIT/ applies only
-to the row-level lock(s) mdash; the required literalROW SHARE/literal
-table-level lock is still taken in the ordinary way (see
+use either the literalNOWAIT/ or literalSKIP LOCKED/literal
+option.  With literalNOWAIT/, the statement reports an error, rather
+than waiting, if a selected row cannot be locked immediately.
+With literalSKIP LOCKED/literal, any selected rows that cannot be
+immediately locked are skipped.  Skipping locked rows provides an
+inconsistent view of the data, so this is not suitable for general purpose
+work, but can be used to avoid lock contention with multiple consumers
+accessing a queue-like table.  Note that literalNOWAIT/
+and literalSKIP LOCKED/literal apply only to the row-level lock(s)
+mdash; the required literalROW SHARE/literal table-level lock is
+still taken in the ordinary way (see
 xref linkend=mvcc).  You can use
 xref linkend=sql-lock
 with the literalNOWAIT/ option first,
@@ -1386,14 +1392,14 @@ KEY SHARE
/para
 
para
-Multiple locking
-clauses can be written if it is necessary to specify different locking
-behavior for different tables.  If the same table is mentioned (or
-implicitly affected) by more than one locking clause,
-then it is processed as if it was only specified by the strongest one.
-Similarly, a table is processed
-as literalNOWAIT/ if that is specified in any of the clauses
-affecting it.
+Multiple locking clauses can be written if it is necessary to specify
+different locking behavior for different tables.  If the same table is
+mentioned (or implicitly affected) by more than one locking clause, then
+it is processed as if it was only specified by the strongest one.
+Similarly, a table is processed as literalNOWAIT/ if that is specified
+in any of the clauses affecting it.  Otherwise, it is processed
+as literalSKIP LOCKED/literal if that is specified in any of the
+clauses affecting it.
/para
 
para
@@ -1930,9 +1936,9 @@ SELECT distributors.* WHERE distributors.name = 'Westward';
 productnamePostgreSQL/productname allows it in any commandSELECT/
 query as well as in sub-commandSELECT/s, but this is an extension.
 The literalFOR NO KEY UPDATE/, literalFOR SHARE/ and
-

[HACKERS] building pdfs

2014-07-26 Thread Andrew Dunstan



Is there any standard set of packages on any supported platform that 
will allow for building the doc PDFs? So far I have not found one on 
either Fedora 20 or Ubuntu 14.04, but maybe I'm missing something. I am 
looking at adding a buildfarm facility to build the docs, but I'm not 
prepared to make buildfarm owners turn handsprings in order to do so. 
IMNSHO, if we can't build the docs in all their formsts with standard 
distro packages then there's a problem right there we should address.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parametric block size?

2014-07-26 Thread Fabien COELHO



The rationale, which may be proven false, is that with a SSD the 
latency penalty for reading and writing randomly vs sequentially is 
much lower than for HDD, so there is less insentive to group stuff in 
larger chunks on that account.


A higher number of blocks has overhead unrelated to this though:
Increased waste/lower storage density as it gets more frequently that
tuples don't fit into a page; more locks; higher number of buffer
headers; more toasted rows; smaller toast chunks; more vacuuming/heap
pruning WAL records, ...

Now obviously there's also a inverse to this, otherwise we'd all be
using 1GB page sizes. But I don't think storage latency has much to do
with it - it's imo more about write amplification (i.e. turning a single
row update into a 8/4/16/32 kb write).


I agree with your interesting above discussion. I do not think that is 
altogether fully invalidates my reasonning about latency, page size  
performance, but I may be wrong. On a HDD, writing a page takes +- the 
same time whatever the size of the page, so the insentive is to try to 
benefit as much as possible from this write, thus to use larger pages. On 
a SSD, the insentive is not so, you can write smaller pages at a lower 
cost.


Anyway, this needs measures, not just words.

ISTM that there is a tradeoff. Whether the current 8 kB page size is the 
best possible compromise, given the various effects and the evoluting 
hardware, and that the compromise would happen to be the same for a HDD 
and a SSD, does not look obvious to me.



These benchs have the merit to exist, to be consistent (the smaller the
blocksize, the better the performance), and ISTM that the performance
results suggest that this is worth investigating.


Well, it's easy to make claims that aren't meaningful with bad
benchmarks.


Sure.

The basic claim that I'm making wrt to this benchmark is that there may be 
a significant impact on performance with changing the block size, thus 
this is worth investigating. I think this claim is quite safe, even if the 
benchmark is not the best possible.



What would you suggest as meaningful for scale and run time, say on a
dual-core 8GB memory 256GB SSD laptop?


At the very least scale hundred - then it likely doesn't fit into
internal caches on common consumer drives anymore. But more importantly
the test has to run over several checkpoint cycles, so hot pruning and
vacuuming are also measured.


Ok.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] PL/PgSQL: EXIT USING ROLLBACK

2014-07-26 Thread Marko Tiikkaja


Hello,

Today I'd like to present a way to get rid of code like this:

  $$
  BEGIN

  BEGIN
INSERT INTO foo VALUES (1);
-- run some tests/checks/whatever
RAISE EXCEPTION 'OK';
  EXCEPTION WHEN raise_exception THEN
IF SQLERRM  'OK' THEN
  RAISE;
END IF;
  END;

  RETURN 'success';
  END
  $$

And replace it with code like this:

  $$
  BEGIN

  testsomething
  BEGIN
INSERT INTO foo VALUES (1);
-- run some tests/checks/whatever
EXIT USING ROLLBACK testsomething;
  EXCEPTION WHEN others THEN
RAISE;
  END;

  RETURN 'success';
  END
  $$

I'm not set on the USING ROLLBACK syntax; it was the only thing I could 
come up with that seemed even remotely sane and didn't break backwards 
compatibility.


Thoughts?  Patch attached, if someone cares.


.marko
*** a/doc/src/sgml/plpgsql.sgml
--- b/doc/src/sgml/plpgsql.sgml
***
*** 2107,2113  END LOOP optional replaceablelabel/replaceable 
/optional;
   /indexterm
  
  synopsis
! EXIT optional replaceablelabel/replaceable /optional optional WHEN 
replaceableboolean-expression/replaceable /optional;
  /synopsis
  
 para
--- 2107,2113 
   /indexterm
  
  synopsis
! EXIT optional replaceablelabel/replaceable /optional optional USING 
ROLLBACK /optional optional WHEN 
replaceableboolean-expression/replaceable /optional;
  /synopsis
  
 para
***
*** 2121,2126  EXIT optional replaceablelabel/replaceable /optional 
optional WHEN re
--- 2121,2134 
 /para
  
 para
+ If literalUSING ROLLBACK/ is specified, instead of persisting the
+ changes made inside the escaped literalBEGIN/ blocks, they are
+ rolled back.  The replaceablelabel/ must be the label of the
+ current or an outer level literalBEGIN/ block with an
+ literalEXCEPTION/ block.
+/para
+ 
+para
  If literalWHEN/ is specified, the loop exit occurs only if
  replaceableboolean-expression/ is true. Otherwise, control passes
  to the statement after literalEXIT/.
*** a/src/pl/plpgsql/src/pl_exec.c
--- b/src/pl/plpgsql/src/pl_exec.c
***
*** 1181,1188  exec_stmt_block(PLpgSQL_execstate *estate, 
PLpgSQL_stmt_block *block)

   resTypByVal, resTypLen);
}
  
!   /* Commit the inner transaction, return to outer xact 
context */
!   ReleaseCurrentSubTransaction();
MemoryContextSwitchTo(oldcontext);
CurrentResourceOwner = oldowner;
  
--- 1181,1192 

   resTypByVal, resTypLen);
}
  
!   if (rc == PLPGSQL_RC_EXIT  estate-exitrollback)
!   RollbackAndReleaseCurrentSubTransaction();
!   else
!   /* Commit the inner transaction, return to 
outer xact context */
!   ReleaseCurrentSubTransaction();
! 
MemoryContextSwitchTo(oldcontext);
CurrentResourceOwner = oldowner;
  
***
*** 1330,1335  exec_stmt_block(PLpgSQL_execstate *estate, 
PLpgSQL_stmt_block *block)
--- 1334,1347 
return PLPGSQL_RC_EXIT;
if (strcmp(block-label, estate-exitlabel) != 0)
return PLPGSQL_RC_EXIT;
+   if (estate-exitrollback)
+   {
+   if (!block-exceptions)
+   ereport(ERROR,
+   
(errcode(ERRCODE_SYNTAX_ERROR),
+errmsg(the BEGIN 
block targeted by EXIT USING ROLLBACK must have an EXCEPTION clause)));
+   estate-exitrollback = false;
+   }
estate-exitlabel = NULL;
return PLPGSQL_RC_OK;
  
***
*** 1789,1794  exec_stmt_loop(PLpgSQL_execstate *estate, PLpgSQL_stmt_loop 
*stmt)
--- 1801,1810 
return PLPGSQL_RC_EXIT;
if (strcmp(stmt-label, estate-exitlabel) != 0)
return PLPGSQL_RC_EXIT;
+   if (estate-exitrollback)
+   ereport(ERROR,
+   
(errcode(ERRCODE_SYNTAX_ERROR),
+errmsg(the target of 
EXIT USING ROLLBACK must be a BEGIN block)));
estate-exitlabel = NULL;
return PLPGSQL_RC_OK;
  
***
*** 1850,1855

Re: [HACKERS] parametric block size?

2014-07-26 Thread Andres Freund

On 2014-07-26 19:06:58 +0200, Fabien COELHO wrote:
 The basic claim that I'm making wrt to this benchmark is that there may be a
 significant impact on performance with changing the block size, thus this is
 worth investigating. I think this claim is quite safe, even if the benchmark
 is not the best possible.

Well, you went straight to making it something adjustable at run
time. And I don't see that as being warranted at this point. But further
benchmarks sound like a good idea.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parametric block size?

2014-07-26 Thread Fabien COELHO



The basic claim that I'm making wrt to this benchmark is that there may 
be a significant impact on performance with changing the block size, 
thus this is worth investigating. I think this claim is quite safe, 
even if the benchmark is not the best possible.


Well, you went straight to making it something adjustable at run time.


What I really did was to go straight to asking the question:-)

Up to now I have two answers, or really caveats:

 - a varying blocksize implementation should have minimum effects
   on performance for user of the default settings.

 - the said benchmark may not be that meaningful, so the performance
   impact is to be accessed more thoroughly.

And I don't see that as being warranted at this point. But further 
benchmarks sound like a good idea.


Yep. A 10% potential performance impact looks worth the investigation.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Should smgrtruncate() avoid sending sinval message for temp relations

2014-07-26 Thread Tom Lane

Andres Freund and...@2ndquadrant.com writes:
 Wouldn't it be better to move the catchup interrupt processing out of
 the signal handler? For normal backends we only enable when reading from
 the client and DoingCommandRead is set. How about setting a variable in
 the signal handler and doing the actual catchup processing after the
 recv() returned EINTR?

Only it won't.  See SA_RESTART.  I think turning that off is a nonstarter,
as per previous discussions.

 That'd require either renegging on SA_RESTART or
 using WaitLatchOrSocket() and nonblocking send/recv.

Yeah, I was wondering about using WaitLatchOrSocket for client I/O too.
We already have a hook that lets us do the actual recv even when using
OpenSSL, and in principle that function could do interrupt-service-like
functions if it got kicked off the recv().

Anything in this line is going to be a bigger change than I'd want to
back-patch, though.  Are we OK with not fixing the problem in the back
branches?  Given the shortage of field complaints, that might be all
right.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] building pdfs

2014-07-26 Thread Tom Lane

Andrew Dunstan and...@dunslane.net writes:
 Is there any standard set of packages on any supported platform that 
 will allow for building the doc PDFs? So far I have not found one on 
 either Fedora 20 or Ubuntu 14.04, but maybe I'm missing something.

On either Fedora or RHEL, installing the authoring tools package group
(not quite the right name, but it's close) has worked for me for many
years.  I've not bothered to figure out what the minimum required subset
of that might be.

Not sure what the Debian equivalent is, but borka/guaibasaurus is able to
build the docs, so the packages certainly are available.  Peter might have
a better idea.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Should smgrtruncate() avoid sending sinval message for temp relations

2014-07-26 Thread Andres Freund

On 2014-07-26 13:58:38 -0400, Tom Lane wrote:

 Andres Freund and...@2ndquadrant.com writes:
  That'd require either renegging on SA_RESTART or
  using WaitLatchOrSocket() and nonblocking send/recv.
 
 Yeah, I was wondering about using WaitLatchOrSocket for client I/O too.
 We already have a hook that lets us do the actual recv even when using
 OpenSSL, and in principle that function could do interrupt-service-like
 functions if it got kicked off the recv().

I've started playing with this. Looks clearly worthwile.

I think if we do it right we pretty much can get rid of the whole
prepare_for_client_read() machinery and handle everything via
ProcessInterrupts(). EnableCatchupInterrupt() et al don't really fill me
with joy.

I'm not yet entirely sure where the interrupt processing should happen,
but I guess that'll fall out of the work at some point. The important
bit imo is to *not* not do anything but return with BIO_set_retry_*()
from my_sock_read/write(). That then allows us to implement stuff like
the idle transaction timeout with much fewer problems.

I probably won't finish doing this before leaving on holidays, so nobody
should hesitate to look themselves if interested. If not, I plan to pick
this up again.  I think it's a prerequisite to getting rid of the FATAL
for recovery conflict interrupts which I really would like to do.

 Anything in this line is going to be a bigger change than I'd want to
 back-patch, though.

Agreed. I don't think it will, but it very well could have performance
implications. Besides the obvious risk of bugs...

 Are we OK with not fixing the problem in the back
 branches?  Given the shortage of field complaints, that might be all
 right.

I'm not really comfortable with that. How about simply flagging a couple
contexts to not do the SyncRepWaitForLsn() dance? Possibly just by doing
something ugly like
SetConfigOption('synchronous_commit', 'off', PGC_INTERNAL,
PGC_S_OVERRIDE, GUC_ACTION_LOCAL, true, ERROR)?
during startup, inval and similar transaction commands? Not pretty, but
it looks simple enough to be backpatchable.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] building pdfs

2014-07-26 Thread Andres Freund

On 2014-07-26 14:14:15 -0400, Tom Lane wrote:
 Andrew Dunstan and...@dunslane.net writes:
  Is there any standard set of packages on any supported platform that 
  will allow for building the doc PDFs? So far I have not found one on 
  either Fedora 20 or Ubuntu 14.04, but maybe I'm missing something.
 
 On either Fedora or RHEL, installing the authoring tools package group
 (not quite the right name, but it's close) has worked for me for many
 years.  I've not bothered to figure out what the minimum required subset
 of that might be.
 
 Not sure what the Debian equivalent is, but borka/guaibasaurus is able to
 build the docs, so the packages certainly are available.  Peter might have
 a better idea.

'apt-get build-dep postgresql-$version' IIRC does the trick
there. Installs a thing or two more, but those are things that a debian
buildfarm member is going have installed anyway.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/PgSQL: EXIT USING ROLLBACK

2014-07-26 Thread Tom Lane

Marko Tiikkaja ma...@joh.to writes:
 Hello,
 Today I'd like to present a way to get rid of code like this:

$$
BEGIN

BEGIN
  INSERT INTO foo VALUES (1);
  -- run some tests/checks/whatever
  RAISE EXCEPTION 'OK';
EXCEPTION WHEN raise_exception THEN
  IF SQLERRM  'OK' THEN
RAISE;
  END IF;
END;

RETURN 'success';
END
$$

 And replace it with code like this:

$$
BEGIN

testsomething
BEGIN
  INSERT INTO foo VALUES (1);
  -- run some tests/checks/whatever
  EXIT USING ROLLBACK testsomething;
EXCEPTION WHEN others THEN
  RAISE;
END;

RETURN 'success';
END
$$

Somehow I'm failing to see that as much of an improvement;
in fact, it's probably less clear than before.  I don't much
care for the idea that EXIT should take on some transaction-control
properties instead of being a simple transfer of control.
In particular, what happens if someone attaches USING ROLLBACK
to an EXIT that does not lead from inside to outside a BEGIN/EXCEPTION
block?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/PgSQL: EXIT USING ROLLBACK

2014-07-26 Thread Marko Tiikkaja


On 7/26/14, 8:22 PM, Tom Lane wrote:

In particular, what happens if someone attaches USING ROLLBACK
to an EXIT that does not lead from inside to outside a BEGIN/EXCEPTION
block?


I'm not sure which case you're envisioning.  A label is required, and 
the label must be that of a BEGIN block with an EXCEPTION block if USING 
ROLLBACK is specified.  If that doesn't answer your question, could try 
and explain (perhaps in the form of an example) which problem you're seeing?



.marko


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] PL/PgSQL: RAISE and the number of parameters

2014-07-26 Thread Marko Tiikkaja


Me again,

Here's a patch for making PL/PgSQL throw an error during compilation 
(instead of runtime) if the number of parameters passed to RAISE don't 
match the number of placeholders in error message.  I'm sure people can 
see the pros of doing it this way.



.marko
*** a/src/pl/plpgsql/src/pl_gram.y
--- b/src/pl/plpgsql/src/pl_gram.y
***
*** 106,111  static void check_labels(const char 
*start_label,
--- 106,112 
  staticPLpgSQL_expr*read_cursor_args(PLpgSQL_var *cursor,

  int until, const char *expected);
  staticList*read_raise_options(void);
+ staticvoid
check_raise_parameters(PLpgSQL_stmt_raise *stmt);
  
  %}
  
***
*** 1849,1854  stmt_raise   : K_RAISE
--- 1850,1857 
new-options = 
read_raise_options();
}
  
+   check_raise_parameters(new);
+ 
$$ = (PLpgSQL_stmt *)new;
}
;
***
*** 3768,3773  read_raise_options(void)
--- 3771,3812 
  }
  
  /*
+  * Check that the number of parameter placeholders in the message matches the
+  * number of parameters passed to it, if message was defined.
+  */
+ static void
+ check_raise_parameters(PLpgSQL_stmt_raise *stmt)
+ {
+   char *cp;
+   int expected_nparams = 0;
+ 
+   if (stmt-message == NULL)
+   return;
+ 
+   for (cp = stmt-message; *cp; cp++)
+   {
+   if (cp[0] != '%')
+   continue;
+   /* literal %, ignore */
+   if (cp[1] == '%')
+   {
+   cp++;
+   continue;
+   }
+   expected_nparams++;
+   }
+ 
+   if (expected_nparams  list_length(stmt-params))
+   ereport(ERROR,
+   (errcode(ERRCODE_SYNTAX_ERROR),
+   errmsg(too many parameters specified for 
RAISE)));
+   if (expected_nparams  list_length(stmt-params))
+   ereport(ERROR,
+   (errcode(ERRCODE_SYNTAX_ERROR),
+   errmsg(too few parameters specified for 
RAISE)));
+ }
+ 
+ /*
   * Fix up CASE statement
   */
  static PLpgSQL_stmt *
*** a/src/test/regress/expected/plpgsql.out
--- b/src/test/regress/expected/plpgsql.out
***
*** 2446,2463  begin
  return $1;
  end;
  $$ language plpgsql;
- select raise_test1(5);
  ERROR:  too many parameters specified for RAISE
! CONTEXT:  PL/pgSQL function raise_test1(integer) line 3 at RAISE
  create function raise_test2(int) returns int as $$
  begin
  raise notice 'This message has too few parameters: %, %, %', $1, $1;
  return $1;
  end;
  $$ language plpgsql;
- select raise_test2(10);
  ERROR:  too few parameters specified for RAISE
! CONTEXT:  PL/pgSQL function raise_test2(integer) line 3 at RAISE
  -- Test re-RAISE inside a nested exception block.  This case is allowed
  -- by Oracle's PL/SQL but was handled differently by PG before 9.1.
  CREATE FUNCTION reraise_test() RETURNS void AS $$
--- 2446,2474 
  return $1;
  end;
  $$ language plpgsql;
  ERROR:  too many parameters specified for RAISE
! CONTEXT:  compilation of PL/pgSQL function raise_test1 near line 3
  create function raise_test2(int) returns int as $$
  begin
  raise notice 'This message has too few parameters: %, %, %', $1, $1;
  return $1;
  end;
  $$ language plpgsql;
  ERROR:  too few parameters specified for RAISE
! CONTEXT:  compilation of PL/pgSQL function raise_test2 near line 3
! create function raise_test3(int) returns int as $$
! begin
! raise notice 'This message has no parameters (despite having %% signs in 
it)!';
! return $1;
! end;
! $$ language plpgsql;
! select raise_test3(1);
! NOTICE:  This message has no parameters (despite having % signs in it)!
!  raise_test3 
! -
!1
! (1 row)
! 
  -- Test re-RAISE inside a nested exception block.  This case is allowed
  -- by Oracle's PL/SQL but was handled differently by PG before 9.1.
  CREATE FUNCTION reraise_test() RETURNS void AS $$
*** a/src/test/regress/sql/plpgsql.sql
--- b/src/test/regress/sql/plpgsql.sql
***
*** 2078,2085  begin
  end;
  $$ language plpgsql;
  
- select raise_test1(5);
- 
  create function raise_test2(int) returns int as $$
  begin
  raise notice 'This message has too few parameters: %, %, %', $1, $1;
--- 2078,2083 
***
*** 2087,2093  begin
  end;
  $$ language plpgsql;
  
! select raise_test2(10);
  
  -- Test re-RAISE inside a nested exception block.  This case is allowed
  -- by

Re: [HACKERS] PL/PgSQL: EXIT USING ROLLBACK

2014-07-26 Thread Tom Lane

Marko Tiikkaja ma...@joh.to writes:
 On 7/26/14, 8:22 PM, Tom Lane wrote:
 In particular, what happens if someone attaches USING ROLLBACK
 to an EXIT that does not lead from inside to outside a BEGIN/EXCEPTION
 block?

 I'm not sure which case you're envisioning.  A label is required, and 
 the label must be that of a BEGIN block with an EXCEPTION block if USING 
 ROLLBACK is specified.  If that doesn't answer your question, could try 
 and explain (perhaps in the form of an example) which problem you're seeing?

Well, restrictions of that sort might dodge the implementation problem,
but they make the construct even less orthogonal.  (And the restriction as
stated isn't good enough anyway, since I could still place such an EXIT in
the EXCEPTION part of the block.)

Basically my point is that this just seems like inventing another way to
do what one can already do with RAISE, and it doesn't have much redeeming
social value to justify the cognitive load of inventing another construct.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/PgSQL: EXIT USING ROLLBACK

2014-07-26 Thread Marko Tiikkaja


On 7/26/14, 8:39 PM, Tom Lane wrote:

Marko Tiikkaja ma...@joh.to writes:

I'm not sure which case you're envisioning.  A label is required, and
the label must be that of a BEGIN block with an EXCEPTION block if USING
ROLLBACK is specified.  If that doesn't answer your question, could try
and explain (perhaps in the form of an example) which problem you're seeing?


Well, restrictions of that sort might dodge the implementation problem,
but they make the construct even less orthogonal.  (And the restriction as
stated isn't good enough anyway, since I could still place such an EXIT in
the EXCEPTION part of the block.)


That's a good point; the patch would have to be changed to disallow this 
case.



Basically my point is that this just seems like inventing another way to
do what one can already do with RAISE, and it doesn't have much redeeming
social value to justify the cognitive load of inventing another construct.


Yes, you can already do this with RAISE but that seems more like an 
accident than anything else.  I feel a dedicated syntax is less error 
prone and makes the intent clearer to people reading the code.  But I 
realize I might be in the minority with this.



.marko


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] building pdfs

2014-07-26 Thread Andrew Dunstan



On 07/26/2014 02:22 PM, Andres Freund wrote:

On 2014-07-26 14:14:15 -0400, Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:

Is there any standard set of packages on any supported platform that
will allow for building the doc PDFs? So far I have not found one on
either Fedora 20 or Ubuntu 14.04, but maybe I'm missing something.

On either Fedora or RHEL, installing the authoring tools package group
(not quite the right name, but it's close) has worked for me for many
years.  I've not bothered to figure out what the minimum required subset
of that might be.

Not sure what the Debian equivalent is, but borka/guaibasaurus is able to
build the docs, so the packages certainly are available.  Peter might have
a better idea.

'apt-get build-dep postgresql-$version' IIRC does the trick
there. Installs a thing or two more, but those are things that a debian
buildfarm member is going have installed anyway.



Yes, I did that and generated a PDF, but I got an enormous number of 
errors or warnings. See 
https://www.dropbox.com/s/9n4hhijin3qn8mw/postgres-US.log for example.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] building pdfs

2014-07-26 Thread Tom Lane

Andrew Dunstan and...@dunslane.net writes:
 Yes, I did that and generated a PDF, but I got an enormous number of 
 errors or warnings. See 
 https://www.dropbox.com/s/9n4hhijin3qn8mw/postgres-US.log for example.

If they're things like overfull hbox from the TeX step, they're
expected.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] building pdfs

2014-07-26 Thread Andrew Dunstan



On 07/26/2014 06:44 PM, Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:

Yes, I did that and generated a PDF, but I got an enormous number of
errors or warnings. See
https://www.dropbox.com/s/9n4hhijin3qn8mw/postgres-US.log for example.

If they're things like overfull hbox from the TeX step, they're
expected.




That's rather sad. How would we find out that something has actually 
gone wrong, short of it failing to write a PDF altogether? Searching 
through 204,000 lines of output doesn't sound like fun.


There are lots of these:

   Package Fancyhdr Warning: \fancyhead's `E' option without twoside
   option is use
   less on input line 83877.

and quite a few overfull hboxes that are more than 72pt too wide.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parametric block size?

2014-07-26 Thread Mark Kirkwood


On 26/07/14 21:05, Andres Freund wrote:



More advanced features, but with much more impact on the code, would be to
be able to change the size at database/table level.


That'd be pretty horrible because the size of pages in shared_buffers
wouldn't be uniform anymore.




Possibly stopping at the tablespace level might be more straightforward. 
To avoid messing up the pages in shared buffers we'd perhaps need 
something like several shared buffer pools - each with either its own 
blocksize or associated with a (set of) tablespace(s).


Obviously this sort of thing has a pretty big architecture/code impact, 
probably better to consider a 1st iteration with it being initdb 
specifiable only (as that would still be very convenient)!


Regards

Mark


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/PgSQL: EXIT USING ROLLBACK

2014-07-26 Thread Pavel Stehule

Hello


2014-07-26 19:14 GMT+02:00 Marko Tiikkaja ma...@joh.to:

 Hello,

 Today I'd like to present a way to get rid of code like this:

   $$
   BEGIN

   BEGIN
 INSERT INTO foo VALUES (1);
 -- run some tests/checks/whatever
 RAISE EXCEPTION 'OK';
   EXCEPTION WHEN raise_exception THEN
 IF SQLERRM  'OK' THEN
   RAISE;
 END IF;
   END;

   RETURN 'success';
   END
   $$

 And replace it with code like this:

   $$
   BEGIN

   testsomething
   BEGIN
 INSERT INTO foo VALUES (1);
 -- run some tests/checks/whatever
 EXIT USING ROLLBACK testsomething;
   EXCEPTION WHEN others THEN
 RAISE;
   END;

   RETURN 'success';
   END
   $$

 I'm not set on the USING ROLLBACK syntax; it was the only thing I could
 come up with that seemed even remotely sane and didn't break backwards
 compatibility.

 Thoughts?  Patch attached, if someone cares.


-1

I don't think, so we need to cobolize PL/pgSQL more.

There is not any strong reason why we should to introduce it. You don't
save a code, you don't increase a performance

Regards

Pavel




 .marko


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/PgSQL: RAISE and the number of parameters

2014-07-26 Thread Pavel Stehule

Hi


2014-07-26 20:39 GMT+02:00 Marko Tiikkaja ma...@joh.to:

 Me again,

 Here's a patch for making PL/PgSQL throw an error during compilation
 (instead of runtime) if the number of parameters passed to RAISE don't
 match the number of placeholders in error message.  I'm sure people can see
 the pros of doing it this way.


+1

Regards

Pavel




 .marko


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Use unique index for longer pathkeys.

2014-07-26 Thread Amit Kapila

On Sat, Jul 26, 2014 at 11:53 AM, Amit Kapila amit.kapil...@gmail.com
wrote:
 On Fri, Jul 25, 2014 at 12:48 PM, Kyotaro HORIGUCHI 
horiguchi.kyot...@lab.ntt.co.jp wrote:
   I think there is one more disadvantage in the way current patch is
   done which is that you need to collect index path keys for all
relations
   irrespective of whether they will be of any use to eliminate useless
   pathkeys from query_pathkeys.  One trivial case that comes to mind is
   when there are multiple relations involved in query and ORDER BY is
   base on columns of only part of the tables involved in query.
 
  Like this?
 
  select x.a, x.b, y.b from x, y where x.a = y.a order by x.a, x.b;
 
  Equivalent class consists of (x.a=y.a) and (x.b), so index
  pathkeys for i_y is (y.a.=x.a). As a result, no common primary
  pathkeys found.

 I think it will find common pathkey incase you have an unique index
 on x.a (please see the example below), but currently I am not clear
 why there is a need for a common index path key in such a case to
 eliminate useless keys in ORDER BY, why can't we do it based
 on individual table's path key.

 Example:

 create table t (a int not null, b int not null, c int, d text);
 create unique index i_t_pkey on t(a, b);
 insert into t (select a % 10, a / 10, a, 't' from generate_series(0,
10) a);
 analyze;

 create table t1 (a int not null, b int not null, c int, d text);
 create unique index i_t1_pkey_1 on t1(a);
 create unique index i_t1_pkey_2 on t1(a, b);
 insert into t1 (select a * 2, a / 10, a, 't' from generate_series(0,
10) a);
 explain (costs off, analyze off) select * from t,t1 where t.a=t1.a order
by t1.a,t1.b,t1.c,t1.d;

 QUERY PLAN
 --
  Merge Join
Merge Cond: (t.a = t1.a)
-  Index Scan using i_t_pkey on t
-  Index Scan using i_t1_pkey_1 on t1
 (4 rows)

 Here we can notice that there is no separate sort key in plan.

 Now drop the i_t1_pkey_1 and check the query plan again.

 drop index i_t1_pkey_1;
 explain (costs off, analyze off) select * from t,t1 where t.a=t1.a order
by t1.a,t1.b,t1.c,t1.d;
QUERY PLAN
 
  Sort
Sort Key: t.a, t1.b, t1.c, t1.d
-  Merge Join
  Merge Cond: (t.a = t1.a)
  -  Index Scan using i_t_pkey on t
  -  Index Scan using i_t1_pkey_2 on t1
 (6 rows)

 Can't above plan eliminate Sort Key even after dropping index
 (i_t1_pkey_1)?


Here I have one additional thought which I would like to share with
you to see if this patch can be done in a simpler way.  In function
standard_qp_callback(), can we directly trim the sortclause list based
on index information in PlannerInfo.  We have access to target list in
this function to know exactly the relation/column information of
sortclause.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] [RFC] Should smgrtruncate() avoid sending sinval message for temp relations

2014-07-26 Thread MauMau


From: Tom Lane t...@sss.pgh.pa.us

[ sinval catchup signal - ProcessCatchupEvent - WaitLatch - deadlock ]


I must add one thing.  After some client processes closed the connection 
without any hang, their server processes were stuck with a stack trace like 
this (I'll look for and show the exact stack trace tomorrow):


WaitLatchOrSocket
SyncRepWaitForLSN
CommitTransaction
CommitTransactionCommand
ProcessCatchupEvent
...
shmem_exit
proc_exit_prepare
proc_exit
PostgresMain
...

The process appears to be hanging during session termination.  So, it's not 
the problem only during client request wait.




Another line of thought is that we've been way too uncritical about
shoving different kinds of events into the SIGUSR1 multiplexor.
It might be a good idea to separate high level interrupts from
low level ones, using say SIGUSR1 for the former and SIGUSR2
for the latter.  However, that doesn't sound very back-patchable,
even assuming that we can come up with a clean division.


This seems to be one step in the right direction.  There are two issues in 
the current implementation:


[Issue 1]
[ sinval catchup signal - ProcessCatchupEvent - WaitLatch - deadlock ]
This is (partly) because the latch wakeup and other processing use the same 
SIGUSR1 in normal backend, autovacuum launcher/worker, and the background 
worker with database access.  On the other hand, other background daemon 
processes properly use SIGUSR1 only for latch wakeup, and SIGUSR2 for other 
tasks.


[Issue 2]
WaitLatch() returns prematurely due to the sinval catchup signal, even 
though the target event (e.g. reply from standby) hasn't occurred and called 
SetLatch() yet.  This is because procsignal_sigusr1_handler() always calls 
latch_sigusr1_handler().

This is probably not related to the cause of the hang.

So, as you suggest, I think it would be a good idea to separate 
latch_sigusr1_handler() call into a different function solely for it like 
other daemon processes,  and leave the rest in procsignal_sigusr1_handler() 
and rename function to procsignal_sigusr2_handler() or procsignal_handler(). 
Originally, it's not natural that the current procsignal_sigusr1_handler() 
contains latch_sigusr1_handler() call, because latch processing is not based 
on the procsignal mechanism (SetLatch() doesn't call SendProcSignal()).


I'll try the fix tomorrow if possible.  What kind of problems do you hink of 
for back-patching?


Regards
MauMau




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

44 matches

Mail list logo