Re: [HACKERS] Vacuuming big btree indexes without pages with deleted items

2015-04-01 Thread Vladimir Borodin

> 31 марта 2015 г., в 23:33, Kevin Grittner  написал(а):
> 
> Jim Nasby  wrote:
>> On 3/27/15 5:15 AM, Vladimir Borodin wrote:
> 
>>> Master writes this record to xlog in btvacuumscan function after
>>> vacuuming of all index pages. And in case of no pages with
>>> deleted items xlog record would contain lastBlockVacuumed 0.
>>> 
>>> In btree_xlog_vacuum replica reads all blocks from
>>> lastBlockVacuumed to last block of the index while applying this
>>> record because there is no api in the buffer manager to
>>> understand if the page is unpinned.
>>> 
>>> So if the index is quite big (200+ GB in described case) it
>>> takes much time to do it.
> 
>>> 2. Is it possible not to write to xlog record with
>>> lastBlockVacuumed 0 in some cases? For example, in case of not
>>> deleting any pages.
> 
>> Possibly, but that's much higher risk. Without studying it, if we
>> wanted to mess around with that it might actually make more sense
>> to XLOG a set of blkno's that got vacuumed, but I suspect that
>> wouldn't be a win.
> 
> I feel pretty confident that it would be a win in some significant
> cases, but it could be worse in some cases by changing sequential
> access to random, unless we use heuristics to protect against
> that.  But...
> 
>>> Or maybe there are some better ways of improving this situation?
> 
> This is a start of a better way:
> 
> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=2ed5b87f96d473962ec5230fd820abfeaccb2069
> 
> If we expand on that commit to cover non-MVCC index scans,
> index-only scans, and index scans of non-WAL-logged indexes, then
> this whole aspect of btree vacuum can be eliminated.  It seems
> extremely dubious that all of that could be done for 9.5, and it's
> certainly not material for back-patching to any stable branches,
> but it would be a more complete and better-performing fix than the
> alternatives being discussed here.

Kevin, thanks for your work in this direction.

This way seems to be definitely better. It doesn’t matter that it would not be 
included in 9.5 and back-patched to stable versions. This thread is mostly 
about what could be done in the future. If other cases (including index-only 
scans) would be addressed in 9.6, for example, that would be really cool.

> 
> --
> Kevin Grittner
> EDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company


--
May the force be with you…
https://simply.name



Re: [HACKERS] vac truncation scan problems

2015-04-01 Thread Kyotaro HORIGUCHI
By the way, what shoud we do about this?

 - Waiting for someone's picking up this.

 - Making another thread to attract notice

 - Otherwise..

At Wed, 1 Apr 2015 10:49:55 +0900, Michael Paquier  
wrote in 
> On Wed, Apr 1, 2015 at 2:18 AM, Jeff Janes  wrote:
> 
> > On Tue, Mar 31, 2015 at 1:28 AM, Kyotaro HORIGUCHI <
> > horiguchi.kyot...@lab.ntt.co.jp> wrote:
> >
> >> Hi, this is a bug in the commit 0d831389749a3baaced7b984205b9894a82444b9 .
> >>
> >> It allows vucuum freeze to be skipped and inversely lets regular
> >> vacuum wait for lock. The attched patch fixes it.
> >>
> >>
> >> In table_recheck_autovac, vacuum options are determined as following,
> >>
> >>  >  tab->at_vacoptions = VACOPT_SKIPTOAST |
> >>  >  (dovacuum ? VACOPT_VACUUM : 0) |
> >>  >  (doanalyze ? VACOPT_ANALYZE : 0) |
> >> !>  (wraparound ? VACOPT_NOWAIT : 0);
> >>
> >> The line prefixed by '!' looks inverted.
> >>
> >
> > Thanks, it is obvious once you see it!
> >
> 
> Nice catch, Horiguchi-san.

Thank you:)

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] vac truncation scan problems

2015-04-01 Thread Michael Paquier
On Wed, Apr 1, 2015 at 4:35 PM, Kyotaro HORIGUCHI <
horiguchi.kyot...@lab.ntt.co.jp> wrote:

> By the way, what should we do about this?
>
>  - Waiting for someone's picking up this.
>  - Making another thread to attract notice
>  - Otherwise..


I am sure someone will show up quickly and push the fix you provided.
--
Michael


Re: [HACKERS] vac truncation scan problems

2015-04-01 Thread Kyotaro HORIGUCHI
Hi,

At Wed, 1 Apr 2015 16:50:41 +0900, Michael Paquier  
wrote in 
> I am sure someone will show up quickly and push the fix you provided.

Ok, I'll be a good boy.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup may fail to send feedbacks.

2015-04-01 Thread Fujii Masao
On Tue, Mar 10, 2015 at 5:29 PM, Kyotaro HORIGUCHI
 wrote:
> Hi, the attached is the v5 patch.
>
> - Do feGetCurrentTimestamp() only when necessary.
> - Rebased to current master
>
>
> At Mon, 2 Mar 2015 20:21:36 +0900, Fujii Masao  wrote 
> in 
>> On Tue, Feb 24, 2015 at 6:44 PM, Kyotaro HORIGUCHI
>>  wrote:
>> > Hello, the attached is the v4 patch that checks feedback timing
>> > every WAL segments boundary.
> ..
>> > I said that checking whether to send feedback every boundary
>> > between WAL segments seemed too long but after some rethinking, I
>> > changed my mind.
>> >
>> >  - The most large possible delay source in the busy-receive loop
>> >is fsyncing at closing WAL segment file just written, this can
>> >take several seconds.  Freezing longer than the timeout
>> >interval could not be saved and is not worth saving anyway.
>> >
>> >  - 16M bytes-disk-writes intervals between gettimeofday() seems
>> >to be gentle enough even on platforms where gettimeofday() is
>> >rather heavy.
>>
>> Sounds reasonable to me.
>>
>> So we don't need to address the problem in walreceiver side so proactively
>> because it tries to send the feedback every after flushing the WAL records.
>> IOW, the problem you observed is less likely to happen. Right?
>>
>> +now = feGetCurrentTimestamp();
>> +if (standby_message_timeout > 0 &&
>
> Surely it would hardly happen, especially on ordinary
> configuration.
>
> However, the continuous receiving of the replication stream is a
> quite normal behavior even if hardly happens.  So the receiver
> should guarantee the feedbacks to be sent by logic as long as it
> is working normally, as long as the code for the special case
> won't be too large and won't take too long time:).
>
> The current walreceiver doesn't look trying to send feedbacks
> after flushing every wal records. HandleCopyStream will
> restlessly process the records in a gapless replication stream,
> sending feedback only when keepalive packet arrives. It is the
> fact that the response to the keepalive would help greatly but it
> is not what the keepalives are for. It is intended to be used to
> confirm if a silent receiver is still alive.
>
> Even with this fix, the case that one write or flush takes longer
> time than the feedback interval cannot be saved, but it would be
> ok since it should be regarded as disorder.
>
>
>> Minor comment: should feGetCurrentTimestamp() be called after the check of
>> standby_message_timeout > 0, to avoid unnecessary calls of that?
>
> Ah, you're right. I'll fixed it.

Even if the timeout has not elapsed yet, if synchronous mode is true, we should
send feedback here?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Maximum number of WAL files in the pg_xlog directory

2015-04-01 Thread Jehan-Guillaume de Rorthais
Hi,

As I'm writing a doc patch for 9.4 -> 9.0, I'll discuss below on this formula
as this is the last one accepted by most of you.

On Mon, 3 Nov 2014 12:39:26 -0800
Jeff Janes  wrote:

> It looked to me that the formula, when descending from a previously
> stressed state, would be:
> 
> greatest(1 + checkpoint_completion_target) * checkpoint_segments,
> wal_keep_segments) + 1 +
> 2 * checkpoint_segments + 1

It lacks a closing parenthesis. I guess the formula is:

  greatest (
(1 + checkpoint_completion_target) * checkpoint_segments,
 wal_keep_segments
  ) 
  + 1 + 2 * checkpoint_segments + 1

> This assumes logs are filled evenly over a checkpoint cycle, which is
> probably not true because there is a spike in full page writes right after
> a checkpoint starts.
> 
> But I didn't have a great deal of confidence in my analysis.

The only problem I have with this formula is that considering
checkpoint_completion_target ~ 1 and wal_keep_segments = 0, it becomes:

  4 * checkpoint_segments + 2

Which violate the well known, observed and default one:

  3 * checkpoint_segments + 1

A value above this formula means the system can not cope with the number of
file to flush. The doc is right about that:

   If, due to a short-term peak of log output rate, there
   are more than 3 * checkpoint_segments + 1
   segment files, the unneeded segment files will be deleted

The formula is wrong in the doc when wal_keep_segments <> 0

> The first line reflects the number of WAL that will be retained as-is,

I agree with this files MUST be retained: the set of checkpoint_segments WALs
beeing flushed and the checkpoint_completion_target ones written in
the meantime.

> the second is the number that will be recycled for future use before starting 
> to delete them.

disagree cause the WAL files beeing written are actually consuming recycled
WALs in the meantime.

Your formula expect written files are created and recycled ones never touched,
leading to this checkpoint_segment + 1 difference between formulas.

> My reading of the code is that wal_keep_segments is computed from the
> current end of WAL (i.e the checkpoint record), not from the checkpoint
> redo point.  If I distribute the part outside the 'greatest' into both
> branches of the 'greatest', I don't get the same answer as you do for
> either branch.

So The formula, using checkpoint_completion_target=1, should be:

  greatest (
 checkpoint_segments,
 wal_keep_segments
  ) 
  + 2 * checkpoint_segments + 1

Please find attached to this email a documentation patch for 9.4 using this
formula.

Regards,
-- 
Jehan-Guillaume de Rorthais
Dalibo
http://www.dalibo.com
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index d2392b2..1ed780b 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -546,17 +546,18 @@
 
   
There will always be at least one WAL segment file, and will normally
-   not be more than (2 + checkpoint_completion_target) * checkpoint_segments + 1
-   or checkpoint_segments +  + 1
-   files.  Each segment file is normally 16 MB (though this size can be
+   not be more than greatest(checkpoint_segments, )
+   + (1 + checkpoint_completion_target) * checkpoint_segments
+   + 1 files.  Each segment file is normally 16 MB (though this size can be
altered when building the server).  You can use this to estimate space
requirements for WAL.
Ordinarily, when old log segment files are no longer needed, they
are recycled (that is, renamed to become future segments in the numbered
sequence). If, due to a short-term peak of log output rate, there
-   are more than 3 * checkpoint_segments + 1
-   segment files, the unneeded segment files will be deleted instead
-   of recycled until the system gets back under this limit.
+   are more than greatest(checkpoint_segments, wal_keep_segments)
+   + 2 * checkpoint_segments + 1 segment files, the
+   unneeded segment files will be deleted instead of recycled until the system
+   gets back under this limit.
   
 
   

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-04-01 Thread Amit Kapila
On Mon, Mar 30, 2015 at 8:35 PM, Robert Haas  wrote:
>
> On Wed, Mar 25, 2015 at 6:27 AM, Amit Kapila 
wrote:
> > Apart from that I have moved the Initialization of dsm segement from
> > InitNode phase to ExecFunnel() (on first execution) as per suggestion
> > from Robert.  The main idea is that as it creates large shared memory
> > segment, so do the work when it is really required.
>
> So, suppose we have a plan like this:
>
> Append
> -> Funnel
>   -> Partial Seq Scan
> -> Funnel
>   -> Partial Seq Scan
> (repeated many times)
>
> In earlier versions of this patch, that was chewing up lots of DSM
> segments.  But it seems to me, on further reflection, that it should
> never use more than one at a time.  The first funnel node should
> initialize its workers and then when it finishes, all those workers
> should get shut down cleanly and the DSM destroyed before the next
> scan is initialized.
>
> Obviously we could do better here: if we put the Funnel on top of the
> Append instead of underneath it, we could avoid shutting down and
> restarting workers for every child node.  But even without that, I'm
> hoping it's no longer the case that this uses more than one DSM at a
> time.  If that's not the case, we should see if we can't fix that.
>

Currently it doesn't behave you are expecting, it destroys the DSM and
perform clean shutdown of workers (DestroyParallelContext()) at the
time of ExecEndFunnel() which in this case happens when we finish
Execution of AppendNode.

One way to change it is do the clean up for parallel context when we
fetch last tuple from the FunnelNode (into ExecFunnel) as at that point
we are sure that we don't need workers or dsm anymore.  Does that
sound reasonable to you?


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Parallel Seq Scan

2015-04-01 Thread Amit Kapila
On Mon, Mar 30, 2015 at 8:31 PM, Robert Haas  wrote:
>
> On Wed, Mar 18, 2015 at 11:43 PM, Amit Kapila 
wrote:
> >> I think I figured out the problem.  That fix only helps in the case
> >> where the postmaster noticed the new registration previously but
> >> didn't start the worker, and then later notices the termination.
> >> What's much more likely to happen is that the worker is started and
> >> terminated so quickly that both happen before we create a
> >> RegisteredBgWorker for it.  The attached patch fixes that case, too.
> >
> > Patch fixes the problem and now for Rescan, we don't need to Wait
> > for workers to finish.
>
> I realized that there is a problem with this.  If an error occurs in
> one of the workers just as we're deciding to kill them all, then the
> error won't be reported.

We are sending SIGTERM to worker for terminating the worker, so
if the error occurs before the signal is received then it should be
sent to master backend.  Am I missing something here?

> Also, the new code to propagate
> XactLastRecEnd won't work right, either.

As we are generating FATAL error on termination of worker
(bgworker_die()), so won't it be handled in AbortTransaction path
by below code in parallel-mode patch?

+ if (!parallel)
+ latestXid = RecordTransactionAbort(false);
+ else
+ {
+ latestXid = InvalidTransactionId;
+
+ /*
+ * Since the parallel master won't get our value of XactLastRecEnd in this
+ * case, we nudge WAL-writer ourselves in this case.  See related comments
in
+ * RecordTransactionAbort for why this matters.
+ */
+ XLogSetAsyncXactLSN(XactLastRecEnd);
+ }


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Parallel Seq Scan

2015-04-01 Thread Robert Haas
On Tue, Mar 31, 2015 at 8:53 AM, Amit Kapila  wrote:
>> It looks to me like the is an InitPlan, not a subplan.  There
>> shouldn't be any problem with a Funnel node having an InitPlan; it
>> looks to me like all of the InitPlan stuff is handled by common code
>> within the executor (grep for initPlan), so it ought to work here the
>> same as it does for anything else.  What I suspect is failing
>> (although you aren't being very clear about it here) is the passing
>> down of the parameters set by the InitPlan to the workers.
>
> It is failing because we are not passing InitPlan itself (InitPlan is
> nothing but a list of SubPlan) and I tried tried to describe in previous
> mail [1] what we need to do to achieve the same, but in short, it is not
> difficult to pass down the required parameters (like plan->InitPlan or
> plannedstmt->subplans), rather the main missing part is the handling
> of such parameters in worker side (mainly we need to provide support
> for all plan nodes which can be passed as part of InitPlan in readfuncs.c).
> I am not against supporting InitPlan's on worker side, but just wanted to
> say that if possible why not leave that for first version.

Well, if we *don't* handle it, we're going to need to insert some hack
to ensure that the planner doesn't create plans.  And that seems
pretty unappealing.  Maybe it'll significantly compromise plan
quality, and maybe it won't, but at the least, it's ugly.

> [1]
> I have tried to evaluate what it would take us to support execution
> of subplans by parallel workers.  We need to pass the sub plans
> stored in Funnel Node (initPlan) and corresponding subplans stored
> in planned statement (subplans)  as subplan's stored in Funnel node
> has reference to subplans in planned statement.  Next currently
> readfuncs.c (functions to read different type of nodes) doesn't support
> reading any type of plan node, so we need to add support for reading all
> kind
> of plan nodes (as subplan can have any type of plan node) and similarly
> to execute any type of Plan node, we might need more work (infrastructure).

I don't think you need to do anything that complicated.  I'm not
proposing to *run* the initPlan in the workers, just to pass the
parameter values down.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-04-01 Thread Robert Haas
On Wed, Apr 1, 2015 at 6:30 AM, Amit Kapila  wrote:
> On Mon, Mar 30, 2015 at 8:35 PM, Robert Haas  wrote:
>> So, suppose we have a plan like this:
>>
>> Append
>> -> Funnel
>>   -> Partial Seq Scan
>> -> Funnel
>>   -> Partial Seq Scan
>> (repeated many times)
>>
>> In earlier versions of this patch, that was chewing up lots of DSM
>> segments.  But it seems to me, on further reflection, that it should
>> never use more than one at a time.  The first funnel node should
>> initialize its workers and then when it finishes, all those workers
>> should get shut down cleanly and the DSM destroyed before the next
>> scan is initialized.
>>
>> Obviously we could do better here: if we put the Funnel on top of the
>> Append instead of underneath it, we could avoid shutting down and
>> restarting workers for every child node.  But even without that, I'm
>> hoping it's no longer the case that this uses more than one DSM at a
>> time.  If that's not the case, we should see if we can't fix that.
>>
> Currently it doesn't behave you are expecting, it destroys the DSM and
> perform clean shutdown of workers (DestroyParallelContext()) at the
> time of ExecEndFunnel() which in this case happens when we finish
> Execution of AppendNode.
>
> One way to change it is do the clean up for parallel context when we
> fetch last tuple from the FunnelNode (into ExecFunnel) as at that point
> we are sure that we don't need workers or dsm anymore.  Does that
> sound reasonable to you?

Yeah, I think that's exactly what we should do.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-04-01 Thread Robert Haas
On Wed, Apr 1, 2015 at 7:30 AM, Amit Kapila  wrote:
>> > Patch fixes the problem and now for Rescan, we don't need to Wait
>> > for workers to finish.
>>
>> I realized that there is a problem with this.  If an error occurs in
>> one of the workers just as we're deciding to kill them all, then the
>> error won't be reported.
>
> We are sending SIGTERM to worker for terminating the worker, so
> if the error occurs before the signal is received then it should be
> sent to master backend.  Am I missing something here?

The master only checks for messages at intervals - each
CHECK_FOR_INTERRUPTS(), basically.  So when the master terminates the
workers, any errors generated after the last check for messages will
be lost.

>> Also, the new code to propagate
>> XactLastRecEnd won't work right, either.
>
> As we are generating FATAL error on termination of worker
> (bgworker_die()), so won't it be handled in AbortTransaction path
> by below code in parallel-mode patch?

That will asynchronously flush the WAL, but if the master goes on to
commit, we've wait synchronously for WAL flush, and possibly sync rep.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TABLESAMPLE patch

2015-04-01 Thread Tomas Vondra

On 03/15/15 16:21, Petr Jelinek wrote:


I also did all the other adjustments we talked about up-thread and
rebased against current master (there was conflict with 31eae6028).



Hi,

I did a review of the version submitted on 03/15 today, and only found a 
few minor issues:


1) The documentation of the pg_tablesample_method catalog is missing
   documentation of the 'tsmpagemode' column, which was added later.

2) transformTableEntry() in parse_clause modifies the comment, in a way
   that made sense before part of the code was moved to a separate
   function. I suggest to revert the comment changes, and instead add
   the comment to transformTableSampleEntry()

3) The "shared" parts of the block sampler in sampling.c (e.g. in
   BlockSampler_Next) reference Vitter's algorithm (both the code and
   comments) which is a bit awkward as the only part that uses it is
   analyze.c. The other samplers using this code (system / bernoulli)
   don't use Vitter's algorithm.

   I don't think it's possible to separate this piece of code, though.
   It simply has to be in there somewhere, so I'd recommend adding here
   a simple comment explaining that it's needed because of analyze.c.

Otherwise I think the patch is OK. I'll wait for Petr to fix these 
issues, and then mark it as ready for committer.


What do you think, Amit? (BTW you should probably add yourself as 
reviewer in the CF app, as you've provided a lot of feedback here.)


--
Tomas Vondra   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TABLESAMPLE patch

2015-04-01 Thread Amit Kapila
On Wed, Apr 1, 2015 at 6:31 PM, Tomas Vondra 
wrote:
>
> On 03/15/15 16:21, Petr Jelinek wrote:
>>
>>
>> I also did all the other adjustments we talked about up-thread and
>> rebased against current master (there was conflict with 31eae6028).
>>
>
> Hi,
>
> I did a review of the version submitted on 03/15 today, and only found a
few minor issues:
>
> 1) The documentation of the pg_tablesample_method catalog is missing
>documentation of the 'tsmpagemode' column, which was added later.
>
> 2) transformTableEntry() in parse_clause modifies the comment, in a way
>that made sense before part of the code was moved to a separate
>function. I suggest to revert the comment changes, and instead add
>the comment to transformTableSampleEntry()
>
> 3) The "shared" parts of the block sampler in sampling.c (e.g. in
>BlockSampler_Next) reference Vitter's algorithm (both the code and
>comments) which is a bit awkward as the only part that uses it is
>analyze.c. The other samplers using this code (system / bernoulli)
>don't use Vitter's algorithm.
>
>I don't think it's possible to separate this piece of code, though.
>It simply has to be in there somewhere, so I'd recommend adding here
>a simple comment explaining that it's needed because of analyze.c.
>
> Otherwise I think the patch is OK. I'll wait for Petr to fix these
issues, and then mark it as ready for committer.
>
> What do you think, Amit? (BTW you should probably add yourself as
reviewer in the CF app, as you've provided a lot of feedback here.)
>

I am still not sure whether it is okay to move REPEATABLE from
unreserved to other category.  In-fact last weekend I have spent some
time to see the exact reason for shift/reduce errors and tried some ways
but didn't find a way to get away with the same.  Now I am planning to
spend some more time on the same probably in next few days and then
still if I cannot find a way, I will share my findings and then once
re-review
the changes made by Petr in last version.  I think overall the patch is in
good shape now although I haven't looked into DDL support part of the
patch which I thought could be done in a separate patch as well.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Parallel Seq Scan

2015-04-01 Thread Amit Kapila
On Wed, Apr 1, 2015 at 6:03 PM, Robert Haas  wrote:
>
> On Tue, Mar 31, 2015 at 8:53 AM, Amit Kapila 
wrote:
> >> It looks to me like the is an InitPlan, not a subplan.  There
> >> shouldn't be any problem with a Funnel node having an InitPlan; it
> >> looks to me like all of the InitPlan stuff is handled by common code
> >> within the executor (grep for initPlan), so it ought to work here the
> >> same as it does for anything else.  What I suspect is failing
> >> (although you aren't being very clear about it here) is the passing
> >> down of the parameters set by the InitPlan to the workers.
> >
> > It is failing because we are not passing InitPlan itself (InitPlan is
> > nothing but a list of SubPlan) and I tried tried to describe in previous
> > mail [1] what we need to do to achieve the same, but in short, it is not
> > difficult to pass down the required parameters (like plan->InitPlan or
> > plannedstmt->subplans), rather the main missing part is the handling
> > of such parameters in worker side (mainly we need to provide support
> > for all plan nodes which can be passed as part of InitPlan in
readfuncs.c).
> > I am not against supporting InitPlan's on worker side, but just wanted
to
> > say that if possible why not leave that for first version.
>
> Well, if we *don't* handle it, we're going to need to insert some hack
> to ensure that the planner doesn't create plans.  And that seems
> pretty unappealing.  Maybe it'll significantly compromise plan
> quality, and maybe it won't, but at the least, it's ugly.
>

I also think changing anything in planner related to this is not a
good idea, but what about detecting this during execution (into
ExecFunnel) and then just run the plan locally (by master backend)?

> > [1]
> > I have tried to evaluate what it would take us to support execution
> > of subplans by parallel workers.  We need to pass the sub plans
> > stored in Funnel Node (initPlan) and corresponding subplans stored
> > in planned statement (subplans)  as subplan's stored in Funnel node
> > has reference to subplans in planned statement.  Next currently
> > readfuncs.c (functions to read different type of nodes) doesn't support
> > reading any type of plan node, so we need to add support for reading all
> > kind
> > of plan nodes (as subplan can have any type of plan node) and similarly
> > to execute any type of Plan node, we might need more work
(infrastructure).
>
> I don't think you need to do anything that complicated.  I'm not
> proposing to *run* the initPlan in the workers, just to pass the
> parameter values down.
>

Sorry, but I am not able to understand how it will help if just parameters
(If I understand correctly you want to say about Bitmapset  *extParam;
Bitmapset  *allParam; in Plan structure) are passed to workers and I
think they are already getting passed only initPlan and related Subplan
in planned statement is not passed and the reason is that ss_finalize_plan()
attaches initPlan to top node (which in this case is Funnel node and not
PartialSeqScan)

By any chance, do you mean that we run the part of the statement in
workers and then run initPlan in master backend?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Parallel Seq Scan

2015-04-01 Thread Robert Haas
On Wed, Apr 1, 2015 at 10:28 AM, Amit Kapila  wrote:
>> Well, if we *don't* handle it, we're going to need to insert some hack
>> to ensure that the planner doesn't create plans.  And that seems
>> pretty unappealing.  Maybe it'll significantly compromise plan
>> quality, and maybe it won't, but at the least, it's ugly.
>
> I also think changing anything in planner related to this is not a
> good idea, but what about detecting this during execution (into
> ExecFunnel) and then just run the plan locally (by master backend)?

That seems like an even bigger hack; we want to minimize the number of
cases where we create a parallel plan and then don't go parallel.
Doing that in the hopefully-rare case where we manage to blow out the
DSM segments seems OK, but doing it every time a plan of a certain
type gets created doesn't seem very appealing to me.

>> > [1]
>> > I have tried to evaluate what it would take us to support execution
>> > of subplans by parallel workers.  We need to pass the sub plans
>> > stored in Funnel Node (initPlan) and corresponding subplans stored
>> > in planned statement (subplans)  as subplan's stored in Funnel node
>> > has reference to subplans in planned statement.  Next currently
>> > readfuncs.c (functions to read different type of nodes) doesn't support
>> > reading any type of plan node, so we need to add support for reading all
>> > kind
>> > of plan nodes (as subplan can have any type of plan node) and similarly
>> > to execute any type of Plan node, we might need more work
>> > (infrastructure).
>>
>> I don't think you need to do anything that complicated.  I'm not
>> proposing to *run* the initPlan in the workers, just to pass the
>> parameter values down.
>
> Sorry, but I am not able to understand how it will help if just parameters
> (If I understand correctly you want to say about Bitmapset  *extParam;
> Bitmapset  *allParam; in Plan structure) are passed to workers and I
> think they are already getting passed only initPlan and related Subplan
> in planned statement is not passed and the reason is that ss_finalize_plan()
> attaches initPlan to top node (which in this case is Funnel node and not
> PartialSeqScan)
>
> By any chance, do you mean that we run the part of the statement in
> workers and then run initPlan in master backend?

If I'm not confused, it would be the other way around.  We would run
the initPlan in the master backend *first* and then the rest in the
workers.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How about to have relnamespace and relrole?

2015-04-01 Thread Robert Haas
On Sat, Mar 28, 2015 at 6:59 PM, Andrew Dunstan  wrote:
> I have just claimed this as committer in the CF, but on reviewing the emails
> it looks like there is disagreement about the need for it at all, especially
> from Tom and Robert.
>
> I confess I have often wanted regnamespace, particularly, and occasionally
> regrole, simply as a convenience. But I'm not going to commit it against
> substantial opposition.
>
> Do we need a vote?

Seeing this committed this wouldn't be my first choice, but I can live
with it, as long as the patch is good technically.  As useful as these
sorts of types are, I'm not convinced that notational convenience for
people steeped in backend internals is a sufficiently-good reason to
clutter the system with more built-in types.  But I'm probably in the
minority on that; and it's clearly a judgement call anyway.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How about to have relnamespace and relrole?

2015-04-01 Thread Andrew Dunstan


On 03/31/2015 04:48 PM, Tom Lane wrote:


In view of that, you could certainly argue that if someone's bothered
to make a patch to add a new regFOO type, it's useful enough.  I don't
want to end up with thirtysomething of them, but we don't seem to be
trending in that direction.

Or in short, objection withdrawn.  (As to the concept, anyway.
I've not read the patch...)






The only possible issue I see on reading the patches is that these are 
treated differently for dependencies than other regFOO types. Rather 
than create a dependency if a value is used in a default expression, an 
error is raised if one is found. Are we OK with that?


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Combining Aggregates

2015-04-01 Thread Robert Haas
On Mon, Mar 30, 2015 at 1:28 AM, Michael Paquier
 wrote:
>> I've been thinking of bumping this patch to the June commitfest as the
>> patch only exists to provide the basic infrastructure for things like
>> parallel aggregation, aggregate before join, and perhaps auto updating
>> materialised views.
>>
>> It seems unlikely that any of those things will happen for 9.5.
>
> Yeah, I guess so...
>
>> Does anybody object to me moving this to June's commitfest?
>
> Not from my side FWIW. I think it actually makes sense.

+1.  I'd like to devote some time to looking at this, but I don't have
the time right now.  The chances that we can do something useful with
it in 9.6 seem good.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug fix for missing years in make_date()

2015-04-01 Thread David Fetter
On Tue, Mar 31, 2015 at 12:22:39PM -0700, David Fetter wrote:
> On Tue, Mar 31, 2015 at 12:58:27PM -0400, Tom Lane wrote:
> > David Fetter  writes:
> > > On Tue, Mar 31, 2015 at 10:34:45AM -0400, Adam Brightwell wrote:
> > >> Previously, zero was rejected, what does it do now? I'm sure it
> > >> represents 0 AD/CE, however, is that important enough to note
> > >> given that it was not allowed previously?
> > 
> > > Now, it's supposed to take 0 as 1 BCE, -1 as 2 BCE, etc.  There
> > > should probably be tests for that.
> > 
> > Surely that is *not* what we want?
> 
> It is if we're to be consistent with the rest of the system, to wit:
> 
> SELECT to_date('','');
> to_date
> ---
>  0001-01-01 BC
> (1 row)

Looking at this further, I think that it should be consistent with
cast rather than with to_date().

SELECT date '-01-01';
ERROR:  date/time field value out of range: "-01-01"
LINE 1: SELECT date '-01-01';

Please find attached the next revision of the patch.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
diff --git a/src/backend/utils/adt/date.c b/src/backend/utils/adt/date.c
index d66f640..bbf10e9 100644
--- a/src/backend/utils/adt/date.c
+++ b/src/backend/utils/adt/date.c
@@ -248,10 +248,15 @@ make_date(PG_FUNCTION_ARGS)
tm.tm_mday = PG_GETARG_INT32(2);
 
/*
-* Note: we'll reject zero or negative year values.  Perhaps negatives
-* should be allowed to represent BC years?
+* Note: Negative years are taken to be BCE.
 */
-   dterr = ValidateDate(DTK_DATE_M, false, false, false, &tm);
+   if (tm.tm_year >= 0)
+   dterr = ValidateDate(DTK_DATE_M, false, false, false, &tm);
+   else
+   {
+   tm.tm_year = -1 * tm.tm_year;
+   dterr = ValidateDate(DTK_DATE_M, false, false, true, &tm);
+   }
 
if (dterr != 0)
ereport(ERROR,
diff --git a/src/test/regress/expected/date.out 
b/src/test/regress/expected/date.out
index 8923f60..953e262 100644
--- a/src/test/regress/expected/date.out
+++ b/src/test/regress/expected/date.out
@@ -1191,6 +1191,12 @@ select make_date(2013, 7, 15);
  07-15-2013
 (1 row)
 
+select make_date(-44, 3, 15);  -- Negative years are BCE
+   make_date   
+---
+ 03-15-0044 BC
+(1 row)
+
 select make_time(8, 20, 0.0);
  make_time 
 ---
@@ -1198,14 +1204,14 @@ select make_time(8, 20, 0.0);
 (1 row)
 
 -- should fail
+select make_date(0, 1, 1);
+ERROR:  date field value out of range: 0-01-01
 select make_date(2013, 2, 30);
 ERROR:  date field value out of range: 2013-02-30
 select make_date(2013, 13, 1);
 ERROR:  date field value out of range: 2013-13-01
 select make_date(2013, 11, -1);
 ERROR:  date field value out of range: 2013-11--1
-select make_date(-44, 3, 15);  -- perhaps we should allow this sometime?
-ERROR:  date field value out of range: -44-03-15
 select make_time(10, 55, 100.1);
 ERROR:  time field value out of range: 10:55:100.1
 select make_time(24, 0, 2.1);
diff --git a/src/test/regress/sql/date.sql b/src/test/regress/sql/date.sql
index a62e92a..bd9a845 100644
--- a/src/test/regress/sql/date.sql
+++ b/src/test/regress/sql/date.sql
@@ -279,11 +279,12 @@ select isfinite('infinity'::date), 
isfinite('-infinity'::date), isfinite('today'
 
 -- test constructors
 select make_date(2013, 7, 15);
+select make_date(-44, 3, 15);  -- Negative years are BCE
 select make_time(8, 20, 0.0);
 -- should fail
+select make_date(0, 1, 1);
 select make_date(2013, 2, 30);
 select make_date(2013, 13, 1);
 select make_date(2013, 11, -1);
-select make_date(-44, 3, 15);  -- perhaps we should allow this sometime?
 select make_time(10, 55, 100.1);
 select make_time(24, 0, 2.1);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tables cannot have INSTEAD OF triggers

2015-04-01 Thread Robert Haas
On Tue, Mar 31, 2015 at 8:49 AM, Aliouii Ali  wrote:
> hi all,
> back in
> 2011(http://www.postgresql.org/message-id/1305138588.8811.3.ca...@vanquo.pezone.net),
> an question the same as this one was asked
> the anwser was :
>
> I think they're very useful on views, but I
> couldn't think of a use-case for having them on tables. ISTM that
> anything an INSTEAD OF trigger on a table could do, could equally well
> be done in a BEFORE trigger.
> no not really there is a use-case : in partitioned table ( instead of
> defining before trigger on the master table that return null as the doc
> states, it will be good things to have instead of trigger that return NEW)
> so that query like insert/update ... .. RETURNING will be handdy and gain
> some performance, otherwise we will have to do an insert and select to get
> the same jobs done

I don't see how this helps.  The problem with partitioning is that you
need a way to redirect the INSERT to another table, and there's no
built-in way to do that, so you have to simulate it somehow.  That
issue seems largely separate from how the CREATE TRIGGER command is
spelled.  Maybe I'm missing something.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Zero-padding and zero-masking fixes for to_char(float)

2015-04-01 Thread Bruce Momjian
On Tue, Mar 24, 2015 at 09:47:56AM -0400, Noah Misch wrote:
> On Sun, Mar 22, 2015 at 10:53:12PM -0400, Bruce Momjian wrote:
> > On Sun, Mar 22, 2015 at 04:41:19PM -0400, Noah Misch wrote:
> > > On Wed, Mar 18, 2015 at 05:52:44PM -0400, Bruce Momjian wrote:
> > > > This "junk" digit zeroing matches the Oracle behavior:
> > > > 
> > > > SELECT to_char(1.123456789123456789123456789d, 
> > > > '9.9') as x from dual;
> > > > --
> > > > 1.12345678912345680
> > > > 
> > > > Our output with the patch would be:
> > > > 
> > > > SELECT to_char(float8 '1.123456789123456789123456789', 
> > > > '9.9');
> > > > --
> > > > 1.12345678912345000
> 
> > > These outputs show Oracle treating 17 digits as significant while 
> > > PostgreSQL
> > > treats 15 digits as significant.  Should we match Oracle in this respect 
> > > while
> > > we're breaking compatibility anyway?  I tend to think yes.
> > 
> > Uh, I am hesistant to adjust our precision to match Oracle as I don't
> > know what they are using internally.
> 
> http://sqlfiddle.com/#!4/8b4cf/5 strongly implies 17 significant digits for
> float8 and 9 digits for float4.

I was able to get proper rounding with the attached patch.

test=> SELECT to_char(float8 '1.123456789123456789123456789', 
'9.9');
 to_char
--
  1.12345678912346000
(1 row)

Handling rounding for exponent-format values turned out to be simple. 
What has me stuck now is how to do rounding in the non-decimal part of
the number, e.g.

test=> SELECT to_char(float4 
'15.9123456789123456789000',
  repeat('9', 50) || '.' || repeat('9', 50));
to_char


  
1555753984.00
(1 row)

This should return something like 16.000... (per Oracle
output at the URL above, float4 has 6 significant digits on my compiler)
but I can't seem to figure how to get printf() to round non-fractional
parts.  I am afraid the only solution is to use printf's %e format and
place the decimal point myself.

The fact I still don't have a complete solution suggests this is 9.6
material but I still want to work on it so it is ready.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
new file mode 100644
index 40a353f..2b5a440
*** a/src/backend/utils/adt/formatting.c
--- b/src/backend/utils/adt/formatting.c
***
*** 113,125 
  #define DCH_MAX_ITEM_SIZ	   12		/* max localized day name		*/
  #define NUM_MAX_ITEM_SIZ		8		/* roman number (RN has 15 chars)	*/
  
- /* --
-  * More is in float.c
-  * --
-  */
- #define MAXFLOATWIDTH	60
- #define MAXDOUBLEWIDTH	500
- 
  
  /* --
   * Format parser structs
--- 113,118 
*** static DCHCacheEntry *DCH_cache_getnew(c
*** 989,994 
--- 982,988 
  static NUMCacheEntry *NUM_cache_search(char *str);
  static NUMCacheEntry *NUM_cache_getnew(char *str);
  static void NUM_cache_remove(NUMCacheEntry *ent);
+ static char *add_zero_padding(char *num_str, int pad_digits);
  
  
  /* --
*** do { \
*** 5016,5021 
--- 5010,5056 
  	SET_VARSIZE(result, len + VARHDRSZ); \
  } while (0)
  
+ /*
+  * add_zero_padding
+  *
+  * Some sprintf() implementations have a 512-digit precision limit, and we
+  * need sprintf() to round to the internal precision, so this function adds
+  * zero padding between the mantissa and exponent of an exponential-format
+  * number, or after the supplied string for non-exponent strings.
+  */
+ static char *
+ add_zero_padding(char *num_str, int pad_digits)
+ {
+ 	/* one for decimal point, one for trailing null byte */
+ 	char *out = palloc(strlen(num_str) + pad_digits + 1 + 1), *out_p = out;
+ 	char *num_str_p = num_str;
+ 	bool found_decimal = false;
+ 
+ 	/* copy the number before 'e', or the entire string if no 'e' */
+ 	while (*num_str_p && *num_str_p != 'e' && *num_str_p != 'E')
+ 	{
+ 		if (*num_str_p == '.')
+ 			found_decimal = true;
+ 		*(out_p++) = *(num_str_p++);
+ 	}
+ 
+ 	if (!found_decimal)
+ 		*(out_p++) = '.';
+ 
+ 	/* add zero pad digits */
+ 	while (pad_digits-- > 0)
+ 		*(out_p++) = '0';
+ 
+ 	/* copy 'e' and everything after */
+ 	while (*num_str_p)
+ 		*(out_p++) = *(num_str_p++);
+ 
+ 	*(out_p++) = '\0';
+ 
+ 	pfree(num_str);
+ 	return out;	
+ }
+ 
  /* ---

Re: [HACKERS] TABLESAMPLE patch

2015-04-01 Thread Robert Haas
On Wed, Apr 1, 2015 at 9:49 AM, Amit Kapila  wrote:
> I am still not sure whether it is okay to move REPEATABLE from
> unreserved to other category.  In-fact last weekend I have spent some
> time to see the exact reason for shift/reduce errors and tried some ways
> but didn't find a way to get away with the same.  Now I am planning to
> spend some more time on the same probably in next few days and then
> still if I cannot find a way, I will share my findings and then once
> re-review
> the changes made by Petr in last version.  I think overall the patch is in
> good shape now although I haven't looked into DDL support part of the
> patch which I thought could be done in a separate patch as well.

That seems like a legitimate concern.  We usually try not to make
keywords more reserved in PostgreSQL than they are in the SQL
standard, and REPEATABLE is apparently non-reserved there:

http://www.postgresql.org/docs/devel/static/sql-keywords-appendix.html

This also makes "method" an unreserved keyword, which I'm not wild
about either.  Adding new keyword doesn't cost *much*, but is this
SQL-mandated syntax or something we created?  If the latter, can we
find something to call it that doesn't require new keywords?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tables cannot have INSTEAD OF triggers

2015-04-01 Thread Andres Freund
On 2015-04-01 11:40:13 -0400, Robert Haas wrote:
> On Tue, Mar 31, 2015 at 8:49 AM, Aliouii Ali  wrote:
> I don't see how this helps.  The problem with partitioning is that you
> need a way to redirect the INSERT to another table, and there's no
> built-in way to do that, so you have to simulate it somehow.  That
> issue seems largely separate from how the CREATE TRIGGER command is
> spelled.  Maybe I'm missing something.

Without INSTEAD OF you can't, to my knowledge, return a valid tuple from
the top level table without also inserting into it. Returning NULL after
redirecting the tuple into a child table will break RETURNING; not
returning NULL will insert the tuple in the top level table.

So the only way to do redirection that doesn't break RETURNING without
rules is to insert the tuple in the child in the BEFORE trigger return
NEW and delete the top level table row in an AFTER trigger. That sucks.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [pgsql-pkg-debian] Updated libpq5 packages cause connection errors on postgresql 9.2

2015-04-01 Thread Bruce Momjian
On Sat, Dec 20, 2014 at 12:27:05PM +0100, Magnus Hagander wrote:
> I haven't seen a specific number, it might depend on exactly which cipher is
> negotiated. See for example http://openssl.6102.n7.nabble.com/
> What-is-the-reason-for-error-quot-SSL-negotiation-failed-error-04075070-rsa-routines-RSA-sign-digest-td43953.html
> 
> All references I have foud say at least 2014 is safe and 512 is broken, but
> there are points in betwee nthat apparently works in some cases only.
> 
> I think if we say "use 1024 bits or more" we err on the safe side. 

Did we ever decide on this?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bogus WAL segments archived after promotion

2015-04-01 Thread Bruce Momjian
On Fri, Dec 19, 2014 at 10:26:34PM +0200, Heikki Linnakangas wrote:
> On 12/19/2014 02:55 PM, Heikki Linnakangas wrote:
> >I'm thinking that we should add a step to promotion, where we scan
> >pg_xlog for any segments higher than the timeline switch point, and
> >remove them, or mark them with .done so that they are not archived.
> >There might be some real WAL that was streamed from the primary, but not
> >yet applied, but such WAL is of no interest to that server anyway, after
> >it's been promoted. It's a bit disconcerting to zap WAL that's valid,
> >even if doesn't belong to the current server's timeline history, because
> >as a general rule it's good to avoid destroying evidence that might be
> >useful in debugging. There isn't much difference between removing them
> >immediately and marking them as .done, though, because they will
> >eventually be removed/recycled anyway if they're marked as .done.
> 
> This is what I came up with. This patch removes the suspect segments
> at timeline switch. The alternative of creating .done files for them
> would preserve more evidence for debugging, but OTOH it would also
> be very confusing to have valid-looking WAL segments in pg_xlog,
> with .done files, that in fact contain garbage.
> 
> The patch is a bit longer than it otherwise would be, because I
> moved the code to remove a single file from RemoveOldXlogFiles() to
> a new function. I think that makes it more readable in any case,
> simply because it was so deeply indented in RemoveOldXlogFiles.

Where are we on this?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How about to have relnamespace and relrole?

2015-04-01 Thread Tom Lane
Andrew Dunstan  writes:
> The only possible issue I see on reading the patches is that these are 
> treated differently for dependencies than other regFOO types. Rather 
> than create a dependency if a value is used in a default expression, an 
> error is raised if one is found. Are we OK with that?

Why would it be a good idea to act differently from the others?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TABLESAMPLE patch

2015-04-01 Thread Petr Jelinek

On 01/04/15 17:52, Robert Haas wrote:

On Wed, Apr 1, 2015 at 9:49 AM, Amit Kapila  wrote:

I am still not sure whether it is okay to move REPEATABLE from
unreserved to other category.  In-fact last weekend I have spent some
time to see the exact reason for shift/reduce errors and tried some ways
but didn't find a way to get away with the same.  Now I am planning to
spend some more time on the same probably in next few days and then
still if I cannot find a way, I will share my findings and then once
re-review
the changes made by Petr in last version.  I think overall the patch is in
good shape now although I haven't looked into DDL support part of the
patch which I thought could be done in a separate patch as well.


That seems like a legitimate concern.  We usually try not to make
keywords more reserved in PostgreSQL than they are in the SQL
standard, and REPEATABLE is apparently non-reserved there:

http://www.postgresql.org/docs/devel/static/sql-keywords-appendix.html

This also makes "method" an unreserved keyword, which I'm not wild
about either.  Adding new keyword doesn't cost *much*, but is this
SQL-mandated syntax or something we created?  If the latter, can we
find something to call it that doesn't require new keywords?



REPEATABLE is mandated by standard. I did try for quite some time to 
make it unreserved but was not successful (I can only make it unreserved 
if I make it mandatory but that's not a solution). I haven't been in 
fact even able to find out what it actually conflicts with...


METHOD is something I added. I guess we could find a way to name this 
differently if we really tried. The reason why I picked METHOD was that 
I already added the same unreserved keyword in the sequence AMs patch 
and in that one any other name does not really make sense.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Move inet_gist to right place in pg_amproc

2015-04-01 Thread Heikki Linnakangas

On 03/31/2015 11:00 PM, Andreas Karlsson wrote:

Hi,

The pg_amproc functions for inet_gist were accidentally added under the
gin heading. I have attached a patch which moves them to the gist
heading where they belong.


Thanks, moved.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TABLESAMPLE patch

2015-04-01 Thread Robert Haas
On Wed, Apr 1, 2015 at 12:15 PM, Petr Jelinek  wrote:
> REPEATABLE is mandated by standard. I did try for quite some time to make it
> unreserved but was not successful (I can only make it unreserved if I make
> it mandatory but that's not a solution). I haven't been in fact even able to
> find out what it actually conflicts with...

Yeah, that can be hard to figure out.  Did you run bison with -v and
poke around in gram.output?

> METHOD is something I added. I guess we could find a way to name this
> differently if we really tried. The reason why I picked METHOD was that I
> already added the same unreserved keyword in the sequence AMs patch and in
> that one any other name does not really make sense.

I see.  Well, I guess if we've got more than one use for it it's not so bad.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How about to have relnamespace and relrole?

2015-04-01 Thread Andrew Dunstan


On 04/01/2015 12:13 PM, Tom Lane wrote:

Andrew Dunstan  writes:

The only possible issue I see on reading the patches is that these are
treated differently for dependencies than other regFOO types. Rather
than create a dependency if a value is used in a default expression, an
error is raised if one is found. Are we OK with that?

Why would it be a good idea to act differently from the others?




I have no idea.

It was mentioned here 
 
but nobody seems to have commented. I'm not sure why it was done like 
this. Adding the dependencies seems to be no harder than raising the 
exception. I think we can kick this back to the author to fix.


cheers

andrew






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tables cannot have INSTEAD OF triggers

2015-04-01 Thread Robert Haas
On Wed, Apr 1, 2015 at 12:04 PM, Andres Freund  wrote:
> On 2015-04-01 11:40:13 -0400, Robert Haas wrote:
>> I don't see how this helps.  The problem with partitioning is that you
>> need a way to redirect the INSERT to another table, and there's no
>> built-in way to do that, so you have to simulate it somehow.  That
>> issue seems largely separate from how the CREATE TRIGGER command is
>> spelled.  Maybe I'm missing something.
>
> Without INSTEAD OF you can't, to my knowledge, return a valid tuple from
> the top level table without also inserting into it. Returning NULL after
> redirecting the tuple into a child table will break RETURNING; not
> returning NULL will insert the tuple in the top level table.
>
> So the only way to do redirection that doesn't break RETURNING without
> rules is to insert the tuple in the child in the BEFORE trigger return
> NEW and delete the top level table row in an AFTER trigger. That sucks.

So, the idea is that INSTEAD OF would behave like BEFORE but the tuple
it returns wouldn't actually be inserted?  That wasn't clear to me
from the OP, but I guess it would be a reasonable way to go.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Maximum number of WAL files in the pg_xlog directory

2015-04-01 Thread Fujii Masao
On Wed, Apr 1, 2015 at 7:00 PM, Jehan-Guillaume de Rorthais
 wrote:
> Hi,
>
> As I'm writing a doc patch for 9.4 -> 9.0, I'll discuss below on this formula
> as this is the last one accepted by most of you.
>
> On Mon, 3 Nov 2014 12:39:26 -0800
> Jeff Janes  wrote:
>
>> It looked to me that the formula, when descending from a previously
>> stressed state, would be:
>>
>> greatest(1 + checkpoint_completion_target) * checkpoint_segments,
>> wal_keep_segments) + 1 +
>> 2 * checkpoint_segments + 1
>
> It lacks a closing parenthesis. I guess the formula is:
>
>   greatest (
> (1 + checkpoint_completion_target) * checkpoint_segments,
>  wal_keep_segments
>   )
>   + 1 + 2 * checkpoint_segments + 1
>
>> This assumes logs are filled evenly over a checkpoint cycle, which is
>> probably not true because there is a spike in full page writes right after
>> a checkpoint starts.
>>
>> But I didn't have a great deal of confidence in my analysis.
>
> The only problem I have with this formula is that considering
> checkpoint_completion_target ~ 1 and wal_keep_segments = 0, it becomes:
>
>   4 * checkpoint_segments + 2
>
> Which violate the well known, observed and default one:
>
>   3 * checkpoint_segments + 1
>
> A value above this formula means the system can not cope with the number of
> file to flush. The doc is right about that:
>
>If, due to a short-term peak of log output rate, there
>are more than 3 * checkpoint_segments + 1
>segment files, the unneeded segment files will be deleted
>
> The formula is wrong in the doc when wal_keep_segments <> 0
>
>> The first line reflects the number of WAL that will be retained as-is,
>
> I agree with this files MUST be retained: the set of checkpoint_segments WALs
> beeing flushed and the checkpoint_completion_target ones written in
> the meantime.
>
>> the second is the number that will be recycled for future use before starting
>> to delete them.
>
> disagree cause the WAL files beeing written are actually consuming recycled
> WALs in the meantime.
>
> Your formula expect written files are created and recycled ones never touched,
> leading to this checkpoint_segment + 1 difference between formulas.
>
>> My reading of the code is that wal_keep_segments is computed from the
>> current end of WAL (i.e the checkpoint record), not from the checkpoint
>> redo point.  If I distribute the part outside the 'greatest' into both
>> branches of the 'greatest', I don't get the same answer as you do for
>> either branch.
>
> So The formula, using checkpoint_completion_target=1, should be:
>
>   greatest (
>  checkpoint_segments,
>  wal_keep_segments
>   )
>   + 2 * checkpoint_segments + 1

No. Please imagine how many WAL files can exist at the end of checkpoint.
At the end of checkpoint, we have to leave all the WAL files which were
generated since the starting point of previous checkpoint for the future
crash recovery. The number of these WAL files is

(1 + checkpoint_completion_target) * checkpoint_segments

or

wal_keep_segments

In addition to these files, at the end of checkpoint, old WAL files which were
generated before the starting point of previous checkpoint are recycled.
The number of these WAL files is at most

2 * checkpoint_segments + 1

Note that *usually* there are not such many WAL files at the end of
checkpoint. But this can happen after the peak of WAL logging generates
too much WAL files.

So the sum of those is the right formula, i.e.,

(3 + checkpoint_completion_target) * checkpoint_segments + 1

   or

wal_keep_segments + 2 * checkpoint_segments + 1

If checkpoint_completion_target is 1 and wal_keep_segments is 0,
it can become 4 * checkpoint_segments + 1.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How about to have relnamespace and relrole?

2015-04-01 Thread Tom Lane
Andrew Dunstan  writes:
> On 04/01/2015 12:13 PM, Tom Lane wrote:
>> Andrew Dunstan  writes:
>>> The only possible issue I see on reading the patches is that these are
>>> treated differently for dependencies than other regFOO types. Rather
>>> than create a dependency if a value is used in a default expression, an
>>> error is raised if one is found. Are we OK with that?

>> Why would it be a good idea to act differently from the others?

> I have no idea.

> It was mentioned here 
> 
>  
> but nobody seems to have commented. I'm not sure why it was done like 
> this. Adding the dependencies seems to be no harder than raising the 
> exception. I think we can kick this back to the author to fix.

After a bit more thought it occurred to me that a dependency on a role
would need to be a shared dependency, and the existing infrastructure
for recordDependencyOnExpr() wouldn't support that.

I'm not sure that it's worth adding the complexity to allow shared
dependencies along with normal ones there.  This might be a reason
to reject the regrole part of the patch, as requiring more complexity
than it's worth.

But in any case I cannot see a reason to treat regnamespace differently
from the existing types on this point.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] Permission select pg_stat_replication

2015-04-01 Thread Stephen Frost
Denish, all,

Moved over to -hackers to discuss specifics around addressing this.

* Denish Patel (den...@omniti.com) wrote:
> Fair enough but they should be able to achieve their goal to avoid granting
> SUPER to monitoring user. They have to tweak the grant/revoke as desired.

That's correct, but the problem we currently have is that none of the
monitoring systems will "just work" with this approach because they're
hard-coded to what we actually provide by default.

I'm hoping to fix that problem by having a C function which returns
everything and then locking *that* down and only allowing it to be
executed by a superuser by default.  Administrators would then be able
to grant access to those functions and the monitoring systems could be
built on top of using those functions, and they'd work just fine if the
monitoring user is a superuser but they'd also work if the monitoring
user *isn't*, provided the correct GRANTs are done.

What's getting tricky about all of this is making our existing views
work against the C function but without depending on it to handle the
filtering and instead doing it in SQL.  That sounds simple until you
look at the filtering we're actually doing and that functions defined in
views run as the querier of the view, which means you have to have a
security definer function involved to query the protected function
underneath, and that function has to be callable by everyone, but it has
to return values based on the permissions of the querier, which it
doesn't know because we don't provide that information anywhere.

I'm working on solving that problem by having a function which can
return the value of "who called this function", a capability a *lot* of
people have asked me about in the past, but that's pretty darn grotty
given how GUCs work (we have "show_hook"s, but those operate against
whatever the C variable is currently set to, and I really don't want to
be playing with setting/resetting that just for this..).  Rather than
try to re-engineer how GUCs work, I'm looking at doing this specifically
for this specific case of role information.  I don't hear a lot of
people asking for the value of other GUCs (except perhaps search_path,
but that's easier since we don't have a show_hook for that..).

Would certainly appreciate any thoughts from others on all of the above.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Tables cannot have INSTEAD OF triggers

2015-04-01 Thread Andres Freund
On 2015-04-01 12:46:05 -0400, Robert Haas wrote:
> On Wed, Apr 1, 2015 at 12:04 PM, Andres Freund  wrote:
> > On 2015-04-01 11:40:13 -0400, Robert Haas wrote:
> > Without INSTEAD OF you can't, to my knowledge, return a valid tuple from
> > the top level table without also inserting into it. Returning NULL after
> > redirecting the tuple into a child table will break RETURNING; not
> > returning NULL will insert the tuple in the top level table.
> >
> > So the only way to do redirection that doesn't break RETURNING without
> > rules is to insert the tuple in the child in the BEFORE trigger return
> > NEW and delete the top level table row in an AFTER trigger. That sucks.
> 
> So, the idea is that INSTEAD OF would behave like BEFORE but the tuple
> it returns wouldn't actually be inserted?  That wasn't clear to me
> from the OP, but I guess it would be a reasonable way to go.

I'm not sure what the OP intended, but to me that's pretty much the only
reasonable definition of INSTEAD OF for tables that I can think of.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tables cannot have INSTEAD OF triggers

2015-04-01 Thread Tom Lane
Andres Freund  writes:
> On 2015-04-01 12:46:05 -0400, Robert Haas wrote:
>> So, the idea is that INSTEAD OF would behave like BEFORE but the tuple
>> it returns wouldn't actually be inserted?  That wasn't clear to me
>> from the OP, but I guess it would be a reasonable way to go.

> I'm not sure what the OP intended, but to me that's pretty much the only
> reasonable definition of INSTEAD OF for tables that I can think of.

If you have such a trigger, it's impossible to insert any rows, which
means the table doesn't need storage, which means it may as well be a
view, no?  So this still seems to me like a wart not a useful feature.
I think it would create confusion because a table with such a trigger
would act so much unlike other tables.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tables cannot have INSTEAD OF triggers

2015-04-01 Thread Andres Freund
On 2015-04-01 13:15:26 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2015-04-01 12:46:05 -0400, Robert Haas wrote:
> >> So, the idea is that INSTEAD OF would behave like BEFORE but the tuple
> >> it returns wouldn't actually be inserted?  That wasn't clear to me
> >> from the OP, but I guess it would be a reasonable way to go.
> 
> > I'm not sure what the OP intended, but to me that's pretty much the only
> > reasonable definition of INSTEAD OF for tables that I can think of.
> 
> If you have such a trigger, it's impossible to insert any rows, which
> means the table doesn't need storage, which means it may as well be a
> view, no?  So this still seems to me like a wart not a useful feature.
> I think it would create confusion because a table with such a trigger
> would act so much unlike other tables.

For one you can't easily add partitions to a view (and
constraint_exclusion = partition IIRC doesn't work if you use UNION ALL),
for another there's WHEN for triggers that should allow dealing with
that.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tables cannot have INSTEAD OF triggers

2015-04-01 Thread Alvaro Herrera
Tom Lane wrote:
> Andres Freund  writes:
> > On 2015-04-01 12:46:05 -0400, Robert Haas wrote:
> >> So, the idea is that INSTEAD OF would behave like BEFORE but the tuple
> >> it returns wouldn't actually be inserted?  That wasn't clear to me
> >> from the OP, but I guess it would be a reasonable way to go.
> 
> > I'm not sure what the OP intended, but to me that's pretty much the only
> > reasonable definition of INSTEAD OF for tables that I can think of.
> 
> If you have such a trigger, it's impossible to insert any rows, which
> means the table doesn't need storage, which means it may as well be a
> view, no?

The interesting difference, as per upthread, is that you can have child
tables (partitions) and don't need a defining query but instead have a
defined set of named and typed columns.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tables cannot have INSTEAD OF triggers

2015-04-01 Thread Tom Lane
Andres Freund  writes:
> On 2015-04-01 13:15:26 -0400, Tom Lane wrote:
>> If you have such a trigger, it's impossible to insert any rows, which
>> means the table doesn't need storage, which means it may as well be a
>> view, no?  So this still seems to me like a wart not a useful feature.
>> I think it would create confusion because a table with such a trigger
>> would act so much unlike other tables.

> For one you can't easily add partitions to a view (and
> constraint_exclusion = partition IIRC doesn't work if you use UNION ALL),
> for another there's WHEN for triggers that should allow dealing with
> that.

WHEN won't help; if there are any INSTEAD OF triggers, no insert will
happen, whether the triggers actually fire or not.

As for partitioning, you could do this:

create table parent(...);
create table child(...) inherits(parent); -- repeat as needed
create view v as select * from parent;
attach INSTEAD OF triggers to v

Now the application deals only with v, and thinks that's the real
table.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tables cannot have INSTEAD OF triggers

2015-04-01 Thread Andres Freund
On 2015-04-01 13:29:33 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2015-04-01 13:15:26 -0400, Tom Lane wrote:
> >> If you have such a trigger, it's impossible to insert any rows, which
> >> means the table doesn't need storage, which means it may as well be a
> >> view, no?  So this still seems to me like a wart not a useful feature.
> >> I think it would create confusion because a table with such a trigger
> >> would act so much unlike other tables.
> 
> > For one you can't easily add partitions to a view (and
> > constraint_exclusion = partition IIRC doesn't work if you use UNION ALL),
> > for another there's WHEN for triggers that should allow dealing with
> > that.
> 
> WHEN won't help; if there are any INSTEAD OF triggers, no insert will
> happen, whether the triggers actually fire or not.

Well, right now it doesn't work at all. It seems pretty reasonable to
define things so that the insert happens normally if there's no matching
INSTEAD OF trigger.

> As for partitioning, you could do this:
> 
> create table parent(...);
> create table child(...) inherits(parent); -- repeat as needed
> create view v as select * from parent;
> attach INSTEAD OF triggers to v
> 
> Now the application deals only with v, and thinks that's the real
> table.

Sure, but that's just making things unnecessarily hard. That then
requires also defining UPDATE/DELETE INSTEAD triggers which otherwise
would just work.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Auditing extension for PostgreSQL (Take 2)

2015-04-01 Thread David Steele
Hi Sawada,

On 3/25/15 9:24 AM, David Steele wrote:
> On 3/25/15 7:46 AM, Sawada Masahiko wrote:
>> 2.
>> I got ERROR when executing function uses cursor.
>>
>> 1) create empty table (hoge table)
>> 2) create test function as follows.
>>
>> create function test() returns int as $$
>> declare
>> cur1 cursor for select * from hoge;
>> tmp int;
>> begin
>> open cur1;
>> fetch cur1 into tmp;
>>return tmp;
>> end$$
>> language plpgsql ;
>>
>> 3) execute test function (got ERROR)
>> =# select test();
>> LOG:  AUDIT: SESSION,6,1,READ,SELECT,,,selecT test();
>> LOG:  AUDIT: SESSION,6,2,FUNCTION,EXECUTE,FUNCTION,public.test,selecT test();
>> LOG:  AUDIT: SESSION,6,3,READ,SELECT,,,select * from hoge
>> CONTEXT:  PL/pgSQL function test() line 6 at OPEN
>> ERROR:  pg_audit stack is already empty
>> STATEMENT:  selecT test();
>>
>> It seems like that the item in stack is already freed by deleting
>> pg_audit memory context (in MemoryContextDelete()),
>> before calling stack_pop in dropping of top-level Portal.

This has been fixed and I have attached a new patch.  I've seen this
with cursors before where the parent MemoryContext is freed before
control is returned to ProcessUtility.  I think that's strange behavior
but there's not a lot I can do about it.

The code I put in to deal with this situation was not quite robust
enough so I had to harden it a bit more.

Let me know if you see any other issues.

Thanks,
-- 
- David Steele
da...@pgmasters.net
diff --git a/contrib/Makefile b/contrib/Makefile
index 195d447..d8e75f4 100644
--- a/contrib/Makefile
+++ b/contrib/Makefile
@@ -29,6 +29,7 @@ SUBDIRS = \
pageinspect \
passwordcheck   \
pg_archivecleanup \
+   pg_audit\
pg_buffercache  \
pg_freespacemap \
pg_prewarm  \
diff --git a/contrib/pg_audit/Makefile b/contrib/pg_audit/Makefile
new file mode 100644
index 000..32bc6d9
--- /dev/null
+++ b/contrib/pg_audit/Makefile
@@ -0,0 +1,20 @@
+# pg_audit/Makefile
+
+MODULE = pg_audit
+MODULE_big = pg_audit
+OBJS = pg_audit.o
+
+EXTENSION = pg_audit
+
+DATA = pg_audit--1.0.0.sql
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = contrib/pg_audit
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/contrib/pg_audit/pg_audit--1.0.0.sql 
b/contrib/pg_audit/pg_audit--1.0.0.sql
new file mode 100644
index 000..9d9ee83
--- /dev/null
+++ b/contrib/pg_audit/pg_audit--1.0.0.sql
@@ -0,0 +1,22 @@
+/* pg_audit/pg_audit--1.0.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION pg_audit" to load this file.\quit
+
+CREATE FUNCTION pg_audit_ddl_command_end()
+   RETURNS event_trigger
+   LANGUAGE C
+   AS 'MODULE_PATHNAME', 'pg_audit_ddl_command_end';
+
+CREATE EVENT TRIGGER pg_audit_ddl_command_end
+   ON ddl_command_end
+   EXECUTE PROCEDURE pg_audit_ddl_command_end();
+
+CREATE FUNCTION pg_audit_sql_drop()
+   RETURNS event_trigger
+   LANGUAGE C
+   AS 'MODULE_PATHNAME', 'pg_audit_sql_drop';
+
+CREATE EVENT TRIGGER pg_audit_sql_drop
+   ON sql_drop
+   EXECUTE PROCEDURE pg_audit_sql_drop();
diff --git a/contrib/pg_audit/pg_audit.c b/contrib/pg_audit/pg_audit.c
new file mode 100644
index 000..b34df5a
--- /dev/null
+++ b/contrib/pg_audit/pg_audit.c
@@ -0,0 +1,1688 @@
+/*--
+ * pg_audit.c
+ *
+ * An auditing extension for PostgreSQL. Improves on standard statement logging
+ * by adding more logging classes, object level logging, and providing
+ * fully-qualified object names for all DML and many DDL statements (See
+ * pg_audit.sgml for details).
+ *
+ * Copyright (c) 2014-2015, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *   contrib/pg_audit/pg_audit.c
+ 
*--
+ */
+#include "postgres.h"
+
+#include "access/htup_details.h"
+#include "access/sysattr.h"
+#include "access/xact.h"
+#include "catalog/catalog.h"
+#include "catalog/objectaccess.h"
+#include "catalog/pg_class.h"
+#include "catalog/namespace.h"
+#include "commands/dbcommands.h"
+#include "catalog/pg_proc.h"
+#include "commands/event_trigger.h"
+#include "executor/executor.h"
+#include "executor/spi.h"
+#include "miscadmin.h"
+#include "libpq/auth.h"
+#include "nodes/nodes.h"
+#include "tcop/utility.h"
+#include "utils/acl.h"
+#include "utils/builtins.h"
+#include "utils/guc.h"
+#include "utils/lsyscache.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "utils/syscache.h"
+#include "utils/timestamp.h"
+
+PG_MODULE_MAGIC;
+
+void _PG_init(void);
+
+/*
+ * Event trigger prototypes
+ */
+Datum pg_audit_ddl_command_end(PG_FUNCTION_ARGS);
+Datum pg_audit_

Re: [HACKERS] Tables cannot have INSTEAD OF triggers

2015-04-01 Thread Tom Lane
Andres Freund  writes:
> On 2015-04-01 13:29:33 -0400, Tom Lane wrote:
>> WHEN won't help; if there are any INSTEAD OF triggers, no insert will
>> happen, whether the triggers actually fire or not.

> Well, right now it doesn't work at all. It seems pretty reasonable to
> define things so that the insert happens normally if there's no matching
> INSTEAD OF trigger.

It would absolutely *not* be reasonable for WHEN conditions for triggers
on tables to work completely differently than they do for triggers on
views.  That ship's sailed.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: row_to_array function

2015-04-01 Thread Merlin Moncure
On Sun, Mar 29, 2015 at 1:27 PM, Tom Lane  wrote:
> Pavel Stehule  writes:
>> here is rebased patch.
>> It contains both patches - row_to_array function and foreach array support.
>
> While I don't have a problem with hstore_to_array, I don't think that
> row_to_array is a very good idea; it's basically encouraging people to
> throw away SQL datatypes altogether and imagine that everything is text.
> They've already bought into that concept if they are using hstore or
> json, so smashing elements of those containers to text is not a problem.
> But that doesn't make this version a good thing.
>
> (In any case, those who insist can get there through row_to_json, no?)

You have a point.  What does attached do that to_json does not do
besides completely discard type information?  Our json api is pretty
rich and getting richer.  For better or ill, we dumped all json
support into the already stupendously bloated public namespace and so
it's always available.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Auditing extension for PostgreSQL (Take 2)

2015-04-01 Thread David Steele
On 3/23/15 12:40 PM, David Steele wrote:
> On 3/23/15 1:31 AM, Abhijit Menon-Sen wrote:
>> I'm experimenting with a few approaches to do this without reintroducing
>> switch statements to test every command. That will require core changes,
>> but I think we can find an acceptable arrangement. I'll post a proof of
>> concept in a few days.

Any progress on the POC?  I'm interested in trying to get the ROLE class
back in before the Commitfest winds up, but not very happy with my
current string-matching options.

>>> + * Takes an AuditEvent and, if it log_check(), writes it to the audit
>>> log.
>>
>> I don't think log_check is the most useful name, because this sentence
>> doesn't tell me what the function may do. Similarly, I would prefer to
>> have log_acl_check be renamed acl_grants_audit or similar. (These are
>> all static functions anyway, I don't think a log_ prefix is needed.)
> 
> log_check() has become somewhat vestigial at this point since it is only
> called from one place - I've been considering removing it and merging
> into log_audit_event().  For the moment I've improved the comments.

log_check() got rolled into log_audit_event().

> I like acl_grants_audit() and agree that it's a clearer name.  I'll
> incorporate that into the next version and apply the same scheme to the
> other ACL functionsas well as do a general review of naming.

I ended up going with audit_on_acl, audit_on_relation, etc. and reworked
some of the other function names.

I attached the v6 patch to my previous email, or you can find it on the
CF page.

-- 
- David Steele
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] How about to have relnamespace and relrole?

2015-04-01 Thread Andrew Dunstan


On 04/01/2015 12:53 PM, Tom Lane wrote:

Andrew Dunstan  writes:

On 04/01/2015 12:13 PM, Tom Lane wrote:

Andrew Dunstan  writes:

The only possible issue I see on reading the patches is that these are
treated differently for dependencies than other regFOO types. Rather
than create a dependency if a value is used in a default expression, an
error is raised if one is found. Are we OK with that?

Why would it be a good idea to act differently from the others?

I have no idea.
It was mentioned here

but nobody seems to have commented. I'm not sure why it was done like
this. Adding the dependencies seems to be no harder than raising the
exception. I think we can kick this back to the author to fix.

After a bit more thought it occurred to me that a dependency on a role
would need to be a shared dependency, and the existing infrastructure
for recordDependencyOnExpr() wouldn't support that.

I'm not sure that it's worth adding the complexity to allow shared
dependencies along with normal ones there.  This might be a reason
to reject the regrole part of the patch, as requiring more complexity
than it's worth.

But in any case I cannot see a reason to treat regnamespace differently
from the existing types on this point.




Good points.

I agree re namespace. And I also agree that shared dependency support is 
not worth the trouble, especially not just to support regrole. I'm  not 
sure that's a reason to reject regrole entirely, though. However, I also 
think there is a significantly less compelling case for it than for 
regnamespace, based on the number of times I have wanted each.


Anybody else have thoughts on this?

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tables cannot have INSTEAD OF triggers

2015-04-01 Thread Dean Rasheed
On 1 April 2015 at 18:37, Andres Freund  wrote:
> On 2015-04-01 13:29:33 -0400, Tom Lane wrote:
>> As for partitioning, you could do this:
>>
>> create table parent(...);
>> create table child(...) inherits(parent); -- repeat as needed
>> create view v as select * from parent;
>> attach INSTEAD OF triggers to v
>>
>> Now the application deals only with v, and thinks that's the real
>> table.
>
> Sure, but that's just making things unnecessarily hard. That then
> requires also defining UPDATE/DELETE INSTEAD triggers which otherwise
> would just work.
>

No, because as defined above the view v would be auto-updatable, so
updates and deletes on v would just do the matching update/delete on
parent.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug #10432 failed to re-find parent key in index

2015-04-01 Thread Heikki Linnakangas

On 03/31/2015 09:19 PM, Joshua D. Drake wrote:


On 03/31/2015 10:51 AM, Andres Freund wrote:


On 2015-03-31 10:49:06 -0700, Joshua D. Drake wrote:

On 03/31/2015 04:20 AM, Heikki Linnakangas wrote:

Perhaps we could consider it after a year or two, once 9.4 is indeed
very stable, but at that point you have to wonder if it's really worth
the trouble anymore. If someone has runs into that issue frequently, he
probably should just upgrade to 9.4.


Ouch. That is a really poor way to look at this.


Man.

Easy for you to say. You're not doing the work (which would be
significant in this case). You're not going to be blamed if the backport
breaks more things than it fixed.


I understand that. I am not picking on anyone. I am just saying that
looking at the problem this way is poor, which it is. We are saying as a
community: Your option to remove this data loss bug is to upgrade. That
is generally not how we approach things.


Hmm, I've never considered this to be a data loss bug. I guess you can 
view it that way: if you have a standby following the master, and the 
master fails so that you fail over to the standby, the standby will 
refuse to start up because of this, so you can't access the data. 
However, the table itself is OK, it's just the index that's corrupt. 
You'll need some hackery to force the system out of standby mode, but 
it's not like the data has been overwritten and lost forever.


Greg Stark suggested downgrading the error to warning during recovery 
mode, so that the error would not prevent you from starting up the 
system. That makes a lot of sense, I think we should do that in the 
back-branches.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] printing table in asciidoc with psql

2015-04-01 Thread Bruce Momjian
On Tue, Mar 31, 2015 at 05:06:49PM -0400, Bruce Momjian wrote:
> Uh, you broke asciidoctor 1.5.2.   ;-)  LOL
> 
> I installed the Asciidoctor Firefox plugin:

Asciidoctor has confirmed they have a bug and hope to fix it in their
next release:


http://discuss.asciidoctor.org/Problem-with-table-heading-ending-in-a-pipe-tp2902p2916.html

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [pgsql-pkg-debian] Updated libpq5 packages cause connection errors on postgresql 9.2

2015-04-01 Thread Christoph Berg
Re: Bruce Momjian 2015-04-01 <20150401160907.gj4...@momjian.us>
> On Sat, Dec 20, 2014 at 12:27:05PM +0100, Magnus Hagander wrote:
> > I haven't seen a specific number, it might depend on exactly which cipher is
> > negotiated. See for example http://openssl.6102.n7.nabble.com/
> > What-is-the-reason-for-error-quot-SSL-negotiation-failed-error-04075070-rsa-routines-RSA-sign-digest-td43953.html
> > 
> > All references I have foud say at least 2014 is safe and 512 is broken, but
> > there are points in betwee nthat apparently works in some cases only.
> > 
> > I think if we say "use 1024 bits or more" we err on the safe side. 
> 
> Did we ever decide on this?

The question seems to be if we want to recommend "1024 or more" or
something more sophisticated like "use something between 512 and 1024
but ymmv  1024 should work in any case". Given that more bits
should be more secure, and 1024 shouldn't pose a performance problem
for anyone, going for the short version shouldn't do any harm.

Christoph
-- 
c...@df7cb.de | http://www.df7cb.de/


signature.asc
Description: Digital signature


Re: [HACKERS] Selectivity estimation for inet operators

2015-04-01 Thread Tom Lane
Emre Hasegeli  writes:
> [ inet-selfuncs-v14.patch ]

After further reflection I concluded that the best way to deal with the
O(N^2) runtime problem for the join selectivity function was to set a
limit on the number of statistics values we'd consider, as was discussed
awhile back IIRC.  We can easily consider just the first N values of the
MCV arrays, since those are sorted by decreasing frequency anyway.  For
the histogram arrays, taking every K'th value seems like the thing to do.

I made the limit be 1024 elements as that seemed to give an acceptable
maximum runtime (a couple hundred msec on my machine).  We could probably
reduce that if anyone feels the max runtime needs to be less.

I had to drop the idea of breaking out of the histogram loop early as that
didn't play nicely with the decimation logic, unfortunately.

Anyway, pushed.  Thanks for your perseverance on this!

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade needs postmaster [sic]

2015-04-01 Thread Bruce Momjian
On Mon, Dec 22, 2014 at 06:18:35PM -0500, Bruce Momjian wrote:
> On Mon, Dec 22, 2014 at 11:48:52PM +0100, Christoph Berg wrote:
> > Hi,
> > 
> > I've played with trying to find out which minimal set of files I need
> > from the old version to make pg_upgrade work. Interestingly, this
> > includes the good old postmaster binary:
> > 
> > $ sudo -u postgres pgsql/bin/pg_upgrade -b /var/tmp/pgsql/bin/ -B 
> > /usr/lib/postgresql/9.5/bin/ -d /etc/postgresql/9.5/main -D /tmp/9.5/data
> > Finding the real data directory for the old cluster sh: 1: 
> > /var/tmp/pgsql/bin/postmaster: not found
> > 
> > Could not get data directory using "/var/tmp/pgsql/bin/postmaster" -D 
> > "/etc/postgresql/9.5/main" -C data_directory: No such file or directory
> > Failure, exiting
> > 
> > I think it should just use "postgres" there, patch attached. (If we
> > need to be compatible with postmaster-only PG versions from the last
> > century, it should try both names.)
> 
> Yes, I think you are right.  I see pg_ctl using the 'postgres' binary,
> so pg_upgrade should do the same.  I will apply this patch soon, and see
> if there are any other 'postmaster' mentions that need updating.

Patch applied.  Thanks.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Something is rotten in the state of Denmark...

2015-04-01 Thread Tom Lane
I wrote:
> Observe these recent buildfarm failures:
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mule&dt=2015-03-21%2000%3A30%3A02
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=guaibasaurus&dt=2015-03-23%2004%3A17%3A01
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mule&dt=2015-03-31%2023%3A30%3A02

> Three similar-looking failures, on two different machines, in a regression
> test that has existed for less than three weeks.  Something is very wrong.

I've been able to reproduce this.  The triggering event seems to be that
the "VACUUM FULL pg_am" in vacuum.sql has to happen while another backend
is starting up.  With a ten-second delay inserted at the bottom of
PerformAuthentication(), it's trivial to hit it manually.  The reason we'd
not seen this before the rolenames.sql test was added is that none of the
other tests that run concurrently with vacuum.sql perform mid-test
reconnections, or ever have AFAIR.  So as long as they all managed to
start up before vacuum.sql got to the dangerous step, no problem.

I've not fully tracked it down, but I think that the blame falls on the
MVCC-snapshots-for-catalog-scans patch; it appears that it's trying to
read pg_am's pg_class entry with a snapshot that's too old, possibly
because it assumes that sinval signaling is alive which I think ain't so.

For even more fun, try "VACUUM FULL pg_class" instead:

psql: PANIC:  could not open critical system index 2662

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] POLA violation with \c service=

2015-04-01 Thread Alvaro Herrera
I have pushed this after some rework.  For instance, the 9.0 and 9.1
versions believed that URIs were accepted, but that stuff was introduced
in 9.2.  I changed some other minor issues -- I hope not to have broken
too many other things in the process.  Please give the whole thing a
look, preferrably in all branches, and let me know if I've broken
something in some horrible way.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] POLA violation with \c service=

2015-04-01 Thread David Fetter
On Wed, Apr 01, 2015 at 08:13:02PM -0300, Alvaro Herrera wrote:
> I have pushed this after some rework.  For instance, the 9.0 and 9.1
> versions believed that URIs were accepted, but that stuff was introduced
> in 9.2.  I changed some other minor issues -- I hope not to have broken
> too many other things in the process.  Please give the whole thing a
> look, preferrably in all branches, and let me know if I've broken
> something in some horrible way.

Thanks for taking the time to do this.  Will test.  I'm a little
unsure as to how regression tests involving different hosts might
work, but I'll see what I can do.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TABLESAMPLE patch

2015-04-01 Thread Petr Jelinek

On 01/04/15 18:38, Robert Haas wrote:

On Wed, Apr 1, 2015 at 12:15 PM, Petr Jelinek  wrote:

REPEATABLE is mandated by standard. I did try for quite some time to make it
unreserved but was not successful (I can only make it unreserved if I make
it mandatory but that's not a solution). I haven't been in fact even able to
find out what it actually conflicts with...


Yeah, that can be hard to figure out.  Did you run bison with -v and
poke around in gram.output?



Oh, no I didn't (I didn't know gram.output will be generated). This 
helped quite a bit. Thanks.


I now found the reason, it's conflicting with alias but that's my 
mistake - alias should be before TABLESAMPLE clause as per standard and 
I put it after in parser. Now that I put it at correct place REPEATABLE 
can be unreserved keyword. This change requires making TABLESAMPLE a 
"type_func_name_keyword" but that's probably not really an issue as 
TABLESAMPLE is reserved keyword per standard.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The return value of allocate_recordbuf()

2015-04-01 Thread Bruce Momjian
On Thu, Feb 12, 2015 at 04:02:52PM +0900, Michael Paquier wrote:
> Yes, why not using palloc_extended instead of palloc_noerror that has been
> clearly rejected in the other thread. Now, for palloc_extended we should copy
> the flags of MemoryContextAllocExtended to fe_memutils.h and have the same
> interface between frontend and backend (note that MCXT_ALLOC_HUGE has no real
> utility for frontends). Attached is a patch to achieve that, completed with a
> second patch for the problem related to this thread. Note that in the second
> patch I have added an ERROR in logical.c after calling XLogReaderAllocate, 
> this
> was missing..
> 
> 
> > Btw, the huge flag should be used as well as palloc only allows
> > allocation up to 1GB, and this is incompatible with ~9.4 where malloc
> > is used.
> 
> I think that flag should be used only if it's needed, not just on the
> grounds that 9.4 has no such limit.  In general, limiting allocations
> to 1GB has been good to us; for example, it prevents a corrupted TOAST
> length from causing a memory allocation large enough to crash the
> machine (yes, there are machines you can crash with a giant memory
> allocation!).  We should bypass that limit only where it is clearly
> necessary.
> 
> 
> Fine for me to error out in this code path if we try more than 1GB of
> allocation.

Where are we on this?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The return value of allocate_recordbuf()

2015-04-01 Thread Michael Paquier
On Thu, Apr 2, 2015 at 9:10 AM, Bruce Momjian  wrote:

> Where are we on this?
>

If we want to have allocate_recordbuf error out properly on frontend side,
we are going to need a equivalent of MemoryContextAllocExtended for
frontends in the shape of palloc_extended able to take control flags.
That's what the patch I sent previously proposes. And this is 9.5 material,
except if we accept that allocate_recordbuf() will fail all the time in
case of an OOM (the only code path where we don't need to mandatory fail
with OOM for this routine is used only with WAL_DEBUG in xlog.c). Now if we
push forward with this patch, and I think we should to maintain
compatibility with WAL_DEBUG with previous versions, we'll need to add as
well an ERROR when an OOM occurs after XLogReaderAllocate in logical.c, and
in pg_rewind's parsexlog.c.
-- 
Michael


Re: [HACKERS] Relation extension scalability

2015-04-01 Thread Jim Nasby

On 3/30/15 10:48 PM, Amit Kapila wrote:

 > If we're able to extend based on page-level locks rather than the global
 > relation locking that we're doing now, then I'm not sure we really need
 > to adjust how big the extents are any more.  The reason for making
 > bigger extents is because of the locking problem we have now when lots
 > of backends want to extend a relation, but, if I'm following correctly,
 > that'd go away with Andres' approach.
 >

The benefit of extending in bigger chunks in background is that backend
would need to perform such an operation at relatively lesser frequency
which in itself could be a win.


The other potential advantage (and I have to think this could be a BIG 
advantage) is extending by a large amount makes it more likely you'll 
get contiguous blocks on the storage. That's going to make a big 
difference for SeqScan speed. It'd be interesting if someone with access 
to some real systems could test that. In particular, seqscan of a 
possibly fragmented table vs one of the same size but created at once. 
For extra credit, compare to dd bs=8192 of a file of the same size as 
the overall table.


What I've seen in the real world is very, very poor SeqScan performance 
of tables that were relatively large. So bad that I had to SeqScan 8-16 
tables in parallel to max out the IO system the same way I could with a 
single dd bs=8k of a large file (in this case, something like 480MB/s). 
A single SeqScan would only do something like 30MB/s.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] POLA violation with \c service=

2015-04-01 Thread Alvaro Herrera
David Fetter wrote:
> On Wed, Apr 01, 2015 at 08:13:02PM -0300, Alvaro Herrera wrote:
> > I have pushed this after some rework.  For instance, the 9.0 and 9.1
> > versions believed that URIs were accepted, but that stuff was introduced
> > in 9.2.  I changed some other minor issues -- I hope not to have broken
> > too many other things in the process.  Please give the whole thing a
> > look, preferrably in all branches, and let me know if I've broken
> > something in some horrible way.
> 
> Thanks for taking the time to do this.  Will test.  I'm a little
> unsure as to how regression tests involving different hosts might
> work, but I'll see what I can do.

Well, contrib/dblink is failing all over the place, for one thing.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_rewind tests

2015-04-01 Thread Michael Paquier
On Tue, Mar 31, 2015 at 4:37 PM, Michael Paquier 
wrote:

> While looking at that I noticed two additional issues:
> - In remote mode, the connection string to the promoted standby was
> incorrect when running pg_rewind, leading to connection errors
> - At least in my environment, a sleep of 1 after the standby promotion was
> not sufficient to make the tests work.
>

While working on another patch for TAP tests, I noticed that relying on
environment variables to run tests is a bad idea as well, as other tests do
not do it, so instead is a patch that refactors the tests so as they do not
use environment variables and so as it is not necessary to pass arguments
to prove.
The trick is to use sub-routines in each test, and invoke this subroutine
for both 'local' and 'remote'. This changes the order the tests are run,
but I guess that's not a big deal as long as the tests are run, and this
approach looks more solid to me as it makes pg_rewind's tests more
consistent with the rest. The log files of each test are still separated
the same way as before.
Regards,
-- 
Michael
From 1892bc680313cc5ab0044357842a04c853ce1cec Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Thu, 2 Apr 2015 10:46:47 +0900
Subject: [PATCH] Refactor TAP tests of pg_rewind

This removes the dependency on passing arguments to prove, something
incompatible with perl 5.8, the minimum requirement supported by the
in-core TAP tests.

Clean up at the same time a couple of issues:
- RewindTest.pm should use strict and warnings, fixing at the same
  time the multiple warnings reported.
- In remote mode, the connection string to the promoted standby was
  incorrect when running pg_rewind, leading to connection errors.
- a sleep of 1 after standby promotion may not be sufficient in some
  environments, leading to failures.
---
 src/bin/pg_rewind/Makefile|   5 +-
 src/bin/pg_rewind/RewindTest.pm   |  24 +---
 src/bin/pg_rewind/t/001_basic.pl  | 105 ++---
 src/bin/pg_rewind/t/002_databases.pl  |  46 +--
 src/bin/pg_rewind/t/003_extrafiles.pl | 108 +++---
 5 files changed, 159 insertions(+), 129 deletions(-)

diff --git a/src/bin/pg_rewind/Makefile b/src/bin/pg_rewind/Makefile
index efd4988..e3400f5 100644
--- a/src/bin/pg_rewind/Makefile
+++ b/src/bin/pg_rewind/Makefile
@@ -47,6 +47,5 @@ clean distclean maintainer-clean:
 	rm -f pg_rewind$(X) $(OBJS) xlogreader.c
 	rm -rf tmp_check regress_log
 
-check: all
-	$(prove_check) :: local
-	$(prove_check) :: remote
+check:
+	$(prove_check)
diff --git a/src/bin/pg_rewind/RewindTest.pm b/src/bin/pg_rewind/RewindTest.pm
index 0f8f4ca..7a20e79 100644
--- a/src/bin/pg_rewind/RewindTest.pm
+++ b/src/bin/pg_rewind/RewindTest.pm
@@ -29,6 +29,9 @@ package RewindTest;
 # master and standby servers. The data directories are also available
 # in paths $test_master_datadir and $test_standby_datadir
 
+use strict;
+use warnings;
+
 use TestLib;
 use Test::More;
 
@@ -58,8 +61,8 @@ our @EXPORT = qw(
 
 # Adjust these paths for your environment
 my $testroot = "./tmp_check";
-$test_master_datadir="$testroot/data_master";
-$test_standby_datadir="$testroot/data_standby";
+our $test_master_datadir="$testroot/data_master";
+our $test_standby_datadir="$testroot/data_standby";
 
 mkdir $testroot;
 
@@ -73,8 +76,8 @@ my $port_standby=$port_master + 1;
 my $log_path;
 my $tempdir_short;
 
-$connstr_master="port=$port_master";
-$connstr_standby="port=$port_standby";
+my $connstr_master="port=$port_master";
+my $connstr_standby="port=$port_standby";
 
 $ENV{PGDATABASE} = "postgres";
 
@@ -127,7 +130,8 @@ sub append_to_file
 
 sub init_rewind_test
 {
-	($testname, $test_mode) = @_;
+	my $testname = shift;
+	my $test_mode = shift;
 
 	$log_path="regress_log/pg_rewind_log_${testname}_${test_mode}";
 
@@ -195,11 +199,13 @@ sub promote_standby
 	# Now promote slave and insert some new data on master, this will put
 	# the master out-of-sync with the standby.
 	system_or_bail("pg_ctl -w -D $test_standby_datadir promote >>$log_path 2>&1");
-	sleep 1;
+	sleep 2;
 }
 
 sub run_pg_rewind
 {
+	my $test_mode = shift;
+
 	# Stop the master and be ready to perform the rewind
 	system_or_bail("pg_ctl -w -D $test_master_datadir stop -m fast >>$log_path 2>&1");
 
@@ -212,7 +218,7 @@ sub run_pg_rewind
 	# overwritten during the rewind.
 	copy("$test_master_datadir/postgresql.conf", "$testroot/master-postgresql.conf.tmp");
 	# Now run pg_rewind
-	if ($test_mode == "local")
+	if ($test_mode eq "local")
 	{
 		# Do rewind using a local pgdata as source
 		# Stop the master and be ready to perform the rewind
@@ -225,12 +231,12 @@ sub run_pg_rewind
 '>>', $log_path, '2>&1');
 		ok ($result, 'pg_rewind local');
 	}
-	elsif ($test_mode == "remote")
+	elsif ($test_mode eq "remote")
 	{
 		# Do rewind using a remote connection as source
 		my $result =
 			run(['./pg_rewind',
- "--source-server=\"port=$port_standby dbname=postgres\"",
+ "--source-server

Re: [HACKERS] POLA violation with \c service=

2015-04-01 Thread Michael Paquier
On Thu, Apr 2, 2015 at 9:38 AM, Alvaro Herrera  wrote:
>
> David Fetter wrote:
> > On Wed, Apr 01, 2015 at 08:13:02PM -0300, Alvaro Herrera wrote:
> > > I have pushed this after some rework.  For instance, the 9.0 and 9.1
> > > versions believed that URIs were accepted, but that stuff was introduced
> > > in 9.2.  I changed some other minor issues -- I hope not to have broken
> > > too many other things in the process.  Please give the whole thing a
> > > look, preferrably in all branches, and let me know if I've broken
> > > something in some horrible way.
> >
> > Thanks for taking the time to do this.  Will test.  I'm a little
> > unsure as to how regression tests involving different hosts might
> > work, but I'll see what I can do.
>
> Well, contrib/dblink is failing all over the place, for one thing.


OSX and Windows builds are broken as well, libpq missing dependencies
with connstrings.c. Sent a patch is here FWIW:
http://www.postgresql.org/message-id/cab7npqto1fn7inxaps2exdy4wxbncwnooaweptt-jvorli3...@mail.gmail.com
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sloppy SSPI error reporting code

2015-04-01 Thread Bruce Momjian
On Sat, Jan 10, 2015 at 02:53:13PM -0500, Tom Lane wrote:
> While looking at fe-auth.c I noticed quite a few places that weren't
> bothering to make error messages localizable (ie, missing libpq_gettext
> calls), and/or were failing to add a trailing newline as expected in
> libpq error messages.  Perhaps these are intentional but I doubt it.
> Most of the instances seemed to be SSPI-related.
> 
> I have no intention of fixing these myself, but whoever committed that
> code should take a second look.

I looked through that file and only found two cases;  patch attached.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/src/interfaces/libpq/fe-auth.c b/src/interfaces/libpq/fe-auth.c
new file mode 100644
index 8927df4..c267f72
*** a/src/interfaces/libpq/fe-auth.c
--- b/src/interfaces/libpq/fe-auth.c
*** pg_SSPI_continue(PGconn *conn)
*** 333,339 
  			 * authentication. Keep check in case it shows up with other
  			 * authentication methods later.
  			 */
! 			printfPQExpBuffer(&conn->errorMessage, "SSPI returned invalid number of output buffers\n");
  			return STATUS_ERROR;
  		}
  
--- 333,339 
  			 * authentication. Keep check in case it shows up with other
  			 * authentication methods later.
  			 */
! 			printfPQExpBuffer(&conn->errorMessage, libpq_gettext("SSPI returned invalid number of output buffers\n"));
  			return STATUS_ERROR;
  		}
  
*** pg_fe_sendauth(AuthRequest areq, PGconn
*** 691,697 
  			if (pg_password_sendauth(conn, conn->pgpass, areq) != STATUS_OK)
  			{
  printfPQExpBuffer(&conn->errorMessage,
! 	 "fe_sendauth: error sending password authentication\n");
  return STATUS_ERROR;
  			}
  			break;
--- 691,697 
  			if (pg_password_sendauth(conn, conn->pgpass, areq) != STATUS_OK)
  			{
  printfPQExpBuffer(&conn->errorMessage,
! 	 libpq_gettext("fe_sendauth: error sending password authentication\n"));
  return STATUS_ERROR;
  			}
  			break;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] POLA violation with \c service=

2015-04-01 Thread David Fetter
On Thu, Apr 02, 2015 at 11:46:53AM +0900, Michael Paquier wrote:
> On Thu, Apr 2, 2015 at 9:38 AM, Alvaro Herrera  
> wrote:
> >
> > David Fetter wrote:
> > > On Wed, Apr 01, 2015 at 08:13:02PM -0300, Alvaro Herrera wrote:
> > > > I have pushed this after some rework.  For instance, the 9.0
> > > > and 9.1 versions believed that URIs were accepted, but that
> > > > stuff was introduced in 9.2.  I changed some other minor
> > > > issues -- I hope not to have broken too many other things in
> > > > the process.  Please give the whole thing a look, preferrably
> > > > in all branches, and let me know if I've broken something in
> > > > some horrible way.
> > >
> > > Thanks for taking the time to do this.  Will test.  I'm a little
> > > unsure as to how regression tests involving different hosts
> > > might work, but I'll see what I can do.
> >
> > Well, contrib/dblink is failing all over the place, for one thing.
> 
> 
> OSX and Windows builds are broken as well, libpq missing
> dependencies with connstrings.c. Sent a patch is here FWIW:
> http://www.postgresql.org/message-id/cab7npqto1fn7inxaps2exdy4wxbncwnooaweptt-jvorli3...@mail.gmail.com

I haven't checked yet, but could this be because people aren't using
--enable-depend with ./configure ?

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Relation extension scalability

2015-04-01 Thread Amit Langote
On 02-04-2015 AM 09:24, Jim Nasby wrote:
> The other potential advantage (and I have to think this could be a BIG
> advantage) is extending by a large amount makes it more likely you'll get
> contiguous blocks on the storage. That's going to make a big difference for
> SeqScan speed. It'd be interesting if someone with access to some real systems
> could test that. In particular, seqscan of a possibly fragmented table vs one
> of the same size but created at once. For extra credit, compare to dd bs=8192
> of a file of the same size as the overall table.
> 

Orthogonal to topic of the thread but this comment made me recall a proposal
couple years ago[0] to add (posix_)fallocate to mdextend(). Wonder if it helps
the case?

Amit

[0]
http://www.postgresql.org/message-id/cadupchw1pomsunonmdvawltq-a3x_a3zqmusjhs4rcexipg...@mail.gmail.com




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Relation extension scalability

2015-04-01 Thread Stephen Frost
* Amit Langote (langote_amit...@lab.ntt.co.jp) wrote:
> On 02-04-2015 AM 09:24, Jim Nasby wrote:
> > The other potential advantage (and I have to think this could be a BIG
> > advantage) is extending by a large amount makes it more likely you'll get
> > contiguous blocks on the storage. That's going to make a big difference for
> > SeqScan speed. It'd be interesting if someone with access to some real 
> > systems
> > could test that. In particular, seqscan of a possibly fragmented table vs 
> > one
> > of the same size but created at once. For extra credit, compare to dd 
> > bs=8192
> > of a file of the same size as the overall table.
> 
> Orthogonal to topic of the thread but this comment made me recall a proposal
> couple years ago[0] to add (posix_)fallocate to mdextend(). Wonder if it helps
> the case?

As I recall, it didn't, and further, modern filesystems are pretty good
about avoiding fragmentation anyway..

I'm not saying Jim's completely off-base with this idea, I'm just not
sure that it'll really buy us much.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] authentication_timeout ineffective for replication connections

2015-04-01 Thread Bruce Momjian
On Tue, Jan 13, 2015 at 03:29:04PM +0100, Andres Freund wrote:
> Hi,
> 
> I just noticed that authentication_timeout is ineffective for
> replication=true type connections. That's because walsender doesn't
> register a SIGINT handler and authentication_timeout relies on having
> one.
> 
> There's no problem with reading the initial startup packet
> (ProcessStartupPacket/BackendInitialize) because we use a separate
> handler there. But once that's done, before finishing authentication,
> WalSndSignals() will have set SIGINT's handler to SIG_IGN.
> 
> Demo python program attached. You'll only see the problem if the
> authentication method requires a password/addititional packets.
> 
> I think we could fix this by simply mapping SIGINT to die() instead
> SIG_IGN.

What is the status on this?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Additional role attributes && superuser review

2015-04-01 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > REVOKE'ing access *without* removing the permissions checks would defeat
> > the intent of these changes, which is to allow an administrator to grant
> > the ability for a certain set of users to cancel and/or terminate
> > backends started by other users, without also granting those users
> > superuser rights.
> 
> I see: we have two different use-cases and no way for GRANT/REVOKE
> to manage both cases using permissions on a single object.  Carry
> on then.

Alright, after going about this three or four different ways, I've
settled on the approach proposed in the attached patch.  Most of it is
pretty straight-forward: drop the hard-coded check in the function
itself and REVOKE EXECUTE on the function from PUBLIC.  The interesting
bits are those pieces which are more complex than the "all-or-nothing"
cases.

This adds a few new SQL-level functions which return unfiltered results,
if you're allowed to call them based on the GRANT system (and EXECUTE
privileges for them are REVOKE'd from PUBLIC, of course).  These are:
pg_stat_get_activity_all, pg_stat_get_wal_senders_all, and
pg_signal_backend (which is similar but not the same- that allows you to
send signals to other backends as a regular user, if you are allowed to
call that function and the other backend wasn't started by a superuser).

There are two new views added, which simply sit on top of the new
functions: pg_stat_activity_all and pg_stat_replication_all.

Minimizing the amount of code duplication wasn't too hard with the
pg_stat_get_wal_senders case; as it was already returning a tuplestore
and so I simply moved most of the logic into a "helper" function which
handled populating the tuplestore and then each call path was relatively
small and mostly boilerplate.  pg_stat_get_activity required a bit more
in the way of changes, but they were essentially to modify it to return
a tuplestore like pg_stat_get_wal_senders, and then add in the extra
function and boilerplate for the non-filtered path.

I had considered (and spent a good bit of time implementing...) a number
of alternatives, mostly around trying to do more at the SQL level when
it came to filtering, but that got very ugly very quickly.  There's no
simple way to find out "what was the effective role of the caller of
this function" from inside of a security definer function (which would
be required for the filtering, as it would need to be able to call the
function underneath).  This is necessary, of course, because functions
in views run as the caller of the view, not as the view owner (that's
useful in some cases, less useful in others).  I looked at exposing the
information about the effective role which was calling a function, but
that got pretty ugly too.  The GUC system has a stack, but we don't
actually update the 'role' GUC when we are in security definer
functions.  There's the notion of an "outer" role ID, but that doesn't
account for intermediate security definer functions and so, while it's
fine for what it does, it wasn't really helpful in this case.

I do still think it'd be nice to provide that information and perhaps we
can do so with fmgr_security_definer(), but it's beyond what's really
needed here and I don't think it'd end up being particularly cleaner to
do it all in SQL now that I've refactored pg_stat_get_activity.

I'd further like to see the ability to declare on either a function call
by function call basis (inside a view defintion), of even at the view
level (as that would allow me to build up multiple views with different
parameters) "who to run this function as", where the options would be
"view owner" or "view querier", very similar to our SECURITY DEFINER vs.
SECURITY INVOKER options for functions today.  This is interesting
because, more and more, we're building functions which are actually
returning data sets, not individual values, and we want to filter those
sets sometimes and not other times.  In any case, those are really
thoughts for the future and get away from what this is all about, which
is reducing the need for monitoring and/or "regular" admins to have the
"superuser" bit when they don't really need it.

Clearly, further testing and documentation is required and I'll be
getting to that over the next couple of days, but it's pretty darn late
and I'm currently getting libpq undefined reference errors, probably
because I need to set LD_LIBRARY_PATH, but I'm just too tired to now. :)

Thoughts?

Thanks!

Stephen
diff --git a/contrib/test_decoding/expected/permissions.out b/contrib/test_decoding/expected/permissions.out
new file mode 100644
index 212fd1d..538ebdc
*** a/contrib/test_decoding/expected/permissions.out
--- b/contrib/test_decoding/expected/permissions.out
*** SET synchronous_commit = on;
*** 4,9 
--- 4,16 
  CREATE ROLE lr_normal;
  CREATE ROLE lr_superuser SUPERUSER;
  CREATE ROLE lr_replication REPLICATION;
+ GRANT EXECUTE ON FUN

Re: [HACKERS] Sloppy SSPI error reporting code

2015-04-01 Thread Noah Misch
On Wed, Apr 01, 2015 at 10:49:01PM -0400, Bruce Momjian wrote:
> On Sat, Jan 10, 2015 at 02:53:13PM -0500, Tom Lane wrote:
> > While looking at fe-auth.c I noticed quite a few places that weren't
> > bothering to make error messages localizable (ie, missing libpq_gettext
> > calls), and/or were failing to add a trailing newline as expected in
> > libpq error messages.  Perhaps these are intentional but I doubt it.
> > Most of the instances seemed to be SSPI-related.
> > 
> > I have no intention of fixing these myself, but whoever committed that
> > code should take a second look.
> 
> I looked through that file and only found two cases;  patch attached.

Tom mentioned newline omissions, which you'll find in pg_SSPI_error().

> ! printfPQExpBuffer(&conn->errorMessage, 
> libpq_gettext("SSPI returned invalid number of output buffers\n"));
> !  libpq_gettext("fe_sendauth: error 
> sending password authentication\n"));

The first message is too internals-focused for a translation to be useful.  If
it were in the backend, we would use elog() and not translate.  The second is
a non-actionable message painting over a wide range of specific failures;
translating it would just put lipstick on the pig.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-04-01 Thread Amit Kapila
On Wed, Apr 1, 2015 at 8:18 PM, Robert Haas  wrote:
>
> >> I don't think you need to do anything that complicated.  I'm not
> >> proposing to *run* the initPlan in the workers, just to pass the
> >> parameter values down.
> >
> > Sorry, but I am not able to understand how it will help if just
parameters
> > (If I understand correctly you want to say about Bitmapset  *extParam;
> > Bitmapset  *allParam; in Plan structure) are passed to workers and I
> > think they are already getting passed only initPlan and related Subplan
> > in planned statement is not passed and the reason is that
ss_finalize_plan()
> > attaches initPlan to top node (which in this case is Funnel node and not
> > PartialSeqScan)
> >
> > By any chance, do you mean that we run the part of the statement in
> > workers and then run initPlan in master backend?
>
> If I'm not confused, it would be the other way around.  We would run
> the initPlan in the master backend *first* and then the rest in the
> workers.
>

Either one of us is confused, let me try to describe my understanding in
somewhat more detail.  Let me try to explain w.r.t the tab completion
query [1].  In this, the initPlan is generated for Qualification expression
[2], so it will be executed during qualification and the callstack will
look like:

  postgres.exe!ExecSeqScan(ScanState * node=0x0c33bce8)  Line 113 C
  postgres.exe!ExecProcNode(PlanState * node=0x0c33bce8)  Line 418
+ 0xa bytes C
  postgres.exe!ExecSetParamPlan(SubPlanState * node=0x0c343930,
ExprContext * econtext=0x0c33de50)  Line 1001 + 0xa bytes C
> postgres.exe!ExecEvalParamExec(ExprState * exprstate=0x0c33f980,
ExprContext * econtext=0x0c33de50, char *
isNull=0x0c33f481, ExprDoneCond * isDone=0x)  Line
 C
  postgres.exe!ExecMakeFunctionResultNoSets(FuncExprState *
fcache=0x0c33f0d0, ExprContext * econtext=0x0c33de50, char
* isNull=0x0042f1c8, ExprDoneCond * isDone=0x)
 Line 1992 + 0x2d bytes C
  postgres.exe!ExecEvalOper(FuncExprState * fcache=0x0c33f0d0,
ExprContext * econtext=0x0c33de50, char *
isNull=0x0042f1c8, ExprDoneCond * isDone=0x)  Line
2443 C
  postgres.exe!ExecQual(List * qual=0x0c33fa08, ExprContext *
econtext=0x0c33de50, char resultForNull=0)  Line 5206 + 0x1a bytes C
  postgres.exe!ExecScan(ScanState * node=0x0c33dd38, TupleTableSlot
* (ScanState *)* accessMtd=0x000140232940, char (ScanState *,
TupleTableSlot *)* recheckMtd=0x0001402329e0)  Line 195 + 0x1a bytes C
  postgres.exe!ExecSeqScan(ScanState * node=0x0c33dd38)  Line 114 C

Basically here initPlan is getting executed during Qualification.

So now the point I am not able to understand from your explanation
is that how the worker will perform qualification without the knowledge
of initPlan?

[1]
SELECT pg_catalog.quote_ident(c.relname) FROM pg_catalog.pg_class c WHERE
c.relkind IN ('r', 'S', 'v', 'm',
'f') AND substring(pg_catalog.quote_ident(c.relname),1,3)='pgb' AND
pg_catalog.pg_table_is_visible(c.oid) AND
c.relnamespace <> (SELECT oid FROM pg_catalog.pg_namespace WHERE nspname =
'pg_catalog') UNION SELECT
pg_catalog.quote_ident(n.nspname) || '.' FROM pg_catalog.pg_namespace n
WHERE substring
(pg_catalog.quote_ident(n.nspname) || '.',1,3)='pgb' AND (SELECT
pg_catalog.count(*) FROM
pg_catalog.pg_namespace WHERE substring(pg_catalog.quote_ident(nspname) ||
'.',1,3) = substring
('pgb',1,pg_catalog.length(pg_catalog.quote_ident(nspname))+1)) > 1 UNION
SELECT pg_catalog.quote_ident
(n.nspname) || '.' || pg_catalog.quote_ident(c.relname) FROM
pg_catalog.pg_class c, pg_catalog.pg_namespace n
WHERE c.relnamespace = n.oid AND c.relkind IN ('r', 'S', 'v', 'm', 'f') AND
substring(pg_catalog.quote_ident
(n.nspname) || '.' || pg_catalog.quote_ident(c.relname),1,3)='pgb' AND
substring(pg_catalog.quote_ident
(n.nspname) || '.',1,3) =
substring('pgb',1,pg_catalog.length(pg_catalog.quote_ident(n.nspname))+1)
AND
(SELECT pg_catalog.count(*) FROM pg_catalog.pg_namespace WHERE
substring(pg_catalog.quote_ident(nspname) ||
'.',1,3) =
substring('pgb',1,pg_catalog.length(pg_catalog.quote_ident(nspname))+1)) =
1 LIMIT 1000;

[2]
(SELECT pg_catalog.count(*) FROM pg_catalog.pg_namespace WHERE
substring(pg_catalog.quote_ident(nspname) ||
'.',1,3) =
substring('pgb',1,pg_catalog.length(pg_catalog.quote_ident(nspname))+1)

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


[HACKERS] Logical decoding (contrib/test_decoding) walsender broken in 9.5 master?

2015-04-01 Thread Craig Ringer
Hi all

It appears that logical decoding may be broken in 9.5 at the moment.



With HEAD at f6caf5a:

./configure --enable-debug --enable-cassert --prefix=/home/craig/pg/95
CFLAGS="-Og -ggdb -fno-omit-frame-pointer"

make clean install

make -C contrib/test_decoding clean install

PGPORT=5142 PATH=/home/craig/pg/95/bin:$PATH initdb -D ~/tmp/slottest95

PGPORT=5142 PATH=/home/craig/pg/95/bin:$PATH postgres -D ~/tmp/slottest95

and in another session:

psql -p 5142 -c 'SELECT pg_create_logical_replication_slot('test',
'test_decoding');'

in yet another:

PGPORT=5142 PATH=$HOME/pg/95/bin:$PATH pg_recvlogical -d postgres -S test
--start -f -

and back in the psql session do some work:

psql -p 5142 -c 'CREATE TABLE x AS SELECT xx FROM generate_series(1,1)
xx;'



This works fine in REL9_4_STABLE at a44e54c.

Decoding over the SQL protocol works fine, and "make check" in
contrib/test_decoding passes without errors. This issue only arises in
decoding in a walsender.

I haven't bisected it back to a specific change yet, I just wanted to give
early heads-up. Also, our testing clearly needs to cover logical decoding
over walsenders.

See attachment for the bt.



-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
Core was generated by `postgres: wal sender pr'.
Program terminated with signal SIGABRT, Aborted.
#0  0x7f4005aa28d7 in __GI_raise (sig=sig@entry=6) at 
../sysdeps/unix/sysv/linux/raise.c:55
55return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);

(gdb) bt
#0  0x7f4005aa28d7 in __GI_raise (sig=sig@entry=6) at 
../sysdeps/unix/sysv/linux/raise.c:55
#1  0x7f4005aa453a in __GI_abort () at abort.c:89
#2  0x008e3a00 in ExceptionalCondition (conditionName=0xae3528 
"!(!((&RegisteredSnapshots)->ph_root == ((void *)0)))", errorType=0xae323a 
"FailedAssertion", fileName=0xae3230 "snapmgr.c", lineNumber=677) at assert.c:54
#3  0x00929082 in UnregisterSnapshotFromOwner (snapshot=0x7f3ff5232f58, 
owner=0x1cd4698) at snapmgr.c:677
#4  0x0092901b in UnregisterSnapshot (snapshot=0x7f3ff5232f58) at 
snapmgr.c:663
#5  0x004c4b76 in systable_endscan (sysscan=0x1d1b210) at genam.c:504
#6  0x008cf3d7 in RelationBuildTupleDesc (relation=0x7f40066ffe78) at 
relcache.c:569
#7  0x008d055c in RelationBuildDesc (targetRelId=3455, insertIt=1 
'\001') at relcache.c:1036
#8  0x008d221c in RelationIdGetRelation (relationId=3455) at 
relcache.c:1778
#9  0x004aad94 in relation_open (relationId=3455, lockmode=1) at 
heapam.c:1047
#10 0x004c4efb in index_open (relationId=3455, lockmode=1) at 
indexam.c:167
#11 0x004c45ff in systable_beginscan (heapRelation=0x7f40067584e0, 
indexId=3455, indexOK=1 '\001', snapshot=0x0, nkeys=2, key=0x7ffcde1e7900) at 
genam.c:334
#12 0x008da269 in RelidByRelfilenode (reltablespace=0, 
relfilenode=1247) at relfilenodemap.c:204
#13 0x00750dd7 in ReorderBufferCommit (rb=0x1c62788, xid=1865, 
commit_lsn=25360464, end_lsn=25360872, commit_time=481271227439738) at 
reorderbuffer.c:1338
#14 0x0074bac7 in DecodeCommit (ctx=0x1c60778, buf=0x7ffcde1e7c80, 
parsed=0x7ffcde1e7bc0, xid=1865) at decode.c:494
#15 0x0074b43a in DecodeXactOp (ctx=0x1c60778, buf=0x7ffcde1e7c80) at 
decode.c:211
#16 0x0074b1a7 in LogicalDecodingProcessRecord (ctx=0x1c60778, 
record=0x1c60a38) at decode.c:100
#17 0x0075ad78 in XLogSendLogical () at walsender.c:2425
#18 0x0075a078 in WalSndLoop (send_data=0x75acd7 ) at 
walsender.c:1834
#19 0x007590f0 in StartLogicalReplication (cmd=0x1c59eb8) at 
walsender.c:997
#20 0x00759713 in exec_replication_command (cmd_string=0x1cb3108 
"START_REPLICATION SLOT \"test\" LOGICAL 0/0") at walsender.c:1326
#21 0x007aea57 in PostgresMain (argc=1, argv=0x1c40350, 
dbname=0x1c40240 "postgres", username=0x1c40220 "craig") at postgres.c:4021
#22 0x007354e3 in BackendRun (port=0x1c64900) at postmaster.c:4141
#23 0x00734c4b in BackendStartup (port=0x1c64900) at postmaster.c:3826
#24 0x0073150a in ServerLoop () at postmaster.c:1594
#25 0x00730b95 in PostmasterMain (argc=3, argv=0x1c3f410) at 
postmaster.c:1241
#26 0x00690e64 in main (argc=3, argv=0x1c3f410) at main.c:221

(gdb) bt full
#0  0x7f4005aa28d7 in __GI_raise (sig=sig@entry=6) at 
../sysdeps/unix/sysv/linux/raise.c:55
resultvar = 0
pid = 22806
selftid = 22806
#1  0x7f4005aa453a in __GI_abort () at abort.c:89
save_stage = 2
act = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, 
sa_mask = {__val = {140724035024416, 4590256, 139912960889648, 11420318053453, 
72057594037927936, 1, 30519912, 140724035024352, 9497950, 72057594068447848, 
2520, 0, 139912950442304, 0, 139912953362992, 139912953358080}}, sa_flags = 
98728496, sa_restorer = 0x7f4006734700}
sigs = {__val = {32, 0 }}
#2  0x008e3

Re: [HACKERS] Logical decoding (contrib/test_decoding) walsender broken in 9.5 master?

2015-04-01 Thread Michael Paquier
On Thu, Apr 2, 2015 at 3:48 PM, Craig Ringer  wrote:
> Also, our testing clearly needs to cover logical decoding over walsenders.

Noted.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers