Re: [HACKERS] Postgres_fdw join pushdown - getting server crash in left outer join of three table

2016-05-16 Thread Rajkumar Raghuwanshi
Thanks for the commit. I have tested it again. Not getting server crash now.

Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation

On Mon, May 16, 2016 at 9:38 PM, Robert Haas  wrote:

> On Fri, May 13, 2016 at 6:40 PM, Michael Paquier
>  wrote:
> > On Fri, May 13, 2016 at 11:14 PM, Robert Haas 
> wrote:
> >> So, barring objections, I intend to apply the attached fixup patch,
> >> which replaces Michael's logic with Ashutosh's logic and rewrites the
> >> comments such to be much more explicit.
> >
> > Re-oops. I didn't check what was committed to be honest. And it should
> > not have been my version, definitely.
>
> OK, committed.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


Re: [HACKERS] 10.0

2016-05-16 Thread Craig Ringer
On 14 May 2016 at 02:49, Tom Lane  wrote:


>
> * This year's major release will be 9.6.0, with minor updates 9.6.1,
> 9.6.2, etc.  It's too late to do otherwise for this release cycle.
>
> * Next year's major release will be 10.0, with minor updates 10.1,
> 10.2, etc.
>
> * The year after, 11.0.  Etc cetera.
>
>
Yes. Please!

I get tired of explaining to people that PostgreSQL "9.x" isn't a thing,
that yes, 9.3 and 9.4 really _do_ have incompatible data directories and
replication protocols, and that when the docs say "major version" they
don't mean "major version as you might actually expect" but "first two
version number parts".

Lets get rid of this user-baffling wart.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] A Modest Upgrade Proposal

2016-05-16 Thread Alvaro Herrera
Joshua D. Drake wrote:
> On 05/16/2016 06:32 PM, Alvaro Herrera wrote:
> >Joshua D. Drake wrote:
> >
> >>Alvaro,
> >>
> >>Thank you for bringing this to light. Is there a contributor FAQ for
> >>PgLogical so that people can help?
> >
> >Hmm, I don't think there's any contributor FAQ.  It's supposed to be a
> >regular patch submission, after all -- it needs user interface review, a
> >review of the communication protocol, tests, code-level review, etc.
> 
> O.k. so we should discuss all PgLogical things here and not on Github? I am
> just trying to figure out what the proper mode here is. I don't think
> anybody wants us to double up efforts.

As far as I am concerned, by all means ignore Github and discuss issues
in pgsql-hackers.  Github is being used only because it provides a
convenient Git mirror, which is said to be easier to use than attaching
huge patches back and forth.  I think it may be more convenient also to
keep track of issues people have reported so that they can be marked as
fixed in commit messages, etc.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A Modest Upgrade Proposal

2016-05-16 Thread David Fetter
On Mon, May 16, 2016 at 06:20:34PM -0700, Joshua D. Drake wrote:
> On 05/16/2016 05:52 PM, David Fetter wrote:
> > Folks,
> 
> > This path would, of course, run either locally or across a
> > network, and be testable in both cases.  There would be a
> > downgrade path, namely switching origin nodes.
> > 
> > What say?
> 
> What happens when the database is 5TB in size and you only have
> 500GB available but that 500GB won't exhaust before the 18 month
> lease expiry?

We cannot prepare for every eventuality.

The downside risk of a binary upgrade in the type of case you describe
is in no conceivable instance better than "rent or borrow another
server with more storage attached and replicate to it."

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A Modest Upgrade Proposal

2016-05-16 Thread Joshua D. Drake

On 05/16/2016 06:32 PM, Alvaro Herrera wrote:

Joshua D. Drake wrote:


Alvaro,

Thank you for bringing this to light. Is there a contributor FAQ for
PgLogical so that people can help?


Hmm, I don't think there's any contributor FAQ.  It's supposed to be a
regular patch submission, after all -- it needs user interface review, a
review of the communication protocol, tests, code-level review, etc.


O.k. so we should discuss all PgLogical things here and not on Github? I 
am just trying to figure out what the proper mode here is. I don't think 
anybody wants us to double up efforts.


JD






--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A Modest Upgrade Proposal

2016-05-16 Thread Alvaro Herrera
Joshua D. Drake wrote:
 
> Alvaro,
> 
> Thank you for bringing this to light. Is there a contributor FAQ for
> PgLogical so that people can help?

Hmm, I don't think there's any contributor FAQ.  It's supposed to be a
regular patch submission, after all -- it needs user interface review, a
review of the communication protocol, tests, code-level review, etc.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A Modest Upgrade Proposal

2016-05-16 Thread Joshua D. Drake

On 05/16/2016 06:22 PM, Alvaro Herrera wrote:

David Fetter wrote:


As a relatively (to our users) minor course correction, I would like
to propose the following:



- Develop a logical upgrade path as a part of the (Yay! Sexy!) logical
   replication that's already in large part built.

This path would, of course, run either locally or across a network,
and be testable in both cases.


This is one use case that pglogical intends to fulfill.  If you're able
to contribute to that project, I'm sure many would appreciate it.  Right
now the hottest question seems to be: is this something that should be
an extension, or should it be part of core with its own set of DDL etc?
The current patch is geared towards the former, so if the community at
large prefers to have it as the latter and would oppose the former, now
is the time to speak up so that the course can be corrected.


Alvaro,

Thank you for bringing this to light. Is there a contributor FAQ for 
PgLogical so that people can help?


Sincerely,

jD

--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A Modest Upgrade Proposal

2016-05-16 Thread Alvaro Herrera
David Fetter wrote:

> As a relatively (to our users) minor course correction, I would like
> to propose the following:

> - Develop a logical upgrade path as a part of the (Yay! Sexy!) logical
>   replication that's already in large part built.
>   
> This path would, of course, run either locally or across a network,
> and be testable in both cases.

This is one use case that pglogical intends to fulfill.  If you're able
to contribute to that project, I'm sure many would appreciate it.  Right
now the hottest question seems to be: is this something that should be
an extension, or should it be part of core with its own set of DDL etc?
The current patch is geared towards the former, so if the community at
large prefers to have it as the latter and would oppose the former, now
is the time to speak up so that the course can be corrected.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A Modest Upgrade Proposal

2016-05-16 Thread Joshua D. Drake

On 05/16/2016 05:52 PM, David Fetter wrote:

Folks,



This path would, of course, run either locally or across a network,
and be testable in both cases.  There would be a downgrade path,
namely switching origin nodes.

What say?


What happens when the database is 5TB in size and you only have 500GB 
available but that 500GB won't exhaust before the 18 month lease expiry?


JD




Cheers,
David.




--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] A Modest Upgrade Proposal

2016-05-16 Thread David Fetter
Folks,

We have a problem.

With all due respect to the valiant efforts of people over the years
who have tried to make an upgrade-in-place system work, I would like
to note that such a system faces what I believe are insurmountable
barriers to being done correctly.  I will then propose an alternative.

We have seen each one of the following on multiple occasions:

- It's extraordinarily unglamorous work.  This further restricts the
  already tiny pool of people who might work on it.  If somebody has a
  sustainable way to increase the glamour, that might help, but...

- To do correctly, it requires broad and intimate knowledge of the
  storage system and the systems below it (what is and isn't actually
  invariant across filesystems and kernels, e.g.) at a level that even
  most core engine hackers do not possess.

- It's always done under extreme time pressure, namely between feature
  freeze (more properly, all-other-code-freeze, if it's to be actually
  correct) and release.  We haven't even attempted the "properly"
  version for what I hope are pretty obvious reasons.

- It's extraordinarily difficult to test even normal cases, let alone
  corner cases, especially in light of the time pressure.

- Failure modes tend to be silent (or at least whispering) data
  corruption, not infrequently permanent.

That all sounds grim because it is.

HOWEVER

All is not lost.

We can relax the in-place requirement because of the economics of
computing.  The components of a node have been getting drastically
cheaper for decades while (amazingly, if you think about it)
increasing in quality. Rented ("cloud") nodes have gotten steadily
cheaper and better, too, although not over quite as long a haul.

In light of the above, it is perfectly reasonable to require, at least
temporarily, setting up duplicate storage, or another node.

I am aware that some cases exist where this is not possible, but I
don't think we should twist ourselves into pretzels to accommodate a
tiny minority of our users, which my experience in the field leads me
to believe is the case.

As a relatively (to our users) minor course correction, I would like
to propose the following:

- Keep the current pg_upgrade code, but put loud deprecation warnings
  all over it, most emphatically all over its documentation.

- Develop a logical upgrade path as a part of the (Yay! Sexy!) logical
  replication that's already in large part built.
  
This path would, of course, run either locally or across a network,
and be testable in both cases.  There would be a downgrade path,
namely switching origin nodes.

What say?

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Correctly align page's images in generic wal API

2016-05-16 Thread Teodor Sigaev

Instead of allocating this memory unconditionally for each buffer,
wouldn't it be better to set all the page pointers to NULL in
GenericXLogStart and allocate memory only once a buffer is registered
in GenericXLogRegisterBuffer when finding a free slot? This patch is
wasting many cycles.



GenericXLogRegisterBuffer() could be called in another MemoryContext what
can be a reason for strange bugs. Right now only a few pages could be
involved in one round of GenericWal. I don't believe that such allocation
could be a reason of noticable performance degradation. Although I didn't
check that.
--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Correctly align page's images in generic wal API

2016-05-16 Thread Michael Paquier
On Tue, May 17, 2016 at 6:01 AM, Teodor Sigaev  wrote:
> Correctly align page's images in generic wal API
>
> Page image should be MAXALIGN'ed because existing code could directly align
> pointers in page instead of align offset from beginning of page.
>
> Found during play with indexes as extenstion, Alexander Korotkov and me

Instead of allocating this memory unconditionally for each buffer,
wouldn't it be better to set all the page pointers to NULL in
GenericXLogStart and allocate memory only once a buffer is registered
in GenericXLogRegisterBuffer when finding a free slot? This patch is
wasting many cycles.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade fails in 9.6 beta1

2016-05-16 Thread Michael Paquier
On Tue, May 17, 2016 at 12:07 AM, Palle Girgensohn  wrote:
> pg_restore: creating TABLE "public.cal_event"
> pg_restore: [archiver (db)] Error while PROCESSING TOC:
> pg_restore: [archiver (db)] Error from TOC entry 205; 1259 3538451239 TABLE 
> cal_event nobody
> pg_restore: [archiver (db)] could not execute query: ERROR:  could not create 
> directory "pg_tblspc/16400/PG_9.4_201605051/16403": No such file or directory

That's weird, a directory with the old name is tried to be created in
the new 9.6 cluster if I follow that correctly.

> Command was:
> -- For binary upgrade, must preserve pg_type oid
> SELECT pg_catalog.binary_upgrade_set_next_pg_type_oid('3538451241'::pg_cat...
>
> now
> "pg_tblspc/16400/PG_9.4_201605051/16403"
> is a mixup of the new cluster pg_tblspc/16400, the old version PG_9.4_, and 
> the new cluster 201605051/
> Seems like a bug to me, it shouldn't create pg_tblspc/16400/PG_9.4_201605051, 
> the 9.6 tablespace is in data96/pg_tblspc/16400/PG_9.6_201605051  (not 9.4).

Based on the information you gave out, a tablespace and visibly a
custom type, I have tried a 9.4->9.6 upgrade with this schema and
things are working for me with -k:
CREATE TABLESPACE popo LOCATION '$TBSPACE_DIR';
CREATE TYPE yoyo AS (a int);
CREATE TABLE toto (a int PRIMARY KEY, b yoyo) TABLESPACE popo;

The old and new tablespaces are in correct shape, even if I did not
provide a tablespace map.

Could you provide more details about the schema causing this error? Do
you have for example some roles beginning with "pg_" as prefix?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [HACKERS] Re: [HACKERS] Re: [HACKERS] Re: [HACKERS] Windows service is not starting so there’s message in log: FATAL: "could not create shared memory segment “Global/PostgreSQL.851401618

2016-05-16 Thread Michael Paquier
On Tue, May 17, 2016 at 6:01 AM, Michael Paquier
 wrote:
> EACCES is a bit too low-level... I had in mind to check GetLastError
> with only ERROR_ACCESS_DENIED, and retry only in this case, which is
> the case where one postmaster is trying to access the segment of
> another.

s/low/high/.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [HACKERS] Re: [HACKERS] Re: [HACKERS] Re: [HACKERS] Windows service is not starting so there’s message in log: FATAL: "could not create shared memory segment “Global/PostgreSQL.851401618

2016-05-16 Thread Michael Paquier
On Tue, May 17, 2016 at 4:16 AM, Amit Kapila  wrote:
> On Mon, May 16, 2016 at 9:45 AM, Michael Paquier 
> wrote:
>>
>> On Sun, May 15, 2016 at 3:34 PM, Amit Kapila 
>> wrote:
>> > Sounds sensible, but if we want to that route, shall we have some
>> > mechanism
>> > such that if retrying it for 10 times (10 is somewhat arbitrary, but we
>> > retry 10 times in PGSharedMemoryCreate, so may be there is some
>> > consistency)
>> > doesn't give us unique name and we are getting EACCES error, then just
>> > throw
>> > the error instead of more retries.  This is to ensure that if the API is
>> > returning EACCES due to reason other than duplicate handle, then we
>> > won't
>> > retry indefinitely.
>>
>> The logic in win32_shmem.c relies on the fact that a segment will be
>> recycled, and the retry is here because it may take time at OS level.
>> On top of that it relies on the segment names being unique across
>> systems. So it seems to me that it is not worth the complication to
>> duplicate that logic in the dsm implementation.
>
> If we don't do retry for fixed number of times, then how will we handle the
> case if EACCES is due to the reason other than duplicate handle?

EACCES is a bit too low-level... I had in mind to check GetLastError
with only ERROR_ACCESS_DENIED, and retry only in this case, which is
the case where one postmaster is trying to access the segment of
another.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A couple of cosmetic changes around shared memory code

2016-05-16 Thread Michael Paquier
On Tue, May 17, 2016 at 4:40 AM, Piotr Stefaniak
 wrote:
> while investigating the shm_mq code and its testing module I made some
> cosmetic improvements there. You can see them in the attached diff file.

-toc_bytes = offsetof(shm_toc, toc_entry) +nentry * sizeof(shm_toc_entry)
+toc_bytes = offsetof(shm_toc, toc_entry) + nentry * sizeof(shm_toc_entry)
 + allocated_bytes;
I don't recall the exact reason, but this is intentional style
(memories from a patchwork with Tom). See for example geo_ops.c or
pl_funcs.c. Though it is true that things are not completely
consistent in the code with offset.

-seg = dsm_create(shm_toc_estimate(&e), 0);
+seg = dsm_create(segsize, 0);
Yep.

-proc_exit(1);
+proc_exit(0);
Agreed here. I don't see why this should not exit with 0 if there have
not been any errors.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] A couple of cosmetic changes around shared memory code

2016-05-16 Thread Piotr Stefaniak

Hello,

while investigating the shm_mq code and its testing module I made some 
cosmetic improvements there. You can see them in the attached diff file.
commit 0e202cb6e0eca2e7fb3e1353b550f3d2ace9680e
Author: Piotr Stefaniak 
Date:   Thu Apr 28 18:36:16 2016 +0200

Cosmetic improvements around shm_mq and test_shm_mq.

diff --git a/src/backend/storage/ipc/shm_mq.c b/src/backend/storage/ipc/shm_mq.c
index 7859f42..292d515 100644
--- a/src/backend/storage/ipc/shm_mq.c
+++ b/src/backend/storage/ipc/shm_mq.c
@@ -103,7 +103,7 @@ struct shm_mq
  * locally by copying the chunks into a backend-local buffer.  mqh_buffer is
  * the buffer, and mqh_buflen is the number of bytes allocated for it.
  *
- * mqh_partial_message_bytes, mqh_expected_bytes, and mqh_length_word_complete
+ * mqh_partial_bytes, mqh_expected_bytes, and mqh_length_word_complete
  * are used to track the state of non-blocking operations.  When the caller
  * attempts a non-blocking operation that returns SHM_MQ_WOULD_BLOCK, they
  * are expected to retry the call at a later time with the same argument;
diff --git a/src/backend/storage/ipc/shm_toc.c b/src/backend/storage/ipc/shm_toc.c
index 55248c2..e1d6bd1 100644
--- a/src/backend/storage/ipc/shm_toc.c
+++ b/src/backend/storage/ipc/shm_toc.c
@@ -96,7 +96,7 @@ shm_toc_allocate(shm_toc *toc, Size nbytes)
 	total_bytes = vtoc->toc_total_bytes;
 	allocated_bytes = vtoc->toc_allocated_bytes;
 	nentry = vtoc->toc_nentry;
-	toc_bytes = offsetof(shm_toc, toc_entry) +nentry * sizeof(shm_toc_entry)
+	toc_bytes = offsetof(shm_toc, toc_entry) + nentry * sizeof(shm_toc_entry)
 		+ allocated_bytes;
 
 	/* Check for memory exhaustion and overflow. */
@@ -132,7 +132,7 @@ shm_toc_freespace(shm_toc *toc)
 	nentry = vtoc->toc_nentry;
 	SpinLockRelease(&toc->toc_mutex);
 
-	toc_bytes = offsetof(shm_toc, toc_entry) +nentry * sizeof(shm_toc_entry);
+	toc_bytes = offsetof(shm_toc, toc_entry) + nentry * sizeof(shm_toc_entry);
 	Assert(allocated_bytes + BUFFERALIGN(toc_bytes) <= total_bytes);
 	return total_bytes - (allocated_bytes + BUFFERALIGN(toc_bytes));
 }
@@ -176,7 +176,7 @@ shm_toc_insert(shm_toc *toc, uint64 key, void *address)
 	total_bytes = vtoc->toc_total_bytes;
 	allocated_bytes = vtoc->toc_allocated_bytes;
 	nentry = vtoc->toc_nentry;
-	toc_bytes = offsetof(shm_toc, toc_entry) +nentry * sizeof(shm_toc_entry)
+	toc_bytes = offsetof(shm_toc, toc_entry) + nentry * sizeof(shm_toc_entry)
 		+ allocated_bytes;
 
 	/* Check for memory exhaustion and overflow. */
diff --git a/src/test/modules/test_shm_mq/setup.c b/src/test/modules/test_shm_mq/setup.c
index 5bd2820..a0f3962 100644
--- a/src/test/modules/test_shm_mq/setup.c
+++ b/src/test/modules/test_shm_mq/setup.c
@@ -125,7 +125,7 @@ setup_dynamic_shared_memory(int64 queue_size, int nworkers,
 	segsize = shm_toc_estimate(&e);
 
 	/* Create the shared memory segment and establish a table of contents. */
-	seg = dsm_create(shm_toc_estimate(&e), 0);
+	seg = dsm_create(segsize, 0);
 	toc = shm_toc_create(PG_TEST_SHM_MQ_MAGIC, dsm_segment_address(seg),
 		 segsize);
 
diff --git a/src/test/modules/test_shm_mq/worker.c b/src/test/modules/test_shm_mq/worker.c
index 638649b..a94414a 100644
--- a/src/test/modules/test_shm_mq/worker.c
+++ b/src/test/modules/test_shm_mq/worker.c
@@ -139,7 +139,7 @@ test_shm_mq_main(Datum main_arg)
 	 * we can go ahead and exit.
 	 */
 	dsm_detach(seg);
-	proc_exit(1);
+	proc_exit(0);
 }
 
 /*

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [HACKERS] Re: [HACKERS] Re: [HACKERS] Re: [HACKERS] Windows service is not starting so there’s message in log: FATAL: "could not create shared memory segment “Global/PostgreSQL.851401618

2016-05-16 Thread Amit Kapila
On Mon, May 16, 2016 at 9:45 AM, Michael Paquier 
wrote:
>
> On Sun, May 15, 2016 at 3:34 PM, Amit Kapila 
wrote:
> > Sounds sensible, but if we want to that route, shall we have some
mechanism
> > such that if retrying it for 10 times (10 is somewhat arbitrary, but we
> > retry 10 times in PGSharedMemoryCreate, so may be there is some
consistency)
> > doesn't give us unique name and we are getting EACCES error, then just
throw
> > the error instead of more retries.  This is to ensure that if the API is
> > returning EACCES due to reason other than duplicate handle, then we
won't
> > retry indefinitely.
>
> The logic in win32_shmem.c relies on the fact that a segment will be
> recycled, and the retry is here because it may take time at OS level.
> On top of that it relies on the segment names being unique across
> systems. So it seems to me that it is not worth the complication to
> duplicate that logic in the dsm implementation.

If we don't do retry for fixed number of times, then how will we handle the
case if EACCES is due to the reason other than duplicate handle?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] 10.0

2016-05-16 Thread Jeff Janes
On Sat, May 14, 2016 at 8:37 PM, Tom Lane  wrote:
> Jeff Janes  writes:
>> There are lots of improvement which get done to in-memory data
>> structures that wouldn't require a pg_dump/pg_upgrade, which could in
>> principle be ported into prior major versions if we had the resources
>> (reviewing, testing, packaging) to do it, with an increase in the
>> middle number.  Maybe we will never find the resources to do that, but
>> why should that assumption get baked into the numbering scheme?
>
> If we were to do that today, it'd just be an increase in the minor number.
> I don't see why we'd need to change that approach.

We've rejected back-patching such improvements in the past on the
grounds that it was at least theoretically possible that it would
negatively affect someone, even if it were a win overall for most
people, and users shouldn't be forced to adopt that risk in order to
get security or corruption bug fixes that go into the minor number
increments.

> The real blocking
> factors there are about manpower and stability of the resulting code, not
> about whether you need some special version numbering to describe it.

If we did overcome the man-power and stability problems, we would
certain run into the version numbering one pretty quickly, under both
the existing versioning system and the two-part system.

And I don't think that using something at least vaguely like SemVer is
really "special", if anything it is less special than either the
existing or the dominant proposal.

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] HeapTupleSatisfiesToast() busted? (was atomic pin/unpin causing errors)

2016-05-16 Thread Jeff Janes
On Mon, May 16, 2016 at 8:28 AM, Andres Freund  wrote:
> On 2016-05-10 17:58:33 -0700, Andres Freund wrote:
>> FWIW, I've commented out the relevant sections from xlog_redo and since
>> then I've not been able to reproduce the issue.
>
> A couple days of running later, and it hasn't yet re-triggered. So I
> assume that's indeed the fix.

Yes, I've also run it to wrap-around a few times with that commented
out, so I agree that that is a fix. But I don't think we should apply
until we get the larger issue with hit-bits is fixed first, as then we
are masking one problem by fixing another.

Although it doesn't really matter to me, because I'm not using HEAD to
run the tests anyway, as I can't get the error to trigger on HEAD.  As
a side note, using 48354581a49c30f5 and without commenting out the
part of the replay code we are talking about, I also need
track_io_timing to be on in order to get the error.  I don't think
there is any direct causation there, just another fiddly timing issue.

I've tried fixing the hit-bit issue as well, but I'm afraid it is over
my head.  I'll be happy to test proposed patches, though.

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] trivia: cancel{,l}{ed,ing,ation}

2016-05-16 Thread Robert Haas
I never know how to spell the various derived forms of "cancel" that
come up from time to time.

[rhaas pgsql]$ for a in canceled cancelled canceling cancelling
cancelation cancellation; do echo -n $a:; git grep $a | wc -l; done
canceled: 161
cancelled:  10
canceling: 141
cancelling:   4
cancelation:   0
cancellation:  51

So, apparently, we have a strong preference for spelling cancellation
with two "l"s, but we have a strong preference for spelling
"canceling" and "canceled" with only one "l".  Here are the number of
Google hits for each of these spellings:

canceled: About 1,520,000,000 results
cancelled: About 144,000,000 results
canceling: About 12,200,000 results
cancelling: About 20,900,000 results
cancelation: About 1,550,000 results (did you mean cancellation?)
cancellation: About 109,000,000 results

So Google thinks that our practice of using one "l" in canceled and
two in "cancellation" is typical, but our use of a single "l" for
"canceling" is a minority position (though it's close).

http://grammarist.com/spelling/cancel/ thinks that cancellation should
have two "l"s and that the others have two "l"s outside the United
States but that the United States is switching from two "L"s to one,
with a single "L" now predominating.  I have to admit that before I
got involved with PostgreSQL, I thought that all of these had two
"L"s, which maybe shows that I am old-fashioned, or just old.  I
personally find this whole thing quite annoying as it takes up brain
space that seems like it might be better used for other things, but
unless we're willing to standardize on the double-"L" spelling for
everything, I guess I'll just have to keep checking our practice for
that particular word every time I use any of them.

I will be heading up to Ottawa in a few hours, assuming my plane
flight does not get canceled.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres_fdw join pushdown - getting server crash in left outer join of three table

2016-05-16 Thread Robert Haas
On Fri, May 13, 2016 at 6:40 PM, Michael Paquier
 wrote:
> On Fri, May 13, 2016 at 11:14 PM, Robert Haas  wrote:
>> So, barring objections, I intend to apply the attached fixup patch,
>> which replaces Michael's logic with Ashutosh's logic and rewrites the
>> comments such to be much more explicit.
>
> Re-oops. I didn't check what was committed to be honest. And it should
> not have been my version, definitely.

OK, committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] HeapTupleSatisfiesToast() busted? (was atomic pin/unpin causing errors)

2016-05-16 Thread Andres Freund
On 2016-05-10 17:58:33 -0700, Andres Freund wrote:
> FWIW, I've commented out the relevant sections from xlog_redo and since
> then I've not been able to reproduce the issue.

A couple days of running later, and it hasn't yet re-triggered. So I
assume that's indeed the fix.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Odd oid-system-column handling in postgres_fdw

2016-05-16 Thread Robert Haas
On Wed, May 11, 2016 at 3:44 AM, Etsuro Fujita
 wrote:
> I'll add this to the next CF.

Makes sense.  I think this is basically a new feature, so let's treat
it that way.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW

2016-05-16 Thread Robert Haas
On Wed, May 11, 2016 at 3:20 AM, Etsuro Fujita
 wrote:
> Thanks for the review!
>
> I'll add this to the next CF.  I think this should be addressed in advance
> of the release of 9.6, though.

I agree.  Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_upgrade fails in 9.6 beta1

2016-05-16 Thread Palle Girgensohn
Hi,

When trying an upgrade from postgresql94 -> postgresql96 (beta1) using 
pg_upgrade:

pg_upgrade -p 5433 -P 5434 -b /usr/local/bin  -B /home/girgen/postgres96/bin -d 
/tank/opt96/pgsql/data94 -D /tank/opt96/pgsql/data96 -U pgsql96 -k


I eventually get this:

pg_restore: creating TABLE "public.cal_event"
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 205; 1259 3538451239 TABLE 
cal_event nobody
pg_restore: [archiver (db)] could not execute query: ERROR:  could not create 
directory "pg_tblspc/16400/PG_9.4_201605051/16403": No such file or directory
Command was:
-- For binary upgrade, must preserve pg_type oid
SELECT pg_catalog.binary_upgrade_set_next_pg_type_oid('3538451241'::pg_cat...




now

"pg_tblspc/16400/PG_9.4_201605051/16403"

is a mixup of the new cluster pg_tblspc/16400, the old version PG_9.4_, and the 
new cluster 201605051/

that is, pg_upgrade seems to take the old version as a parameter ( PG_9.4_ ) 
when trying to create the new tablespace.


This happens in pg_restore, it seems:

Restoring database schemas in the new cluster
  loadtest
*failure*


Seems like a bug to me, it shouldn't create pg_tblspc/16400/PG_9.4_201605051, 
the 9.6 tablespace is in data96/pg_tblspc/16400/PG_9.6_201605051  (not 9.4).

Palle



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [HACKERS] Reviewing freeze map code

2016-05-16 Thread Robert Haas
On Tue, May 10, 2016 at 10:40 PM, Masahiko Sawada  wrote:
> Or second way I came up with is having tool to remove particular _vm
> file safely, which is executed via SQL or client tool like
> pg_resetxlog.
>
> Attached updated VACUUM SCAN_ALL patch.
> Please find it.

We should support scan_all only with the new-style options syntax for
VACUUM; that is, vacuum (scan_all) rename.  That doesn't require
making scan_all a keyword, which is good: this is a minor feature, and
we don't want to bloat the parsing tables for it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-05-16 Thread Tom Lane
Peter Eisentraut  writes:
> On 5/16/16 9:53 AM, Greg Stark wrote:
>> I thought the idea was that Berkeley tossed an source tree over the
>> wall with no version number and then the first five releases were
>> Postgres95 0.x, Postgres95 1.0, Postgres95 1.0.1, Postgres95 1.0.2,
>> Postgres95 1.0.9. Then the idea was that PostgreSQL 6.0 was the sixth
>> major release counting those as the first five releases.

> The last release out of Berkeley was 4.2.

Correct --- I have a copy of that tarball.

> Then Postgres95 was "5", and then PostgreSQL started at 6.

I wasn't actually around at the time, but our commit history starts
with this:

Author: Marc G. Fournier 
Branch: master Release: REL6_1 [d31084e9d] 1996-07-09 06:22:35 +

Postgres95 1.01 Distribution - Virgin Sources

The first mention of 6.anything is here:

Author: Bruce Momjian 
Branch: master Release: REL6_1 [a2b7f6297] 1996-12-28 02:01:58 +

Updated changes for 6.0.

I see no references in the commit history to 5.anything, but there
are some references like this:

Author: Marc G. Fournier 
Branch: master Release: REL6_1 [491b9b89c] 1996-08-26 20:38:52 +

The patch that is applied at the end of the email makes sure that these
conditions are always met. The patch can be applied to any version
of Postgres95 from 1.02 to 1.05. After applying the patch, queries
using indices on bpchar and varchar fields should (hopefully ;-) )
always return the same tuple set regardless to the fact whether
indices are used or not.
Submitted by: Gerhard Reithofer 

So I think that the Postgres95 releases were retrospectively redefined as
being the "5.x" series, but no release was ever made with such a number.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-05-16 Thread Peter Eisentraut

On 5/16/16 9:53 AM, Greg Stark wrote:

On Sat, May 14, 2016 at 1:00 AM, Tom Lane  wrote:


If that were the standard, we'd never have bumped the major version at
all, and would still be on 4.something (or whatever Berkeley was using
when they tossed it over the wall; I'm not too clear on whether there was
ever a 5.x release).


I thought the idea was that Berkeley tossed an source tree over the
wall with no version number and then the first five releases were
Postgres95 0.x, Postgres95 1.0, Postgres95 1.0.1, Postgres95 1.0.2,
Postgres95 1.0.9. Then the idea was that PostgreSQL 6.0 was the sixth
major release counting those as the first five releases.


The last release out of Berkeley was 4.2.  Then Postgres95 was "5", and 
then PostgreSQL started at 6.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 10.0

2016-05-16 Thread Greg Stark
On Sat, May 14, 2016 at 1:00 AM, Tom Lane  wrote:

> If that were the standard, we'd never have bumped the major version at
> all, and would still be on 4.something (or whatever Berkeley was using
> when they tossed it over the wall; I'm not too clear on whether there was
> ever a 5.x release).

I thought the idea was that Berkeley tossed an source tree over the
wall with no version number and then the first five releases were
Postgres95 0.x, Postgres95 1.0, Postgres95 1.0.1, Postgres95 1.0.2,
Postgres95 1.0.9. Then the idea was that PostgreSQL 6.0 was the sixth
major release counting those as the first five releases.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH][Documination] Add optional USING keyword before opclass name in INSERT statemet

2016-05-16 Thread Nikolay Shaplov

If I read gram.y code for insert statement, I see that there is an optional 
USING keyword before opclass name


opt_class:  any_name{ $$ = $1; }
| USING any_name{ $$ = $2; }
| /*EMPTY*/ { $$ = NIL; }
;

but it the documentation this keyword is omitted.

I'd like to offer a patch that fixes this problem 

-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index 7dee405..788bc3f 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
  
 
 CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] name ] ON table_name [ USING method ]
-( { column_name | ( expression ) } [ COLLATE collation ] [ opclass ] [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] )
+( { column_name | ( expression ) } [ COLLATE collation ] [ [ USING ] opclass ] [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] )
 [ WITH ( storage_parameter = value [, ... ] ) ]
 [ TABLESPACE tablespace_name ]
 [ WHERE predicate ]

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning

2016-05-16 Thread Ildar Musin

Hi Amit,

I'm running some experiments based on your infrastructure trying to 
optimize SELECT queries. At some point I need to get PartitionDesc for 
relation and to do it I'm using RelationGetPartitionDesc() function. 
Problem is that this function copies relcache data and it can be quite 
slow for large amounts (thousands) of partitions. The comment to the 
function says that we cannot use relation->rd_partdesc pointer to 
relcache because of possibility of relcache invalidation. Could you 
please tell is it possible that relcache invalidation occurs during 
SELECT/UPDATE/DELETE query?


Thanks!

--
Ildar Musin
i.mu...@postgrespro.ru



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Backup doc typo

2016-05-16 Thread Amit Langote
Hi,

Attached patch adds missing "is" in a sentence in backup.sgml.

Thanks,
Amit
diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index b036183..96fc9a0 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -906,8 +906,8 @@ SELECT * FROM pg_stop_backup(false);
 
  Once the WAL segment files active during the backup are archived, you are
  done.  The file identified by pg_stop_backup's first return
- value the last segment that is required to form a complete set of backup
- files.  If archive_mode is enabled,
+ value is the last segment that is required to form a complete set of
+ backup files.  If archive_mode is enabled,
  pg_stop_backup does not return until the last segment has
  been archived.
  Archiving of these files happens automatically since you have

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] The rewritting of join conditions caused a very slow query plan.

2016-05-16 Thread chang chao
Hi,all

I have a query that is very slow,and the reason may be in the rewritting of 
join conditions.

this is the simplied version table and the key part of the sql.

level1_table and level2_table hold the tree data nodes,
and all_level_status table holds the current status all all nodes of all levels.
(I know that there would be much less trouble in performance if 
all_level_status was divided into two tables,namely,level1_status and 
level2_status tables.)

table1: level1_table
  level1_no   PK:serial
  level1_node_name :varchar

table2:level2_table
  level2_no   PK:serial
  parent_no   FK to level1_table.level1_no
  level2_node_name :varchar

table3: all_level_status
  level:1 OR 2 PK1
  node_no:level1_table.level1_no or level2_table.level2_no PK2
  status:0 OR 1(normal or abnormal)


The sql to find all level2 nodes whose parent level nodes are in normal status.

explain analyze
select * from level2_table l2
join (
 select l1.* from level1_table l1
 join all_level_status als on (als.level=1 and als.node_no=l1.level1_no)
 where  als.status=0
) normal_l1 on l2.parent_no=normal_l1.level1_no;


this is the query plan .

"Merge Join  (cost=3.38..5.13 rows=3 width=158) (actual time=0.087..0.179 
rows=21 loops=1)"
"  Merge Cond: (als.node_no = l2.parent_no)"
"  ->  Merge Join  (cost=1.63..7.66 rows=19 width=80) (actual time=0.067..0.126 
rows=18 loops=1)"
"Merge Cond: (als.node_no = l1.level1_no)"
"->  Index Scan using all_level_status_pkey on all_level_status als  
(cost=0.00..21.74 rows=19 width=4) (actual time=0.037..0.079 rows=18 loops=1)"
"  Index Cond: (level = 1)"
"  Filter: (status = 0)"
"->  Sort  (cost=1.63..1.68 rows=20 width=76) (actual time=0.026..0.026 
rows=20 loops=1)"
"  Sort Key: l1.level1_no"
"  Sort Method:  quicksort  Memory: 27kB"
"  ->  Seq Scan on level1_table l1  (cost=0.00..1.20 rows=20 
width=76) (actual time=0.005..0.009 rows=20 loops=1)"
"  ->  Sort  (cost=1.75..1.81 rows=23 width=82) (actual time=0.016..0.024 
rows=23 loops=1)"
"Sort Key: l2.parent_no"
"Sort Method:  quicksort  Memory: 28kB"
"->  Seq Scan on level2_table l2  (cost=0.00..1.23 rows=23 width=82) 
(actual time=0.003..0.005 rows=23 loops=1)"
"Total runtime: 0.307 ms"


Please note that,join condition of query plan line 2 is rewritten to 
"als.node_no = l2.parent_no"
level1 and level2 nodes are of the 1:n relationship,and because 
all_level_status.node_no represents different things(level1_table.level1_no and 
level2_table.level2_no uses separate serials),so when this rewriting is 
applied,the statistics of mcvs of all_level_status.node_no and level2.parent_no 
will be used to do the row selectivity,as can be anticipated,a large gap 
occurred between actual rows and estimated rows.

the above sql is one simplified part of a long sql,because of this gap,the 
estimated row count becomes 1 in the outer sub-query,which in actual has large 
number of values,
the very slow nested-loop join is selected.

Had the rewriting of the join condition not be done,maybe a much fast query 
plan would be selected.

So I'm wondering what is the reason behind the join condition rewriting,
Is it just because that join conditions that both left and right side have mcvs 
are preferable to those in which there are no mcvs on both sides?


Chao.



Re: [HACKERS] Lets (not) break all the things. Was: [pgsql-advocacy] 9.6 -> 10.0

2016-05-16 Thread Bruce Momjian
On Sun, May 15, 2016 at 03:23:52PM -0500, Jim Nasby wrote:
> 2) There's no ability at all to revert, other than restore a backup. That
> means if you pull the trigger and discover some major performance problem,
> you have no choice but to deal with it (you can't switch back to the old
> version without losing data).

In --link mode only

> For many users those issues just don't matter; but in my work with financial
> data it's why I've never actually used it. #2 especially was good to have
> (in our case, via londiste). It also made it a lot easier to find
> performance issues beforehand, by switching reporting replicas over to the
> new version first.
> 
> One other consideration is cut-over time. Swapping logical master and
> replica can happen nearly instantly, while pg_upgrade needs some kind of
> outage window.

Right.  I am thinking of writing some docs about how to avoid downtime
for upgrades of various types.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers