Re: [HACKERS] 9.4 broken on alpha

2015-08-30 Thread Christoph Berg
Re: Michael Cree 2015-08-26 20150826052530.GA4256@tower
 I reported the failure to build on Alpha, with an explanation and a
 patch to fix it, to the Debian package maintainers over a year ago,
 and within about of a month of version 9.4 being uploaded to Debian.
 
 My recollection is that prior versions (9.2 and 9.3) compiled on
 Alpha so the use of the wrong barrier, and the fix, was in fact
 reported in a timely fashion following the first reasonable chance to
 observe the problem.
 
 It has been built and running at Debian-Ports for over a year now as
 I uploaded the fixed version to the Alpha unreleased distribution.

Hi Michael,

(I've discovered this branch of this thread only now, I got removed
from CC.)

Sorry for letting that rot for so long - I'd blame the Debian
infrastructure for not showing ports information in the usual places.
I've really only discovered the problem because buildd.debian.org is
now showing the non-main architectures as well. (Of course we could
just have looked at the bug report...) I guess we should look into
making that even more visible. Is there a list of packages that have
ports-only patches applied which we could use to make maintainers
aware via ddpo/pts/tracker? Having a porter box available would help
as well.

  It'd be easy enough to s/rmb/mb/ in 9.4 ... but not sure it's worth
  the trouble, since we're desupporting Alpha as of 9.5 anyway.
 
 That is disappointing to hear.  Why is that?  It is still in use on
 Alpha.  What is the maintenance load for keeping the Alpha arch
 specific code?

Fwiw I'd be curious to see if 9.5 still works using the generic
primitives, but atm it's blocking on perl5.20:

Dependency installability problem for postgresql-9.5 on alpha:

postgresql-9.5 build-depends on:
- alpha:libipc-run-perl
alpha:libipc-run-perl depends on:
- alpha:libio-pty-perl
alpha:libio-pty-perl depends on missing:
- alpha:perlapi-5.20.0

Christoph
-- 
c...@df7cb.de | http://www.df7cb.de/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On-demand running query plans using auto_explain and signals

2015-08-30 Thread Shulgin, Oleksandr
On Aug 29, 2015 7:31 PM, Pavel Stehule pavel.steh...@gmail.com wrote:



 2015-08-29 18:36 GMT+02:00 Andres Freund and...@anarazel.de:

 On 2015-08-29 18:27:59 +0200, Pavel Stehule wrote:
  2015-08-29 18:25 GMT+02:00 Shulgin, Oleksandr 
oleksandr.shul...@zalando.de
   Good point.  There's still hope to set a flag and process it later
on.
   Will have to check if it's possible to stay in the scope of a loaded
module
   though.

  I had a workable prototype - and It was implemented very similar as
  handling CANCEL

 Where did you put the handling of that kind of interrupt? Directly into
 ProcessInterrupts()?


 Probably. I don't remember it well, but it need hack code - it cannot be
used from extension.

Do you still have the code somewhere around? Did it see production use?

Thanks!
--
Alex


[HACKERS] datestyle=postgres broken with timezone=UTC+N

2015-08-30 Thread Christoph Berg
Discovered when debugging libpqtypes test failures:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=795729

postgres =# set timezone = 'Etc/UTC+1';
SET
postgres =# set datestyle = 'postgres';
SET
postgres =# select '2015-01-01 01:00:00 +0100'::timestamptz;
 Wed 31 Dec 23:00:00 2014 ETC/UTC

postgres =# select 'Wed 31 Dec 23:00:00 2014 ETC/UTC'::timestamptz;
 Wed 31 Dec 22:00:00 2014 ETC/UTC


That is, feeding back a value will to the UTC offset shift again.
(9.6 head, also seen on 9.4.)

Christoph
-- 
c...@df7cb.de | http://www.df7cb.de/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On-demand running query plans using auto_explain and signals

2015-08-30 Thread Pavel Stehule
2015-08-30 10:30 GMT+02:00 Shulgin, Oleksandr oleksandr.shul...@zalando.de
:

 On Aug 29, 2015 7:31 PM, Pavel Stehule pavel.steh...@gmail.com wrote:
 
 
 
  2015-08-29 18:36 GMT+02:00 Andres Freund and...@anarazel.de:
 
  On 2015-08-29 18:27:59 +0200, Pavel Stehule wrote:
   2015-08-29 18:25 GMT+02:00 Shulgin, Oleksandr 
 oleksandr.shul...@zalando.de
Good point.  There's still hope to set a flag and process it later
 on.
Will have to check if it's possible to stay in the scope of a
 loaded module
though.
 
   I had a workable prototype - and It was implemented very similar as
   handling CANCEL
 
  Where did you put the handling of that kind of interrupt? Directly into
  ProcessInterrupts()?
 
 
  Probably. I don't remember it well, but it need hack code - it cannot be
 used from extension.

 Do you still have the code somewhere around? Did it see production use?

I am not sure I am able to find it - I'll try. We didn't use it on
production.


 Thanks!
 --
 Alex



Re: [HACKERS] Potential GIN vacuum bug

2015-08-30 Thread Tom Lane
Jeff Janes jeff.ja...@gmail.com writes:
 Attached is a patch to deal with this without the heavyweight locks.
 I realized that using the clean up lock on the meta page, rather than the
 pending head page, would be easier to implement as a pin was already held
 on the meta page throughout, and the pending head page can change during
 the cleanup if we need multiple passes.
 ...
 I still prefer the heavy-weight approach.  The buffer clean up lock for
 vacuuming seems fragile to start with, and abusing it for other purposes
 doesn't improve on that.

FWIW, I would go with the heavyweight lock approach as well.  The extra
cycles needed for a heavyweight lock don't seem significant in this
context, and you have far more control over which other operations
conflict or don't conflict with the lock.  Taking a buffer cleanup lock on
the metapage sounds really scary from that viewpoint; it's basically going
to conflict with everything else, even if the other operations only take
it for short intervals, and you have no freedom to adjust that.

Your earlier point about how the current design throttles insertions to
keep the pending list from growing without bound seems like a bigger deal
to worry about.  I think we'd like to have some substitute for that.
Perhaps we could make the logic in insertion be something like

if (pending-list-size  threshold)
{
if (conditional-lock-acquire(...))
{
do-pending-list-cleanup;
lock-release;
}
else if (pending-list-size  threshold * 2)
{
unconditional-lock-acquire(...);
if (pending-list-size  threshold)
do-pending-list-cleanup;
lock-release;
}
}

so that once the pending list got too big, incoming insertions would wait
for it to be cleared.  Whether to use a 2x safety margin or something else
could be a subject for debate, of course.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing dead support for pre-POSIX signals

2015-08-30 Thread Andres Freund
Hi,

On 2015-08-30 14:59:41 -0400, Tom Lane wrote:
 1. No buildfarm member in the available history (going back to 2012-01-01)
 has ever reported not having the POSIX signal interface, nor sig_atomic_t.
 (We don't run the POSIX-signals check on Windows systems, though.)

We, afaik, don't use any signals on windows anyway...


 I think we should rip out the configure checks for HAVE_POSIX_SIGNALS,
 HAVE_SIGPROCMASK, HAVE_SIGSETJMP, and probably HAVE_SIG_ATOMIC_T, as well
 as the C code that tries to make up for not having these things (on
 non-Windows systems that is).  It's not being exercised and it's fair to
 doubt that those code paths would even work reliably anymore.  For
 instance, it seems likely that our current latch code has never been run
 on a system without these APIs, and even more likely that Andres' recent
 fooling around with signal handling (eg commit 675f5) has never been
 checked on such a system.

Sounds good to me.


 HAVE_SIG_ATOMIC_T is a debatable case, in that the only thing we're
 doing with it is c.h's
 
 /* sig_atomic_t is required by ANSI C, but may be missing on old platforms */
 #ifndef HAVE_SIG_ATOMIC_T
 typedef int sig_atomic_t;
 #endif
 
 which should be safe enough (if int isn't atomically stored/fetched we
 already have big problems elsewhere).  Still, the configure test for it
 appears to be a complete waste of cycles.

What are you proposing to do instead? Replace sig_atomic_t by int
everywhere? Or unconditionally do the typedef? Because the latter won't
work well if it's already typedef'ed...

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Horizontal scalability/sharding

2015-08-30 Thread Oleg Bartunov
On Sun, Aug 30, 2015 at 5:31 PM, Simon Riggs si...@2ndquadrant.com wrote:

 On 30 August 2015 at 03:17, Bruce Momjian br...@momjian.us wrote:

 I have recently increased my public statements about the idea of adding
 horizontal scaling/sharding to Postgres.


 Glad to see it. Many people have been pushing such things for years, so it
 is good to finally see some debate about this on Hackers.


 I wanted to share with hackers
 a timeline of how we got here, and where I think we are going in the
 short term:

 2012-2013:  As part of writing my scaling talk
 (http://momjian.us/main/presentations/overview.html#scaling), studying
 Oracle RAC, and talking to users, it became clear that an XC-like
 architecture (sharding) was the only architecture that was going to allow
 for write scaling.


 What other architectures were discussed? Where was that discussion?


 Users and conference attendees I talked to were increasingly concerned
 about the ability of Postgres to scale for high write volumes.  They
 didn't
 necessarily need that scale now, but they needed to know they could get
 it if they wanted it, and wouldn't need to switch to a new database in
 the future.  This is similar to wanting a car that can get you on a
 highway
 on-ramp fast --- even if you don't need it, you want to know it is there.


 +1


 2014:  I started to shop around the idea that we could use FDWs,
 parallelism, and a transaction/snapshot manager to get XC features
 as built-in to Postgres.  (I don't remember where the original idea
 came from.)  It was clear that having separate forks of the source code
 in XC and XL was never going to achieve critical mass --- there just
 aren't enough people who need high right scale right now, and the fork
 maintenance overhead is a huge burden.


 I personally support the view that we should put scalability features into
 Postgres core, rather than run separate forks.


 I realized that we would never get community acceptance to dump the XC
 (or XL) code needed for sharding into community Postgres


 How or why did you realize that? There has never been any such discussion,
 AFAIK. Surely it can be possible to move required subsystems across?


 , but with FDWs,
 we could add the features as _part_ of improving FDWs, which would benefit
 FDWs _and_ would be useful for sharding.  (We already see some of those
 FDW features in 9.5.)


 That is a huge presumption. Not discussed or technically analyzed in any
 way with the community.


 October, 2014:  EDB and NTT started working together in the community
 to start improving FDWs as a basis for an FDW-based sharding solution.
 Many of the 9.5 FDW improvements that also benefit sharding were developed
 by a combined EDB/NTT team.  The features improved FDWs independent of
 sharding, so they didn't need community buy-in on sharding to get them
 accepted.

 June, 2015:  I attended the PGCon sharding unconference session and
 there was a huge discussion about where we should go with sharding.
 I think the big take-away was that most people liked the FDW approach,
 but had business/customer reasons for wanting to work on XC or XL because
 those would be production-ready faster.


 Cough, cough. You must surely be joking that most people liked the FDW
 approach? How did we measure the acceptance of this approach?

 What actually is the FDW approach? Since its not been written down
 anywhere, or even explained verbally, how can anyone actually agree to it?


 July, 2015:  Oleg Bartunov and his new company Postgres Professional (PP)
 started to think about joining the FDW approach, rather than working on
 XL, as they had stated at PGCon in June.  A joint NTT/EDB/PP phone-in
 meeting is scheduled for September 1.




A little correction about Postgres Professional. We are concentrated on
idea to have one distributed transaction manager, originally DTM, now we
have better name XTM, which is neutral to actual cluster realization. For
example, we are testing it with XL, ported to 9.4, but we were planning to
extend tests to pg_shard, postgres_fdw. My idea was to have at least XTM
committed to 9.6, so all parties could work on their implementation much
easier.



 August, 2015:  While speaking at SFPUG, Citus Data approached me about
 joining the FDW sharding team.  They have been invited to the September
 1 meeting, as have the XC and XL people.


 2ndQuadrant is working in this area, specifically bringing XL 9.5
 forwards. Please can invites be posted to myself, Pavan Deolasee and Petr
 Jelinek also? I'll pass on to others also.

 Koichi Suzuki is arranging a meeting in Hong Long for XC/XL discussions.
 Presumably EDB is invited also? If Koichi is a leading organizer of this,
 why are there two meetings?

 October, 2015:  EDB is sponsoring a free 3-hour summit about FDW sharding
 at the PG-EU conference in Vienna.   Everyone is invited, but it is hoped
 most of the September 1 folks can attend.



 February, 2016:  Oleg is planning a similar meeting at their 

Re: [HACKERS] One question about security label command

2015-08-30 Thread Joe Conway
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/28/2015 07:21 PM, Adam Brightwell wrote:
 On 08/28/2015 08:37 AM, Joe Conway wrote:
 So given all that, here is what I propose we do:
 
 1.) Commit Kouhei's patch against HEAD and 9.5 (Joe) 2.) Commit
 my modified patch against 9.4 and 9.3 (Joe) 3.) Rework patch for
 9.2 (Kouhei) 4.) Finish standing up the RHEL/CentOS 7.x buildfarm
 member to test sepgsql on 9.2 and up. The animal (rhinoceros) is
 running already, but still needs some custom scripting. (Joe,
 Andrew) 5.) Additionally stand up a RHEL/CentOS 6.x buildfarm
 member to test sepgsql on 9.1 (no changes) (Joe).
 
 Sound like a plan?
 
 I think this makes sense.  Getting buildfarm coverage on any level
 is better than nothing, IMHO.  Kind of a bummer that 9.1 is the
 only version that will work as-is on EL6 but it is what it is for
 now, I suppose.

#1 and 2 above are done -- committed and pushed to HEAD and 9.3-9.5.

Additionally I have #5 (the 6.x buildfarm member) created/working and
I sent in my registration form. It has not been accepted yet.

Joe

- -- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training,  Open Source Development
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.22 (GNU/Linux)

iQIcBAEBAgAGBQJV40i4AAoJEDfy90M199hlFK4P/iBkQk1oboo9OlnRgIT3w/be
tCoL95E3ZH7SPEuxq0lbbCM6Hvz5lFWw0X9RQBJlUFBaIPjoilGrcf8r6COSfZZt
XOTwmQJMDO7iCmBuc89ZrVH5Fz3rc19eA6p7bNplwesWOtXP/RwRObjtffi2XHOz
pOuUqHpw9GBK4uVEf7v9Fdqz48OUYGa/MAMlw3XlBzgBKdx8l+bjBh/5YQFvRwBr
ujNb9c/1DfEvtr/oaa7rmkKNUVmL53bEZbL6m81+taCKbvt7yD5GAN5Sdqw27iiy
HILGwmdzlVzd+VMIJ7+53w2HpimIEVXCnn2yrM1rWL12b9oFmz/rCA1qfuyrGIN9
9ZFb+q3myCq/ofzXoznsBUIYxQhAfuLyRDrAUPEi8h4wWDc3JPJJkKmu/6ZuBGv7
m2g3mUx9sr7BoJKVZ4Vkqc4mgorwDkwzfJpiH0hKBewtO3WkN8LHk/jGTZaPgMwl
4OjJQvmCBk/5vdsfOQ/qZHqw6WM75gnxKGvHo/nFPuztzawXVx4sVWm2e4HxZlAP
tfrTZtAYTE1K5y6Kn6yIeQgW4ZTwztIsjuG0XebTAymnHIjqkkj5RHpn93PADbrh
wHvs6+nQAePdWoQ2LPwlWNELUxRz107BMnWBCHYk+cqoim48GE0UIHTLZQtR7C5D
wIo5lU81nUqPI147940K
=AB+P
-END PGP SIGNATURE-


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Removing dead support for pre-POSIX signals

2015-08-30 Thread Tom Lane
A question in another project's mailing list prompted me to investigate
whether our support for non-POSIX signals is actually still of any value.
I grepped through the buildfarm server's configure-stage reports, and
found that:

1. No buildfarm member in the available history (going back to 2012-01-01)
has ever reported not having the POSIX signal interface, nor sig_atomic_t.
(We don't run the POSIX-signals check on Windows systems, though.)

2. No non-Windows member has ever reported not having sigprocmask nor
sigsetjmp, save for two one-time failures that look to be due to configure
failing for being out of disk space.

3. The check for sys_siglist is still of use, which is unsurprising
because it is not required by POSIX.

The POSIX signal APIs, including sigprocmask and sigsetjmp, are required
by Single Unix Spec v2, which is what we've usually considered to be our
minimum baseline for supported Unix-ish systems.

I think we should rip out the configure checks for HAVE_POSIX_SIGNALS,
HAVE_SIGPROCMASK, HAVE_SIGSETJMP, and probably HAVE_SIG_ATOMIC_T, as well
as the C code that tries to make up for not having these things (on
non-Windows systems that is).  It's not being exercised and it's fair to
doubt that those code paths would even work reliably anymore.  For
instance, it seems likely that our current latch code has never been run
on a system without these APIs, and even more likely that Andres' recent
fooling around with signal handling (eg commit 675f5) has never been
checked on such a system.

HAVE_SIG_ATOMIC_T is a debatable case, in that the only thing we're
doing with it is c.h's

/* sig_atomic_t is required by ANSI C, but may be missing on old platforms */
#ifndef HAVE_SIG_ATOMIC_T
typedef int sig_atomic_t;
#endif

which should be safe enough (if int isn't atomically stored/fetched we
already have big problems elsewhere).  Still, the configure test for it
appears to be a complete waste of cycles.

Comments?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing dead support for pre-POSIX signals

2015-08-30 Thread Tom Lane
Andres Freund and...@anarazel.de writes:
 On 2015-08-30 14:59:41 -0400, Tom Lane wrote:
 HAVE_SIG_ATOMIC_T is a debatable case, in that the only thing we're
 doing with it is c.h's
 
 /* sig_atomic_t is required by ANSI C, but may be missing on old platforms */
 #ifndef HAVE_SIG_ATOMIC_T
 typedef int sig_atomic_t;
 #endif
 
 which should be safe enough (if int isn't atomically stored/fetched we
 already have big problems elsewhere).  Still, the configure test for it
 appears to be a complete waste of cycles.

 What are you proposing to do instead? Replace sig_atomic_t by int
 everywhere? Or unconditionally do the typedef? Because the latter won't
 work well if it's already typedef'ed...

No no no, I'm proposing to remove the above-quoted lines and the configure
test.  sig_atomic_t is required by C89; there is no reason anymore to
cope with it not being provided by signal.h.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing dead support for pre-POSIX signals

2015-08-30 Thread Andres Freund
On 2015-08-30 15:28:42 -0400, Tom Lane wrote:
 No no no, I'm proposing to remove the above-quoted lines and the configure
 test.  sig_atomic_t is required by C89; there is no reason anymore to
 cope with it not being provided by signal.h.

Ok, that works for me. You seemed to be a bit more doubtful about the
sig_atomic_t support, that's why I thought you might want to do
something but rip it out. Seems like a pretty low risk thing to try.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Potential GIN vacuum bug

2015-08-30 Thread Jeff Janes
On Sat, Aug 22, 2015 at 11:25 AM, Jeff Janes jeff.ja...@gmail.com wrote:

 On Tue, Aug 18, 2015 at 8:59 AM, Robert Haas robertmh...@gmail.com
 wrote:

 On Mon, Aug 17, 2015 at 5:41 PM, Jeff Janes jeff.ja...@gmail.com wrote:
  User backends attempt to take the lock conditionally, because otherwise
 they
  would cause an autovacuum already holding the lock to cancel itself,
 which
  seems quite bad.
 
  Not that this a substantial behavior change, in that with this code the
 user
  backends which find the list already being cleaned will just add to the
 end
  of the pending list and go about their business.  So if things are
 added to
  the list faster than they can be cleaned up, the size of the pending
 list
  can increase without bound.
 
  Under the existing code each concurrent user backend will try to clean
 the
  pending list at the same time.  The work doesn't parallelize, so doing
 this
  is just burns CPU (and possibly consuming up to maintenance_work_mem
 for
  *each* backend) but it does server to throttle the insertion rate and so
  keep the list from growing without bound.
 
  This is just a proof-of-concept patch, because I don't know if this
 approach
  is the right approach.

 I'm not sure if this is the right approach, but I'm a little wary of
 involving the heavyweight lock manager in this.  If pending list
 cleanups are frequent, this could involve a lot of additional lock
 manager traffic, which could be bad for performance.



 Even if they are
 infrequent, it seems like it would be more natural to handle this
 without some regime of locks and pins and buffer cleanup locks on the
 buffers that are storing the pending list, rather than a heavyweight
 lock on the whole relation.  But I am just waving my hands wildly
 here.


 I also thought of a buffer clean up lock on the pending list head buffer
 to represent the right to do a clean up.  But with the proviso that once
 you have obtained the clean up lock, you can then drop the exclusive buffer
 content lock and continue to hold the conceptual lock just by maintaining
 the pin.  I think that this would be semantically correct, but backends
 doing a cleanup would have to get the lock conditionally, and I think you
 would have too many chances for false failures where it bails out when the
 other party simply holds a pin.  I guess I could implement it and see how
 it fairs in my test case.


Attached is a patch to deal with this without the heavyweight locks.

I realized that using the clean up lock on the meta page, rather than the
pending head page, would be easier to implement as a pin was already held
on the meta page throughout, and the pending head page can change during
the cleanup if we need multiple passes.

Also, I think the clean up lock on the metapage should actually be easier.
All queries need to visit the pending head, and they hold it long enough to
check all the keys (possibly hundreds, checked with arbitrarily slow SQL
functions) on that page.  The metapage is only checked for a few variables
which are C types.

I thought of checking for metadata-head == InvalidBlockNumber with a
sharelock before getting the clean-up lock and then again after, but decide
against it as the user backends already check that immediately before
calling this function, and wouldn't call it if there was no pending list
as-of that check.

I exchange the exclusive context lock given to us by
the LockBufferForCleanup for a share content lock, so as to not hold the
exclusive lock over the IO possibly needed to read the pending head page
into buffers.  I don't know that this is actually a win.

This fixed the same problem I was seeing that was fixed by the previous
heavy-weight lock patch.

I still prefer the heavy-weight approach.  The buffer clean up lock for
vacuuming seems fragile to start with, and abusing it for other purposes
doesn't improve on that.

Whichever approach is taken, more work is needed on the comments.  And the
code that currently checks for concurrent cleanup and bails out needs to be
changed to throw errors or something instead.  But I don't want to make too
many changes until I know which approach to take, and whether it will be
back-patched.

Cheers,

Jeff


gin_pending_lwlock.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] datestyle=postgres broken with timezone=UTC+N

2015-08-30 Thread Christoph Berg
Re: Tom Lane 2015-08-30 20976.1440946...@sss.pgh.pa.us
 Christoph Berg c...@df7cb.de writes:
  postgres =# set timezone = 'Etc/UTC+1';
  SET
  postgres =# set datestyle = 'postgres';
  SET
  postgres =# select '2015-01-01 01:00:00 +0100'::timestamptz;
   Wed 31 Dec 23:00:00 2014 ETC/UTC
 
  postgres =# select 'Wed 31 Dec 23:00:00 2014 ETC/UTC'::timestamptz;
   Wed 31 Dec 22:00:00 2014 ETC/UTC
 
 Not sure I see your point?  Lacking any entry for ETC/UTC in the
 timezone_abbreviations file, we'll look it up in the zoneinfo database,
 and that will tell us it means UTC+0.  So the input means 2300 UTC, and
 that gets displayed as 2200 in your UTC-1 display timezone.
 
 No, this isn't entirely consistent with the way the timezone name is being
 used in output.  But that's because you've only half-configured the system
 for a weird timezone name.  You would need an abbreviation entry as well
 if you want ETC/UTC to be read as UTC-1.
 
 For more info read
 http://www.postgresql.org/docs/9.4/static/datatype-datetime.html#DATATYPE-TIMEZONES
 --- see particularly the FOOBAR0 example, which is exactly what you've

Hi Tom,

thanks for the heads-up. I have to confess that I was chasing a ghost
here; the original report for the libpqtypes test failure I was
investigating was mentioning Etc/GMT+12 (instead of UTC) which is a
legal zone, and with which the timestamptz output is correct even with
datestyle=postgres. I'll report back if the problem is on the PG or
libpqtypes side, sorry for the premature noise.

 got here.  By and large, I don't recommend use of POSIX notation in
 timezone settings; the IANA geographical zone names are much less likely
 to bite you on the rear.

The report is from Debian's reproducible builds project, they try to
make the package builds independent from as many environment factors as
possible, and one variation they try are timezones on the far plus or
minus side. Luckily, PostgreSQL itself is already reproducible (but
not cross-build friendly, more on that later), but libpqtypes got
caught by a TZ problem in the test suite.

https://wiki.debian.org/ReproducibleBuilds

Christoph
-- 
c...@df7cb.de | http://www.df7cb.de/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing dead support for pre-POSIX signals

2015-08-30 Thread Tom Lane
Andres Freund and...@anarazel.de writes:
 On 2015-08-30 15:28:42 -0400, Tom Lane wrote:
 No no no, I'm proposing to remove the above-quoted lines and the configure
 test.  sig_atomic_t is required by C89; there is no reason anymore to
 cope with it not being provided by signal.h.

 Ok, that works for me. You seemed to be a bit more doubtful about the
 sig_atomic_t support, that's why I thought you might want to do
 something but rip it out. Seems like a pretty low risk thing to try.

The distinction I was making was that it's fair to suspect that the
pre-POSIX-signal code is actively broken.  (For instance, as far back as
commit 8408f6525 we were aware that libpq's thread-safe signal logic would
not work with pre-POSIX signals.  How many other cases do you think have
snuck in unnoticed since then?)  On the other hand, there's no reason to
think that the substitute sig_atomic_t typedef wouldn't work fine if it
were used.  The argument for ripping that out is merely that it's silly to
continue expending configure cycles on something that's required by C89.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 broken on alpha

2015-08-30 Thread Tom Lane
David Fetter da...@fetter.org writes:
 At a minimum, we should de-support every platform on which literally
 no new deployments will ever happen.
 I'm looking specifically at you, HPUX, and I could make a pretty good
 case for the idea that we can relegate 32-bit platforms to the ash
 heap of history, at least on the server side.

This wasn't responded to very much, but I wanted to put on record
that I don't agree with the concept.  There are several reasons:

1. Run on every platform you can is in the DNA of this and just about
every other successful open-source project.  You don't want to drive away
potential users by not supporting their platform.  If they're still
getting good use out of an old OS or non-mainstream architecture, who are
we to tell them not to?

2. Even if a particular platform is no longer a credible target for
production deployments, it can be a useful test case to ensure that we
don't get frozen into a narrow FooOS on x86_64 is the only case worth
considering straitjacket.  Software monocultures are bad news; they tend
not to adapt very well when the world changes.  So for instance I'm
reluctant to shut down pademelon, even though its compiler is old enough
to vote, because it's one of not too darn many buildfarm animals whose
compilers are not gcc or derivatives.  We need cases like that to keep us
from building in gcc-isms.  In short, supporting old platforms is one of
the ways that we stay flexible enough to be able to support new platforms
in the future.

3. I see no reason to desupport platforms when we don't gain anything by
it.  In the case of Alpha, it's pretty clear what we gain: we don't have
to worry about its unlike-anything-else memory coherency model.  (I'm not
very worried that future platforms will adopt that idea, either.)  And the
lack of any support from its remaining user community tilts the scales
pretty heavily against it.  I'll be happy to drop testing on HPUX 10.20,
or the ancient OS X versions my other buildfarm critters run, the minute
there is some feature we have a clear need for that one of them doesn't
have.  But I don't think it's desirable to cut anything off as long as
it's still able to run a buildfarm member.  I think those critters are
still capable of catching unexpected portability issues that might affect
more-viable platforms too.


A useful comparison point is the testing Greg Stark did recently for VAX.
Certainly no-one's ever again going to try to get useful work done with
Postgres on a VAX, but that still taught us some good things about
unnecessary IEEE-floating-point dependencies that had snuck into the code.
Someday, that might be important; IEEE 754 won't be the last word on
float arithmetic forever.

As an example of a desupport proposal that I think *is* well-founded,
see my nearby message 27975.1440961...@sss.pgh.pa.us.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \rotate in psql

2015-08-30 Thread David Fetter
On Sat, Aug 29, 2015 at 12:48:23AM +0200, Daniel Verite wrote:
  Hi,
 
 This is a reboot of my previous proposal for pivoting results in psql,
 with a new patch that generalizes the idea further through a command
 now named \rotate, and some examples.

Neat!

Thanks for putting this together :)

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commitfest remaining Needs Review items

2015-08-30 Thread Michael Paquier
On Wed, Aug 26, 2015 at 4:59 PM, Michael Paquier michael.paqu...@gmail.com
wrote:

 On Tue, Aug 25, 2015 at 4:39 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
  On Mon, Aug 10, 2015 at 4:34 PM, Heikki Linnakangas hlinn...@iki.fi
 wrote:
  Hello Hackers,
 
  There are a few Needs Review items remaining in the July commitfest.
  Reviewers, please take action - you are holding up the commitfest.
 
  In addition to these items, there are a bunch of items in Ready for
  Committer state. Committers: please help with those.
 
  At this stage, there are currently 26 patches in need of actions for
  the current CF:

 And now we are down to 2, the following ones:
 -- Backpatch Resource Owner reassign locks cache. Still not sure if
 Andres is planning a backpatch to 9.2.


This has been closed by Tom.


 -- self-defined policy for sepgsql regression test. Joe is wrapping up
 that.
 The rest has been updated in the CF app. If you have any complaints or
 remarks, feel free to do so on the thread of the related patch or here
 that's fine.


This has been moved to the next CF with the same status.
And the commit fest of 2015-07 is now closed with the following score:
Committed: 58.
Moved to next CF: 25.
Rejected: 9.
Returned with Feedback: 25.
Total: 117.
Thanks!
-- 
Michael


Re: [HACKERS] Horizontal scalability/sharding

2015-08-30 Thread Simon Riggs
On 30 August 2015 at 03:17, Bruce Momjian br...@momjian.us wrote:

 I have recently increased my public statements about the idea of adding
 horizontal scaling/sharding to Postgres.


Glad to see it. Many people have been pushing such things for years, so it
is good to finally see some debate about this on Hackers.


 I wanted to share with hackers
 a timeline of how we got here, and where I think we are going in the
 short term:

 2012-2013:  As part of writing my scaling talk
 (http://momjian.us/main/presentations/overview.html#scaling), studying
 Oracle RAC, and talking to users, it became clear that an XC-like
 architecture (sharding) was the only architecture that was going to allow
 for write scaling.


What other architectures were discussed? Where was that discussion?


 Users and conference attendees I talked to were increasingly concerned
 about the ability of Postgres to scale for high write volumes.  They didn't
 necessarily need that scale now, but they needed to know they could get
 it if they wanted it, and wouldn't need to switch to a new database in
 the future.  This is similar to wanting a car that can get you on a highway
 on-ramp fast --- even if you don't need it, you want to know it is there.


+1


 2014:  I started to shop around the idea that we could use FDWs,
 parallelism, and a transaction/snapshot manager to get XC features
 as built-in to Postgres.  (I don't remember where the original idea
 came from.)  It was clear that having separate forks of the source code
 in XC and XL was never going to achieve critical mass --- there just
 aren't enough people who need high right scale right now, and the fork
 maintenance overhead is a huge burden.


I personally support the view that we should put scalability features into
Postgres core, rather than run separate forks.


 I realized that we would never get community acceptance to dump the XC
 (or XL) code needed for sharding into community Postgres


How or why did you realize that? There has never been any such discussion,
AFAIK. Surely it can be possible to move required subsystems across?


 , but with FDWs,
 we could add the features as _part_ of improving FDWs, which would benefit
 FDWs _and_ would be useful for sharding.  (We already see some of those
 FDW features in 9.5.)


That is a huge presumption. Not discussed or technically analyzed in any
way with the community.


 October, 2014:  EDB and NTT started working together in the community
 to start improving FDWs as a basis for an FDW-based sharding solution.
 Many of the 9.5 FDW improvements that also benefit sharding were developed
 by a combined EDB/NTT team.  The features improved FDWs independent of
 sharding, so they didn't need community buy-in on sharding to get them
 accepted.

 June, 2015:  I attended the PGCon sharding unconference session and
 there was a huge discussion about where we should go with sharding.
 I think the big take-away was that most people liked the FDW approach,
 but had business/customer reasons for wanting to work on XC or XL because
 those would be production-ready faster.


Cough, cough. You must surely be joking that most people liked the FDW
approach? How did we measure the acceptance of this approach?

What actually is the FDW approach? Since its not been written down
anywhere, or even explained verbally, how can anyone actually agree to it?


 July, 2015:  Oleg Bartunov and his new company Postgres Professional (PP)
 started to think about joining the FDW approach, rather than working on
 XL, as they had stated at PGCon in June.  A joint NTT/EDB/PP phone-in
 meeting is scheduled for September 1.



 August, 2015:  While speaking at SFPUG, Citus Data approached me about
 joining the FDW sharding team.  They have been invited to the September
 1 meeting, as have the XC and XL people.


2ndQuadrant is working in this area, specifically bringing XL 9.5 forwards.
Please can invites be posted to myself, Pavan Deolasee and Petr Jelinek
also? I'll pass on to others also.

Koichi Suzuki is arranging a meeting in Hong Long for XC/XL discussions.
Presumably EDB is invited also? If Koichi is a leading organizer of this,
why are there two meetings?

October, 2015:  EDB is sponsoring a free 3-hour summit about FDW sharding
 at the PG-EU conference in Vienna.   Everyone is invited, but it is hoped
 most of the September 1 folks can attend.



 February, 2016:  Oleg is planning a similar meeting at their February
 Moscow conference.



 Anyway, I wanted to explain the work that has been happening around
 sharding.


Thanks


 As things move forward, I am increasingly convinced that write
 scaling will be needed soon,


+1


 that the XC approach is the only reasonable way to do it,



 and that FDWs are the cleanest way to get it into community
 Postgres.


Those two things aren't at all obvious to me.

Please don't presume my opposition. If the technical information were made
public, I might understand and agree with the FDW approach, perhaps

[HACKERS] SimpleTee flush

2015-08-30 Thread Kevin Grittner
I find the TestLib.pm framework to be more friendly with the
attached tweak to SimpleTee.pm.  It's not a big deal, so if anyone
objects I'll let it drop, and I don't think it merits anything
fancier.

Objections?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

SimpleTee-flush.diff
Description: invalid/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SimpleTee flush

2015-08-30 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:

 Just out of curiosity --- what's the rationale for flushing after
 every line, rather than once at the end of the loop?  Is there
 actual input-reading going on underneath the @$self notation?
 If it's just dumping data from process memory, I should think
 one flush would be sufficient.

Isn't the loop over file handles written to?  What would you flush
outside the loop?


--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SimpleTee flush

2015-08-30 Thread Tom Lane
Kevin Grittner kgri...@ymail.com writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 Just out of curiosity --- what's the rationale for flushing after
 every line, rather than once at the end of the loop?

 Isn't the loop over file handles written to?

Oh... I just automatically read it as looping over some data,
not printing the same data to multiple filehandles.  That's kind of
odd ... but yeah, you would have to do it like that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] datestyle=postgres broken with timezone=UTC+N

2015-08-30 Thread Tom Lane
Christoph Berg c...@df7cb.de writes:
 postgres =# set timezone = 'Etc/UTC+1';
 SET
 postgres =# set datestyle = 'postgres';
 SET
 postgres =# select '2015-01-01 01:00:00 +0100'::timestamptz;
  Wed 31 Dec 23:00:00 2014 ETC/UTC

 postgres =# select 'Wed 31 Dec 23:00:00 2014 ETC/UTC'::timestamptz;
  Wed 31 Dec 22:00:00 2014 ETC/UTC

Not sure I see your point?  Lacking any entry for ETC/UTC in the
timezone_abbreviations file, we'll look it up in the zoneinfo database,
and that will tell us it means UTC+0.  So the input means 2300 UTC, and
that gets displayed as 2200 in your UTC-1 display timezone.

No, this isn't entirely consistent with the way the timezone name is being
used in output.  But that's because you've only half-configured the system
for a weird timezone name.  You would need an abbreviation entry as well
if you want ETC/UTC to be read as UTC-1.

For more info read
http://www.postgresql.org/docs/9.4/static/datatype-datetime.html#DATATYPE-TIMEZONES
--- see particularly the FOOBAR0 example, which is exactly what you've
got here.  By and large, I don't recommend use of POSIX notation in
timezone settings; the IANA geographical zone names are much less likely
to bite you on the rear.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SimpleTee flush

2015-08-30 Thread Tom Lane
Kevin Grittner kgri...@ymail.com writes:
 I find the TestLib.pm framework to be more friendly with the
 attached tweak to SimpleTee.pm.  It's not a big deal, so if anyone
 objects I'll let it drop, and I don't think it merits anything
 fancier.

Just out of curiosity --- what's the rationale for flushing after
every line, rather than once at the end of the loop?  Is there
actual input-reading going on underneath the @$self notation?
If it's just dumping data from process memory, I should think
one flush would be sufficient.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Information of pg_stat_ssl visible to all users

2015-08-30 Thread Andres Freund
On 2015-08-30 11:33:28 -0400, Stephen Frost wrote:
 Yeah, I'm not really thrilled with all of this information being
 available to everyone on the system.  We already get ding'd by people
 for not limiting who can see what connections there are to the database
 and this is doubling-down on that.

I don't buy that the relevant piece of information is the CN when the
connection itself is visible. Neither do I buy the argument that later
hiding this for ssl once we have more granular permissions is going to
be relevantly painful in comparison to changing the contents of
pg_stat_activity itself.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: About CMake v2

2015-08-30 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 I imagine that esoteric platforms are not going to have cmake at all and
 are going to need their own installation anyway.  Not sure if that's
 going to be more onerous than the requirement to install GNU make.

If we get to the point where this is starting to look like a real
proposal, I'll try to install cmake from source on my buildfarm critters,
none of which have it ATM.  And we'll need to get other buildfarm owners
to do likewise.  So we'll be able to acquire data points on whether that's
a big problem or not.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Information of pg_stat_ssl visible to all users

2015-08-30 Thread Stephen Frost
* Michael Paquier (michael.paqu...@gmail.com) wrote:
 On Sun, Aug 30, 2015 at 5:27 AM, Bruce Momjian wrote:
 
  I know I am coming in late here, but I know Heroku uses random user
  names to allow a cluster to have per-user databases without showing
  external user name details:
  [...]
  I can see them having problems with a user being able to see the SSL
  remote user names of all connected users.
 
 
 Yep, and I can imagine that this is the case of any company managing cloud
 nodes with Postgres embedded, and at least to me that's a real concern.

Yeah, I'm not really thrilled with all of this information being
available to everyone on the system.  We already get ding'd by people
for not limiting who can see what connections there are to the database
and this is doubling-down on that.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Potential GIN vacuum bug

2015-08-30 Thread Jeff Janes
On Sun, Aug 30, 2015 at 11:11 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Jeff Janes jeff.ja...@gmail.com writes:

 Your earlier point about how the current design throttles insertions to
 keep the pending list from growing without bound seems like a bigger deal
 to worry about.  I think we'd like to have some substitute for that.
 Perhaps we could make the logic in insertion be something like

 if (pending-list-size  threshold)
 {
 if (conditional-lock-acquire(...))
 {
 do-pending-list-cleanup;
 lock-release;
 }
 else if (pending-list-size  threshold * 2)
 {
 unconditional-lock-acquire(...);
 if (pending-list-size  threshold)
 do-pending-list-cleanup;
 lock-release;
 }
 }

 so that once the pending list got too big, incoming insertions would wait
 for it to be cleared.  Whether to use a 2x safety margin or something else
 could be a subject for debate, of course.


If the goal is to not change existing behavior (like for back patching) the
margin should be 1, always wait.  But we would still have to deal with the
fact that unconditional acquire attempt by the backends will cause a vacuum
to cancel itself, which is undesirable.  If we define a new namespace for
this lock (like the relation extension lock has its own namespace) then
perhaps the cancellation code could be made to not cancel on that
condition.  But that too seems like a lot of work to backpatch.

Would we bother to back-patch a theoretical bug which there is no evidence
is triggering in the field?  Of course, if people are getting bit by this,
they probably wouldn't know.  You search for malevolent unicorns, get no
hits, and just assume there are no hits, without scouring the table and
seeing it is an index problem.  Or if you do realize it is an index
problem, you would probably never trace it back to the cause of the
problem.  There are quite a few reports of mysterious index corruptions
which never get resolved.

If we want to improve the current behavior rather than fix a bug, then I
think that if the list is greater than threshold*2 and the cleaning lock is
unavailable, what it should do is proceed to insert the tuple's keys into
the index itself, as if fastupdate = off.  That would require some major
surgery to the existing code, as by the time it invokes the clean up, it is
too late to not insert into the pending list.


Cheers,

Jeff


Re: [HACKERS] 9.4 broken on alpha

2015-08-30 Thread Thomas Munro
On Mon, Aug 31, 2015 at 8:42 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 A useful comparison point is the testing Greg Stark did recently for VAX.
 Certainly no-one's ever again going to try to get useful work done with
 Postgres on a VAX, but that still taught us some good things about
 unnecessary IEEE-floating-point dependencies that had snuck into the code.
 Someday, that might be important; IEEE 754 won't be the last word on
 float arithmetic forever.


Just by the way, there is at least one example of a non-IEEE  floating
point format supported by a current production compiler and hardware: IBM
XL C on z/OS (and possibly other platforms) can use either IEEE or IBM's
hex float format, depending on a compiler option.

https://en.wikipedia.org/wiki/IBM_Floating_Point_Architecture

-- 
Thomas Munro
http://www.enterprisedb.com


Re: [HACKERS] Horizontal scalability/sharding

2015-08-30 Thread Michael Paquier
On Mon, Aug 31, 2015 at 11:08 AM, Bruce Momjian br...@momjian.us wrote:

 On Mon, Aug 31, 2015 at 09:53:57AM +0900, Michael Paquier wrote:
  Well, I have had many such discussions with XC/XL folks, and that was my
  opinion.  I have seen almost no public discussion about this because the
  idea had almost no chance of success.  If it was possible, someone would
  have already suggested it on this list.
 
 
  Or perhaps people invested in this area had other obligations or lacked
  motivation and/or time to work to push up for things in core. That's not
  possible to know, and what is done is done.

 Well, I have talked to everyone privately about this, and concluded that
 while horizontal scalability/sharding is useful, it is unlikely that the
 code volume of something like XC or XL would be accepted into the
 community, and frankly, now that we have FDWs, it is hard to imagine why
 we would _not_ go in the FDW direction.

If I recall correctly in terms of numbers, that's indeed 40k of code,
the main areas of XC code being the GTM, the planner changes for
expression and join push down, and the connection pooler for parallel
query execution.

ISTM that FDW is a portion of the puzzle, there are other pieces that
could be used toward an in-core integration, like the parallel stuff
Amit Kapila is working on to allow remote query execution in parallel
of local scans. Also, XC/XL were performing well on OLTP thanks to the
connection pooler: this should indeed be part of the FDW portion
managing the foreign scans. This may sound like a minor issue compared
to the others, but already established connections help a lot when
scaling out with foreign servers.

 Of course, people have concerns, and FDWs might need to be improved, but
 it is something worth researching.  We might find out FDWs can't be used
 at all, and that we have to either add much more code to Postgres to do
 sharding, do something like pg_shard, or not implement built-in sharding
 at all, but at least it is time to research this.

I am really looking forward to hearing the arguments of the authors of
pg_shard on the matter.

  OK, I will send you a separate email and you can then supply their email
  addresses.
 
 
  FWIW, I would be interested in that as well. I worked in this area of things
  for a couple of years as well FWIW.

 OK, I will send you an email.

Thanks.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Horizontal scalability/sharding

2015-08-30 Thread Bruce Momjian
On Sun, Aug 30, 2015 at 10:36:23PM +0300, Oleg Bartunov wrote:
 Honestly, I still don't know which approach is better, we already played with
 XL (ported to 9.4)  and identified some very strong issues with inconsistency,
 which scared us, especially taking into account how easy we found them. XC
 people have fixed them, but I'm not sure if they were fundamental and if we
 could construct more sophisticated tests and find more issues in XC/XL. We 
 also
 a bit disappointed by Huawei position about CSN patch, we hoped to use for  
 our
 XTM.  FDW approach has been actively criticized by pg_shard people and that's
 also made me a bit suspicious. 

Yep, that has me concerned too.  The pg_shard people will be on the
September 1 call and are working on a Google document to explain their
concerns about FDWs for sharding.

 It looks like  we are doomed to continue
 several development forks, so we decided to work on very important common
 project, XTM, which we hoped could be accepted by all parties and eventually
 committed to 9.6.  Now I see we were right, unfortunately.  

Yes, the ability to add independent parts that can eventually be used
for sharding is a strong indication that doing this incrementally is a
good approach.

 Again, could we organize meeting somewhere in September ?  US is not good for
 us, but other places should be ok. I want to have an agreement  at least on
 XTM. We still are testing various approaches, though. We could present results
 of our experiments and are open to discussion. It's not easy project, but it's
 something we could do for 9.6.

Good.  XTM is a must-have for several use-cases, including sharding.

 I'm very glad Bruce started this discussion in -hackers, since it's silly to 
 me
 to participate in both threads :)  Let's meet in September !

In summary, I think we need to start working on built-in sharding, and
FDWs are the only way I can see to do it with minimal code changes,
which I think might be a community requirement.  It might not work, but
right now, it is the only possible approach I can see.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Horizontal scalability/sharding

2015-08-30 Thread Michael Paquier
On Mon, Aug 31, 2015 at 7:29 AM, Bruce Momjian br...@momjian.us wrote:

 On Sun, Aug 30, 2015 at 03:31:10PM +0100, Simon Riggs wrote:
  I realized that we would never get community acceptance to dump the
 XC
  (or XL) code needed for sharding into community Postgres
 
 
  How or why did you realize that? There has never been any such
 discussion,
  AFAIK. Surely it can be possible to move required subsystems across?

 Well, I have had many such discussions with XC/XL folks, and that was my
 opinion.  I have seen almost no public discussion about this because the
 idea had almost no chance of success.  If it was possible, someone would
 have already suggested it on this list.


Or perhaps people invested in this area had other obligations or lacked
motivation and/or time to work to push up for things in core. That's not
possible to know, and what is done is done.


  July, 2015:  Oleg Bartunov and his new company Postgres Professional
 (PP)
  started to think about joining the FDW approach, rather than working
 on
  XL, as they had stated at PGCon in June.  A joint NTT/EDB/PP phone-in
  meeting is scheduled for September 1.
 
  August, 2015:  While speaking at SFPUG, Citus Data approached me
 about
  joining the FDW sharding team.  They have been invited to the
 September
  1 meeting, as have the XC and XL people.
 
 
  2ndQuadrant is working in this area, specifically bringing XL 9.5
 forwards.

 Yes, I saw the blog post about that:

 http://blog.2ndquadrant.com/working-towards-postgres-xl-9-5/

  Please can invites be posted to myself, Pavan Deolasee and Petr Jelinek
 also?
  I'll pass on to others also.

 OK, I will send you a separate email and you can then supply their email
 addresses.


FWIW, I would be interested in that as well. I worked in this area of
things for a couple of years as well FWIW.


  Koichi Suzuki is arranging a meeting in Hong Long for XC/XL discussions.
  Presumably EDB is invited also? If Koichi is a leading organizer of
 this, why
  are there two meetings?

 I certainly have heard nothing about it, except third-hand people
 telling me a meeting is happening.  I assumed those meetings where
 XC/XL-specific.


Yep, that's my understanding as well and AFAIK as I know things have been
carried this way until now, aka XC/XL and Postgres core are aimed to live
as separate communities.
-- 
Michael


Re: [HACKERS] Potential GIN vacuum bug

2015-08-30 Thread Tom Lane
Jeff Janes jeff.ja...@gmail.com writes:
 On Sun, Aug 30, 2015 at 11:11 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Your earlier point about how the current design throttles insertions to
 keep the pending list from growing without bound seems like a bigger deal
 to worry about.  I think we'd like to have some substitute for that.
 ...

 If the goal is to not change existing behavior (like for back patching) the
 margin should be 1, always wait.

The current behavior is buggy, both as to performance and correctness,
so I'm not sure how come don't change the behavior should be a
requirement.

 But we would still have to deal with the
 fact that unconditional acquire attempt by the backends will cause a vacuum
 to cancel itself, which is undesirable.

Good point.

 If we define a new namespace for
 this lock (like the relation extension lock has its own namespace) then
 perhaps the cancellation code could be made to not cancel on that
 condition.  But that too seems like a lot of work to backpatch.

We could possibly teach the autocancel logic to distinguish this lock type
from others without using a new namespace.  That seems a bit klugy, but
maybe better than adding a new namespace.  (For example, there are
probably only a couple of modes in which we take page-level locks at
present.  Choosing a currently unused, but self-exclusive, mode for taking
such a lock might serve.)

 Would we bother to back-patch a theoretical bug which there is no evidence
 is triggering in the field?

What's theoretical about it?  You seemed pretty sure that the issue in
http://www.postgresql.org/message-id/flat/CA+bfosGVGVQhMAa=0-mue6coo7dbsgayxb-xsnr5vm-s39h...@mail.gmail.com
was exactly this.

 If we want to improve the current behavior rather than fix a bug, then I
 think that if the list is greater than threshold*2 and the cleaning lock is
 unavailable, what it should do is proceed to insert the tuple's keys into
 the index itself, as if fastupdate = off.  That would require some major
 surgery to the existing code, as by the time it invokes the clean up, it is
 too late to not insert into the pending list.

Meh.  That's introducing a whole new behavioral regime, which quite aside
from correctness bugs might introduce new performance bugs of its own.
It certainly doesn't sound like a better back-patch candidate than the
other thing.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Horizontal scalability/sharding

2015-08-30 Thread Bruce Momjian
On Mon, Aug 31, 2015 at 09:53:57AM +0900, Michael Paquier wrote:
 Well, I have had many such discussions with XC/XL folks, and that was my
 opinion.  I have seen almost no public discussion about this because the
 idea had almost no chance of success.  If it was possible, someone would
 have already suggested it on this list.
 
 
 Or perhaps people invested in this area had other obligations or lacked
 motivation and/or time to work to push up for things in core. That's not
 possible to know, and what is done is done.

Well, I have talked to everyone privately about this, and concluded that
while horizontal scalability/sharding is useful, it is unlikely that the
code volume of something like XC or XL would be accepted into the
community, and frankly, now that we have FDWs, it is hard to imagine why
we would _not_ go in the FDW direction.  

Of course, people have concerns, and FDWs might need to be improved, but
it is something worth researching.  We might find out FDWs can't be used
at all, and that we have to either add much more code to Postgres to do
sharding, do something like pg_shard, or not implement built-in sharding
at all, but at least it is time to research this.

 OK, I will send you a separate email and you can then supply their email
 addresses.
 
 
 FWIW, I would be interested in that as well. I worked in this area of things
 for a couple of years as well FWIW.

OK, I will send you an email.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] icc vs. gcc-style asm blocks ... maybe the twain can meet?

2015-08-30 Thread Tom Lane
I wrote:
 I came across some websites suggesting that icc will handle gcc-style
 asm blocks as long as you give it the -fasm-blocks command line option.
 It would be awfully nice to get rid of the __INTEL_COMPILER special
 cases in s_lock.h and the atomics headers --- would someone who has
 icc at hand check into this theory?

Hmm ... wait a second.  The main collection of asm blocks in s_lock.h
believes that Intel's compiler will take gcc-style asm without any help:

#if defined(__GNUC__) || defined(__INTEL_COMPILER)

It has believed that since 2003.  There are just two stanzas in s_lock.h
that think icc needs its own implementation; one was introduced in 2005
and the other in 2014, and I'm betting both of them are confused about it.
The other places where __INTEL_COMPILER is used to exclude an asm block
are also of relatively recent vintage.  I'm suspecting that they are
cargo-cult programming rather than actually necessary special cases.

It's possible that these extra implementations are worth the trouble to
carry because icc is smarter about those special intrinsics than it is
about asm blocks.  However, unless someone can point to some solid
evidence of that, I think we should get rid of 'em.  That code is quite
enough of an #ifdef snake's nest without carrying versions we don't
demonstrably need.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Proposal of Table Partition

2015-08-30 Thread My Life
Hi, everyone! I'd like to propose a postgres partition implementation. First, I 
would show the design to everyone, and talk about it. If we think the design is 
not very bad, and can be commit to the PostgreSQL baseline, then I will post 
the code to the community.
(note: my english is not very good.)

Table Partition Design
=
In this design, partitions are normal tables in inheritance hierarchies, with 
the same table structure with the partitioned table.

In pg_class we have an additional relpartition field which has following values:
's'/* single regular table */
'r'/* partitioned table by range */
'l'/* partitioned table by list */
'h'/* partitioned table by hash */
'c'/* child partition table */

Add a new system schema named 'pg_partition', just like 'pg_toast', we can 
create the partition catalog table to store the partition entries. let's assume 
the partition catalog's name is pg_partition_2586 (2586 is the partitioned 
table's OID in pg_class).
a range or interval partition catalog's structure is as follows:
columndata typecomment
partnamenamea partition's name, this is the primary key
partidoida partition's OID in pg_class
intervaltexta interval partition's interval(maybe a 
expression)
partkey1depends on partitioned table
...
partkeyNdepends on partitioned table
partkey1, ..., partkeyN is a partition's upper bound.
Finally, make a unique constraint on partkey1, ..., partkeyN.
Every time we create a new partition, we insert a new tuple into this partition 
catalog.
Every time we drop an old partition, we delete the related tuple in this 
partition catalog.

For a partitioned table's CREATE action, we should transform the action into 
the CREATE action of partitioned table and partitions, and the INSERT action 
into the partition catalog.

For INSERT action, we implement a RelationGetTuplePartid method, which can find 
the partition the tuple belongs to. It will do an index scan on the partition 
catalog table(assume it is pg_partition_2586) to find the partition.
and a ExecGetPartitionResultRel method, which can return the partition's 
ResultRelInfo to execute INSERT action.

For partitioned table's scan action, and JOIN action, we implemented a plan 
node named 'PartitionExpand'. the plan node can expand the partitioned table 
scan node into a list of partitions according to the filter and conditions. and 
it can expand partitioned table JOIN node into a list of partitions JOIN node 
wisely.

For pg_dump backup action, we should dump the partition catalog, and 
relpartition field in pg_class.

so these are the main points of the design, and I can show any detail you 
wondered later.

[HACKERS] [PROPOSAL] Table Partition

2015-08-30 Thread My Life
Hi, everyone! I'd like to propose a postgres partition implementation. First, I 
would show the design to everyone, and talk about it. If we think the design is 
not very bad, and can be commit to the PostgreSQL baseline, then I will post 
the code to the community.
(note: my english is not very good.)

Table Partition Design
=
In this design, partitions are normal tables in inheritance hierarchies, with 
the same table structure with the partitioned table.

In pg_class we have an additional relpartition field which has following values:
's'/* single regular table */
'r'/* partitioned table by range */
'l'/* partitioned table by list */
'h'/* partitioned table by hash */
'c'/* child partition table */

Add a new system schema named 'pg_partition', just like 'pg_toast', we can 
create the partition catalog table to store the partition entries. let's assume 
the partition catalog's name is pg_partition_2586 (2586 is the partitioned 
table's OID in pg_class).
a range or interval partition catalog's structure is as follows:
columndata typecomment
partnamenamea partition's name, this is the primary key
partidoida partition's OID in pg_class
intervaltexta interval partition's interval(maybe a 
expression)
partkey1depends on partitioned table
...
partkeyNdepends on partitioned table
partkey1, ..., partkeyN is a partition's upper bound.
Finally, make a unique constraint on partkey1, ..., partkeyN.
Every time we create a new partition, we insert a new tuple into this partition 
catalog.
Every time we drop an old partition, we delete the related tuple in this 
partition catalog.

For a partitioned table's CREATE action, we should transform the action into 
the CREATE action of partitioned table and partitions, and the INSERT action 
into the partition catalog.

For INSERT action, we implement a RelationGetTuplePartid method, which can find 
the partition the tuple belongs to. It will do an index scan on the partition 
catalog table(assume it is pg_partition_2586) to find the partition.
and a ExecGetPartitionResultRel method, which can return the partition's 
ResultRelInfo to execute INSERT action.

For partitioned table's scan action, and JOIN action, we implemented a plan 
node named 'PartitionExpand'. the plan node can expand the partitioned table 
scan node into a list of partitions according to the filter and conditions. and 
it can expand partitioned table JOIN node into a list of partitions JOIN node 
wisely.
We implemented a DynamicPrunePartition method, which can expand the partitioned 
table's scan node into a list of partition's scan node.
We implemented a DynamicPrunePartitionJoin method, which can expand the 
partitioned table's JOIN node into a list of partition's JOIN node.
These expand action happend in ExecInitPartitionExpand function, when 
initialize the executor. and all these action implemented based on the 
partition catalog.

For UPDATE and DELETE action, we just set real partition as the ResultRelInfo, 
when ExecPartitionExpand is running.

For pg_dump backup action, we should dump the partition catalog, and 
relpartition field in pg_class.

so these are the main points of the design, and I can show any detail you 
wondered later.

Re: [HACKERS] Horizontal scalability/sharding

2015-08-30 Thread Bruce Momjian
On Sun, Aug 30, 2015 at 03:31:10PM +0100, Simon Riggs wrote:
 On 30 August 2015 at 03:17, Bruce Momjian br...@momjian.us wrote:
 
 I have recently increased my public statements about the idea of adding
 horizontal scaling/sharding to Postgres.

 Glad to see it. Many people have been pushing such things for years, so it is
 good to finally see some debate about this on Hackers.

Agreed.  Right now, in our community, we are only seeing users who are
happy with what Postgres offers but think they might need massive
horizontal scalability in the future.  I think there is a larger group
that cares about massive horizontal scalability, but those people are
using other software right now, so we don't see them yet.

Without a roadmap for built-in massive horizontal scalability, I think
Postgres adoption will eventually suffer.

 I wanted to share with hackers
 a timeline of how we got here, and where I think we are going in the
 short term:
 
 2012-2013:  As part of writing my scaling talk
 (http://momjian.us/main/presentations/overview.html#scaling), studying
 Oracle RAC, and talking to users, it became clear that an XC-like
 architecture (sharding) was the only architecture that was going to allow
 for write scaling.
 
 
 What other architectures were discussed? Where was that discussion?

That was mostly my conclusion.  I explained it to small groups at
conferences and Postgres user groups.  No one said I was wrong, but that
is about the level of debate I had.

 2014:  I started to shop around the idea that we could use FDWs,
 parallelism, and a transaction/snapshot manager to get XC features
 as built-in to Postgres.  (I don't remember where the original idea
 came from.)  It was clear that having separate forks of the source code
 in XC and XL was never going to achieve critical mass --- there just
 aren't enough people who need high right scale right now, and the fork
 maintenance overhead is a huge burden.
 
 
 I personally support the view that we should put scalability features into
 Postgres core, rather than run separate forks.

Good, I do think it is time, but as I stated above, there is limited
interest in our current community, so the tolerance for additional
community code to accomplish this is also limited.  This is the big
thing that had me excited about using FDWs --- FDW improvements can get
us closer to sharding without requiring community acceptance of
sharding-only features.

 I realized that we would never get community acceptance to dump the XC
 (or XL) code needed for sharding into community Postgres
 
 
 How or why did you realize that? There has never been any such discussion,
 AFAIK. Surely it can be possible to move required subsystems across?

Well, I have had many such discussions with XC/XL folks, and that was my
opinion.  I have seen almost no public discussion about this because the
idea had almost no chance of success.  If it was possible, someone would
have already suggested it on this list.

 , but with FDWs,
 we could add the features as _part_ of improving FDWs, which would benefit
 FDWs _and_ would be useful for sharding.  (We already see some of those
 FDW features in 9.5.)
 
 
 That is a huge presumption. Not discussed or technically analyzed in any way
 with the community.

True.  It seemed pretty obvious to me.

 October, 2014:  EDB and NTT started working together in the community
 to start improving FDWs as a basis for an FDW-based sharding solution.
 Many of the 9.5 FDW improvements that also benefit sharding were developed
 by a combined EDB/NTT team.  The features improved FDWs independent of
 sharding, so they didn't need community buy-in on sharding to get them
 accepted.
 
 June, 2015:  I attended the PGCon sharding unconference session and
 there was a huge discussion about where we should go with sharding.
 I think the big take-away was that most people liked the FDW approach,
 but had business/customer reasons for wanting to work on XC or XL because
 those would be production-ready faster.
 
 
 Cough, cough. You must surely be joking that most people liked the FDW
 approach? How did we measure the acceptance of this approach? 

Well, I didn't have my audience-meter with me at the time.  ;-)

The discussion was mostly in the hallway after the unconference session,
Future of PostgreSQL shared-nothing cluster by Konstantin Knizhnik,
Alexander Korotkov, and Oleg Bartunov.  Again, when I explained the
ability to use FDWs to get sharding into Postgres with minimal
additional code, no one said the idea was crazy, which I took as a big
thumbs-up!  When I asked why to continue with XC/XL, I was told those
were more mature and more customer-ready, which is true.  I will not
quote people from the from the hallway discussion for privacy reasons.

 What actually is the FDW approach? Since its not been written down anywhere, 
 or
 even 

Re: [HACKERS] Horizontal scalability/sharding

2015-08-30 Thread Bruce Momjian
On Sun, Aug 30, 2015 at 10:08:06PM -0400, Bruce Momjian wrote:
 On Mon, Aug 31, 2015 at 09:53:57AM +0900, Michael Paquier wrote:
  Well, I have had many such discussions with XC/XL folks, and that was my
  opinion.  I have seen almost no public discussion about this because the
  idea had almost no chance of success.  If it was possible, someone would
  have already suggested it on this list.
  
  
  Or perhaps people invested in this area had other obligations or lacked
  motivation and/or time to work to push up for things in core. That's not
  possible to know, and what is done is done.
 
 Well, I have talked to everyone privately about this, and concluded that
 while horizontal scalability/sharding is useful, it is unlikely that the
 code volume of something like XC or XL would be accepted into the
 community, and frankly, now that we have FDWs, it is hard to imagine why
 we would _not_ go in the FDW direction.  

Actually, there was hope that XC or XL would get popular enough that it
would justify adding their code into community Postgres, but that never
happened.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Horizontal scalability/sharding

2015-08-30 Thread Michael Paquier
On Mon, Aug 31, 2015 at 11:48 AM, Bruce Momjian br...@momjian.us wrote:
 On Sun, Aug 30, 2015 at 10:08:06PM -0400, Bruce Momjian wrote:
 On Mon, Aug 31, 2015 at 09:53:57AM +0900, Michael Paquier wrote:
  Well, I have had many such discussions with XC/XL folks, and that was 
  my
  opinion.  I have seen almost no public discussion about this because 
  the
  idea had almost no chance of success.  If it was possible, someone 
  would
  have already suggested it on this list.
 
 
  Or perhaps people invested in this area had other obligations or lacked
  motivation and/or time to work to push up for things in core. That's not
  possible to know, and what is done is done.

 Well, I have talked to everyone privately about this, and concluded that
 while horizontal scalability/sharding is useful, it is unlikely that the
 code volume of something like XC or XL would be accepted into the
 community, and frankly, now that we have FDWs, it is hard to imagine why
 we would _not_ go in the FDW direction.

 Actually, there was hope that XC or XL would get popular enough that it
 would justify adding their code into community Postgres, but that never
 happened.

Forks are aimed to die without proper maintenance resources. Still,
for XC/XL, what does not help is the complication of the architecture
and SPOF management, particularly thinking with the GTM that was
something completely new and not well understood (there is a GTM
standby but this model is weak IMO and does not scale similarly to
what you get with standbys, and impacts the overall performance of the
cluster).
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] icc vs. gcc-style asm blocks ... maybe the twain can meet?

2015-08-30 Thread Tom Lane
I came across some websites suggesting that icc will handle gcc-style
asm blocks as long as you give it the -fasm-blocks command line option.
It would be awfully nice to get rid of the __INTEL_COMPILER special
cases in s_lock.h and the atomics headers --- would someone who has
icc at hand check into this theory?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Extended query protocol violation?

2015-08-30 Thread Tatsuo Ishii
While playing with Java application using JDBC driver, I noticed
an interesting fact:

When autocommit is off, JDBC driver issues following messages:

20:10:54.731 (2)  FE= Parse(stmt=S_1,query=BEGIN,oids={})
20:10:54.731 (2)  FE= Bind(stmt=S_1,portal=null)
20:10:54.731 (2)  FE= Execute(portal=null,limit=0)
20:10:54.732 (2)  FE= Parse(stmt=null,query=SELECT ...
20:10:54.733 (2)  FE= Bind(stmt=null,portal=null)
20:10:54.733 (2)  FE= Describe(portal=null)
20:10:54.733 (2)  FE= Execute(portal=null,limit=0)
20:10:54.733 (2)  FE= Sync
20:10:54.734 (2)  =BE ParseComplete [S_1]
20:10:54.734 (2)  =BE BindComplete [null]
20:10:54.734 (2)  =BE CommandStatus(BEGIN)
20:10:54.735 (2)  =BE ParseComplete [null]
20:10:54.735 (2)  =BE BindComplete [null]
20:10:54.735 (2)  =BE RowDescription(15)

Notice that JDBC driver sends Parse, Bind and Execute without Sync
followed then immediately sends another Parse message.  I wonder if
this violates our extended query protocol.  From the manual:

At completion of each series of extended-query messages, the frontend
should issue a Sync message.

each series of extended-query messages is a little vague here but it
seems it referes to a sequence of message starting with parse and
ending with execute message to me. If so, I think above example of
message sequence violates the protocol.

The application produces the packet sequence is essentially like this:

connection.setAutoCommit(false);
stmt = connection.createStatement();
rs = stmt.executeQuery(sql);

Comments?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extended query protocol violation?

2015-08-30 Thread Tom Lane
Tatsuo Ishii is...@postgresql.org writes:
 Notice that JDBC driver sends Parse, Bind and Execute without Sync
 followed then immediately sends another Parse message.

 I wonder if this violates our extended query protocol.

It does not.

 At completion of each series of extended-query messages, the frontend
 should issue a Sync message.

That's a should, not a must.  The important point here is that if the
BEGIN were to fail for some reason, the backend would skip the second
command altogether, since it would skip to the Sync before resuming
processing messages.  If the JDBC driver isn't expecting that behavior,
that's a bug in the driver --- but it's not a protocol violation.  In
fact, the protocol is intentionally designed to let you stack up commands
like that.  It gives you a tradeoff of potential concurrency vs possible
complexity in error recovery handling.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] checkpointer continuous flushing

2015-08-30 Thread Amit Kapila
On Mon, Aug 24, 2015 at 12:45 PM, Fabien COELHO coe...@cri.ensmp.fr wrote:


 Also check the file:

   sh file ./avg.py
   ./avg.py: Python script, UTF-8 Unicode text executable


There were some CRLF line terminators, after removing those, it worked
fine and here are the results of some of the tests done for sorting patch
(checkpoint-continuous-flush-10-a) :

Config Used
--
M/c details


IBM POWER-8 24 cores, 192 hardware threads
RAM = 492GB

Test details
--
warmup=60
scale=300
max_connections=150
shared_buffers=8GB
checkpoint_timeout=2min
time=7200
synchronous_commit=on
max_wal_size=5GB

parallelism - 128 clients, 128 threads

Sort - off
avg over 7200: 8256.382528 ± 6218.769282 [0.00, 76.05,
10975.50, 13105.95, 21729.00]
percent of values below 10.0: 19.5%

Sort - on
avg over 7200: 8375.930639 ± 6148.747366 [0.00, 84.00,
10946.00, 13084.00, 20289.90]
percent of values below 10.0: 18.6%

Before going to conclusion, let me try to explain above data (I am
explaining again even though Fabien has explained, to make it clear
if someone has not read his mail)

Let's try to understand with data for sorting - off option

avg over 7200: 8256.382528 ± 6218.769282

8256.382528 - average tps for 7200s pgbench run
6218.769282 - standard deviation on per second figures

[0.00, 84.00, 10946.00, 13084.00, 20289.90]

These 5 values can be read as minimum TPS, q1, median TPS, q3,
maximum TPS over 7200s pgbench run.  As far as I understand q1
and q3 median of subset of values which I didn't focussed much.

percent of values below 10.0: 19.5%

Above means percent of time the result is below 10 tps.

Now about test results, these tests are done for pgbench full speed runs
and the above results indicate that there is approximately 1.5%
improvement in avg. TPS and ~1% improvement in tps values which are
below 10 with sorting on and there is almost no improvement in median or
maximum TPS values, instead they or slightly less when sorting is
on which could be due to run-to-run variation.

I have done more tests as well by varying time and number of clients
keeping other configuration same as above, but the results are quite
similar.

The results of sorting patch for the tests done indicate that the win is not
big enough with just doing sorting during checkpoints, we should consider
flush patch along with sorting.  I would like to perform some tests with
both
the patches together (sort + flush) unless somebody else thinks that sorting
patch alone is beneficial and we should test some other kind of scenarios to
see it's benefit.


 The reason for the tablespace balancing is that in the current postgres
buffers are written more or less randomly, so it is (probably) implicitely
and statistically balanced over tablespaces because of this randomness, and
indeed, AFAIK, people with multi tablespace setup have not complained that
postgres was using the disks sequentially.

 However, once the buffers are sorted per file, the order becomes
deterministic and there is no more implicit balancing, which means that if
someone has a pg setup with several disks it will write sequentially on
these instead of in parallel.


What if tablespaces are not on separate disks or not enough hardware
support to make Writes parallel?  I think for such cases it might be
better to do it sequentially.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Horizontal scalability/sharding

2015-08-30 Thread Amit Kapila
On Sun, Aug 30, 2015 at 7:47 AM, Bruce Momjian br...@momjian.us wrote:

 I have recently increased my public statements about the idea of adding
 horizontal scaling/sharding to Postgres. I wanted to share with hackers
 a timeline of how we got here, and where I think we are going in the
 short term:

 2012-2013:  As part of writing my scaling talk
 (http://momjian.us/main/presentations/overview.html#scaling), studying
 Oracle RAC, and talking to users, it became clear that an XC-like
 architecture (sharding) was the only architecture that was going to allow
 for write scaling.


I think sharding like architecture is quite useful for certain kind of
workloads
where users can manage to arrange queries and data layout in an optimized
way which I hope users might agree to change if required.  One thing to
consider here is what kind of scaling are we expecting in such a system and
is it sufficient considering we will keep focussed on this architecture for
horizontal scalability?
Generally speaking, the scaling in such systems is limited by the number of
profitable partitions user can create based on data and then cross-partition
transactions sucks the performance/scalability in such systems.  I
understand that there is definitely a benefit in proceeding with sharding
like
architecture as there are already some PostgreSQL based forks which uses
such architecture, so if we follow same way, we can save some effort rather
than inventing or following some other architecture, however there is no
harm
is discussing pros and cons of some other architectures like Oracle RAC,
Google F1 or others.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] On-demand running query plans using auto_explain and signals

2015-08-30 Thread Pavel Stehule
2015-08-30 10:34 GMT+02:00 Pavel Stehule pavel.steh...@gmail.com:



 2015-08-30 10:30 GMT+02:00 Shulgin, Oleksandr 
 oleksandr.shul...@zalando.de:

 On Aug 29, 2015 7:31 PM, Pavel Stehule pavel.steh...@gmail.com wrote:
 
 
 
  2015-08-29 18:36 GMT+02:00 Andres Freund and...@anarazel.de:
 
  On 2015-08-29 18:27:59 +0200, Pavel Stehule wrote:
   2015-08-29 18:25 GMT+02:00 Shulgin, Oleksandr 
 oleksandr.shul...@zalando.de
Good point.  There's still hope to set a flag and process it later
 on.
Will have to check if it's possible to stay in the scope of a
 loaded module
though.
 
   I had a workable prototype - and It was implemented very similar as
   handling CANCEL
 
  Where did you put the handling of that kind of interrupt? Directly into
  ProcessInterrupts()?
 
 
  Probably. I don't remember it well, but it need hack code - it cannot
 be used from extension.

 Do you still have the code somewhere around? Did it see production use?

 I sent it to mailing list year ago

http://www.postgresql.org/message-id/cafj8praxcs9b8abgim-zauvggqdhpzoarz5ysp1_nhv9hp8...@mail.gmail.com

Regards

Pavel



 I am not sure I am able to find it - I'll try. We didn't use it on
 production.


 Thanks!
 --
 Alex





Re: [HACKERS] Horizontal scalability/sharding

2015-08-30 Thread Ashutosh Bapat


 At PGCon we agreed to have such meeting in Vienna at least. But I think we
 should be prepared and try to clean all our issues before. It looks like we
 already out of time,but probably we could meet in Hong Kong ?

 Honestly, I still don't know which approach is better, we already played
 with XL (ported to 9.4)  and identified some very strong issues with
 inconsistency, which scared us, especially taking into account how easy we
 found them. XC people have fixed them, but I'm not sure if they were
 fundamental and if we could construct more sophisticated tests and find
 more issues in XC/XL. We also a bit disappointed by Huawei position about
 CSN patch, we hoped to use for  our XTM.  FDW approach has been actively
 criticized by pg_shard people and that's also made me a bit suspicious.  It
 looks like  we are doomed to continue several development forks, so we
 decided to work on very important common project, XTM, which we hoped could
 be accepted by all parties and eventually committed to 9.6.  Now I see we
 were right, unfortunately.


Distributed transaction manager should support at least three things
1. Atomic commit
2. Atomic visibility
3. Consistent snapshots (e.g. required for repeatable reads and higher
isolation levels).

I have submitted patch for implementing first for FDWs. The patch adds
infrastructure to be used by all FDWs including postgres_fdw. It also adds
postgres_fdw code to use this infrastructure. The same can be used to
achieve atomic commit in postgres_fdw based sharding. Please see if XTM can
benefit from it. If there are things that are required by XTM, please post
the requirements on that thread and I will work on those. You can find the
latest patch at
http://www.postgresql.org/message-id/CAFjFpRfANWL53+x2HdM9TCNe5pup=opkqssj-kgfr-d2efj...@mail.gmail.com



 Again, could we organize meeting somewhere in September ?  US is not good
 for us, but other places should be ok. I want to have an agreement  at
 least on XTM. We still are testing various approaches, though. We could
 present results of our experiments and are open to discussion. It's not
 easy project, but it's something we could do for 9.6.

 I'm very glad Bruce started this discussion in -hackers, since it's silly
 to me to participate in both threads :)  Let's meet in September !



 --
 Simon Riggshttp://www.2ndQuadrant.com/
 http://www.2ndquadrant.com/
 PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services





-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] [PATCH] SQL function to report log message

2015-08-30 Thread dinesh kumar
On Sun, Aug 30, 2015 at 4:52 AM, Pavel Stehule pavel.steh...@gmail.com
wrote:

 Hi

 I am starting to work review of this patch

 Hi Pavel,

Thanks for your review.


 2015-07-13 9:54 GMT+02:00 dinesh kumar dineshkuma...@gmail.com:

 Hi All,

 Greetings for the day.

 Would like to discuss on below feature here.

 Feature:
 Having an SQL function, to write messages to log destination.

 Justification:
 As of now, we don't have an SQL function to write custom/application
 messages to log destination. We have RAISE clause which is controlled by
 log_ parameters. If we have an SQL function which works irrespective of
 log settings, that would be a good for many log parsers. What i mean is, in
 DBA point of view, if we route all our native OS stats to log files in a
 proper format, then we can have our log reporting tools to give most
 effective reports. Also, Applications can log their own messages to
 postgres log files, which can be monitored by DBAs too.

 Implementation:
 Implemented a new function pg_report_log which takes one argument
 as text, and returns void. I took, LOG prefix for all the reporting
 messages.I wasn't sure to go with new prefix for this, since these are
 normal LOG messages. Let me know, if i am wrong here.

 Here is the attached patch.


 This patch is not complex, but the implementation doesn't cover a
 ereport well.

 Although this functionality should be replaced by custom function in any
 PL (now or near future), I am not against to have this function in core.
 There are lot of companies with strong resistance against stored procedures
 - and sometimes this functionality can help with SQL debugging.

 Issues:

 1. Support only MESSAGE field in exception - I am expecting to support all
 fields: HINT, DETAIL, ...
 2. Missing regress tests
 3. the parsing ereport level should be public function shared with PLpgSQL
 and other PL
 4. should be hidestmt mandatory parameter?
 5. the function declaration is strange

 postgres=# \sf pg_report_log (text, anyelement, boolean)
 CREATE OR REPLACE FUNCTION pg_catalog.pg_report_log(text, anyelement,
 boolean)
  RETURNS void
  LANGUAGE sql
  STABLE STRICT COST 1
 AS $function$SELECT pg_report_log($1::pg_catalog.text,
 $2::pg_catalog.text, $3::boolean)$function$

 Why polymorphic? It is useless on any modern release

 postgres=# \sf pg_report_log (text, text, boolean)
 CREATE OR REPLACE FUNCTION pg_catalog.pg_report_log(text, text, boolean)
  RETURNS void
  LANGUAGE internal
  IMMUTABLE STRICT
 AS $function$pg_report_log$function$

 Why stable, why immutable? This function should be volatile.

 6. using elog level enum as errcode is wrong idea - errcodes are defined
 in table http://www.postgresql.org/docs/9.4/static/errcodes-appendix.html


Let me go through each concern and will update you on this.

Regards,
Dinesh
manojadinesh.blogspot.com


 Regards

 Pavel



 Regards,
 Dinesh
 manojadinesh.blogspot.com