date:20150612

Re: [HACKERS] Incompatible trig error handling

2015-06-12 Thread Noah Misch

On Wed, Apr 29, 2015 at 05:11:48PM -0700, Tom Lane wrote:
> John Gorman  writes:
> > Two of the trigonometry functions have differing error condition behavior
> > between Linux and OSX. The Linux behavior follows the standard set by the
> > other trig functions.
> 
> We have never considered it part of Postgres' charter to try to hide
> platform-specific variations in floating-point behavior.  If we did,
> we'd spend all our time doing that rather than more productive stuff.
> 
> In particular, it appears to me that both of these behaviors are allowed
> per the POSIX standard, which makes it very questionable why we should
> insist that one is correct and the other is not.
> 
> In addition, the proposed patch turns *all* cases that return NaN into
> errors, which is wrong at least for the case where the input is NaN.

OS X is a MATH_ERREXCEPT, !MATH_ERRNO platform.  PostgreSQL wrongly assumes
that all platforms are MATH_ERRNO platforms.  The correct fix is to use
fetestexcept() on !MATH_ERRNO platforms.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Time to fully remove heap_formtuple() and friends?

2015-06-12 Thread Peter Geoghegan

On Fri, Jun 12, 2015 at 8:47 PM, Tom Lane  wrote:
> Peter Geoghegan  writes:
>> Attached patch actually removes heap_formtuple() and friends. Does
>> this seem worthwhile?
>
> Seems reasonable, but at this point I would say this is 9.6 material;
> third-party-module authors have enough to do with the API breaks we've
> already created for 9.5.  Please enter this in commitfest-next.

Fair enough. I have done so.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Collection of memory leaks for ECPG driver

2015-06-12 Thread Tom Lane

Michael Paquier  writes:
> On Fri, Jun 12, 2015 at 10:01 PM, Michael Meskes  
> wrote:
>> More seriously, though, does anyone know of any platform where free(NULL) is 
>> *not* a noop?

> I recall reading that some past versions of SunOS crashed, but it is
> rather old...

Yeah, SunOS 4.x had issues here, but it's long dead.  More to the point,
both C89 and Single Unix Spec v2 clearly say that free(NULL) is a no-op;
and it's been many years since we agreed that we had no interest in
supporting platforms that didn't meet at least those minimum standards.
So there is no need to worry about any code of ours that does free(NULL).

But having said that, I would not be in a hurry to remove any existing
if-guards for the case.  For one thing, it makes the code look more
similar to backend code that uses palloc/pfree, where we do *not* allow
pfree(NULL).  There's also the point that changing longstanding code
creates back-patching hazards, so unless there's a clear gain it's best
not to.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Time to fully remove heap_formtuple() and friends?

2015-06-12 Thread Tom Lane

Peter Geoghegan  writes:
> Attached patch actually removes heap_formtuple() and friends. Does
> this seem worthwhile?

Seems reasonable, but at this point I would say this is 9.6 material;
third-party-module authors have enough to do with the API breaks we've
already created for 9.5.  Please enter this in commitfest-next.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Collection of memory leaks for ECPG driver

2015-06-12 Thread Michael Paquier

On Fri, Jun 12, 2015 at 10:01 PM, Michael Meskes  wrote:
> On Mon, Jun 08, 2015 at 10:50:25PM +0900, Michael Paquier wrote:
> Right, that's because they were developed before free received the safeguard, 
> or the programmer simply didn't know at that point in time. Unless I'm 
> mistaken we have other code that only calls free() without an additional 
> safeguard, so why shuld ecpglib be special? If you don't like it using both 
> approaches, feel free to remove the check in the other case. :)

Well, I can send patches...

> More seriously, though, does anyone know of any platform where free(NULL) is 
> *not* a noop?

I recall reading that some past versions of SunOS crashed, but it is
rather old...
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release notes

2015-06-12 Thread Amit Kapila

On Sat, Jun 13, 2015 at 7:47 AM, Bruce Momjian  wrote:
>
> On Thu, Jun 11, 2015 at 09:02:35PM +0200, Magnus Hagander wrote:
> > On Thu, Jun 11, 2015 at 8:56 PM, Josh Berkus  wrote:
> >
> > On 06/10/2015 09:50 PM, Amit Kapila wrote:
> > > Also shall we mention about below in Migrations to 9.5 section
> > >
> > > "pg_basebackup will not not work in tar mode against 9.4 and older
> > servers,
> > >  as we have introduced a new protocol option in that mode."
> >
> > AFAIK, pg_basebackup has never worked across versions.  So there's
no
> > reason for this note.
> >
> >
> > It has. The resulting backup has not been usable cross version, but
> > pg_basebackup itself has. Though not always, and I'm not sure we've ever
> > claimed it was supported, but it has worked.
>
> So we should still mention it in the release notes?
>

If it has never lead to usable backup's for cross version backup, then I
think
there is no pressing need to mention it.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] 9.5 release notes

2015-06-12 Thread Bruce Momjian

On Fri, Jun 12, 2015 at 02:47:22PM +0200, Petr Jelinek wrote:
> Hi,
> 
> +  
> +   
> +Add JSONB functions jsonb_set() and
> +jsonb_pretty (Dmitry Dolgov, Andrew Dunstan)
> +   
> +  
> 
> I believe I should be 3rd author for this one as I rewrote large
> parts of this functionality as part of the review.

Added.  Thanks.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release notes

2015-06-12 Thread Bruce Momjian

On Fri, Jun 12, 2015 at 12:49:11PM +0900, Fujii Masao wrote:
> On Thu, Jun 11, 2015 at 1:15 PM, Bruce Momjian  wrote:
> > I have committed the first draft of the 9.5 release notes.  You can view
> > the output here:
> >
> > http://momjian.us/pgsql_docs/release-9-5.html
> >
> > and it will eventually appear here:
> >
> > http://www.postgresql.org/docs/devel/static/release.html
> 
> I found some minor issues.
> 
> e.g. IDENTIFY_COMMAND, are not logged, even when
> log_statements is set to all.
> 
> Typos.
> s/IDENTIFY_COMMAND/IDENTIFY_SYSTEM
> s/log_statements/log_statement

Updated.

> RETURN WHERE
>
> 
> Looks like garbage.

It is actually a question;  I have reworded it.

>Add VERBOSE option to REINDEX (Fujii Masao)
> 
> Could you change the author name to Sawada Masahiko?
> He is the author of the feature.

Thanks, done.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release notes

2015-06-12 Thread Bruce Momjian

On Thu, Jun 11, 2015 at 05:17:54PM -0400, Robert Haas wrote:
> On Thu, Jun 11, 2015 at 4:23 PM, Peter Geoghegan  wrote:
> > Secondly, Robert didn't credit himself as an author in his commit
> > message for the abbreviated keys infrastructure + text opclass support
> > *at all*. However, I think that Robert should be listed as a secondary
> > author of the abbreviated keys infrastructure, and that he would agree
> > that I am clearly the primary author. Andrew Gierth did work on the
> > datum case for sortsupport + abbreviation, so I agree he should be
> > listed as a secondary author of the infrastructure too, after Robert.
> 
> I'd probably say Peter, Andrew, me.

Order changed, thanks.  This entry was a consolidation of several
commits so the proper order wasn't clear to me.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release notes

2015-06-12 Thread Bruce Momjian

On Thu, Jun 11, 2015 at 09:02:35PM +0200, Magnus Hagander wrote:
> On Thu, Jun 11, 2015 at 8:56 PM, Josh Berkus  wrote:
> 
> On 06/10/2015 09:50 PM, Amit Kapila wrote:
> > Also shall we mention about below in Migrations to 9.5 section
> >
> > "pg_basebackup will not not work in tar mode against 9.4 and older
> servers,
> >  as we have introduced a new protocol option in that mode."
> 
> AFAIK, pg_basebackup has never worked across versions.  So there's no
> reason for this note.
> 
> 
> It has. The resulting backup has not been usable cross version, but
> pg_basebackup itself has. Though not always, and I'm not sure we've ever
> claimed it was supported, but it has worked. 

So we should still mention it in the release notes?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release notes

2015-06-12 Thread Bruce Momjian

On Thu, Jun 11, 2015 at 01:31:01PM -0400, Tom Lane wrote:
> Robert Haas  writes:
> > On Thu, Jun 11, 2015 at 11:32 AM, Bruce Momjian  wrote:
> >> Improve hash creation and lookup performance (Tomas Vondra,
> >> Teodor Sigaev, Tom Lane, Robert Haas)
> 
> > I suggest haveing two separate items.  One of those is about the Hash
> > executor node and the other is about our dynahash stuff.  So they're
> > completely different code bases.
> 
> As far as 4a14f13a0 goes, I would think that ought to be mentioned under
> "Source Code" since it's a change in a rather widely used API.  I doubt
> that the performance aspect of it is really all that exciting to end
> users, but third-party modules calling the dynahash code would care.
> The hash join changes are a completely different thing.

Applied patch attached.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
commit 8bf51ad0cc26e80cbd082c111f45428db7a2f73b
Author: Bruce Momjian 
Date:   Fri Jun 12 22:16:08 2015 -0400

release notes:  split apart hash items

Report by Tom Lane, Robert Haas

diff --git a/doc/src/sgml/release-9.5.sgml b/doc/src/sgml/release-9.5.sgml
new file mode 100644
index 17301a4..283d061
*** a/doc/src/sgml/release-9.5.sgml
--- b/doc/src/sgml/release-9.5.sgml
***
*** 221,228 
  

 
! Improve hash creation and lookup performance (Tomas Vondra,
! Teodor Sigaev, Tom Lane, Robert Haas)
 

  
--- 221,227 
  

 
! Improve in-memory hash performance (Tomas Vondra, Robert Haas)
 

  
***
*** 1795,1800 
--- 1794,1805 
 

  
+   
+
+ Improve dynahash capabilities (Teodor Sigaev, Tom Lane)
+
+   
+ 

 
  Improve parallel execution infrastructure (Robert Haas, Amit

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Entities created in one query not available in another in extended protocol

2015-06-12 Thread Simon Riggs

On 12 June 2015 at 20:06, Tom Lane  wrote:

> Simon Riggs  writes:
> > On 11 June 2015 at 22:12, Shay Rojansky  wrote:
> >> Just in case it's interesting to you... The reason we implemented things
> >> this way is in order to avoid a deadlock situation - if we send two
> queries
> >> as P1/D1/B1/E1/P2/D2/B2/E2, and the first query has a large resultset,
> >> PostgreSQL may block writing the resultset, since Npgsql isn't reading
> it
> >> at that point. Npgsql on its part may get stuck writing the second query
> >> (if it's big enough) since PostgreSQL isn't reading on its end (thanks
> to
> >> Emil Lenngren for pointing this out originally).
>
> > That part does sound like a problem that we have no good answer to.
> Sounds
> > worth starting a new thread on that.
>
> I do not accept that the backend needs to deal with that; it's the
> responsibility of the client side to manage buffering properly if it is
> trying to overlap sending the next query with receipt of data from a
> previous one.  See commit 2a3f6e368 for a related issue in libpq.
>

Then it's our responsibility to define what "manage buffering properly"
means and document it.

People should be able to talk to us without risk of deadlock.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] could not truncate directory "pg_subtrans": apparent wraparound

2015-06-12 Thread Thomas Munro

Hi

Since the multixact equivalent of this problem[1] fell through the
cracks on the multixact mega-thread, here is an updated patch that
addresses this problem for both pg_subtrans and pg_multixact/offsets
using the same approach: always step back one multixact/xid (rather
than doing so only if oldest == next, which seemed like an unnecessary
complication, and a bit futile since the result of such a test is only
an instantaneous snapshot).  I've added this to the commitfest[2].  I
am also attaching a new set of repro scripts including a pair to test
the case where next multixact/xid == first valid ID (the scripts with
'wraparound' in the name, which use dirty pg_resetxlog tricks to get
into that situation).  In my previous patch I naively subtracted one,
which didn't work for those (even rarer!) cases.  The new patch steps
over the special ID values.

This is a low priority bug: it just produces low probability bogus
(but perhaps alarming) LOG messages and skips truncation during
checkpoints on low activity systems.  There have been occasional
reports of these pg_subtrans messages going back as far as 2007 (and
Alvaro was barking up the correct tree[3] back in 2010), so I figured
it was worth following up.

I also took a look at the pg_clog and pg_commit_ts truncation
functions.  You could argue that they have the same problem in theory
(they pass a page number derived from the oldest xid to
SimpleLruTruncate, and maybe there is a way for that to be an xid that
hasn't been issued yet), but in practice I don't think it's a
reachable condition.  They use the frozen xid that is updated by
vacuuming, but vacuum itself advances the next xid counter in the
process.  Is there a path though the vacuum code that ever exposes
frozen xid == next xid?  In contrast, for pg_subtrans we use
GetOldestXmin(), which is equal to the next xid if there are no
running transactions, and for pg_multixact we use the oldest
multixact, which can be equal to the next multixact ID after a
wraparound vacuum because vacuum itself doesn't always consume
multixacts.

[1] 
http://www.postgresql.org/message-id/CAEepm=0DqAtnM=23oq44bbnwvn3g6+dxx+s5g4jrbp-vy8g...@mail.gmail.com
[2] https://commitfest.postgresql.org/5/265/
[3] http://www.postgresql.org/message-id/1274373980-sup-3...@alvh.no-ip.org

-- 
Thomas Munro
http://www.enterprisedb.com


repro-bogus-multixact-error.sh
Description: Bourne shell script


repro-bogus-subtrans-error.sh
Description: Bourne shell script


repro-bogus-multixact-error-wraparound.sh
Description: Bourne shell script


repro-bogus-subtrans-error-wraparound.sh
Description: Bourne shell script


fix-bogus-truncation-errors.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release notes

2015-06-12 Thread Bruce Momjian

On Thu, Jun 11, 2015 at 01:27:23PM -0400, Tom Lane wrote:
> Bruce Momjian  writes:
> > On Thu, Jun 11, 2015 at 05:16:07PM +1200, David Rowley wrote:
> >> Would you also be able to mention something about�f15821e and�d222585 ?
> 
> > I am going to defer to Tom on that.  I have added optimizer changes in
> > the past but he didn't feel it made sense unless there was some
> > user-visible change.
> 
> I'd be inclined to document both of those.  We mentioned outer join
> removal when it was first added, in 9.0, so making a significant
> improvement in it seems worthy of mention also.  Both of these things
> are "user visible" to the extent that they affect EXPLAIN output.
> 
> I'm not sure whether we need to document the semantic hazard that the
> second commit message worries about.

OK, I have added two items;  applied patch attached.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/doc/src/sgml/release-9.5.sgml b/doc/src/sgml/release-9.5.sgml
new file mode 100644
index 98f2107..17301a4
*** a/doc/src/sgml/release-9.5.sgml
--- b/doc/src/sgml/release-9.5.sgml
***
*** 242,247 
--- 242,262 
  

 
+ Allow the optimizer to remove unnecessary references to left
+ outer join subqueries (David Rowley)
+
+   
+ 
+   
+
+ Allow pushdown of query restrictions into window functions, where appropriate
+ (David Rowley)
+
+   
+ 
+   
+
  Speed up CRC (cyclic redundancy check) computations
  (Abhijit Menon-Sen, Heikki Linnakangas)
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Is it possible to have a "fast-write" Index?

2015-06-12 Thread deavid

So I just ran a test case for hash, btree, gin_btree and brin indexes. Also
without indexes, and without primary keys.
* Testing "deliverynotes" table.
  - Definition and use case:
It is a table contaning real delivery note headers of several years
It consists of 300k rows, 128 columns, 63 indexes, 243Mb of data
excluding indexes. Since is a table visible for users, almost every
column can be searched so we need lots of indexes. We do not need
searches to be the fastest possible, we only need to accelerate a
bit our user searches; without harming too much writes.
  - Things to test:
- measure index creation times.
- measure index space.
- with indexes but without primary key
- with everything
- Create fully, delete everything and Insert again data in blocks
- Test updates for recent data

I attached the logs for every test, if anyone wants to see what i'm exactly
testing.
This was tested on my i5 laptop with 4Gb of RAM and a 120Gb SSD (OCZ
Agility 3). I'm trying to measure CPU time, not I/O time, so some
configurations and tests are specific to avoid as much as IO as I can.
I'm using a dev build for Postgresql 9.5 downloaded from git sources.

Conclusions:
- Gin_btree seems slower in almost every case. It's writes are marginally
better than regular btrees even when using work_mem=160MB. (May be 20%
faster than btree). They are smaller than I thought.
- BRIN indexes seem very fast for writes. For selects maybe is a blend
between having indexes and don't having them. They don't recognize that
some values are simply out of range of indexed values, and that's a pity.
If the values we want are packed together I guess I would get even better
results.
- Primary keys and uniqueness checks doesn't seem to make any difference
here.
- Having no indexes at all is faster than I imagined. (Sometimes it beats
BRIN or Btree) Maybe because the IO here is faster than usual.
- Hash indexes: i tried to do something, but they take too much time to
build and i don't know why. If creates are slow, updates should be slow
too. I'm not going to test them again.

And finally, don't know why but i couldn't vacuum or analyze tables. It
always get stalled without doing anything; so i had to comment every
vacuum. Maybe there is a bug in this dev version or i misconfigured
something.

El vie., 12 jun. 2015 a las 7:27, Simon Riggs ()
escribió:

> On 5 June 2015 at 18:07, deavid  wrote:
>
>> There are several use cases where I see useful an index, but adding it
>> will slow too much inserts and updates.
>> For example, when we have 10 million rows on a table, and it's a table
>> which has frequent updates, we need several index to speed up selects, but
>> then we'll slow down updates a lot, specially when we have 10 or more
>> indexes.
>> Other cases involve indexes for text search, which are used only for user
>> search and aren't that important, so we want to have them, but we don't
>> want the overload they put whenever we write on the table.
>> I know different approaches that already solve some of those problems in
>> some ways (table partitioning, partial indexes, etc), but i don't feel they
>> are the solution to every problem of this kind.
>>
>> Some people already asked for "delayed write" indexes, but the idea gets
>> discarded because the index could get out of sync, so it can omit results
>> and this is unacceptable. But i think maybe that could be fixed in several
>> ways and we can have a fast and reliable index (but maybe not so fast on
>> selects).
>>
>
> This is exactly the use case and mechanism for BRIN indexes.
>
> --
> Simon Riggshttp://www.2ndQuadrant.com/
> 
>
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
\timing
Timing is on.
\pset pager always
Pager is always used.
DROP TABLE IF EXISTS deliverynotes CASCADE;
psql:testA_prepare_full_brin.sql:1: NOTICE:  table "deliverynotes" does not exist, skipping
DROP TABLE
Time: 0,353 ms
CREATE TEMP TABLE deliverynotes AS (SELECT * FROM albaranescli ORDER BY idalbaran);
SELECT 333694
Time: 632,625 ms
ALTER TABLE deliverynotes SET (fillfactor = 60);
ALTER TABLE
Time: 0,398 ms
ALTER TABLE deliverynotes ADD CONSTRAINT deliverynotes_pkey 
	PRIMARY KEY(idalbaran)  WITH (FILLFACTOR=60);
ALTER TABLE
Time: 954,238 ms
	
CLUSTER deliverynotes USING 	deliverynotes_pkey;
CLUSTER
Time: 1674,489 ms
CREATE INDEX deliverynotes_codagencia_m1_idx
  ON deliverynotes
  USING brin
  (codagencia );
CREATE INDEX
Time: 482,985 ms
CREATE INDEX deliverynotes_codagenciaup_m1_idx
  ON deliverynotes
  USING brin
  (upper(codagencia::text) );
CREATE INDEX
Time: 611,131 ms
CREATE INDEX deliverynotes_codagente2_m1_idx
  ON deliverynotes
  USING brin
  (codagente2 );
CREATE INDEX
Time: 165,700 ms
CREATE INDEX deliverynotes_codagente2up_m1_idx
  ON deliverynotes
  USING brin
  (upper(codagente2::text) );
CREATE INDEX
Time: 173,112 ms
CREATE INDEX deliverynotes_codagente3_m1_idx
  ON deliverynotes
  USING brin
  (codagente3 );
CREATE INDEX
T

[HACKERS] Time to fully remove heap_formtuple() and friends?

2015-06-12 Thread Peter Geoghegan

Commit 902d1cb3, made in 2008, established that the functions
heap_formtuple(), heap_modifytuple(), and heap_deformtuple() were
deprecated. The commit also actually removed those routines, replacing
them with simple wrappers around their real replacements, which are
spelled slightly differently and have a slightly different interface
(their real replacements are heap_deform_tuple() and friends). The
wrappers fulfilled the old, legacy interface, and were added for
compatibility with third party code -- the old char 'n'/' ' convention
for indicating nulls was bolted on top of calls to the new variants.
Bolting that on to the new variants involves a new palloc() + pfree(),
which isn't cheap.

Attached patch actually removes heap_formtuple() and friends. Does
this seem worthwhile? Does this seem like something that there should
be a compatibility release note item for if we proceed? I have not
added one yet.

I think that enough time has passed that these wrappers are clearly
100% deadwood. If any external extensions are still using
heap_formtuple(), they ought to be updated to work with Postgres 9.5
by using the new variants -- a straight switch-over of callers in the
style of 902d1cb3 should now work against all supported versions of
PostgreSQL, and without macro hacks. Affected external code building
against the removed legacy interface will reliably fail to build,
forcing the third party code to conform in a non-surprising way.
Removing the code seems very low risk.

-- 
Peter Geoghegan
diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c
index 09aea79..045e0a7 100644
--- a/src/backend/access/common/heaptuple.c
+++ b/src/backend/access/common/heaptuple.c
@@ -777,39 +777,6 @@ heap_form_tuple(TupleDesc tupleDescriptor,
 }
 
 /*
- *		heap_formtuple
- *
- *		construct a tuple from the given values[] and nulls[] arrays
- *
- *		Null attributes are indicated by a 'n' in the appropriate byte
- *		of nulls[]. Non-null attributes are indicated by a ' ' (space).
- *
- * OLD API with char 'n'/' ' convention for indicating nulls.
- * This is deprecated and should not be used in new code, but we keep it
- * around for use by old add-on modules.
- */
-HeapTuple
-heap_formtuple(TupleDesc tupleDescriptor,
-			   Datum *values,
-			   char *nulls)
-{
-	HeapTuple	tuple;			/* return tuple */
-	int			numberOfAttributes = tupleDescriptor->natts;
-	bool	   *boolNulls = (bool *) palloc(numberOfAttributes * sizeof(bool));
-	int			i;
-
-	for (i = 0; i < numberOfAttributes; i++)
-		boolNulls[i] = (nulls[i] == 'n');
-
-	tuple = heap_form_tuple(tupleDescriptor, values, boolNulls);
-
-	pfree(boolNulls);
-
-	return tuple;
-}
-
-
-/*
  * heap_modify_tuple
  *		form a new tuple from an old tuple and a set of replacement values.
  *
@@ -880,44 +847,6 @@ heap_modify_tuple(HeapTuple tuple,
 }
 
 /*
- *		heap_modifytuple
- *
- *		forms a new tuple from an old tuple and a set of replacement values.
- *		returns a new palloc'ed tuple.
- *
- * OLD API with char 'n'/' ' convention for indicating nulls, and
- * char 'r'/' ' convention for indicating whether to replace columns.
- * This is deprecated and should not be used in new code, but we keep it
- * around for use by old add-on modules.
- */
-HeapTuple
-heap_modifytuple(HeapTuple tuple,
- TupleDesc tupleDesc,
- Datum *replValues,
- char *replNulls,
- char *replActions)
-{
-	HeapTuple	result;
-	int			numberOfAttributes = tupleDesc->natts;
-	bool	   *boolNulls = (bool *) palloc(numberOfAttributes * sizeof(bool));
-	bool	   *boolActions = (bool *) palloc(numberOfAttributes * sizeof(bool));
-	int			attnum;
-
-	for (attnum = 0; attnum < numberOfAttributes; attnum++)
-	{
-		boolNulls[attnum] = (replNulls[attnum] == 'n');
-		boolActions[attnum] = (replActions[attnum] == 'r');
-	}
-
-	result = heap_modify_tuple(tuple, tupleDesc, replValues, boolNulls, boolActions);
-
-	pfree(boolNulls);
-	pfree(boolActions);
-
-	return result;
-}
-
-/*
  * heap_deform_tuple
  *		Given a tuple, extract data into values/isnull arrays; this is
  *		the inverse of heap_form_tuple.
@@ -1025,46 +954,6 @@ heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc,
 }
 
 /*
- *		heap_deformtuple
- *
- *		Given a tuple, extract data into values/nulls arrays; this is
- *		the inverse of heap_formtuple.
- *
- *		Storage for the values/nulls arrays is provided by the caller;
- *		it should be sized according to tupleDesc->natts not
- *		HeapTupleHeaderGetNatts(tuple->t_data).
- *
- *		Note that for pass-by-reference datatypes, the pointer placed
- *		in the Datum will point into the given tuple.
- *
- *		When all or most of a tuple's fields need to be extracted,
- *		this routine will be significantly quicker than a loop around
- *		heap_getattr; the loop will become O(N^2) as soon as any
- *		noncacheable attribute offsets are involved.
- *
- * OLD API with char 'n'/' ' convention for indicating nulls.
- * This is deprecated and should not be used in new code, but we keep it
- *

Re: [HACKERS] Further issues with jsonb semantics, documentation

2015-06-12 Thread Peter Geoghegan

On Fri, Jun 12, 2015 at 4:31 PM, Andrew Dunstan  wrote:
> OK, pushed, although you'd have to be trying really hard to break this.
> Still, it's reasonable to defend against.

I was trying really hard.   :-)

Thanks
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Further issues with jsonb semantics, documentation

2015-06-12 Thread Andrew Dunstan



On 06/12/2015 06:16 PM, Peter Geoghegan wrote:

On Thu, Jun 4, 2015 at 5:43 PM, Peter Geoghegan  wrote:

BTW, there is a bug here -- strtol() needs additional defenses [1]
(before casting to int):

postgres=# select jsonb_set('[1, 2, 3, 4,
5,6,7,8,9,10,11,12,13,14,15,16,17,18]',
'{"9223372036854775806"}'::text[], '"Input unsanitized"', false) ;
 jsonb_set
--
  [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, "Input
unsanitized", 18]
(1 row)

[1] 
https://www.securecoding.cert.org/confluence/display/cplusplus/INT06-CPP.+Use+strtol()+or+a+related+function+to+convert+a+string+token+to+an+integer

I attach a fix for this bug. The commit message explains everything.




OK, pushed, although you'd have to be trying really hard to break this. 
Still, it's reasonable to defend against.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-06-12 Thread Steve Kehlet

Just wanted to report that I rolled back my VM to where it was with 9.4.2
installed and it wouldn't start. I installed 9.4.4 and now it starts up
just fine:

> 2015-06-12 16:05:58 PDT [6453]: [1-1] LOG:  database system was shut down
at 2015-05-27 13:12:55 PDT
> 2015-06-12 16:05:58 PDT [6453]: [2-1] LOG:  MultiXact member wraparound
protections are disabled because oldest checkpointed MultiXact 1 does not
exist on disk
> 2015-06-12 16:05:58 PDT [6457]: [1-1] LOG:  autovacuum launcher started
> 2015-06-12 16:05:58 PDT [6452]: [1-1] LOG:  database system is ready to
accept connections
>  done
> server started

And this is showing up in my serverlog periodically as the emergency
autovacuums are running:

> 2015-06-12 16:13:44 PDT [6454]: [1-1] LOG:  MultiXact member wraparound
protections are disabled because oldest checkpointed MultiXact 1 does not
exist on disk

**Thank you Robert and all involved for the resolution to this.**

> With the fixes introduced in this release, such a situation will result
in immediate emergency autovacuuming until a correct oldestMultiXid value
can be determined

Okay, I notice these vacuums are of the "to prevent wraparound" type (like
VACUUM FREEZE), that do hold locks preventing ALTER TABLEs and such. Good
to know, we'll plan our software updates accordingly.

Is there any risk until these autovacuums finish?

Re: [HACKERS] Entities created in one query not available in another in extended protocol

2015-06-12 Thread Sehrope Sarkuni

The JDBC driver tries to handle this by estimating how much data has been
buffered. It mainly comes up when executing batch INSERTS as a large number
of statements may be sent to the backend prior to reading back any results.

There's a nice write up of the potential deadlock and the driver's logic to
avoid it here:

https://github.com/pgjdbc/pgjdbc/blob/7c0655b3683efa38cbe0d029385d8889f6392f98/org/postgresql/core/v3/QueryExecutorImpl.java#L300


Regards,
-- Sehrope Sarkuni
Founder & CEO | JackDB, Inc. | https://www.jackdb.com/

Re: [HACKERS] Further issues with jsonb semantics, documentation

2015-06-12 Thread Peter Geoghegan

On Thu, Jun 4, 2015 at 5:43 PM, Peter Geoghegan  wrote:
>
> BTW, there is a bug here -- strtol() needs additional defenses [1]
> (before casting to int):
>
> postgres=# select jsonb_set('[1, 2, 3, 4,
> 5,6,7,8,9,10,11,12,13,14,15,16,17,18]',
> '{"9223372036854775806"}'::text[], '"Input unsanitized"', false) ;
> jsonb_set
> --
>  [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, "Input
> unsanitized", 18]
> (1 row)
>
> [1] 
> https://www.securecoding.cert.org/confluence/display/cplusplus/INT06-CPP.+Use+strtol()+or+a+related+function+to+convert+a+string+token+to+an+integer

I attach a fix for this bug. The commit message explains everything.


-- 
Peter Geoghegan
From 2f2042d93d00f85e52612bd7d7499c3238579d4d Mon Sep 17 00:00:00 2001
From: Peter Geoghegan 
Date: Fri, 12 Jun 2015 14:55:32 -0700
Subject: [PATCH 1/2] Fix "path" infrastructure bug affecting jsonb_set()

jsonb_set() and other clients of the setPathArray() utility function
could get spurious results when an array integer subscript is provided
that is not within the range of int.

To fix, ensure that the value returned by strtol() within setPathArray()
is within the range of int;  when it isn't, assume an invalid input in
line with existing, similar cases.  The path-orientated operators that
appeared in PostgreSQL 9.3 and 9.4 do not call setPathArray(), and
already independently take this precaution, so no change there.
---
 src/backend/utils/adt/jsonfuncs.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c
index c14d3f7..13d5b7a 100644
--- a/src/backend/utils/adt/jsonfuncs.c
+++ b/src/backend/utils/adt/jsonfuncs.c
@@ -3814,11 +3814,14 @@ setPathArray(JsonbIterator **it, Datum *path_elems, bool *path_nulls,
 	if (level < path_len && !path_nulls[level])
 	{
 		char	   *c = VARDATA_ANY(path_elems[level]);
+		long		lindex;
 
 		errno = 0;
-		idx = (int) strtol(c, &badp, 10);
-		if (errno != 0 || badp == c)
+		lindex = strtol(c, &badp, 10);
+		if (errno != 0 || badp == c || lindex > INT_MAX || lindex < INT_MIN)
 			idx = nelems;
+		else
+			idx = lindex;
 	}
 	else
 		idx = nelems;
-- 
1.9.1


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On columnar storage

2015-06-12 Thread Michael Nolan

On Thu, Jun 11, 2015 at 7:03 PM, Alvaro Herrera 
wrote:

> We hope to have a chance to discuss this during the upcoming developer
> unconference in Ottawa.  Here are some preliminary ideas to shed some
> light on what we're trying to do.
>
>
> I've been trying to figure out a plan to enable native column stores
> (CS or "colstore") for Postgres.  Motivations:
>
> * avoid the 32 TB limit for tables
> * avoid the 1600 column limit for tables
> * increased performance
>
> Are you looking to avoid all hardware-based limits, or would using a 64
bit row pointer be possible?  That would give you 2^64 or 1.8 E19 unique
rows over whatever granularity/uniqueness you use (per table, per database,
etc.)
--
Mike Nolan.

Re: [HACKERS] Hash index creation warning

2015-06-12 Thread Thom Brown

On 18 October 2014 at 15:36, Bruce Momjian  wrote:
> On Fri, Oct 17, 2014 at 02:36:55PM -0400, Bruce Momjian wrote:
>> On Fri, Oct 17, 2014 at 12:56:52PM -0400, Tom Lane wrote:
>> > David G Johnston  writes:
>> > > The question is whether we explain the implications of not being 
>> > > WAL-logged
>> > > in an error message or simply state the fact and let the documentation
>> > > explain the hazards - basically just output:
>> > > "hash indexes are not WAL-logged and their use is discouraged"
>> >
>> > +1.  The warning message is not the place to be trying to explain all the
>> > details.
>>
>> OK, updated patch attached.
>
> Patch applied.

I only just noticed this item when I read the release notes.  Should
we bother warning when used on an unlogged table?

-- 
Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On columnar storage

2015-06-12 Thread Alvaro Herrera

Merlin Moncure wrote:
> On Thu, Jun 11, 2015 at 6:03 PM, Alvaro Herrera
>  wrote:
> > We hope to have a chance to discuss this during the upcoming developer
> > unconference in Ottawa.  Here are some preliminary ideas to shed some
> > light on what we're trying to do.
> 
> Quick thought.  We already support out of line columnar storage in a
> fashion with 'toast'.  Obviously that's a long way from where you want
> to go, but have you ruled out extending that system?

TOAST uses pointers in the heap row.  A toasted column is still present
in the heap -- there's no way to get across the 1600-column limitation
if we rely on anything stored in the heap.  Hence the design of the
feature at hand is that the columnar storage somehow "points" to the
heap, rather than the other way around.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Further issues with jsonb semantics, documentation

2015-06-12 Thread Peter Geoghegan

On Fri, Jun 12, 2015 at 12:26 PM, Andrew Dunstan  wrote:
> Here's some code for the count piece of that.

Thanks. I'll look into integrating this with what I have.

BTW, on reflection I'm not so sure about my decision to not touch the
logic within jsonb_delete_idx() (commit b81c7b409). I probably should
have changed it in line with the attached patch as part of that
commit. What do you think?
-- 
Peter Geoghegan
From 8232f360a0696eb9279c29dfa7464edde726c5ae Mon Sep 17 00:00:00 2001
From: Peter Geoghegan 
Date: Fri, 5 Jun 2015 13:55:48 -0700
Subject: [PATCH 1/2] Remove dead code for jsonb subscript deletion

---
 src/backend/utils/adt/jsonfuncs.c | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c
index c14d3f7..3fb8327 100644
--- a/src/backend/utils/adt/jsonfuncs.c
+++ b/src/backend/utils/adt/jsonfuncs.c
@@ -3411,10 +3411,8 @@ jsonb_delete_idx(PG_FUNCTION_ARGS)
 	it = JsonbIteratorInit(&in->root);
 
 	r = JsonbIteratorNext(&it, &v, false);
-	if (r == WJB_BEGIN_ARRAY)
-		n = v.val.array.nElems;
-	else
-		n = v.val.object.nPairs;
+	Assert (r == WJB_BEGIN_ARRAY);
+	n = v.val.array.nElems;
 
 	if (idx < 0)
 	{
@@ -3431,14 +3429,10 @@ jsonb_delete_idx(PG_FUNCTION_ARGS)
 
 	while ((r = JsonbIteratorNext(&it, &v, true)) != 0)
 	{
-		if (r == WJB_ELEM || r == WJB_KEY)
+		if (r == WJB_ELEM)
 		{
 			if (i++ == idx)
-			{
-if (r == WJB_KEY)
-	JsonbIteratorNext(&it, &v, true);	/* skip value */
 continue;
-			}
 		}
 
 		res = pushJsonbValue(&state, r, r < WJB_BEGIN_ARRAY ? &v : NULL);
-- 
1.9.1


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Further issues with jsonb semantics, documentation

2015-06-12 Thread Andrew Dunstan



On 06/12/2015 12:29 PM, I wrote:




I agree that the json case looks a bit nasty. Maybe a better approach 
would be to provide a function that, given a JsonLexContext, returns 
the number of array elements of the current array. In get_array_start 
we could call that if the relevant path element is negative and adjust 
it accordingly.







Here's some code for the count piece of that.

cheers

andrew

diff --git a/src/backend/utils/adt/json.c b/src/backend/utils/adt/json.c
index 26d3843..a11b965 100644
--- a/src/backend/utils/adt/json.c
+++ b/src/backend/utils/adt/json.c
@@ -341,6 +341,53 @@ pg_parse_json(JsonLexContext *lex, JsonSemAction *sem)
 }
 
 /*
+ * json_count_array
+ *
+ * designed to be called from the array_start routine
+ *
+ * gives the number of elements in the array at whatever nesting level it is at.
+ *
+ */
+int
+json_count_array_elements(JsonLexContext *lex)
+{
+	JsonLexContext copylex;
+	int count = 0;
+
+	if (lex_peek(lex) != JSON_TOKEN_ARRAY_START)
+		ereport(ERROR,
+(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION), /* XX FIX */
+ errmsg("json_count_array must be called at the start of an array")));
+
+	/*
+	 * It's safe to do this with a shallow copy because the lexical routines
+	 * don't scribble on the input. They do scribble on the other pointers etc,
+	 * so doing this with a copy makes that safe.
+	 */
+	memcpy(©lex, lex, sizeof(JsonLexContext));
+	copylex.strval = NULL; /* not interested in values here */
+
+	copylex.lex_level++;
+
+	lex_expect(JSON_PARSE_ARRAY_START, ©lex, JSON_TOKEN_ARRAY_START);
+	if (lex_peek(©lex) != JSON_TOKEN_ARRAY_END)
+	{
+		count++;
+		parse_array_element(©lex, &nullSemAction);
+
+		while (lex_accept(©lex, JSON_TOKEN_COMMA, NULL))
+		{
+			count++;
+			parse_array_element(©lex, &nullSemAction);
+		}
+	}
+
+	lex_expect(JSON_PARSE_ARRAY_NEXT, ©lex, JSON_TOKEN_ARRAY_END);
+
+	return count;
+}
+
+/*
  *	Recursive Descent parse routines. There is one for each structural
  *	element in a json document:
  *	  - scalar (string, number, true, false, null)
diff --git a/src/include/utils/jsonapi.h b/src/include/utils/jsonapi.h
index 296d20a..1abb8b9 100644
--- a/src/include/utils/jsonapi.h
+++ b/src/include/utils/jsonapi.h
@@ -104,6 +104,13 @@ typedef struct JsonSemAction
 extern void pg_parse_json(JsonLexContext *lex, JsonSemAction *sem);
 
 /*
+ * json_count_array does a fast secondary parse to determine the number
+ * of elements in the current array. It must be called from an array_start
+ * action.
+ */
+extern int json_count_array_elements(JsonLexContext *lex);
+
+/*
  * constructors for JsonLexContext, with or without strval element.
  * If supplied, the strval element will contain a de-escaped version of
  * the lexeme. However, doing this imposes a performance penalty, so

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] The Future of Aggregation

2015-06-12 Thread Kevin Grittner

David Rowley  wrote:

>> I am a little curious what sort of machine you're running on,
>> because my i7 is much slower.  I ran a few other tests with your
>> table for perspective.
>
> Assert enabled build?

Mystery solved.  Too often I forget to reconfigure with
optimization and without cassert for quick tests like that.
Hopefully the results are not skewed *too* badly by that in this
case.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On columnar storage

2015-06-12 Thread Tomas Vondra


Hi,

On 06/12/15 15:56, Merlin Moncure wrote:

On Thu, Jun 11, 2015 at 6:03 PM, Alvaro Herrera
 wrote:

We hope to have a chance to discuss this during the upcoming
developer unconference in Ottawa. Here are some preliminary ideas
to shed some light on what we're trying to do.


Quick thought. We already support out of line columnar storage in a
fashion with 'toast'. Obviously that's a long way from where you want
to go, but have you ruled out extending that system?


I doubt it would be a simple extension of TOAST, because it's a bit 
inverse to the columnar store. TOAST splits a single value into many 
pieces, but colstore needs to store values from multiple rows together. 
I wouldn't really call TOAST columnar storage.


--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Entities created in one query not available in another in extended protocol

2015-06-12 Thread Tom Lane

Simon Riggs  writes:
> On 11 June 2015 at 22:12, Shay Rojansky  wrote:
>> Just in case it's interesting to you... The reason we implemented things
>> this way is in order to avoid a deadlock situation - if we send two queries
>> as P1/D1/B1/E1/P2/D2/B2/E2, and the first query has a large resultset,
>> PostgreSQL may block writing the resultset, since Npgsql isn't reading it
>> at that point. Npgsql on its part may get stuck writing the second query
>> (if it's big enough) since PostgreSQL isn't reading on its end (thanks to
>> Emil Lenngren for pointing this out originally).

> That part does sound like a problem that we have no good answer to. Sounds
> worth starting a new thread on that.

I do not accept that the backend needs to deal with that; it's the
responsibility of the client side to manage buffering properly if it is
trying to overlap sending the next query with receipt of data from a
previous one.  See commit 2a3f6e368 for a related issue in libpq.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On columnar storage

2015-06-12 Thread Tom Lane

Alvaro Herrera  writes:
> Amit Kapila wrote:
>> Will the column store obey snapshot model similar to current heap tuples,
>> if so will it derive the transaction information from heap tuple?

> Yes, visibility will be tied to the heap tuple -- a value is accessed
> only when its corresponding heap row has already been determined to be
> visible.  One interesting point that raises from this is about vacuum:
> when are we able to remove a value from the store?  I have some
> not-completely-formed ideas about this.

Hm.  This seems not terribly ambitious --- mightn't a column store
extension wish to store tables *entirely* in the column store, rather
than tying them to a perhaps-vestigial heap table?  Perhaps that's a
necessary restriction to get to something implementable, but considering
that the proposal mentions pluggable column stores I should think you'd
not want to tie it down that much.

I can't help thinking that this could tie in with the storage level API
that I was waving my arms about last year.  Or maybe not --- the goals
are substantially different --- but I think we ought to reflect on that
rather than just doing a narrow hack for column stores used in the
particular way you're describing here.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Entities created in one query not available in another in extended protocol

2015-06-12 Thread Simon Riggs

On 11 June 2015 at 22:12, Shay Rojansky  wrote:

> Thanks everyone for your time (or rather sorry for having wasted it).
>
> Just in case it's interesting to you... The reason we implemented things
> this way is in order to avoid a deadlock situation - if we send two queries
> as P1/D1/B1/E1/P2/D2/B2/E2, and the first query has a large resultset,
> PostgreSQL may block writing the resultset, since Npgsql isn't reading it
> at that point. Npgsql on its part may get stuck writing the second query
> (if it's big enough) since PostgreSQL isn't reading on its end (thanks to
> Emil Lenngren for pointing this out originally).
>

That part does sound like a problem that we have no good answer to. Sounds
worth starting a new thread on that.


> Of course this isn't an excuse for anything, we're looking into ways of
> solving this problem differently in our driver implementation.
>
> Shay
>
> On Thu, Jun 11, 2015 at 6:17 PM, Simon Riggs 
> wrote:
>
>> On 11 June 2015 at 16:56, Shay Rojansky  wrote:
>>
>> Npgsql (currently) sends Parse for the second command before sending
>>> Execute for the first one.
>>>
>>
>> Look no further than that.
>>
>>

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] On columnar storage

2015-06-12 Thread Alvaro Herrera

Amit Kapila wrote:
> On Fri, Jun 12, 2015 at 4:33 AM, Alvaro Herrera 
> wrote:

> > There are several parts to this:
> >
> > 1. the CSM API
> > 2. Cataloguing column stores
> > 3. Query processing: rewriter, optimizer, executor
> >
> 
> I think another important point is about the format of column stores, in
> Page format used by index/heap and how are they organised?

Not really.  That stuff is part of the column store implementation
itself; we're not tackling that part just yet.  Eventually there might
be an implementation using ORC or other formats.  That doesn't matter at
this point -- we only need something that implements the specified API.

> > One critical detail is what will be used to identify a heap row when
> > talking to a CS implementation.  There are two main possibilities:
> >
> > 1. use CTIDs
> > 2. use some logical tuple identifier
> >
> > Using CTIDs is simpler.  One disadvantage is that every UPDATE of a row
> > needs to let the CS know about the new location of the tuple, so that
> > the value is known associated with the new tuple location as well as the
> > old.  This needs to happen even if the value of the column itself is not
> > changed.
> 
> Isn't this somewhat similar to index segment?

Not sure what you mean with "index segment".  A column store is not an
index -- it is the primary storage for the column in question.  The heap
does not have a copy of the data.

> Will the column store obey snapshot model similar to current heap tuples,
> if so will it derive the transaction information from heap tuple?

Yes, visibility will be tied to the heap tuple -- a value is accessed
only when its corresponding heap row has already been determined to be
visible.  One interesting point that raises from this is about vacuum:
when are we able to remove a value from the store?  I have some
not-completely-formed ideas about this.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why does replication need the old history file?

2015-06-12 Thread Josh Berkus


>>> Questions:
>>>
>>> A. Why does the replica need 0002.history?  Shouldn't it only need
>>> 0003.history?
>>
>> From where is the base backup taken in case of the node started at 5?

It is the same backup used to restore the master, restored to a point in
time 5 minutes earlier just to make sure the replica isn't ahead of the
master.

> 
> The related source code comment says
> 
> /*
>  * Get any missing history files. We do this always, even when we're
>  * not interested in that timeline, so that if we're promoted to
>  * become the master later on, we don't select the same timeline that
>  * was already used in the current master. This isn't bullet-proof -
>  * you'll need some external software to manage your cluster if you
>  * need to ensure that a unique timeline id is chosen in every case,
>  * but let's avoid the confusion of timeline id collisions where we
>  * can.
>  */
> WalRcvFetchTimeLineHistoryFiles(startpointTLI, primaryTLI);

So this seems to be something we're doing "just in case" which is
preventing a useful way to spin up large master/replica clusters from
PITR backup.  Might this be something we want to change, and simply
error that we can't find the history file instead of FATAL?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Further issues with jsonb semantics, documentation

2015-06-12 Thread Andrew Dunstan



On 06/10/2015 04:02 PM, Peter Geoghegan wrote:

On Wed, Jun 10, 2015 at 11:48 AM, Andrew Dunstan  wrote:

Sorry for the delay on this. I've been mostly off the grid, having an all
too rare visit from Tom "Mr Enum" Dunstan, and I misunderstood what you were
suggesting,

Thank you for working with me to address this. I've been busy with
other things the past few days too.


Please submit a patch to adjust the treatment of negative integers in the
old functions to be consistent with their treatment in the new functions.
i.e. in the range [-n,-1] they should refer to the corresponding element
counting from the right.

I've almost finished that patch. I'm currently blocked on deciding
what to do about the old path-orientated operators (#> and #>> for
json and jsonb types). It's rather painful to support pushing down
negative subscripting there -- maybe we should just not do so for
those variants, especially given that they're already notationally
inconsistent with the other operators that I'll be updating. What do
you think?

Maybe I'll come up with a simpler way of making that work by taking a
fresh look at it, but haven't done that yet.

My current, draft approach to making subscripting work with the json
variants (not the jsonb variants) is to use a second get_worker() call
in the event of a negative subscript, while making the first such call
(the existing get_worker() call) establish the number of top-level
array elements. That isn't beautiful, and involves some amount of
redundant work, but it's probably better than messing with
get_worker() in a more invasive way. Besides, this second get_worker()
call only needs to happen in the event of a negative subscript, and
I'm only really updating this (that is, updating routines like
json_array_element()) to preserve consistency with jsonb. What do you
think of that idea?




Just took a quick look. My impression is that the jsonb case should be 
fairly easy. If the index is negative, add JB_ROOT_COUNT(container) to 
it and use that as the argument to getIthJsonbValueFromContainer().


I agree that the json case looks a bit nasty. Maybe a better approach 
would be to provide a function that, given a JsonLexContext, returns the 
number of array elements of the current array. In get_array_start we 
could call that if the relevant path element is negative and adjust it 
accordingly.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] The purpose of the core team

2015-06-12 Thread Robert Haas

On Fri, Jun 12, 2015 at 1:21 AM, Simon Riggs  wrote:
> Deciding "WHAT goes in the next release?" is what Committers do, by
> definition.
>
> It seems strange to have a different mailing list for "WHEN is the next
> release needed?", so those two things should be combined.

Core team members have sometimes taken the position that this is
already so; that releases should be discussed on pgsql-hackers and
pgsql-security as appropriate to the driver.  In theory, this may be
fine, but in practice, it doesn't seem to be working very well right
now.

> Packagers should be about "HOW do we make the next release", which is
> separate from the above.
>
> Ultimately, "How" effects "When", but "When is it needed?" is an earlier
> thought.

+1.  Completely agreed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git push hook to check for outdated timestamps

2015-06-12 Thread Tom Lane

Andrew Dunstan  writes:
> On 06/12/2015 09:31 AM, Robert Haas wrote:
>> Could we update our git hook to refuse a push of a new commit whose
>> timestamp is more than, say, 24 hours in the past?  Our commit history
>> has some timestamps in it now that are over a month off, and it's
>> really easy to do, because when you rebase a commit, it keeps the old
>> timestamp.  If you then merge or cherry-pick that into an official
>> branch rather than patch + commit, you end up with this problem unless
>> you are careful to fix it by hand.  It would be nice to prevent
>> further mistakes of this sort, as they create confusion.

> I think 24 hours is probably fairly generous,

Yeah ... if we're going to try to enforce a linear-looking history, ISTM
the requirement ought to be "newer than the latest commit on the same
branch".  Perhaps that would be unduly hard to implement though.

FWIW, our git_changelog script tries to avoid this problem by paying
attention to CommitDate not Date.  But I agree that it's confusing when
those fields are far apart.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git push hook to check for outdated timestamps

2015-06-12 Thread Andrew Dunstan



On 06/12/2015 09:31 AM, Robert Haas wrote:

Could we update our git hook to refuse a push of a new commit whose
timestamp is more than, say, 24 hours in the past?  Our commit history
has some timestamps in it now that are over a month off, and it's
really easy to do, because when you rebase a commit, it keeps the old
timestamp.  If you then merge or cherry-pick that into an official
branch rather than patch + commit, you end up with this problem unless
you are careful to fix it by hand.  It would be nice to prevent
further mistakes of this sort, as they create confusion.



+1.

I think 24 hours is probably fairly generous, but in principle this 
seems right - the timestamp would ideally be pretty close to the time it 
hits the public tree.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] reaper should restart archiver even on standby

2015-06-12 Thread Fujii Masao

On Thu, Jun 11, 2015 at 1:39 AM, Alvaro Herrera
 wrote:
> Fujii Masao wrote:
>
>> Agreed. The attached patch defines the macro to check whether archiver is
>> allowed to start up or not, and uses it everywhere except sigusr1_handler.
>> I made sigusr1_handler use a different condition because only it tries to
>> start archiver in PM_STARTUP postmaster state and it looks a bit messy
>> to add the check of that state into the centralized check condition.
>
> WFM, but do these macros in xlog.h need a one-line comment to state
> their purpose?

Yes, I added the comments and just pushed the patch. Thanks!

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Reconsidering the behavior of ALTER COLUMN TYPE

2015-06-12 Thread Tom Lane

Noah Misch  writes:
> On Thu, Jun 11, 2015 at 03:41:49PM -0500, Merlin Moncure wrote:
>> On Thu, Jun 11, 2015 at 3:12 PM, Tom Lane  wrote:
>>> In any case, we oughta use two different error messages for the two cases,
>>> as per my comment in the above thread.  That seems like a back-patchable
>>> bug fix, though of course any semantics change should only be in HEAD.

>> I have a slight preference to keep it to tightening up the wording on
>> both the hint and the error (for example, "Perhaps you meant USING
>> foo::type?") but leaving the behavior alone.

> +1.  The HINT could certainly provide situation-specific help.

Fair enough, I'll go fix that but leave the semantics alone.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On columnar storage

2015-06-12 Thread Merlin Moncure

On Thu, Jun 11, 2015 at 6:03 PM, Alvaro Herrera
 wrote:
> We hope to have a chance to discuss this during the upcoming developer
> unconference in Ottawa.  Here are some preliminary ideas to shed some
> light on what we're trying to do.

Quick thought.  We already support out of line columnar storage in a
fashion with 'toast'.  Obviously that's a long way from where you want
to go, but have you ruled out extending that system?

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [Proposal] Progress bar for pg_dump/pg_restore

2015-06-12 Thread Andres Freund

Hi,

On 2015-06-12 12:45:50 +, Taiki Kondo wrote:
> Design & API
> 
> When pg_dump / pg_restore is running, progress bar and estimated time to 
> finish is shown on screen like following. 
> 
> 
> =>   (50%)  15:50
> 
> The bar ("=>" in above) and percentage value ("50%" in above) show percentage 
> of progress, and the time ("15:50" in above) shows estimated time to finish.
> (This percentage is the ratio for the whole processing.)
> 
> Percentage and time are calculated and shown for every 1 second.
> 
> In pg_dump, the information, which is required for calculating percentage and 
> time, is from pg_class.
> 
> In pg_restore, to calculate the same things, I want to record total amount of 
> command lines into pg_dump file, thus I would like to add a new element to 
> "Archive" structure.
> (This means that version number of archive format is changed.)

The question is how to actually get useful estimates. As there's no
progress report for indvidiual COPY and CREATE INDEX commands you'll, in
many cases, have very irregular progress updates. In many many cases
most of the time is spent on a very small subset of the total objects.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [Proposal] Progress bar for pg_dump/pg_restore

2015-06-12 Thread Merlin Moncure

On Fri, Jun 12, 2015 at 7:45 AM, Taiki Kondo  wrote:
> Hi, all.
>
> I am newbie in hackers.
> I have an idea from my point of view as one user, I would like to propose the 
> following.
>
>
> Progress bar for pg_dump / pg_restore
> =
>
> Motivation
> --
> "pg_dump" and "pg_restore" show nothing if users don't specify verbose (-v) 
> option.
> In too large table to finish in a few minutes, this behavior worries some 
> users about if this situation (nothing shows up) is all right.
>
> I propose this feature to free these users from worrying.
>
>
> Design & API
> 
> When pg_dump / pg_restore is running, progress bar and estimated time to 
> finish is shown on screen like following.
>
>
> =>   (50%)  15:50
>
> The bar ("=>" in above) and percentage value ("50%" in above) show percentage 
> of progress, and the time ("15:50" in above) shows estimated time to finish.
> (This percentage is the ratio for the whole processing.)
>
> Percentage and time are calculated and shown for every 1 second.
>
> In pg_dump, the information, which is required for calculating percentage and 
> time, is from pg_class.
>
> In pg_restore, to calculate the same things, I want to record total amount of 
> command lines into pg_dump file, thus I would like to add a new element to 
> "Archive" structure.
> (This means that version number of archive format is changed.)
>
>
> Usage
> --
> To use this feature, user must specify "-P" option in command line.
> (This definition is also temporary, so this is changeable if this leads 
> problem.)
>
> $ pg_dump -Fc -P -f foo.pgdump foo
>
> I also think it's better that this feature is enabled as the default and does 
> not force users to specify any options, but it means changing the default 
> behavior, and can make problem in some programs expecting no output on stdout.
>
>
> I will implement this feature if this proposal is accepted by hackers.
> (Maybe, I will not use ncurses for implementing this feature, because ncurses 
> can not be used with standard printf family functions.)
>
>
> Any comments are welcome.

*) how do you estimate %done and ETA when dumping?

*) what's the benefit of doing this instead of using a utility like 'pv'?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] git push hook to check for outdated timestamps

2015-06-12 Thread Robert Haas

Could we update our git hook to refuse a push of a new commit whose
timestamp is more than, say, 24 hours in the past?  Our commit history
has some timestamps in it now that are over a month off, and it's
really easy to do, because when you rebase a commit, it keeps the old
timestamp.  If you then merge or cherry-pick that into an official
branch rather than patch + commit, you end up with this problem unless
you are careful to fix it by hand.  It would be nice to prevent
further mistakes of this sort, as they create confusion.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Missing XLOG_DEBUG check in AdvanceXLInsertBuffer()?

2015-06-12 Thread Robert Haas

On Wed, Jun 10, 2015 at 7:02 AM, Andres Freund  wrote:
> DEBUG:  initialized 1 pages, upto 40/3977E000
> DEBUG:  initialized 9 pages, upto 40/3979
> DEBUG:  initialized 1 pages, upto 40/39792000
> DEBUG:  initialized 1 pages, upto 40/39794000
> DEBUG:  initialized 1 pages, upto 40/39796000
> DEBUG:  initialized 1 pages, upto 40/39798000
> I find that quite annoying. That specific elog() has been there since
> 9a20a9b21 in 9.4.
>
> Does somebody mind me backpatching the missing XLOG_DEBUG &&?

While you're at it, maybe you should s/upto/up to/.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On columnar storage

2015-06-12 Thread Robert Haas

On Thu, Jun 11, 2015 at 7:03 PM, Alvaro Herrera
 wrote:
> I've been trying to figure out a plan to enable native column stores
> (CS or "colstore") for Postgres.  Motivations:
>
> * avoid the 32 TB limit for tables
> * avoid the 1600 column limit for tables
> * increased performance

To me, it feels like there are two different features here that would
be better separated.  First, there's the idea of having a table that
gets auto-joined to other tables whenever you access it, so that the
user sees one really wide table but really the data is segregated by
column groups under the hood.  That's a neat idea.  Second, there's
the idea of a way of storing tuples that is different from
PostgreSQL's usual mechanism - i.e. a storage manager API. I
understand your concerns about going through the FDW API so maybe
that's not the right way to do it, but it seems to me that in the end
you are going to end up with something awfully similar to that + a
local relfilenode + WAL logging support.  I'm not clear on why you
want to make the column store API totally different from the FDW API;
there may be a reason, but I don't immediately see it.

Each of these two features is independently useful.  If you just had
the first feature, you could use the existing table format as your
columnar store.  I'm sure it's possible to do a lot better in some
cases, but there could easily be cases where that's a really big win,
because the existing table format has far more sophisticated indexing
capabilities than any columnar store is likely to have in an early
version.  The second capability, of course, opens up all sorts of
interesting possibilities, like compressed read-only storage or
index-organized tables.  And it would also let you have an
"all-columnar" table, similar to what Citus's cstore_fdw does, which
doesn't seem like it would be supported in your design, and could be a
pretty big win for certain kinds of tables.

BTW, I'm not sure if it's a good idea to call all of this stuff
"cstore".  The name is intuitive, but cstore_fdw has enough traction
already that it may create some confusion.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Collection of memory leaks for ECPG driver

2015-06-12 Thread Michael Meskes

On Mon, Jun 08, 2015 at 10:50:25PM +0900, Michael Paquier wrote:
> And the caller needs to be sure that str actually allocates enough
> space.. That's not directly ECPG's business, still it feels

But there is no way for us to fix this as we want to implement the API as
defined in Informix.

> uncomfortable. I fixed it with the attached.

Thanks, committed.

> The world is full of surprises, and even if free(NULL) is a noop on
> modern platforms, I would rather take it defensively and check for a
> NULL pointer as Postgres supports many platforms. Other code paths
> tend to do already so, per se for example ECPGconnect.

Right, that's because they were developed before free received the safeguard, 
or the programmer simply didn't know at that point in time. Unless I'm mistaken 
we have other code that only calls free() without an additional safeguard, so 
why shuld ecpglib be special? If you don't like it using both approaches, feel 
free to remove the check in the other case. :)

More seriously, though, does anyone know of any platform where free(NULL) is 
*not* a noop?

I don't like adding stuff just because of the world being full of surprises.
There may also be other functions not working as advertized. Wouldn't that
then mean we cannot use any function nor provided by ourselves?

> Perhaps I am overdoing it. I would have them for correctness (some
> other code paths do so) but if you think that the impact is minimal
> there then I am fine to not modify descriptor.c.

Sorry, I wasn't referring to descriptor.c but to the whole preproc/ subtree. 
But as I already said, since we went through the effort we can of course apply 
it.

Will be in my next push.

Thanks.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Meskes at (Debian|Postgresql) dot Org
Jabber: michael.meskes at gmail dot com
VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 release notes

2015-06-12 Thread Petr Jelinek


Hi,

+  
+   
+Add JSONB functions jsonb_set() and
+jsonb_pretty (Dmitry Dolgov, Andrew Dunstan)
+   
+  

I believe I should be 3rd author for this one as I rewrote large parts 
of this functionality as part of the review.



--
Petr Jelinek  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] [Proposal] Progress bar for pg_dump/pg_restore

2015-06-12 Thread Taiki Kondo

Hi, all.

I am newbie in hackers.
I have an idea from my point of view as one user, I would like to propose the 
following.


Progress bar for pg_dump / pg_restore
=

Motivation
--
"pg_dump" and "pg_restore" show nothing if users don't specify verbose (-v) 
option.
In too large table to finish in a few minutes, this behavior worries some users 
about if this situation (nothing shows up) is all right.

I propose this feature to free these users from worrying.


Design & API

When pg_dump / pg_restore is running, progress bar and estimated time to finish 
is shown on screen like following. 


=>   (50%)  15:50

The bar ("=>" in above) and percentage value ("50%" in above) show percentage 
of progress, and the time ("15:50" in above) shows estimated time to finish.
(This percentage is the ratio for the whole processing.)

Percentage and time are calculated and shown for every 1 second.

In pg_dump, the information, which is required for calculating percentage and 
time, is from pg_class.

In pg_restore, to calculate the same things, I want to record total amount of 
command lines into pg_dump file, thus I would like to add a new element to 
"Archive" structure.
(This means that version number of archive format is changed.)


Usage
--
To use this feature, user must specify "-P" option in command line.
(This definition is also temporary, so this is changeable if this leads 
problem.)

$ pg_dump -Fc -P -f foo.pgdump foo

I also think it's better that this feature is enabled as the default and does 
not force users to specify any options, but it means changing the default 
behavior, and can make problem in some programs expecting no output on stdout.


I will implement this feature if this proposal is accepted by hackers.
(Maybe, I will not use ncurses for implementing this feature, because ncurses 
can not be used with standard printf family functions.)


Any comments are welcome.



Best Regards,

--
Taiki Kondo




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Aggregate Supporting Functions

2015-06-12 Thread David Rowley

On 10 June 2015 at 02:26, Tom Lane  wrote:

> Kevin Grittner  writes:
> > David Rowley  wrote:
> >> [ avoid duplicate calculations for related aggregates ]
>
> > From the information you have proposed storing, with cost factors
> > associated with the functions, it seems technically possible to
> > infer that you could run (for example) the avg() aggregate to
> > accumulate both but only run the final functions of the aggregates
> > referenced by the query.  That seems like an optimization to try
> > hard to forget about until you have at least one real-world use
> > case where it would yield a significant benefit.  It seems
> > premature to optimize for that before having the rest working.
>
> Actually, I would suggest that you forget about all the other aspects
> and *just* do that, because it could be made to work today on existing
> aggregate functions, and it would not require hundreds-to-thousands
> of lines of boilerplate support code in the grammar, catalog support,
> pg_dump, yadda yadda.  That is, look to see which aggregates use the
> same transition function and run that just once.  We already have the
> rule that the final function can't destroy the transition output,
> so running two different final functions on the same transition result
> should Just Work.
>
>
Good idea.
I believe the only extra check, besides do they use the same transfn, would
be the initvalue of the aggregate.

I'll write a patch and post in the next couple of days.

Regards

David Rowley

--
 David Rowley   http://www.2ndQuadrant.com/

 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] pg_rewind failure by file deletion in source server

2015-06-12 Thread Fujii Masao

On Fri, Jun 12, 2015 at 4:29 PM, Michael Paquier
 wrote:
> On Fri, Jun 12, 2015 at 3:50 PM, Fujii Masao  wrote:
>> On Fri, Jun 12, 2015 at 3:17 PM, Michael Paquier
>>  wrote:
>>> On Thu, Jun 11, 2015 at 5:48 PM, Fujii Masao  wrote:
 On Thu, Jun 11, 2015 at 2:14 PM, Michael Paquier
  wrote:
> On Thu, Jun 11, 2015 at 1:51 AM, Fujii Masao  
> wrote:
>> Shouldn't pg_rewind ignore that failure of operation? If the file is not
>> found in source server, the file doesn't need to be copied to destination
>> server obviously. So ISTM that pg_rewind safely can skip copying that 
>> file.
>> Thought?
>
> I think that you should fail. Let's imagine that the master to be
> rewound has removed a relation file before being stopped cleanly after
> its standby has been promoted that was here at the last checkpoint
> before forking, and that the standby still has the relation file after
> promotion. You should be able to copy it to be able to replay WAL on
> it. If the standby has removed a file in the file map after taking the
> file map, I guess that the best thing to do is fail because the file
> that should be here for the rewound node cannot be fetched.

 In this case, why do you think that the file should exist in the old 
 master?
 Even if it doesn't exist, ISTM that the old master can safely replay the 
 WAL
 records related to the file when it restarts. So what's the problem
 if the file doesn't exist in the old master?
>>>
>>> Well, some user may want to rewind the master down to the point where
>>> WAL forked, and then recover it immediately when a consistent point is
>>> reached just at restart instead of replugging it into the cluster. In
>>> this case I think that you need the relation file of the dropped
>>> relation to get a consistent state. That's still cheaper than
>>> recreating a node from a fresh base backup in some cases, particularly
>>> if the last base backup taken is far in the past for this cluster.
>>
>> So it's the case where a user wants to recover old master up to the point
>> BEFORE the file in question is deleted in new master. At that point,
>> since the file must exist, pg_rewind should fail if the file cannot be copied
>> from new master. Is my understanding right?
>
> Yep. We are on the same line.
>
>> As far as I read the code of pg_rewind, ISTM that your scenario never 
>> happens.
>> Because pg_rewind sets the minimum recovery point to the latest WAL location
>> in new master, i.e., AFTER the file is deleted. So old master cannot stop
>> recovering before the file is deleted in new master. If the recovery stops
>> at that point, it fails because the minimum recovery point is not reached 
>> yet.
>>
>> IOW, after pg_rewind runs, the old master has to replay the WAL records
>> which were generated by the deletion of the file in the new master.
>> So it's okay if the old master doesn't have the file after pg_rewind runs,
>> I think.
>
> Ah, right. I withdraw, indeed what I thought can not happen:
> /*
>  * Update control file of target. Make it ready to perform archive
>  * recovery when restarting.
>  *
>  * minRecoveryPoint is set to the current WAL insert location in the
>  * source server. Like in an online backup, it's important
> that we recover
>  * all the WAL that was generated while we copied the files over.
>  */
> So a rewound node will replay WAL up to the current insert location of
> the source, and will fail at recovery if recovery target is older than
> this insert location..
>
> You want to draft a patch? Should I?

Please feel free to try that! :)

> I think that we should have a
> test case as well in pg_rewind/t/.

Maybe.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] The Future of Aggregation

2015-06-12 Thread David Rowley

On 11 June 2015 at 01:39, Kevin Grittner  wrote:

> David Rowley  wrote:
> >
> > /* setup */ create table millionrowtable as select
> > generate_series(1,100)::numeric as x;
> > /* test 1 */ SELECT sum(x) / count(x)  from millionrowtable;
> > /* test 2 */ SELECT avg(x) from millionrowtable;
> >
> > Test 1:
> > 274.979 ms
> > 272.104 ms
> > 269.915 ms
> >
> > Test 2:
> > 229.619 ms
> > 220.703 ms
> > 234.743 ms
> >
>
> > (About 19% slower)
>
> Of course, with Tom's approach you would see the benefit; the two
> statements should run at about the same speed.
>
> I am a little curious what sort of machine you're running on,
> because my i7 is much slower.  I ran a few other tests with your
> table for perspective.
>
>
Assert enabled build?
My hardware is very unimpressive... an i5 from Q1 2010. Due to be replaced
very soon.


>
> One question that arose in my mind running this was whether might
> be able to combine sum(x) with count(*) if x was NOT NULL, even
> though the arguments don't match.  It might not be worth the
> gymnastics of recognizing the special case, and I certainly
> wouldn't recommend looking at that optimization in a first pass;
> but it might be worth jotting down on a list somewhere
>
>
I think it's worth looking into that at some stage. I think I might have
some of the code that would be required for the NULL checking over here ->
http://www.postgresql.org/message-id/CAApHDvqRB-iFBy68=dcgqs46arep7aun2pou4ktwl8kx9yo...@mail.gmail.com

I'm just not so sure what the logic would be to decide when we could apply
this. The only properties I can see that may be along the right lines are
pg_proc.pronargs for int8inc and inc8inc_any.

Regards

David Rowley

--
 David Rowley   http://www.2ndQuadrant.com/

 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Why does replication need the old history file?

2015-06-12 Thread Fujii Masao

On Fri, Jun 12, 2015 at 5:18 PM, Michael Paquier
 wrote:
> On Fri, Jun 12, 2015 at 4:56 AM, Josh Berkus  wrote:
>> Hackers,
>>
>> Sequence of events:
>>
>> 1. PITR backup of server on timeline 2.
>>
>> 2. Restored the backup to a new server, new-master.
>>
>> 3. Restored the backup to another new server, new-replica.
>>
>> 4. Started and promoted new-master (now on Timeline 3).
>>
>> 5. Started new-replica, connecting over streaming to new-master.
>>
>> 6. Get error message:
>>
>> 2015-06-11 12:24:14.503 PDT,,,7465,,5579e05e.1d29,1,,2015-06-11 12:24:14
>> PDT,,0,LOG,0,"fetching timeline history file for timeline 2 from
>> primary server",""
>> 2015-06-11 12:24:14.503 PDT,,,7465,,5579e05e.1d29,2,,2015-06-11 12:24:14
>> PDT,,0,FATAL,XX000,"could not receive timeline history file from the
>> primary server: ERROR:  could not open file
>> ""pg_xlog/0002.history"": No such file or directory
>>
>> Questions:
>>
>> A. Why does the replica need 0002.history?  Shouldn't it only need
>> 0003.history?
>
> From where is the base backup taken in case of the node started at 5?

The related source code comment says

/*
 * Get any missing history files. We do this always, even when we're
 * not interested in that timeline, so that if we're promoted to
 * become the master later on, we don't select the same timeline that
 * was already used in the current master. This isn't bullet-proof -
 * you'll need some external software to manage your cluster if you
 * need to ensure that a unique timeline id is chosen in every case,
 * but let's avoid the confusion of timeline id collisions where we
 * can.
 */
WalRcvFetchTimeLineHistoryFiles(startpointTLI, primaryTLI);

>
>> B. Did something change in this regard in 9.3.6, 9.3.7 or 9.3.8?  It was
>> working in our previous setup, on 9.3.5, although that could have just
>> been that the history file hadn't been removed from the backups yet.
>
> At quick glance, I can see nothing in xlog.c between those releases.

Yep, I could reproduce the "trouble" even in 9.3.5 in my laptop.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] The purpose of the core team

2015-06-12 Thread Oleg Bartunov

+1

On Thu, Jun 11, 2015 at 5:12 PM, Robert Haas  wrote:

> On Tue, Jun 9, 2015 at 6:35 PM, Bruce Momjian  wrote:
> > There has been some confusion by old and new community members about the
> > purpose of the core team, and this lack of understanding has caused some
> > avoidable problems.  Therefore, the core team has written a core charter
> > and published it on our website:
> >
> > http://www.postgresql.org/developer/core/
> >
> > Hopefully this will be helpful to people.
>
> I believe the core team is suffering from a lack of members who are
> involved in writing, reviewing, and committing patches.  Those things
> are not core functions of the core team, as that charter illustrates.
> However, the core team needs to know when it should initiate a
> release, and to do that it needs to understand the impact of bugs that
> have been fixed and bugs that have not been fixed.  The recent
> discussion of multixacts seems to indicate that the number of core
> team members who had a clear understanding of the issues was zero,
> which I view as unfortunate.  The core team also needs to make good
> decisions about who should be made a committer, and the people who are
> doing reviews and commits of other people's patches are in the best
> position to have an informed opinion on that topic.
>
> As a non-core team member, I find it quite frustrating that getting a
> release triggered requires emailing a closed mailing list.  I am not a
> party to all of the discussion on my request, and the other people who
> might know whether my request is technically sound or not are not
> party to that discussion either.  I disagreed with the decision to
> stamp 9.4.3 without waiting for
> b6a3444fa63519a0192447b8f9a332dddc66018f, but of course I couldn't
> comment on it, because it was decided in a forum in which I don't get
> to participate, on a thread on which I was not copied.  I realize
> that, because decisions about whether to release and when to release
> often touch on security issues, not all of this discussion can be
> carried on in public.  But when the cone of secrecy is drawn in so
> tightly that excludes everyone who actually understands the technical
> issues related to the proposed release, we have lost our way, and do
> our users a disservice.
>
> I am not sure whether the solution to this problem is to add more
> people to the core team, or whether the solution is to move release
> timing decisions and committer selection out of the core team to some
> newly-created group.  But I believe that change is needed.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] Why does replication need the old history file?

2015-06-12 Thread Michael Paquier

On Fri, Jun 12, 2015 at 4:56 AM, Josh Berkus  wrote:
> Hackers,
>
> Sequence of events:
>
> 1. PITR backup of server on timeline 2.
>
> 2. Restored the backup to a new server, new-master.
>
> 3. Restored the backup to another new server, new-replica.
>
> 4. Started and promoted new-master (now on Timeline 3).
>
> 5. Started new-replica, connecting over streaming to new-master.
>
> 6. Get error message:
>
> 2015-06-11 12:24:14.503 PDT,,,7465,,5579e05e.1d29,1,,2015-06-11 12:24:14
> PDT,,0,LOG,0,"fetching timeline history file for timeline 2 from
> primary server",""
> 2015-06-11 12:24:14.503 PDT,,,7465,,5579e05e.1d29,2,,2015-06-11 12:24:14
> PDT,,0,FATAL,XX000,"could not receive timeline history file from the
> primary server: ERROR:  could not open file
> ""pg_xlog/0002.history"": No such file or directory
>
> Questions:
>
> A. Why does the replica need 0002.history?  Shouldn't it only need
> 0003.history?

>From where is the base backup taken in case of the node started at 5?

> B. Did something change in this regard in 9.3.6, 9.3.7 or 9.3.8?  It was
> working in our previous setup, on 9.3.5, although that could have just
> been that the history file hadn't been removed from the backups yet.

At quick glance, I can see nothing in xlog.c between those releases.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_rewind failure by file deletion in source server

2015-06-12 Thread Michael Paquier

On Fri, Jun 12, 2015 at 3:50 PM, Fujii Masao  wrote:
> On Fri, Jun 12, 2015 at 3:17 PM, Michael Paquier
>  wrote:
>> On Thu, Jun 11, 2015 at 5:48 PM, Fujii Masao  wrote:
>>> On Thu, Jun 11, 2015 at 2:14 PM, Michael Paquier
>>>  wrote:
 On Thu, Jun 11, 2015 at 1:51 AM, Fujii Masao  wrote:
> Shouldn't pg_rewind ignore that failure of operation? If the file is not
> found in source server, the file doesn't need to be copied to destination
> server obviously. So ISTM that pg_rewind safely can skip copying that 
> file.
> Thought?

 I think that you should fail. Let's imagine that the master to be
 rewound has removed a relation file before being stopped cleanly after
 its standby has been promoted that was here at the last checkpoint
 before forking, and that the standby still has the relation file after
 promotion. You should be able to copy it to be able to replay WAL on
 it. If the standby has removed a file in the file map after taking the
 file map, I guess that the best thing to do is fail because the file
 that should be here for the rewound node cannot be fetched.
>>>
>>> In this case, why do you think that the file should exist in the old master?
>>> Even if it doesn't exist, ISTM that the old master can safely replay the WAL
>>> records related to the file when it restarts. So what's the problem
>>> if the file doesn't exist in the old master?
>>
>> Well, some user may want to rewind the master down to the point where
>> WAL forked, and then recover it immediately when a consistent point is
>> reached just at restart instead of replugging it into the cluster. In
>> this case I think that you need the relation file of the dropped
>> relation to get a consistent state. That's still cheaper than
>> recreating a node from a fresh base backup in some cases, particularly
>> if the last base backup taken is far in the past for this cluster.
>
> So it's the case where a user wants to recover old master up to the point
> BEFORE the file in question is deleted in new master. At that point,
> since the file must exist, pg_rewind should fail if the file cannot be copied
> from new master. Is my understanding right?

Yep. We are on the same line.

> As far as I read the code of pg_rewind, ISTM that your scenario never happens.
> Because pg_rewind sets the minimum recovery point to the latest WAL location
> in new master, i.e., AFTER the file is deleted. So old master cannot stop
> recovering before the file is deleted in new master. If the recovery stops
> at that point, it fails because the minimum recovery point is not reached yet.
>
> IOW, after pg_rewind runs, the old master has to replay the WAL records
> which were generated by the deletion of the file in the new master.
> So it's okay if the old master doesn't have the file after pg_rewind runs,
> I think.

Ah, right. I withdraw, indeed what I thought can not happen:
/*
 * Update control file of target. Make it ready to perform archive
 * recovery when restarting.
 *
 * minRecoveryPoint is set to the current WAL insert location in the
 * source server. Like in an online backup, it's important
that we recover
 * all the WAL that was generated while we copied the files over.
 */
So a rewound node will replay WAL up to the current insert location of
the source, and will fail at recovery if recovery target is older than
this insert location..

You want to draft a patch? Should I? I think that we should have a
test case as well in pg_rewind/t/.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] skipping pg_log in basebackup (was Re: pg_basebackup and pg_stat_tmp directory)

2015-06-12 Thread Abhijit Menon-Sen

At 2015-06-11 14:38:03 +0900, langote_amit...@lab.ntt.co.jp wrote:
>
> > On the other hand, I don't like the idea of doing (3) by adding
> > command line arguments to pg_basebackup and adding a new option to
> > the command. I don't think that level of "flexibility" is justified;
> > it would also make it easier to end up with a broken base backup (by
> > inadvertently excluding more than you meant to).
> 
> Maybe a combination of (2) and part of (3). In absence of any command
> line argument, the behavior is (2), to exclude. Provide an option to
> *include* it (-S/--serverlog).

I don't like that idea any more than having the command-line argument to
exclude pg_log. (And people who store torrented files in PGDATA may like
it even less.)

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

56 matches

Mail list logo