Re: [HACKERS] 9.4 Proposal: Initdb creates a single table

2014-04-22 Thread Craig Ringer
On 04/23/2014 02:11 PM, Simon Riggs wrote:
> I propose we add a single table called Postgres when we Initdb
> 
>   CREATE TABLE Postgres (Id Integer, Data Jsonb);

Without particular comment on the need for the table, I'd be concerned
about calling it "postgres".

My personal impression from Stack Overflow etc has been that users are
readily confused by the fact that we have:

- Database engine/system "postgres"
- backend binary "postgres" (they see it in ps)
- unix user "postgres"
- Pg superuser "postgres"
- database "postgres"

Sure, there's an argument for running with the theme here, but I suspect
using the name "postgres" for a default table will just muddy the waters
a bit more.

Even "postgres_table" would help.

It *absolutely must* be lower case, whatever it is, IMO. If you're going
for newest-of-the-newbies, the last thing you want to do is having them
dealing with it being just Postgres in some places, and having to be
"Postgres" in others.

Personally, don't know if I'm convinced it's overly worth doing - but I
think it's silly to dismiss without actually corralling up some users
who're unfamiliar with Pg and watching them get started. I'd love to see
some properly conducted usability studies of Pg, and something like this
would fit in well.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 Proposal: Initdb creates a single table

2014-04-22 Thread Tom Lane
Simon Riggs  writes:
> By now, some of you will be doubled over laughing as if this is an
> April fool joke.

Indeed.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 9.4 Proposal: Initdb creates a single table

2014-04-22 Thread Simon Riggs
We start with a database called Postgres and a schema called Public.
Yet we don't start up with any usable tables.

I propose we add a single table called Postgres when we Initdb

  CREATE TABLE Postgres (Id Integer, Data Jsonb);
  COMMENT ON TABLE Postgres IS 'Single table for quick start usage -
design your database';

The purpose of this is to make the database immediately usable. By
including this table in the default  initdb it will mean that programs
can rely on the existence of this table and begin working quicker.

By now, some of you will be doubled over laughing as if this is an
April fool joke.  I don't mean it to be at all.

The idea is to have a stupidly obvious and easy table that will
potentially be usable by just about everyone, in any language.

If you don't like it, don't use it. If you really dislike it, drop it.

But for new people coming to Postgres, they will have a data object to
access and begin using the database immediately. Their code will work,
their examples will work. OK, so they need to go back and think about
the design, but at least they got it to work and will be encouraged to
do more.

Remember when we didn't have a database called Postgres? Remember how
much simpler life is now? Remember that now.

We can also insert a single row, Id = 0 with "Postgres sample data" in
it, but that seems a step too far.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How can we make beta testing better?

2014-04-22 Thread Jehan-Guillaume de Rorthais
On Thu, 17 Apr 2014 16:42:21 -0700
Josh Berkus  wrote:

> On 04/15/2014 09:53 PM, Rod Taylor wrote:
> > A documented beta test process/toolset which does the following would help:
> > 1) Enables full query logging
> > 2) Creates a replica of a production DB, record $TIME when it stops.
> > 3) Allow user to make changes (upgrade to 9.4, change hardware, change
> > kernel settings, ...)
> > 4) Plays queries from the CSV logs starting from $TIME mimicking actual
> > timing and transaction boundaries
> > 
> > If Pg can make it easy to duplicate activities currently going on in
> > production inside another environment, I would be pleased to fire a couple
> > billion queries through it over the next few weeks.
> > 
> > #4 should include reporting useful to the project, such as a sampling of
> > queries which performed significantly worse and a few relative performance
> > stats for overall execution time.
> 
> So we have some software we've been procrastinating on OSS'ing, which does:
> 
> 1) Takes full query CSV logs from a running postgres instance
> 2) Runs them against a target instance in parallel
> 3) Records response times for all queries
> 
> tsung and pgreplay also do this, but have some limitations which make
> them impractical for a general set of logs to replay.

I've been working on another tool able to replay scenario recorded directly
from a network dump (see [pgshark]). It works, can be totally transparent from
the application point of view, the tcpdump can run anywhere, and **ALL** the
real traffic can be replayed...but it needs some more work for reporting and
handling parallel sessions. The drawback of using libpcap is that you can lost
packets while capturing and a very large capture buffer can not keep you safe
for hours of high-speed scenario. So it might require multiple capture and
adjusting the buffer size to capture 100% of the traffic on the required period.

I tried to quickly write a simple proxy using Perl POE to capture ALL the
traffic safely. My POC was doing nothing but forwarding packets and IIRC a 30s
stress test with 10 or 20 sessions using pgbench showed a drop of ~60% of
performances. But it was a very quick POC with a mono-processus/mono-thread
POC.

Maybe another path would be to be able to generate some this traffic dump
from PostgreSQL (which only have the application level to deal with) itself in a
format we can feed to pgbench. 

> What it would need is:
> 
> A) scripting around coordinated backups
> B) Scripting for single-command runs, including changing pg.conf to
> record data.

Changing the pg.conf is pretty easy with alter system now. But I'm sure we all
have some scripts out there doing this (at least I do)

> C) tools to *analyze* the output data, including error messages.

That's what I lack in pgshark so far.

[pgshark] https://github.com/dalibo/pgshark

Cheers,
-- 
Jehan-Guillaume de Rorthais
Dalibo
http://www.dalibo.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

2014-04-22 Thread YAMAMOTO Takashi
> - NetBSD: crashes under load; this could have been fixed but when I ran the
>   benchmarks in 2012 none of the developers seemed to care.

do you mean this?
https://mail-index.netbsd.org/tech-kern/2012/08/29/msg013918.html

YAMAMOTO Takashi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade and epoch

2014-04-22 Thread Sergey Konoplev
On Tue, Apr 22, 2014 at 8:08 PM, Sergey Burladyan  wrote:
> On Wed, Apr 23, 2014 at 6:38 AM, Sergey Konoplev  wrote:
>> BTW, I didn't manage to make a test case yet. Recently, when I was
>> migrating several servers to skytools3 and upgrading from 9.0 to 9.2,
>> I noticed that epoch was copied, timeline id was >0 after upgrade, but
> ...

Sorry, just noticed a typo: * timeline id = next xid

> This is strange, if I not mistaken XID copied by  copy_clog_xlog_xid(void):
> http://doxygen.postgresql.org/pg__upgrade_8c_source.html#l00398
> and there is no epoch (-e XIDEPOCH) in pg_resetxlog call args

I can only guess here.

+ Bruce Momjian

-- 
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray...@gmail.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade and epoch

2014-04-22 Thread Sergey Burladyan
On Wed, Apr 23, 2014 at 6:38 AM, Sergey Konoplev  wrote:

>
> BTW, I didn't manage to make a test case yet. Recently, when I was
> migrating several servers to skytools3 and upgrading from 9.0 to 9.2,
> I noticed that epoch was copied, timeline id was >0 after upgrade, but
>
...

This is strange, if I not mistaken XID copied by  copy_clog_xlog_xid(void):
http://doxygen.postgresql.org/pg__upgrade_8c_source.html#l00398
and there is no epoch (-e XIDEPOCH) in pg_resetxlog call args

I test it like this:
~/inst/pg9.2/bin/initdb d9.2
/usr/lib/postgresql/9.3/bin/initdb d9.3
~/inst/pg9.2/bin/postgres -D d9.2 -p 5433
^C
~/inst/pg9.2/bin/pg_resetxlog -e 8 d9.2/
~/inst/pg9.2/bin/postgres -D d9.2 -p 5433
psql -At -h localhost -p 5433 -c 'select txid_current()' postgres
34359739064
^C
/usr/lib/postgresql/9.3/bin/pg_upgrade -d d9.2 -D d9.3 -b ~/inst/pg9.2/bin
-B /usr/lib/postgresql/9.3/bin -k
/usr/lib/postgresql/9.3/bin/postgres -D d9.3 -p 5433 -k`pwd`/s
psql -At -h localhost -p 5433 -c 'select txid_current()' postgres
756

34359739064 switched to 756 after upgrade

-- 
Sergey Burladyan


Re: [HACKERS] pg_upgrade and epoch

2014-04-22 Thread Sergey Konoplev
On Tue, Apr 22, 2014 at 6:33 PM, Sergey Burladyan  wrote:
> Current pg_upgrade copy XID into new cluster, but not it epoch. Why?
>
> Without epoch from old cluster txid_current() in upgraded database return
> lower value than before upgrade. This break, for example, PgQ and it must
> be fixed by hand after upgrade with pg_resetxlog.
>
> PS: see
> http://lists.pgfoundry.org/pipermail/skytools-users/2014-April/001812.html

BTW, I didn't manage to make a test case yet. Recently, when I was
migrating several servers to skytools3 and upgrading from 9.0 to 9.2,
I noticed that epoch was copied, timeline id was >0 after upgrade, but
skytools3 sometimes still didn't like it. Also note "sometimes" here,
so in some cases everything was okay, but in some it wasn't. I still
can't explain this, but incrementing timeline id always helped.

-- 
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray...@gmail.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Missing pfree in logical_heap_rewrite_flush_mappings()

2014-04-22 Thread Tom Lane
Bruce Momjian  writes:
> I had to revert this patch.  It causes a failure in the
> /contrib/test_decoding regression test.

On closer inspection, it was simply pfree'ing the wrong pointer.

I fixed that and also undid the allocation in a different memory
context, which didn't seem to be a particularly good idea, unless
you've got a specific reason why CurrentMemoryContext would be the
wrong place for a transient allocation.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Missing pfree in logical_heap_rewrite_flush_mappings()

2014-04-22 Thread Bruce Momjian
On Tue, Apr 22, 2014 at 06:05:53PM -0400, Bruce Momjian wrote:
> On Wed, Mar 26, 2014 at 06:29:38PM +0200, Ants Aasma wrote:
> > It seems to me that when flushing logical mappings to disk, each
> > mapping file leaks the buffer used to pass the mappings to XLogInsert.
> > Also, it seems consistent to allocate that buffer in the RewriteState
> > memory context. Patch attached.
> 
> Patch applied.

I had to revert this patch.  It causes a failure in the
/contrib/test_decoding regression test.


-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_upgrade and epoch

2014-04-22 Thread Sergey Burladyan
Hi All!

Current pg_upgrade copy XID into new cluster, but not it epoch. Why?

Without epoch from old cluster txid_current() in upgraded database return
lower value than before upgrade. This break, for example, PgQ and it must
be fixed by hand after upgrade with pg_resetxlog.

PS: see
http://lists.pgfoundry.org/pipermail/skytools-users/2014-April/001812.html

-- 
Sergey Burladyan


Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

2014-04-22 Thread Mark Kirkwood

On 23/04/14 00:19, Andres Freund wrote:

Hi,

Attached you can find a short (compile tested only ) patch implementing
a 'shared_memory_type' GUC, akin to 'dynamic_shared_memory_type'. Will
only apply to 9.4, not 9.3, but it should be easy to convert for it.



Have just tried this out (on Ubuntu 14.04 rather than Freebsd, as it is 
what I happened to be running), certainly works for me (big shared 
memory segment when I set it to 'sysv', only a tiny one when I use 'mmap').


The regression tests pass in both cases.

regards

Mark



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: default opclass for jsonb (was Re: Call for GIST/GIN/SP-GIST opclass documentation)

2014-04-22 Thread Tom Lane
Bruce Momjian  writes:
> Where are we on the default JSONB opclass change?

Not sure.  I'm for changing it, I think, but it wasn't at all clear
that we had consensus on that.  We did not have a proposed new name
for the opclass either ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] aggregate returning anyarray and 'cannot determine result data type'

2014-04-22 Thread Tom Lane
[ redirecting to -hackers ]

Tomas Vondra  writes:
> So my plan was to do something like this:

>   sample_append(internal, anyelement, int) -> internal
>   sample_final(internal) -> anyarray

>   CREATE AGGREGATE sample_agg(anyelement, int) (
>   SFUNC = sample_append,
>   STYPE = internal,
>   FINALFUNC = sample_final
>   );

> However this leads to
>   ERROR:  cannot determine result data type
>   DETAIL:  A function returning a polymorphic type must have at least
>one polymorphic argument

> because 'sample_final' produces anyarray but has no polymorphic
> argument.

Yeah, this is a problem with trying to use internal stype for polymorphic
aggregates.

The same problem came up in connection with the "ordered set" aggregates
that were added recently, and that patch implemented an interesting
workaround: the final function for an OSA gets additional dummy arguments
of the same type as the aggregate inputs.  They are always passed as NULLs
at runtime, and have no real value except if the aggregate is polymorphic
--- but when it is, they provide a way to resolve the result type of a
polymorphic final function, even if the state type is "internal" or
otherwise non-polymorphic.

I thought at the time that maybe we should offer this feature for regular
aggregates as well as ordered-set ones, but didn't do anything about
it because there hadn't been demand.  If we did have it, you could solve
this problem with

  sample_append(internal, anyelement, int) -> internal
  sample_final(internal, anyelement, int) -> anyarray

  CREATE AGGREGATE sample_agg(anyelement, int) (
  SFUNC = sample_append,
  STYPE = internal,
  FINALFUNC = sample_final
  );

where sample_final would have to be declared non-strict (since it'd always
be getting some NULL arguments), but that's a small price to pay.

I think it'd be a pretty small adjustment to the already-committed
code to allow this to happen.  Basically we'd just have to decouple
the extra-arguments-to-finalfn behavior from ordered-set aggregates.

One potential issue though is that if sample_final existed in both
signatures it wouldn't be very clear which one got selected for the
aggregate.  Perhaps the best fix would be to invent a different
CREATE AGGREGATE keyword for finalfns with extra arguments?  If
so, that's something we ought to do *now*, not in 9.5, because
it'll be too late to redefine how to create OSAs once 9.4 ships.

Thoughts?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] What use case is make_tuple_indirect() supposed to illustrate?

2014-04-22 Thread Tom Lane
Pursuant to
http://www.postgresql.org/message-id/29007.1396038...@sss.pgh.pa.us
I've been working on a patch to prevent external toast pointers from
appearing in composite Datums.  I noticed that this patch completely
breaks the make_tuple_indirect() test case added by commit 36820250.
The regression test doesn't fail (meaning the test case fails to
actually prove that any indirection is happening), but it certainly
isn't doing what's intended, because the "indirect" pointers get
flattened out of the tuple returned by make_tuple_indirect() before
control ever leaves the function.

Even in the code as it stands in HEAD, such indirect pointers would
get flattened out of other container types such as arrays and ranges.

And for that matter, it's a bit silly to be testing make_tuple_indirect
in a BEFORE INSERT/UPDATE trigger, because even if the tuple gets out
of the trigger without being flattened, it will certainly get flattened
mere nanoseconds later before it gets written out to disk.  (If it did
not, the test case would fail altogether, since the indirect values
in memory only survive for the length of the current transaction.)

So I'm wondering exactly what use-case this test is supposed to represent.

Or is the whole thing just a toy anyway?  Because the more I look at that
patch, the less it looks like it could do anything useful, short of adding
a ton of infrastructure that's not there now.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] assertion failure 9.3.4

2014-04-22 Thread Josh Berkus
On 04/22/2014 05:07 PM, Alvaro Herrera wrote:
> If you want to make it easier to reproduce, you need to insert some
> pg_usleep() calls in carefully selected spots.  As Andres says, the
> window is small normally.

Yeah, but the whole point of this is that having
pg_stat-statements/auto_explain loaded should make the window much
bigger.  In the cases where we're hitting it, we're hitting it with a
fair degree of predicability.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] assertion failure 9.3.4

2014-04-22 Thread Alvaro Herrera
Andres Freund wrote:
> On 2014-04-22 18:01:40 -0300, Alvaro Herrera wrote:
> > Thanks for the analysis and patches.  I've been playing with this on my
> > own a bit, and one thing that I just noticed is that at least for
> > heap_update I cannot reproduce a problem when the xmax is originally a
> > multixact, so AFAICT the number of places that need patched aren't as
> > many.
> 
> I am quite uncomfortable with that assumption. I don't immediately see a
> problem for some of the cases, but leaving them in a weaker state than
> 9.2 makes me uncomfortable.

That's true too.

I'm thinking about the comparison of full infomask as you propose
instead of just the bits that we actually care about.   I think the only
thing that could cause a spurious failure (causing an extra execution of
the HeapTupleSatisfiesUpdate call and the stuff below) is somebody
setting HEAP_XMIN_COMMITTED concurrently; but that seems infrequent
enough that it should pretty harmless.  However, should we worry about
possible future infomask bit changes that could negatively affect this
behavior?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] assertion failure 9.3.4

2014-04-22 Thread Alvaro Herrera
Josh Berkus wrote:
> 
> >> In order to encounter this issue, I'd need to have two concurrent
> >> processes update the child records of the same parent record?  That is:
> >>
> >> A ---> B1
> >>   \---> B2
> >>
> >> ... and the issue should only happen if I update both B1 and B2
> >> concurrently in separate sessions?
> > 
> > I don't think that'll trigger it. You need rows that are first key share
> > locked and then updated by the locking transaction. Under
> > concurrency. And the timewindow really is rather small..
> 
> Well, currently I have a test which locks A and B1, then updates B1
> (twice, actually), and then updates A.  However, since there's a lock on
> A, there's no concurrent updating of B1 and B2. This is based on the
> behavior of the queue where I originally saw the problem, but it doesn't
> reproduce the bug.

If you want to make it easier to reproduce, you need to insert some
pg_usleep() calls in carefully selected spots.  As Andres says, the
window is small normally.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

2014-04-22 Thread Andrew Dunstan


On 04/22/2014 06:43 PM, Mark Wong wrote:
On Tue, Apr 22, 2014 at 10:06 AM, Joshua D. Drake 
mailto:j...@commandprompt.com>> wrote:



On 04/22/2014 08:26 AM, Andrew Dunstan wrote:

I'm going away tomorrow for a few days R&R. when I'm back next
week I
will set up a demo client running this module. If you can have
a machine
prepped for this purpose by then so much the better, otherwise
I will
have to drag out a box I recently rescued and have been
waiting for
something to use it with. It's more important that it's stable
(i.e.
nothing else running on it) than that it's very powerful. It
could be

running Ubuntu or some Redhattish variant or, yes, even FreeBSD.


This is best handled by Mark. Mark can you help Andrew with this?
I assume we would use the DL385 with the MS70?


Yeah, I can help.  But let me know if Alfred's offer is preferred.



I don't think they are mutually exclusive, but I'd rather start off with 
one machine. I would find it easiest if it were on something like 
CentOS6.5.


When we have that running and reporting like we want it we can add a 
FreeBSD server.


The idea is that these machines would be available for a long time, 
ideally quite a few years. We want to have them with a stable time 
series of performance data so that when something disturbs the 
performance it sticks out like a sore thumb.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

2014-04-22 Thread Mark Wong
On Tue, Apr 22, 2014 at 10:06 AM, Joshua D. Drake wrote:

>
> On 04/22/2014 08:26 AM, Andrew Dunstan wrote:
>
>  I'm going away tomorrow for a few days R&R. when I'm back next week I
>> will set up a demo client running this module. If you can have a machine
>> prepped for this purpose by then so much the better, otherwise I will
>> have to drag out a box I recently rescued and have been waiting for
>> something to use it with. It's more important that it's stable (i.e.
>> nothing else running on it) than that it's very powerful. It could be
>> running Ubuntu or some Redhattish variant or, yes, even FreeBSD.
>>
>
> This is best handled by Mark. Mark can you help Andrew with this? I assume
> we would use the DL385 with the MS70?
>

Yeah, I can help.  But let me know if Alfred's offer is preferred.

Regards,
Mark


Re: [HACKERS] Re: default opclass for jsonb (was Re: Call for GIST/GIN/SP-GIST opclass documentation)

2014-04-22 Thread Peter Geoghegan
On Tue, Apr 22, 2014 at 3:32 PM, Bruce Momjian  wrote:
> Where are we on the default JSONB opclass change?

FWIW, I still don't have any strong opinion here. I defer to others on
this question.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [DOCS] [doc] EXPLAIN CREATE MATERIALIZED VIEW AS?

2014-04-22 Thread Bruce Momjian
On Wed, Apr 16, 2014 at 12:35:26PM +0900, Amit Langote wrote:
> Hi,
> 
> Attached adds CREATE MATERIALIZED VIEW AS to the list of statements
> that can be EXPLAINed.

Patch applied.  Thanks.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: default opclass for jsonb (was Re: Call for GIST/GIN/SP-GIST opclass documentation)

2014-04-22 Thread Bruce Momjian
On Wed, Apr  9, 2014 at 02:22:54PM -0400, Greg Stark wrote:
> On Wed, Apr 9, 2014 at 11:24 AM, Tom Lane  wrote:
> >> Maybe we should make *neither* of these the default opclass, and give
> >> *neither* the name json_ops.
> >
> > There's definitely something to be said for that.  Default opclasses are
> > sensible when there's basically only one behavior that's interesting for
> > most people.  We can already see that that's not going to be the case
> > for jsonb indexes, at least not with the currently available alternatives.
> >
> > Not having a default would force users to make decisions explicitly.
> > Is that what we want?
> 
> I don't like the idea of having no default opclass. I think there's a
> huge usability gain in being able to "just" create an index on a
> column and have it do something reasonable for most use cases.
> 
> I can get behind the idea of having separate index opclasses for paths
> and path-value pairs but I suspect the default should just be to index
> both in the same index. If we can have one default index opclass that
> supports containment and existence and then other opclasses that are
> smaller but only support a subset of the operators that would seem
> like the best compromise.
> 
> I'm a bit confused by Heikki's list though. I would expect path and
> path-value pair to be the only useful ones. I'm not clear what an
> index on keys or key-value would be -- it would index just the
> top-level keys and values without recursing?

Where are we on the default JSONB opclass change?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MultiXactId error after upgrade to 9.3.4

2014-04-22 Thread Bruce Momjian
On Mon, Mar 31, 2014 at 09:36:03AM -0400, Stephen Frost wrote:
> Andres,
> 
> * Andres Freund (and...@2ndquadrant.com) wrote:
> > Without having looked at the code, IIRC this looks like some place
> > misses passing allow_old=true where it's actually required. Any chance
> > you can get a backtrace for the error message? I know you said somewhere
> > below that you'd worked around the problem, but maybe you have a copy of
> > the database somewhere?
>
>   Looks like your idea that is has to do w/ freezeing is accurate...

Where are we on this?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Adding EXTRA_REGRESS_OPTS to all pg_regress invocations

2014-04-22 Thread Bruce Momjian
On Thu, Mar 27, 2014 at 06:03:24PM +0100, Christoph Berg wrote:
> Re: Bruce Momjian 2013-12-04 <20131204151533.gb17...@momjian.us>
> > On Mon, May  6, 2013 at 11:51:47PM -0700, Christoph Berg wrote:
> > > "make check" supports EXTRA_REGRESS_OPTS to pass extra options to
> > > pg_regress, but all the other places where pg_regress is used do not
> > > allow this. The attached patch adds EXTRA_REGRESS_OPTS to
> > > Makefile.global.in (for contrib modules) and two more special
> > > Makefiles (isolation and pg_upgrade).
> > > 
> > > The use case here is that Debian needs to be able to redirect the unix
> > > socket directory used to /tmp, because /var/run/postgresql isn't
> > > writable for the buildd user. The matching part for this inside
> > > pg_regress is still in discussion here, but the addition of
> > > EXTRA_REGRESS_OPTS is an independent step that is also useful for
> > > others, so I'd like to propose it for inclusion.
> > 
> > Thanks, patch applied.  This will appear in PG 9.4.  I suppose we could
> > backpatch this but I would need community feedback on that.
> 
> Thanks for pushing this. In the meantime, a new bit has appeared:
> The new contrib/test_decoding checks make use of the
> pg_isolation_regress_check macros (which the isolation test itself
> doesn't). These macros also need EXTRA_REGRESS_OPTS, on top of
> 86ef4796f5120c55d1a48cfab52e51df8ed271b5:

Applied.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Missing pfree in logical_heap_rewrite_flush_mappings()

2014-04-22 Thread Bruce Momjian
On Wed, Mar 26, 2014 at 06:29:38PM +0200, Ants Aasma wrote:
> It seems to me that when flushing logical mappings to disk, each
> mapping file leaks the buffer used to pass the mappings to XLogInsert.
> Also, it seems consistent to allocate that buffer in the RewriteState
> memory context. Patch attached.

Patch applied.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

2014-04-22 Thread Alfred Perlstein


On 4/22/14, 8:26 AM, Andrew Dunstan wrote:


On 04/22/2014 01:36 AM, Joshua D. Drake wrote:


On 04/21/2014 06:19 PM, Andrew Dunstan wrote:



If we never start we'll never get there.

I can think of several organizations that might be approached to donate
hardware.


Like .Org?

We have a hardware farm, a rack full of hardware and spindles. It 
isn't the most current but it is there.






I'm going away tomorrow for a few days R&R. when I'm back next week I 
will set up a demo client running this module. If you can have a 
machine prepped for this purpose by then so much the better, otherwise 
I will have to drag out a box I recently rescued and have been waiting 
for something to use it with. It's more important that it's stable 
(i.e. nothing else running on it) than that it's very powerful. It 
could be running Ubuntu or some Redhattish variant or, yes, even FreeBSD.


cheers

andrew




Hey folks, I just spoke with our director of netops Tom Sparks here at 
Norse and we have a vested interest in Postgresql.  We can throw 
together a cluster of 4 machines with specs approximately in the range 
of dual quad core westmere with ~64GB of ram running FreeBSD 10 or 11.  
We can also do an Ubungu install as well or other Linux distro.  Please 
let me know if that this would be a something that the project could 
make use of please.


We also have colo space and power, etc.  So this would be the whole 
deal.  The cluster would be up for as long as needed.


Are the machine specs sufficient?  Any other things we should look for?

CC'd Tom on this email.

-Alfred


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for CREATE RULE sgml -- Was in: [DOCS]

2014-04-22 Thread Bruce Momjian
On Mon, Mar 24, 2014 at 09:51:07PM +0900, Fujii Masao wrote:
> On Sat, Mar 22, 2014 at 12:56 AM, Emanuel Calvo
>  wrote:
> > -BEGIN PGP SIGNED MESSAGE-
> > Hash: SHA512
> >
> >
> >
> > Hi guys,
> >
> > I realized that the output of the CREATE RULE has not a detailed
> > output for the "events" parameter.
> >
> > But the question here is that I'm not sure which format follow:
> >
> > { INSERT | UPDATE | DELETE | SELECT}
> >
> > or
> >
> > INSERT
> > UPDATE
> > DELETE
> > SELECT
> > - --
> >
> >
> > I attach a patch for each one.
> 
> Though I'm not sure the right policy of the format in synopsis, ISTM that
> the following format is suitable in this case, i.e., if the value list
> is very simple.
> Patch attached.
> 
> SELECT | INSERT | UPDATE | DELETE

Patch applied.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: Patch FORCE_NULL option for copy COPY in CSV mode

2014-04-22 Thread Michael Paquier
On Wed, Apr 23, 2014 at 5:07 AM, Bruce Momjian  wrote:

> On Fri, Mar  7, 2014 at 05:08:54PM +0900, Michael Paquier wrote:
> > On Thu, Mar 6, 2014 at 12:09 AM, Tom Lane  wrote:
> > > Andrew Dunstan  writes:
> > >> On 03/05/2014 09:11 AM, Michael Paquier wrote:
> > >>> After testing this feature, I noticed that FORCE_NULL and
> > >>> FORCE_NOT_NULL can both be specified with COPY on the same column.
> > >
> > >> Strictly they are not actually contradictory, since FORCE NULL relates
> > >> to quoted null strings and FORCE NOT NULL relates to unquoted null
> > >> strings. Arguably the docs are slightly loose on this point. Still,
> > >> applying both FORCE NULL and FORCE NOT NULL to the same column would
> be
> > >> rather perverse, since it would result in a quoted null string
> becoming
> > >> null and an unquoted null string becoming not null.
> > >
> > > Given the remarkable lack of standardization of "CSV" output, who's
> > > to say that there might not be data sources out there for which this
> > > is the desired behavior?  It's weird, I agree, but I think throwing
> > > an error for the combination is not going to be helpful.  It's not
> > > like somebody might accidentally write both on the same column.
> > >
> > > +1 for clarifying the docs, though, more or less in the words you
> > > used above.
> > Following that, I have hacked the patch attached to update the docs
> > with an additional regression test (actually replaces a test that was
> > the same as the one before in copy2).
> >
> > I am attaching as well a second patch for file_fdw, to allow the use
> > of force_null and force_not_null on the same column, to be consistent
> > with COPY.
> > Regards,
>
> Correction, this is the patch applied, not the earlier version.
>
Thanks for taking the time to look at that.
-- 
Michael


Re: [HACKERS] assertion failure 9.3.4

2014-04-22 Thread Andres Freund
On 2014-04-22 14:49:00 -0700, Josh Berkus wrote:
> 
> >> In order to encounter this issue, I'd need to have two concurrent
> >> processes update the child records of the same parent record?  That is:
> >>
> >> A ---> B1
> >>   \---> B2
> >>
> >> ... and the issue should only happen if I update both B1 and B2
> >> concurrently in separate sessions?
> > 
> > I don't think that'll trigger it. You need rows that are first key share
> > locked and then updated by the locking transaction. Under
> > concurrency. And the timewindow really is rather small..
> 
> Well, currently I have a test which locks A and B1, then updates B1
> (twice, actually), and then updates A.  However, since there's a lock on
> A, there's no concurrent updating of B1 and B2. This is based on the
> behavior of the queue where I originally saw the problem, but it doesn't
> reproduce the bug.
> 
> I'm thinking I need to just lock B1, update B1, then A, while allowing a
> concurrent session to update B2 and and A.  No?

I don't think this gets any easier to reproduce by introducing more than
one relation. Have one session acquire key share locks and then update,
and another one just doing updates.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] assertion failure 9.3.4

2014-04-22 Thread Josh Berkus

>> In order to encounter this issue, I'd need to have two concurrent
>> processes update the child records of the same parent record?  That is:
>>
>> A ---> B1
>>   \---> B2
>>
>> ... and the issue should only happen if I update both B1 and B2
>> concurrently in separate sessions?
> 
> I don't think that'll trigger it. You need rows that are first key share
> locked and then updated by the locking transaction. Under
> concurrency. And the timewindow really is rather small..

Well, currently I have a test which locks A and B1, then updates B1
(twice, actually), and then updates A.  However, since there's a lock on
A, there's no concurrent updating of B1 and B2. This is based on the
behavior of the queue where I originally saw the problem, but it doesn't
reproduce the bug.

I'm thinking I need to just lock B1, update B1, then A, while allowing a
concurrent session to update B2 and and A.  No?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] assertion failure 9.3.4

2014-04-22 Thread Andres Freund
On 2014-04-22 18:01:40 -0300, Alvaro Herrera wrote:
> Thanks for the analysis and patches.  I've been playing with this on my
> own a bit, and one thing that I just noticed is that at least for
> heap_update I cannot reproduce a problem when the xmax is originally a
> multixact, so AFAICT the number of places that need patched aren't as
> many.

I am quite uncomfortable with that assumption. I don't immediately see a
problem for some of the cases, but leaving them in a weaker state than
9.2 makes me uncomfortable.

> For now I offer a cleaned up version of your patch to add the assertion
> that multis don't contain multiple updates.  I considered the idea of
> making this #ifdef USE_ASSERT_CHECKING, because it has to walk the
> complete array of members; and then have full elogs in MultiXactIdExpand
> and MultiXactIdCreate, which are lighter because they can check more
> easily.  But on second thoughts I refrained from doing that, because
> surely the arrays are not as large anyway, are they.

Yea, I think it's fine to do it where it's in the proposed patch.

> I think I should push this patch first, so that Andrew and Josh can try
> their respective test cases which should start throwing errors, then
> push the actual fixes.  Does that sound okay?

+1

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] assertion failure 9.3.4

2014-04-22 Thread Andres Freund
On 2014-04-22 14:40:46 -0700, Josh Berkus wrote:
> On 04/22/2014 02:01 PM, Alvaro Herrera wrote:
> > Some testing later, I think the issue only occurs if we determine that
> > we don't need to wait for the xid/multi to complete, because otherwise
> > the wait itself saves us.  (It's easy to cause the problem by adding a
> > breakpoint in heapam.c:3325, i.e. just before re-acquiring the buffer
> > lock, and then having transaction A lock for key share, then transaction
> > B update the tuple which stops at the breakpoint, then transaction A
> > also update the tuple, and finally release transaction B).
> 
> So, trying to make my synthetic test work:
> 
> In order to encounter this issue, I'd need to have two concurrent
> processes update the child records of the same parent record?  That is:
> 
> A ---> B1
>   \---> B2
> 
> ... and the issue should only happen if I update both B1 and B2
> concurrently in separate sessions?

I don't think that'll trigger it. You need rows that are first key share
locked and then updated by the locking transaction. Under
concurrency. And the timewindow really is rather small..

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] assertion failure 9.3.4

2014-04-22 Thread Andres Freund
On 2014-04-22 17:36:42 -0400, Andrew Dunstan wrote:
> 
> On 04/22/2014 05:20 PM, Josh Berkus wrote:
> >On 04/22/2014 02:01 PM, Alvaro Herrera wrote:
> >>I think I should push this patch first, so that Andrew and Josh can try
> >>their respective test cases which should start throwing errors, then
> >>push the actual fixes.  Does that sound okay?
> >Note that I have a limited ability to actually test my failing test case
> >-- I have to fire up the user's full application test suite to produce
> >it, and that needs to be scheduled.  However, Andrew is able to rerun
> >his test case repeatedly.
> >
> 
> Alvaro has access (as does Andres) to my setup, and instructions on how to
> run the test. I am going to be offline most of the time from tonight until
> next Tuesday.

My preliminary patch seems to have fixed the problem in your setup...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] assertion failure 9.3.4

2014-04-22 Thread Josh Berkus
On 04/22/2014 02:01 PM, Alvaro Herrera wrote:
> Some testing later, I think the issue only occurs if we determine that
> we don't need to wait for the xid/multi to complete, because otherwise
> the wait itself saves us.  (It's easy to cause the problem by adding a
> breakpoint in heapam.c:3325, i.e. just before re-acquiring the buffer
> lock, and then having transaction A lock for key share, then transaction
> B update the tuple which stops at the breakpoint, then transaction A
> also update the tuple, and finally release transaction B).

So, trying to make my synthetic test work:

In order to encounter this issue, I'd need to have two concurrent
processes update the child records of the same parent record?  That is:

A ---> B1
  \---> B2

... and the issue should only happen if I update both B1 and B2
concurrently in separate sessions?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] assertion failure 9.3.4

2014-04-22 Thread Andrew Dunstan


On 04/22/2014 05:20 PM, Josh Berkus wrote:

On 04/22/2014 02:01 PM, Alvaro Herrera wrote:

I think I should push this patch first, so that Andrew and Josh can try
their respective test cases which should start throwing errors, then
push the actual fixes.  Does that sound okay?

Note that I have a limited ability to actually test my failing test case
-- I have to fire up the user's full application test suite to produce
it, and that needs to be scheduled.  However, Andrew is able to rerun
his test case repeatedly.



Alvaro has access (as does Andres) to my setup, and instructions on how 
to run the test. I am going to be offline most of the time from tonight 
until next Tuesday.


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Regression test errors

2014-04-22 Thread Bruce Momjian
On Sun, Mar  9, 2014 at 09:23:33AM -0300, Martín Marqués wrote:
> OK, noticed how horrible this patch was (thanks for the heads up from
> Jaime Casanova). This happens when trying to fetch changes one made on
> a test copy after a day of lots of work back to a git repository: you
> just make very silly mistakes.
> 
> Well, now I got the changes right (tested the patch, because silly
> changes should be tested as well ;)).

Patch applied to head.  Thanks.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] assertion failure 9.3.4

2014-04-22 Thread Josh Berkus
On 04/22/2014 02:01 PM, Alvaro Herrera wrote:
> I think I should push this patch first, so that Andrew and Josh can try
> their respective test cases which should start throwing errors, then
> push the actual fixes.  Does that sound okay?

Note that I have a limited ability to actually test my failing test case
-- I have to fire up the user's full application test suite to produce
it, and that needs to be scheduled.  However, Andrew is able to rerun
his test case repeatedly.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] assertion failure 9.3.4

2014-04-22 Thread Alvaro Herrera
Andres Freund wrote:
> On 2014-04-21 19:43:15 -0400, Andrew Dunstan wrote:
> > 
> > On 04/21/2014 02:54 PM, Andres Freund wrote:
> > >Hi,
> > >
> > >I spent the last two hours poking arounds in the environment Andrew
> > >provided and I was able to reproduce the issue, find a assert to
> > >reproduce it much faster and find a possible root cause.
> > 
> > 
> > What's the assert that makes it happen faster? That might help a lot in
> > constructing a self-contained test.
> 
> Assertion and *preliminary*, *hacky* fix attached.

Thanks for the analysis and patches.  I've been playing with this on my
own a bit, and one thing that I just noticed is that at least for
heap_update I cannot reproduce a problem when the xmax is originally a
multixact, so AFAICT the number of places that need patched aren't as
many.

Some testing later, I think the issue only occurs if we determine that
we don't need to wait for the xid/multi to complete, because otherwise
the wait itself saves us.  (It's easy to cause the problem by adding a
breakpoint in heapam.c:3325, i.e. just before re-acquiring the buffer
lock, and then having transaction A lock for key share, then transaction
B update the tuple which stops at the breakpoint, then transaction A
also update the tuple, and finally release transaction B).

For now I offer a cleaned up version of your patch to add the assertion
that multis don't contain multiple updates.  I considered the idea of
making this #ifdef USE_ASSERT_CHECKING, because it has to walk the
complete array of members; and then have full elogs in MultiXactIdExpand
and MultiXactIdCreate, which are lighter because they can check more
easily.  But on second thoughts I refrained from doing that, because
surely the arrays are not as large anyway, are they.

I think I should push this patch first, so that Andrew and Josh can try
their respective test cases which should start throwing errors, then
push the actual fixes.  Does that sound okay?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
*** a/src/backend/access/heap/heapam.c
--- b/src/backend/access/heap/heapam.c
***
*** 190,198  static const int MultiXactStatusLock[MaxMultiXactStatus + 1] =
  /* Get the LockTupleMode for a given MultiXactStatus */
  #define TUPLOCK_from_mxstatus(status) \
  			(MultiXactStatusLock[(status)])
- /* Get the is_update bit for a given MultiXactStatus */
- #define ISUPDATE_from_mxstatus(status) \
- 			((status) > MultiXactStatusForUpdate)
  
  /* 
   *		 heap support routines
--- 190,195 
*** a/src/backend/access/transam/multixact.c
--- b/src/backend/access/transam/multixact.c
***
*** 457,463  MultiXactIdExpand(MultiXactId multi, TransactionId xid, MultiXactStatus status)
  	for (i = 0, j = 0; i < nmembers; i++)
  	{
  		if (TransactionIdIsInProgress(members[i].xid) ||
! 			((members[i].status > MultiXactStatusForUpdate) &&
  			 TransactionIdDidCommit(members[i].xid)))
  		{
  			newMembers[j].xid = members[i].xid;
--- 457,463 
  	for (i = 0, j = 0; i < nmembers; i++)
  	{
  		if (TransactionIdIsInProgress(members[i].xid) ||
! 			(ISUPDATE_from_mxstatus(members[i].status) &&
  			 TransactionIdDidCommit(members[i].xid)))
  		{
  			newMembers[j].xid = members[i].xid;
***
*** 713,718  MultiXactIdCreateFromMembers(int nmembers, MultiXactMember *members)
--- 713,734 
  		return multi;
  	}
  
+ 	/* Verify that there is a single update Xid among the given members. */
+ 	{
+ 		int			i;
+ 		bool		has_update = false;
+ 
+ 		for (i = 0; i < nmembers; i++)
+ 		{
+ 			if (ISUPDATE_from_mxstatus(members[i].status))
+ 			{
+ if (has_update)
+ 	elog(ERROR, "new multixact has more than one updating member");
+ has_update = true;
+ 			}
+ 		}
+ 	}
+ 
  	/*
  	 * Assign the MXID and offsets range to use, and make sure there is space
  	 * in the OFFSETs and MEMBERs files.  NB: this routine does
*** a/src/include/access/multixact.h
--- b/src/include/access/multixact.h
***
*** 48,53  typedef enum
--- 48,57 
  
  #define MaxMultiXactStatus MultiXactStatusUpdate
  
+ /* does a status value correspond to a tuple update? */
+ #define ISUPDATE_from_mxstatus(status) \
+ 			((status) > MultiXactStatusForUpdate)
+ 
  
  typedef struct MultiXactMember
  {

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-22 Thread Peter Geoghegan
On Tue, Apr 22, 2014 at 2:03 AM, Hannu Krosing  wrote:
> What is the depth of b-tree at this percentage ?

Well, this percentage of B-Tree pages that are leaf pages doesn't have
much to do with the depth. The percentage seems very consistent for
each B-Tree, irrespective of the total size of the B-Tree.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: Patch FORCE_NULL option for copy COPY in CSV mode

2014-04-22 Thread Bruce Momjian
On Fri, Mar  7, 2014 at 05:08:54PM +0900, Michael Paquier wrote:
> On Thu, Mar 6, 2014 at 12:09 AM, Tom Lane  wrote:
> > Andrew Dunstan  writes:
> >> On 03/05/2014 09:11 AM, Michael Paquier wrote:
> >>> After testing this feature, I noticed that FORCE_NULL and
> >>> FORCE_NOT_NULL can both be specified with COPY on the same column.
> >
> >> Strictly they are not actually contradictory, since FORCE NULL relates
> >> to quoted null strings and FORCE NOT NULL relates to unquoted null
> >> strings. Arguably the docs are slightly loose on this point. Still,
> >> applying both FORCE NULL and FORCE NOT NULL to the same column would be
> >> rather perverse, since it would result in a quoted null string becoming
> >> null and an unquoted null string becoming not null.
> >
> > Given the remarkable lack of standardization of "CSV" output, who's
> > to say that there might not be data sources out there for which this
> > is the desired behavior?  It's weird, I agree, but I think throwing
> > an error for the combination is not going to be helpful.  It's not
> > like somebody might accidentally write both on the same column.
> >
> > +1 for clarifying the docs, though, more or less in the words you
> > used above.
> Following that, I have hacked the patch attached to update the docs
> with an additional regression test (actually replaces a test that was
> the same as the one before in copy2).
> 
> I am attaching as well a second patch for file_fdw, to allow the use
> of force_null and force_not_null on the same column, to be consistent
> with COPY.
> Regards,

Correction, this is the patch applied, not the earlier version.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: Patch FORCE_NULL option for copy COPY in CSV mode

2014-04-22 Thread Bruce Momjian
On Wed, Mar  5, 2014 at 09:49:30PM +0900, Michael Paquier wrote:
> On Wed, Mar 5, 2014 at 7:44 AM, Andrew Dunstan  wrote:
> > I have picked this up and committed the patch. Thanks to all.
> Sorry for coming after the battle, but while looking at what has been
> committed I noticed that copy2.sql is actually doing twice in a row
> the same test:
> COPY forcetest (a, b, c) FROM STDIN WITH (FORMAT csv,
> FORCE_NOT_NULL(b), FORCE_NULL(c));
> 1,,""
> \.
> -- should succeed with no effect ("b" remains an empty string, "c" remains 
> NULL)
> COPY forcetest (a, b, c) FROM STDIN WITH (FORMAT csv,
> FORCE_NOT_NULL(b), FORCE_NULL(c));
> 2,,""
> \.
> 
> See? For both tests the quotes are placed on the same column, the 3rd.
> I think that one of them should be like that, with the quotes on the
> 2nd column => 2,"",
> The attached patch corrects that... and a misplaced comment.
> Regards,

Thanks.  Patch applied.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-22 Thread Jim Nasby

On 4/21/14, 6:07 PM, David G Johnston wrote:

Jim Nasby-2 wrote

>>I feel that if there is no memory pressure, frankly it doesnt matter much
>>about what gets out and what not. The case I am specifically targeting is
>>when the clocksweep gets to move about a lot i.e. high memory pressure
>>workloads. Of course,  I may be totally wrong here.

>
>Well, there's either memory pressure or there isn't. If there isn't then
>it's all moot*because we're not evicting anything*.

The trade-off I'm seeing here is between measuring when there is no memory
pressure - and thus eating at performance while not actually evicting
buffers - and not measuring but then encountering memory pressure and not
having a clue as to what should be evicted.


Right. OSes handle this by keeping a certain ratio of active vs inactive pages, 
regardless of pressure for free pages. That way when you need more pages in the 
free list you can pull them from the inactive list knowing that you're making a 
good decision.

One of the really nice things about this approach is that if memory pressure is 
low enough that you don't need more pages on the inactive list you don't even 
need to run that clock.
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] `pg_dump -Fd` doesn't check write return status...

2014-04-22 Thread Bruce Momjian
On Sat, Mar 1, 2014 at 12:27:19PM -0800, Sean Chittenden wrote:
> The attached patch fixes the case when `pg_dump -Fd …` is called
> on a partition where write(2) fails for some reason or another. In
> this case, backup jobs were returning with a successful exit code even
> though most of the files in the dump directory were all zero length.
>
> I haven’t tested this patch’s failure conditions but the fix seems
> simple enough: cfwrite() needs to have its return status checked
> everywhere and exit_horribly() upon any failure. In this case,
> callers of _WriteData() were not checking the return status and were
> discarding the negative return status (e.g. ENOSPC).
>
> I made a cursory pass over the code and found one other instance where
> write status wasn’t being checked and also included that.

As is often the case with pg_dump, the problems you saw are a small part
of a larger set of problems in that code --- there is general ignoring of
read and write errors.

I have developed a comprehensive patch that addresses all the issues I
could find.  The use of function pointers and the calling of functions
directly and through function pointers makes the fix quite complex.  I
ended up placing checks at the lowest level and removing checks at
higher layers where they were sporadically placed.  

Patch attached.  I have tested dump/restore of all four dump output
formats with the 9.3 regression database and all tests passed.  I
believe this patch is too complex to backpatch, and I don't know how it
could be fixed more simply.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_io.c
new file mode 100644
index 10bc3f0..bad21b5
*** a/src/bin/pg_dump/compress_io.c
--- b/src/bin/pg_dump/compress_io.c
*** static void InitCompressorZlib(Compresso
*** 86,99 
  static void DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs,
  	  bool flush);
  static void ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF);
! static size_t WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
  	   const char *data, size_t dLen);
  static void EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs);
  #endif
  
  /* Routines that support uncompressed data I/O */
  static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF);
! static size_t WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
  	   const char *data, size_t dLen);
  
  /*
--- 86,99 
  static void DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs,
  	  bool flush);
  static void ReadDataFromArchiveZlib(ArchiveHandle *AH, ReadFunc readF);
! static void WriteDataToArchiveZlib(ArchiveHandle *AH, CompressorState *cs,
  	   const char *data, size_t dLen);
  static void EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs);
  #endif
  
  /* Routines that support uncompressed data I/O */
  static void ReadDataFromArchiveNone(ArchiveHandle *AH, ReadFunc readF);
! static void WriteDataToArchiveNone(ArchiveHandle *AH, CompressorState *cs,
  	   const char *data, size_t dLen);
  
  /*
*** ReadDataFromArchive(ArchiveHandle *AH, i
*** 179,185 
  /*
   * Compress and write data to the output stream (via writeF).
   */
! size_t
  WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
     const void *data, size_t dLen)
  {
--- 179,185 
  /*
   * Compress and write data to the output stream (via writeF).
   */
! void
  WriteDataToArchive(ArchiveHandle *AH, CompressorState *cs,
     const void *data, size_t dLen)
  {
*** WriteDataToArchive(ArchiveHandle *AH, Co
*** 190,203 
  	{
  		case COMPR_ALG_LIBZ:
  #ifdef HAVE_LIBZ
! 			return WriteDataToArchiveZlib(AH, cs, data, dLen);
  #else
  			exit_horribly(modulename, "not built with zlib support\n");
  #endif
  		case COMPR_ALG_NONE:
! 			return WriteDataToArchiveNone(AH, cs, data, dLen);
  	}
! 	return 0;	/* keep compiler quiet */
  }
  
  /*
--- 190,205 
  	{
  		case COMPR_ALG_LIBZ:
  #ifdef HAVE_LIBZ
! 			WriteDataToArchiveZlib(AH, cs, data, dLen);
  #else
  			exit_horribly(modulename, "not built with zlib support\n");
  #endif
+ 			break;
  		case COMPR_ALG_NONE:
! 			WriteDataToArchiveNone(AH, cs, data, dLen);
! 			break;
  	}
! 	return;
  }
  
  /*
*** DeflateCompressorZlib(ArchiveHandle *AH,
*** 298,307 
   */
  size_t		len = cs->zlibOutSize - zp->avail_out;
  
! if (cs->writeF(AH, out, len) != len)
! 	exit_horribly(modulename,
!   "could not write to output file: %s\n",
!   strerror(errno));
  			}
  			zp->next_out = (void *) out;
  			zp->avail_out = cs->zlibOutSize;
--- 300,306 
   */
  size_t		len = cs->zlibOutSize - zp->avail_out;
  
! cs->writeF(AH, out, len);
  			}
  			zp->next_out = (void *) out;
  			zp->avail_out = cs->zlibOutSize

Re: [HACKERS] RFC: Async query processing

2014-04-22 Thread Claudio Freire
On Tue, Apr 22, 2014 at 3:45 PM, Florian Weimer  wrote:
> On 01/03/2014 06:06 PM, Claudio Freire wrote:
>
>> Per-query expectations could be such a thing. And it can even work with
>> PQexec:
>>
>> PQexec(con, "SELECT nextval('a_id_seq') FROM generate_series(1,10);");
>> --read--
>> PQexec(con, "SELECT nextval('b_id_seq') FROM generate_series(1,10);");
>> --read--
>> PQexec(con, "INSERT INTO a (...);", PQEXPECT_NO_RESULT | PQASYNC_CORK);
>> PQexec(con, "INSERT INTO b (...);", PQEXPECT_NO_RESULT | PQASYNC_CORK);
>> PQexec(con, "INSERT INTO a (...);", PQEXPECT_NO_RESULT | PQASYNC_CORK);
>> PQexec(con, "INSERT INTO b (...);", PQEXPECT_NO_RESULT | PQASYNC_CORK);
>> PQexec(con, "INSERT INTO a (...);", PQEXPECT_NO_RESULT | PQASYNC_CORK);
>> PQexec(con, "INSERT INTO b (...);", PQEXPECT_NO_RESULT | PQASYNC_CORK);
>> ... 9 times...
>> PQexec(con, "INSERT INTO a (...);", PQEXPECT_NO_RESULT | PQASYNC_CORK);
>> PQexec(con, "INSERT INTO b (...);", PQEXPECT_NO_RESULT | PQASYNC);
>> do {
>> // do something useful
>> } while (PQflush());
>>
>> Here, the PQASYNC flag would temporarily switch to non-blocking I/O,
>> and buffer what cannot be sent. PQASNC_CORK, would only buffer (only
>> send if the buffer is full). After any ASYNC call, PQflush would be
>> necessary (to flush the send queue and to consume the expected
>> responses), but I can imagine any synchronous call (PQexec,
>> PQsendQuery or whatever) could detect a non-empty buffer and just
>> blockingly flush right there.
>
>
> How would you consume results once they arrive?  I think this only covers
> the no-result case,

You could do PQEXPECT_ROWCOUNT for storing rowcounts (and supply a
pointer to a result buffer), or PQ_BUFFER_RESULTS and do the same. The
user would have to know beforehand the size of the result set (or an
upper bound of it), and readiness notification would also need to be
solved. There could also be PQEXPECT_DISCARD_RESULTS.

Alternatively, you could use a callback, node-style, and it would
solve everything (including readiness and error notification), but I'm
not sure about the portability of such a thing. Callbacks certainly
would be tricky when ABI compatibility has to be maintained. It would
however be a much better interface.

The pattern here, is the API needs to perform all the magic and
complex buffering and flushing, it should not be on the application
side.

> and it has since come to my attention that the Node.js
> folks are looking for general async response processing.

Node support would take a little more work. Specifically, for node to
work with this API, the non-blocking case has to be handled properly,
allowing node to wait on the FDs instead of requiring it to flush and
block on the event loop thread.

That means a buffer at least as large as the query parameters, which
should be no problem (but might be tricky to implement), and handling
the equivalent of EWOULDBLOCK at the PQexec(.., PQASYNC) calls.

In any case, on any specific connection, query processing is linear.
So you really need a "result callback queue" (however you implement
it, be the aplication or the API). What I propose is moving as much as
possible to the API, since it will be common to all users of the async
functionality, and it will make it possible to fix bugs in that code
centrally too.

My earlier examples where all about discarding results, because that's
what enables the most thoughput, and it covers lots of cases. But, as
has been mentioned in previous posts, rowcounts at the very least have
to be handled as well, so there's that. I guess we can throw in
generic result callbacks (and errbacks) if the ABI allows it, and it
will be a net win in clarity and simplicity.


On Tue, Apr 22, 2014 at 3:49 PM, Florian Weimer  wrote:
> On 04/22/2014 07:03 PM, Claudio Freire wrote:
>>
>> On Tue, Apr 22, 2014 at 8:19 AM, Florian Weimer 
>> wrote:
>>>
>>> Feedback in this thread was, "we want something like this in libpq, but
>>> not
>>> the thing you proposed".  But there have been no concrete
>>> counter-proposals,
>>> and some of the responses did not take into account the inherent
>>> complexities of round-trip avoidance.  So I'm not sure how to move this
>>> topic forward.
>>
>>
>> What exactly do you mean by not taking into account?
>
>
> A couple of them were along the lines "let's just send batches of queries
> and sync between batches".  This does not seem very helpful to me because
> sizing the batches is difficult, and the sizes could be quite large.

Not at all.

The proposal certainly has that in their examples, but the API can be
used with no explicit synchronization. That's what I was going for
when I suggested that other API calls could flush implicitly when
needed. If you never call flush, flush happens on its own when
necessary. If you never call synchronous API, you never wait for
replies. Though when you need the results (as in right now), you need
to flush explicitly, there's no way around that.

Also, feel free to consider all this mere o

Re: [HACKERS] RFC: Async query processing

2014-04-22 Thread Florian Weimer

On 04/22/2014 07:03 PM, Claudio Freire wrote:

On Tue, Apr 22, 2014 at 8:19 AM, Florian Weimer  wrote:

Feedback in this thread was, "we want something like this in libpq, but not
the thing you proposed".  But there have been no concrete counter-proposals,
and some of the responses did not take into account the inherent
complexities of round-trip avoidance.  So I'm not sure how to move this
topic forward.


What exactly do you mean by not taking into account?


A couple of them were along the lines "let's just send batches of 
queries and sync between batches".  This does not seem very helpful to 
me because sizing the batches is difficult, and the sizes could be quite 
large.



I do not believe you responded to my proposed interface. I'm not
requesting a change in roundtrip avoidance measures, just the
interface.


I've sent a reply now, I think I missed this part.  Sorry.

--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Async query processing

2014-04-22 Thread Florian Weimer

On 01/03/2014 06:06 PM, Claudio Freire wrote:


Per-query expectations could be such a thing. And it can even work with PQexec:

PQexec(con, "SELECT nextval('a_id_seq') FROM generate_series(1,10);");
--read--
PQexec(con, "SELECT nextval('b_id_seq') FROM generate_series(1,10);");
--read--
PQexec(con, "INSERT INTO a (...);", PQEXPECT_NO_RESULT | PQASYNC_CORK);
PQexec(con, "INSERT INTO b (...);", PQEXPECT_NO_RESULT | PQASYNC_CORK);
PQexec(con, "INSERT INTO a (...);", PQEXPECT_NO_RESULT | PQASYNC_CORK);
PQexec(con, "INSERT INTO b (...);", PQEXPECT_NO_RESULT | PQASYNC_CORK);
PQexec(con, "INSERT INTO a (...);", PQEXPECT_NO_RESULT | PQASYNC_CORK);
PQexec(con, "INSERT INTO b (...);", PQEXPECT_NO_RESULT | PQASYNC_CORK);
... 9 times...
PQexec(con, "INSERT INTO a (...);", PQEXPECT_NO_RESULT | PQASYNC_CORK);
PQexec(con, "INSERT INTO b (...);", PQEXPECT_NO_RESULT | PQASYNC);
do {
// do something useful
} while (PQflush());

Here, the PQASYNC flag would temporarily switch to non-blocking I/O,
and buffer what cannot be sent. PQASNC_CORK, would only buffer (only
send if the buffer is full). After any ASYNC call, PQflush would be
necessary (to flush the send queue and to consume the expected
responses), but I can imagine any synchronous call (PQexec,
PQsendQuery or whatever) could detect a non-empty buffer and just
blockingly flush right there.


How would you consume results once they arrive?  I think this only 
covers the no-result case, and it has since come to my attention that 
the Node.js folks are looking for general async response processing.


(In order to implement your proposal, we'd have to add quite a few 
additional APIs to include the flag argument, but that's a minor detail.)


--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

2014-04-22 Thread Merlin Moncure
On Mon, Apr 21, 2014 at 8:06 PM, Andrew Dunstan  wrote:
>
> On 04/21/2014 08:49 PM, Stephen Frost wrote:
>>
>> * Tatsuo Ishii (is...@postgresql.org) wrote:
>>>
>>> I observe performance degradation with PostgreSQL 9.3 vs 9.2 on Linux
>>> as well.  The hardware is HP DL980G7, 80 cores, 2TB mem, RHEL 6,
>>> pgbench is used (read only query), scale factor is 1,000 (DB size
>>> 15GB).
>>
>> Can you isolate the sysv-vs-mmap patch and see what happens with just
>> that change..?
>
>
>
> This is exactly why we need a benchfarm.
>
> I actually have a client working based on Greg Smith's pgbench tools.
>
> What we would need is a way to graph the results - that's something beyond
> my very rudimentary expertise in web programming. If anyone feels like
> collaborating I'd be glad to hear from them (The web site is programmed in
> perl + TemplateToolkit, but even that's not immutable. I'm open to using,
> say, node.js plus one of its templating engines.

Hm, you got me interested.  Is the data you want to visualize stored
in a database?  Got some example data? (this is pretty off topic, feel
free to contact off-list or on a new thread etc).

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] AXLE Plans for 9.5 and 9.6

2014-04-22 Thread Pavel Stehule
2014-04-22 19:02 GMT+02:00 Josh Berkus :

> On 04/22/2014 06:39 AM, Andrew Dunstan wrote:
> > I agree, and indeed that was something like my first reaction to hearing
> > about this development - FDW seems like a very odd way to handle this.
> > But the notion of builtin columnar storage suggests to me that we really
> > need first to tackle how various storage engines might be incorporated
> > into Postgres. I know this has been a bugbear for many years, but maybe
> > now with serious proposals for alternative storage engines on the
> > horizon we can no longer afford to put off the evil day when we grapple
> > with it.
>
> Yes.  *IF* PostgreSQL already supported alternate storage, then the
> Citus folks might have released their CStore as a storage plugin instead
> of an FDW.  However, if they'd waited for pluggable storage, they'd
> still be waiting.
>

I am sceptical - what I know about OLAP column store databases - they need
a hardly different planner, so just engine or storage is not enough. Vector
Wise try to merge Ingres with Monet engine more than four years - and still
has some issues.

Our extensibility is probably major barrier against fast OLAP - I see a
most realistic way to support better partitioning and going in direction
higher parallelism and distribution - and maybe map/reduce support.

In GoodData we use successfully Postgres for BI projects to 20G with fast
response - and most painfulness are missing MERGE, missing fault tolerant
copy, IO expensive update of large tables with lot of indexes and missing
simple massive partitioning. On second hand - Postgres works perfectly on
thousands databases with thousands tables without errors with terrible
simple deploying in cloud environment.

Regards

Pavel


>
> --
> Josh Berkus
> PostgreSQL Experts Inc.
> http://pgexperts.com
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

2014-04-22 Thread Joshua D. Drake


On 04/22/2014 08:26 AM, Andrew Dunstan wrote:


I'm going away tomorrow for a few days R&R. when I'm back next week I
will set up a demo client running this module. If you can have a machine
prepped for this purpose by then so much the better, otherwise I will
have to drag out a box I recently rescued and have been waiting for
something to use it with. It's more important that it's stable (i.e.
nothing else running on it) than that it's very powerful. It could be
running Ubuntu or some Redhattish variant or, yes, even FreeBSD.


This is best handled by Mark. Mark can you help Andrew with this? I 
assume we would use the DL385 with the MS70?


JD




cheers

andrew





--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
Political Correctness is for cowards.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Async query processing

2014-04-22 Thread Claudio Freire
On Tue, Apr 22, 2014 at 8:19 AM, Florian Weimer  wrote:
> Feedback in this thread was, "we want something like this in libpq, but not
> the thing you proposed".  But there have been no concrete counter-proposals,
> and some of the responses did not take into account the inherent
> complexities of round-trip avoidance.  So I'm not sure how to move this
> topic forward.

What exactly do you mean by not taking into account?

I do not believe you responded to my proposed interface. I'm not
requesting a change in roundtrip avoidance measures, just the
interface.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implied BETWEEN from join quals

2014-04-22 Thread Simon Riggs
On 22 April 2014 17:00, Stephen Frost  wrote:
> Simon,
>
> This all looks good, and at the risk of being slightly off-topic for
> this thread, I just wanted to mention..
>
> * Simon Riggs (si...@2ndquadrant.com) wrote:
>> Current proposal ends there, but there is a further optimization that
>> allows us to remove the join altogether if
>> * There is a FK between Fact and Dim
>
> It'd be great if we could start by looking for the above requirement and
> then doing join-removal when it exists and no columns from Dim are
> referenced (outside of the FK reference, which must result in 'true' or
> we've screwed something up).
>
> I had been looking at this about a month ago and just ran out of time to
> play with it, but it doesn't seem like it'd be terribly difficult to
> do..

Yeh, it was on my list. I had a few optimizer changes I've been
looking to make for a while.

Hopefully 9.5 is the year I get the chance at a running start on that
since they're never as easy as you'd hope.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] AXLE Plans for 9.5 and 9.6

2014-04-22 Thread Josh Berkus
On 04/22/2014 06:39 AM, Andrew Dunstan wrote:
> I agree, and indeed that was something like my first reaction to hearing
> about this development - FDW seems like a very odd way to handle this.
> But the notion of builtin columnar storage suggests to me that we really
> need first to tackle how various storage engines might be incorporated
> into Postgres. I know this has been a bugbear for many years, but maybe
> now with serious proposals for alternative storage engines on the
> horizon we can no longer afford to put off the evil day when we grapple
> with it.

Yes.  *IF* PostgreSQL already supported alternate storage, then the
Citus folks might have released their CStore as a storage plugin instead
of an FDW.  However, if they'd waited for pluggable storage, they'd
still be waiting.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implied BETWEEN from join quals

2014-04-22 Thread Stephen Frost
Simon,

This all looks good, and at the risk of being slightly off-topic for
this thread, I just wanted to mention..

* Simon Riggs (si...@2ndquadrant.com) wrote:
> Current proposal ends there, but there is a further optimization that
> allows us to remove the join altogether if
> * There is a FK between Fact and Dim

It'd be great if we could start by looking for the above requirement and
then doing join-removal when it exists and no columns from Dim are
referenced (outside of the FK reference, which must result in 'true' or
we've screwed something up).

I had been looking at this about a month ago and just ran out of time to
play with it, but it doesn't seem like it'd be terribly difficult to
do..

Thanks,

Stephen


signature.asc
Description: Digital signature


[HACKERS] Implied BETWEEN from join quals

2014-04-22 Thread Simon Riggs
I was recently nudged to describe an optimisation of merge
joins/sorts. Rather than decribe that, I've looked at the general
case:

There are some additional implications we may make when joining
tables... a particularly interesting one is that

SELECT *
FROM Fact JOIN Dim on Fact.col = Dim.col

can be rewritten as

SELECT *
FROM Fact JOIN Dim on Fact.col = Dim.col
WHERE
(
Fact.col BETWEEN
 (SELECT min(col) FROM Dim)
AND
 (SELECT max(col) FROM Dim)
)
AND
(
Dim.col BETWEEN
 (SELECT min(col) FROM Fact)
AND
 (SELECT max(col) FROM Fact)
)

which also works similarly for anti/semi-joins.

If we have indexes on A.col and B.col then these additional quals can
be derived cheaply at run-time and could have an important effect on
optimisation.

1) With various kinds of index, we would be able to use these implied
quals to further restrict the scan. Perhaps that doesn't sound very
interesting, but it is very important when solving an "outside-in"
join on a star schema, such as...

SELECT count(*)
FROM Fact JOIN Dim on Fact.col = Dim.col
WHERE Dim.other = 1

since there is no qual that can be applied directly to the Fact table,
causing us to scan the entire table.

We can rewrite this query as

EXPLAIN (ANALYZE on, TIMING off, COSTS on)
SELECT count(*)
FROM Fact JOIN Dim on Fact.col = Dim.col
WHERE Dim.other = 1
AND
(
Fact.col BETWEEN
 (SELECT min(col) FROM Dim WHERE Dim.other = 1)
AND
 (SELECT max(col) FROM Dim WHERE Dim.other = 9)
)

Note that the implied qual on the Dim table has been dropped as
uninteresting. This is because we can calculate the cost and potential
benefit of applying the rewrite, allowing us to discard one or both
implied clauses.

Note also that this has nothing to do with join order. This is solely
about making inferences using the join quals between any two tables.

2) We calculate the join selectivity by comparing the MFVs of the join
columns on the tables being joined. ISTM that we could use the min()
and max() values to refine the selectivity, which can often be wrong
as a result.

- - -

The current planner doesn't add these predicates automatically, but if
it did, it would come up with the following slightly sub-optimal
plan...

EXPLAIN (ANALYZE on, TIMING off, COSTS on)
SELECT count(*)
FROM Fact JOIN Dim on Fact.col = Dim.col
WHERE Dim.other = 1
AND
(
Fact.col BETWEEN
 (SELECT min(col) FROM Dim WHERE Dim.other = 1)
AND
 (SELECT max(col) FROM Dim WHERE Dim.other = 1)
)

 Aggregate  (cost=31.79..31.80 rows=1 width=0) (actual rows=1 loops=1)
   InitPlan 1 (returns $0)
 ->  Aggregate  (cost=1.03..1.04 rows=1 width=4) (actual rows=1 loops=1)
   ->  Seq Scan on dim dim_1  (cost=0.00..1.02 rows=1 width=4)
(actual rows=1 loops=1)
 Filter: (other = 1)
 Rows Removed by Filter: 1
   InitPlan 2 (returns $1)
 ->  Aggregate  (cost=1.03..1.04 rows=1 width=4) (actual rows=1 loops=1)
   ->  Seq Scan on dim dim_2  (cost=0.00..1.02 rows=1 width=4)
(actual rows=1 loops=1)
 Filter: (other = 1)
 Rows Removed by Filter: 1
   ->  Merge Join  (cost=1.33..29.09 rows=250 width=0) (actual
rows=10 loops=1)
 Merge Cond: (dim.col = fact.col)
 ->  Sort  (cost=1.03..1.04 rows=1 width=4) (actual rows=1 loops=1)
   Sort Key: dim.col
   Sort Method: quicksort  Memory: 25kB
   ->  Seq Scan on dim  (cost=0.00..1.02 rows=1 width=4)
(actual rows=1 loops=1)
 Filter: (other = 1)
 Rows Removed by Filter: 1
 ->  Index Only Scan using fact_col_idx on fact
(cost=0.29..24.29 rows=500 width=4) (actual rows=10 loops=1)
   Index Cond: ((col >= $0) AND (col <= $1))
   Heap Fetches: 10

which is sub-optimal only because of the mis-estimation of the effect
of the min() and max(), so we will still benefit from resolving the
parameters to a constant before proceeding with the main query. A
better plan would be

EXPLAIN (ANALYZE on, TIMING off, COSTS on)
SELECT count(*)
FROM Fact JOIN Dim on Fact.col = Dim.col
WHERE Dim.other = 1
AND Fact.col BETWEEN 1 AND 1

 Aggregate  (cost=2944.04..2944.05 rows=1 width=0) (actual rows=1 loops=1)
   ->  Hash Join  (cost=1.04..2819.04 rows=5 width=0) (actual
rows=10 loops=1)
 Hash Cond: (fact.col = dim.col)
 ->  Seq Scan on fact  (cost=0.00..1943.00 rows=10
width=4) (actual rows=10 loops=1)
   Filter: ((col >= 1) AND (col <= 1))
 ->  Hash  (cost=1.02..1.02 rows=1 width=4) (actual rows=1 loops=1)
   Buckets: 1024  Batches: 1  Memory Usage: 1kB
   ->  Seq Scan on dim  (cost=0.00..1.02 rows=1 width=4)
(actual rows=1 loops=1)
 Filter: (other = 1)
 Rows Removed by Filter: 1

but we can probably live with that, as we do with other dynamic index plans.

So when can we use this?

The additional cost of adding this to the query is
* additional qual:  2 * cpu_operator_cost * rows
* ge

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

2014-04-22 Thread Palle Girgensohn

22 apr 2014 kl. 17:26 skrev Andrew Dunstan :

> 
> On 04/22/2014 01:36 AM, Joshua D. Drake wrote:
>> 
>> On 04/21/2014 06:19 PM, Andrew Dunstan wrote:
>> 
>>> 
>>> If we never start we'll never get there.
>>> 
>>> I can think of several organizations that might be approached to donate
>>> hardware.
>> 
>> Like .Org?
>> 
>> We have a hardware farm, a rack full of hardware and spindles. It isn't the 
>> most current but it is there.
>> 
>> 
> 
> 
> I'm going away tomorrow for a few days R&R. when I'm back next week I will 
> set up a demo client running this module. If you can have a machine prepped 
> for this purpose by then so much the better, otherwise I will have to drag 
> out a box I recently rescued and have been waiting for something to use it 
> with. It's more important that it's stable (i.e. nothing else running on it) 
> than that it's very powerful. It could be running Ubuntu or some Redhattish 
> variant or, yes, even FreeBSD.

If you need help with the FreeBSD setup, I'm at you service.

Palle


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

2014-04-22 Thread Andrew Dunstan


On 04/22/2014 01:36 AM, Joshua D. Drake wrote:


On 04/21/2014 06:19 PM, Andrew Dunstan wrote:



If we never start we'll never get there.

I can think of several organizations that might be approached to donate
hardware.


Like .Org?

We have a hardware farm, a rack full of hardware and spindles. It 
isn't the most current but it is there.






I'm going away tomorrow for a few days R&R. when I'm back next week I 
will set up a demo client running this module. If you can have a machine 
prepped for this purpose by then so much the better, otherwise I will 
have to drag out a box I recently rescued and have been waiting for 
something to use it with. It's more important that it's stable (i.e. 
nothing else running on it) than that it's very powerful. It could be 
running Ubuntu or some Redhattish variant or, yes, even FreeBSD.


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Composite Datums containing toasted fields are a bad idea(?)

2014-04-22 Thread Tom Lane
Noah Misch  writes:
> On Mon, Apr 21, 2014 at 10:57:34AM -0400, Tom Lane wrote:
>> I'm actually planning to set this patch on the shelf for a bit and go
>> investigate the other alternative, ie, not generating composite Datums
>> containing toast pointers in the first place.

> I maintain that the potential slowdown is too great to consider adopting that
> for the sake of a cleaner patch.  Your last message examined a 67% performance
> regression.  The strategy you're outlining now can slow a query by 1,000,000%.

[ shrug... ]  It could also speed up a query by similar factors.  I see
no good reason to suppose that it would be a net loss overall.  I agree
that it might change performance characteristics in a way that we'd
ideally not do in the back branches.  But the fact remains that we've
got a bad bug to fix, and absent a reasonably trustworthy functional fix,
arguing about performance characteristics is a waste of breath.  I can
make it arbitrarily fast if it's not required to give the right answer.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] AXLE Plans for 9.5 and 9.6

2014-04-22 Thread Stephen Frost
* Andrew Dunstan (and...@dunslane.net) wrote:
> I agree, and indeed that was something like my first reaction to
> hearing about this development - FDW seems like a very odd way to
> handle this. But the notion of builtin columnar storage suggests to
> me that we really need first to tackle how various storage engines
> might be incorporated into Postgres. I know this has been a bugbear
> for many years, but maybe now with serious proposals for alternative
> storage engines on the horizon we can no longer afford to put off
> the evil day when we grapple with it.

Agreed, and it goes beyond just columnar stores- I could see IOTs being
implemented using this notion of a different 'storage engine', but
calling it a 'storage engine' makes it sound like we want to change how
we access files and I don't think we really want to change that but
rather come up with a way to have an alternative heap..  Columnar or
IOTs would still be page-based and go through shared buffers, etc, I'd
think..

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Store data in pg_toast for custom type fails (bug?)

2014-04-22 Thread Tom Lane
Honza  writes:
> On 03/28/2014 07:02 PM, Tom Lane wrote:
>> I believe I understand what's going on here, and it's not quite as
>> exciting as it first appears.  The issue is that we are failing to
>> honor the "toasting goes only one level deep" rule in the specific
>> case of arrays of composite type.  So while it's definitely a nasty
>> bug, it affects only narrow use cases, and doesn't call into question
>> our whole vacuuming strategy or anything like that.

> I would like to ask if there is anything new in this bug?

Still working on it --- haven't you been following the other thread?
http://www.postgresql.org/message-id/flat/29007.1396038...@sss.pgh.pa.us

We should have a fix in time for the next set of minor releases, whenever
that is (probably a month or two).  If you're feeling desperate for a
temporary solution, you could make use of one of the patches I already
posted, even though I now don't like those as a long-term answer.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] shm_mq inconsistent behavior of SHM_MQ_DETACHED

2014-04-22 Thread Petr Jelinek

Hi,

I was playing with shm_mq and found a little odd behavior with detaching 
after sending messages.


Following sequence behaves as expected (receiver gets 2 messages):
P1 -> set_sender
P1 -> attach
P2 -> set_receiver
P2 -> attach
P1 -> send
P2 -> receive
P1 -> send
P1 -> detach
P2 -> receive
P2 -> detach

But if I do first receive after detach like in this sequence:
P1 -> set_sender
P1 -> attach
P2 -> set_receiver
P2 -> attach
P1 -> send
P1 -> send
P1 -> detach
P2 -> receive

I get SHM_MQ_DETACHED on the receiver even though there are messages in 
the ring buffer.


The reason for this behavior is that mqh_counterparty_attached is only 
set by shm_mq_receive. This does not seem to be consistent - I would 
either expect to get SHM_MQ_DETACHED always when other party has 
detached or always get all remaining messages that are in queue (and I 
would strongly prefer the latter).


Maybe the shm_mq_get_bytes_written should be used to determine if there 
is something left for us to read in the receiver if we hit the 
!mqh_counterparty_attached code path with detached sender?



--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] AXLE Plans for 9.5 and 9.6

2014-04-22 Thread Simon Riggs
On 22 April 2014 13:15, MauMau  wrote:

> Great!  I'm looking forward to seeing PostgreSQL evolve as an analytics
> database for data warehousing.  Is there any reason why in-memory database
> and MPP is not included?

Those ideas are valid; the features are bounded by resource
constraints of time and money, as well as by technical skills/
capacities of my fellow developers. My analysis has been that
implementing parallelism has lower benefit/cost ratio than other
features, as well as requiring more expensive servers (for MPP). I
expect MPP to be an eventual end goal from BDR project.

> Are you planning to include the above features in 9.5 and 9.6?

Yes

> Are you
> recommending other developers not implement these features to avoid
> duplication of work with AXLE?

This was more to draw attention to the work so that all interested
parties can participate in producing something useful.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] AXLE Plans for 9.5 and 9.6

2014-04-22 Thread Andrew Dunstan


On 04/22/2014 08:04 AM, Simon Riggs wrote:

On 22 April 2014 00:24, Josh Berkus  wrote:

On 04/21/2014 03:41 PM, Simon Riggs wrote:

Storage Efficiency
* Compression
* Column Orientation

You might look at turning this:

http://citusdata.github.io/cstore_fdw/

... into a more integrated part of Postgres.

Of course I'm aware of that work - credit to them. Certainly, many
people feel that it is now time to do as you suggest and include
column store features within PostgreSQL.

As to turning it into a more integrated part of Postgres, we have a
few problems there

1. cstore_fdw code has an incompatible licence

2. I don't think FDWs are the right place for complex new
architectures such as column store, massively parallel processing or
sharding. The fact that it is probably the best place to implement it
in user space doesn't mean it transfers well into core code. That's a
shame and I don't know what to do about it, because it would be nice
to simply ask for change of licence and then integrate it, but it
seems more work than that (to me).





I agree, and indeed that was something like my first reaction to hearing 
about this development - FDW seems like a very odd way to handle this. 
But the notion of builtin columnar storage suggests to me that we really 
need first to tackle how various storage engines might be incorporated 
into Postgres. I know this has been a bugbear for many years, but maybe 
now with serious proposals for alternative storage engines on the 
horizon we can no longer afford to put off the evil day when we grapple 
with it.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] AXLE Plans for 9.5 and 9.6

2014-04-22 Thread Andrew Dunstan


On 04/22/2014 08:15 AM, MauMau wrote:



Are you planning to include the above features in 9.5 and 9.6? Are you 
recommending other developers not implement these features to avoid 
duplication of work with AXLE?






Without pointing any fingers, I should note that I have learned the hard 
way to take such recommendations with a grain of salt. More than once I 
have been stopped from working on something because someone else said 
they were, only for nothing to appear, and in the interests of full 
disclosure I can think of two significant instances when I have been 
similarly guilty, although the most serious of those has since been 
rectified by someone else.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

2014-04-22 Thread Stephen Frost
* Magnus Hagander (mag...@hagander.net) wrote:
> I didn't realize we had a guc for dynamic shared memory, must've missed
> that in the discussion about that one. I agree that if we have that, it
> makes perfect sense to have the same setting available for the main shared
> memory segment.

I recall the back-and-forth about the issues with dynamic shared memory
but I had hoped they would result in a solution that didn't involve
having to support both..

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] AXLE Plans for 9.5 and 9.6

2014-04-22 Thread Hannu Krosing
On 04/22/2014 02:04 PM, Simon Riggs wrote:
> On 22 April 2014 00:24, Josh Berkus  wrote:
>> On 04/21/2014 03:41 PM, Simon Riggs wrote:
>>> Storage Efficiency
>>> * Compression
>>> * Column Orientation
>> You might look at turning this:
>>
>> http://citusdata.github.io/cstore_fdw/
>>
>> ... into a more integrated part of Postgres.
> Of course I'm aware of that work - credit to them. Certainly, many
> people feel that it is now time to do as you suggest and include
> column store features within PostgreSQL.
>
> As to turning it into a more integrated part of Postgres, we have a
> few problems there
>
> 1. cstore_fdw code has an incompatible licence
>
> 2. I don't think FDWs are the right place for complex new
> architectures such as column store, massively parallel processing or
> sharding. 
I agree that FDW is not an end-all solution for all these, but it is a
reasonable starting point and it just might be that the extra things
needed could be added to our FDW API instead of sewing it directly
into backend guts.


I recently tried to implement sharding at FDW level and the main
problem I ran into was a missing join type for efficiently using it
for certain queries.

The specific use case was queries of form

select l.*, r*
from remotetable r
join localtable l
on l.key1 = r.id and l.n = N;

PostgreSQL offered only two options:

1) full scan on remote table

2) single id=$ selects

neither of which are what is actually needed, as firs performs badly
if there are more than a few rows in remote table and 2nd performs
badly if l.n = N returns more than a few rows

when I manually rewrote the query to

select l.*, r*
from remotetable r where r.id = ANY(ARRAY(select key1 from localtable
where n = N))
join localtable l
on l.key1 = r.id and l.n = N;

it run really well.

Unfortunately this is not something that postgreSQL considers by itself
while optimising.

BTW, this kind of optimisation should also be a win for really large IN
queries if we
could have an indexed IN whic would not start each lookup from the index
root, but
rather would sort the IN contents and do an index merge vis skipping
from current position.


Cheers










-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

2014-04-22 Thread Magnus Hagander
On Tue, Apr 22, 2014 at 8:26 AM, Mark Kirkwood <
mark.kirkw...@catalyst.net.nz> wrote:

> On 22/04/14 09:25, Andres Freund wrote:
>
>> On 2014-04-21 17:21:20 -0400, Bruce Momjian wrote:
>>
>>> On Mon, Apr 21, 2014 at 02:08:51PM -0700, Joshua Drake wrote:
>>>
 If the community had more *BSD presence I think it would be great
 but it isn't all that viable at this point. I do know however that
 no-one in this community would turn down a team of FreeBSD advocates
 helping us make PostgreSQL awesome for PostgreSQL.

>>>
>>> I don't think we would even implement a run-time control for Linux or
>>> Windows for this, so I don't even think it is a FreeBSD issue.
>>>
>>
>> I think some of the arguments in this thread are pretty damn absurd. We
>> have just introduced dynamic_shared_memory_type.
>>
>>
> +1
>
> I was just thinking the same thing...
>
>
I didn't realize we had a guc for dynamic shared memory, must've missed
that in the discussion about that one. I agree that if we have that, it
makes perfect sense to have the same setting available for the main shared
memory segment.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

2014-04-22 Thread Andres Freund
Hi,

Attached you can find a short (compile tested only ) patch implementing
a 'shared_memory_type' GUC, akin to 'dynamic_shared_memory_type'. Will
only apply to 9.4, not 9.3, but it should be easy to convert for it.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
>From e090326d8b5933a162e8e503ccec714bde3a49b7 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Tue, 22 Apr 2014 14:17:34 +0200
Subject: [PATCH] Add shared_memory_type GUC.

---
 src/backend/port/sysv_shmem.c | 18 --
 src/backend/storage/ipc/ipci.c|  2 ++
 src/backend/utils/misc/guc.c  | 23 +++
 src/backend/utils/misc/postgresql.conf.sample |  7 ++-
 src/include/storage/pg_shmem.h| 21 +++--
 5 files changed, 62 insertions(+), 9 deletions(-)

diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index 5e3850b..f6343d3 100644
--- a/src/backend/port/sysv_shmem.c
+++ b/src/backend/port/sysv_shmem.c
@@ -465,14 +465,20 @@ PGSharedMemoryCreate(Size size, bool makePrivate, int port,
 	 * developer use, this shouldn't be a big problem.
 	 */
 #ifndef EXEC_BACKEND
-	AnonymousShmem = CreateAnonymousSegment(&size);
-	AnonymousShmemSize = size;
+	if (shared_memory_type == SHMEM_TYPE_MMAP)
+	{
+		AnonymousShmem = CreateAnonymousSegment(&size);
+		AnonymousShmemSize = size;
 
-	/* Now we need only allocate a minimal-sized SysV shmem block. */
-	sysvsize = sizeof(PGShmemHeader);
-#else
-	sysvsize = size;
+		/* Now we need only allocate a minimal-sized SysV shmem block. */
+		sysvsize = sizeof(PGShmemHeader);
+	}
+	else
 #endif
+	{
+		Assert(shared_memory_type == SHMEM_TYPE_SYSV);
+		sysvsize = size;
+	}
 
 	/* Make sure PGSharedMemoryAttach doesn't fail without need */
 	UsedShmemSegAddr = NULL;
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 4290d2d..c58f171 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -42,6 +42,8 @@
 #include "storage/sinvaladt.h"
 #include "storage/spin.h"
 
+/* GUCs */
+int			shared_memory_type = DEFAULT_SHARED_MEMORY_TYPE;
 
 shmem_startup_hook_type shmem_startup_hook = NULL;
 
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index ea54d16..f6d9dc4 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -403,6 +403,19 @@ static const struct config_enum_entry huge_pages_options[] = {
 	{NULL, 0, false}
 };
 
+static struct config_enum_entry shared_memory_options[] = {
+#ifndef WIN32
+	{ "sysv", SHMEM_TYPE_SYSV, false},
+#endif
+#ifndef EXEC_BACKEND
+	{ "mmap", SHMEM_TYPE_MMAP, false},
+#endif
+#ifdef WIN32
+	{ "windows", SHMEM_TYPE_WINDOWS, false},
+#endif
+	{NULL, 0, false}
+};
+
 /*
  * Options for enum values stored in other modules
  */
@@ -3453,6 +3466,16 @@ static struct config_enum ConfigureNamesEnum[] =
 	},
 
 	{
+		{"shared_memory_type", PGC_POSTMASTER, RESOURCES_MEM,
+			gettext_noop("Selects the shared memory implementation used."),
+			NULL
+		},
+		&shared_memory_type,
+		DEFAULT_SHARED_MEMORY_TYPE, shared_memory_options,
+		NULL, NULL, NULL
+	},
+
+	{
 		{"wal_sync_method", PGC_SIGHUP, WAL_SETTINGS,
 			gettext_noop("Selects the method used for forcing WAL updates to disk."),
 			NULL
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 70e5a51..1c5f02a 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -127,7 +127,12 @@
 #maintenance_work_mem = 64MB		# min 1MB
 #autovacuum_work_mem = -1		# min 1MB, or -1 to use maintenance_work_mem
 #max_stack_depth = 2MB			# min 100kB
-#dynamic_shared_memory_type = posix # the default is the first option
+#shared_memory_type = mmap		# the default is the first option
+	# supported by the operating system:
+	#   mmap
+	#   sysv
+	#   windows
+#dynamic_shared_memory_type = posix	# the default is the first option
 	# supported by the operating system:
 	#   posix
 	#   sysv
diff --git a/src/include/storage/pg_shmem.h b/src/include/storage/pg_shmem.h
index ab28ebe..53ea43a 100644
--- a/src/include/storage/pg_shmem.h
+++ b/src/include/storage/pg_shmem.h
@@ -41,8 +41,13 @@ typedef struct PGShmemHeader	/* standard header for all Postgres shmem */
 #endif
 } PGShmemHeader;
 
-/* GUC variable */
-extern int huge_pages;
+/* Possible values for shared_memory_type */
+typedef enum
+{
+	SHMEM_TYPE_WINDOWS,
+	SHMEM_TYPE_SYSV,
+	SHMEM_TYPE_MMAP
+} PGShmemType;
 
 /* Possible values for huge_pages */
 typedef enum
@@ -52,6 +57,10 @@ typedef enum
 	HUGE_PAGES_TRY
 } HugePagesType;
 
+/* GUC variables */
+extern int shared_memory_type;
+extern int huge_pages;
+
 #ifndef WIN32
 extern unsigned long UsedShmemSegID;
 #else
@@ -59,6 +68,14 @@ extern HANDLE UsedShmemSegID;
 #endif
 extern void *UsedShm

Re: [HACKERS] AXLE Plans for 9.5 and 9.6

2014-04-22 Thread MauMau

From: "Simon Riggs" 

Some of areas of R&D are definitely on the roadmap, others are more
flexible. Some of this is in progress, other stuff is not even at the
design stage - yet, just a few paragraphs along the lines of "we will
look at these topics". If we have room, its possible we may
accommodate other topics; this is not carte blanche, but the reason
for posting here is so people know we will take input, following the
normal community process. Detailed in-person discussions at PGCon are
expected and the Wiki pages will be updated for each aspect.

BI-related Indexing
* MinMax indexes
* Bitmap indexes

Large Systems
* Freeze avoidance
* Storage management issues for very large systems

Storage Efficiency
* Compression
* Column Orientation

Optimisation
* Bulk loading speed improvements
* Bulk FK evaluation
* Executor tuning for very large queries

Query tuning
* Approximate queries, sampling
* Materialized Views


Great!  I'm looking forward to seeing PostgreSQL evolve as an analytics 
database for data warehousing.  Is there any reason why in-memory database 
and MPP is not included?


Are you planning to include the above features in 9.5 and 9.6?  Are you 
recommending other developers not implement these features to avoid 
duplication of work with AXLE?


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] AXLE Plans for 9.5 and 9.6

2014-04-22 Thread Hannu Krosing
On 04/22/2014 01:24 AM, Josh Berkus wrote:
> On 04/21/2014 03:41 PM, Simon Riggs wrote:
>> Storage Efficiency
>> * Compression
>> * Column Orientation
> You might look at turning this:
>
> http://citusdata.github.io/cstore_fdw/
>
> ... into a more integrated part of Postgres.
What would be of more general usefulness is probably
better planning and better performance of FDW interface.

So instead of integrating one specific FDW it would make
sense to improve postgresql so that it can use (properly written)
FDWs at native speeds

Regards

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] AXLE Plans for 9.5 and 9.6

2014-04-22 Thread Simon Riggs
On 22 April 2014 10:42, Jov  wrote:

> what about runtime code generation using LLVM?
> http://blog.cloudera.com/blog/2013/02/inside-cloudera-impala-runtime-code-generation/
> http://llvm.org/devmtg/2013-11/slides/Wanderman-Milne-Cloudera.pdf

Those techniques have been in use for at least 20 years on various platforms.

The main issues PostgreSQL faces is supporting many platforms and
compilers, while at the same time supporting extensible data types.

I believe there is some research work into run-time compilation in
progress, but that seems unlikely to make it into Postgres core.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] AXLE Plans for 9.5 and 9.6

2014-04-22 Thread Simon Riggs
On 22 April 2014 00:24, Josh Berkus  wrote:
> On 04/21/2014 03:41 PM, Simon Riggs wrote:
>> Storage Efficiency
>> * Compression
>> * Column Orientation
>
> You might look at turning this:
>
> http://citusdata.github.io/cstore_fdw/
>
> ... into a more integrated part of Postgres.

Of course I'm aware of that work - credit to them. Certainly, many
people feel that it is now time to do as you suggest and include
column store features within PostgreSQL.

As to turning it into a more integrated part of Postgres, we have a
few problems there

1. cstore_fdw code has an incompatible licence

2. I don't think FDWs are the right place for complex new
architectures such as column store, massively parallel processing or
sharding. The fact that it is probably the best place to implement it
in user space doesn't mean it transfers well into core code. That's a
shame and I don't know what to do about it, because it would be nice
to simply ask for change of licence and then integrate it, but it
seems more work than that (to me).

cstore_fdw uses ORC, which interestingly stores "lightweight index"
values that look exactly like MinMax indexes, so at least PostgreSQL
shoiuld be getting that soon.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Async query processing

2014-04-22 Thread Florian Weimer

On 01/05/2014 01:56 PM, Craig Ringer wrote:


JDBC also has a statement batching interface. Right now PgJDBC just
unwraps the batch and runs each query individually. Any async-support
improvements server-side should probably consider the need of executing
a batch. The batch might be one PreparedStatement with many different
parameters, or it might be a series of unrelated statements. A way for
PgJDBC to run the batch without syncing with the server after each query
would be really helpful.


The server already supports this because it supports pipelining.

Anyway, I have now switched my application to COPY FROM STDIN where 
possible, and the increase in throughput is phenomenal—even though it is 
running completely locally.  I could squeeze out more round-trips if I 
had an asynchronous query facility in libpq.


Feedback in this thread was, "we want something like this in libpq, but 
not the thing you proposed".  But there have been no concrete 
counter-proposals, and some of the responses did not take into account 
the inherent complexities of round-trip avoidance.  So I'm not sure how 
to move this topic forward.


--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Store data in pg_toast for custom type fails (bug?)

2014-04-22 Thread Honza

On 03/28/2014 07:02 PM, Tom Lane wrote:
> I wrote:
>> Honza  writes:
>>> after a months I've found a time to make test-case for this bug, probably:
> 
>> Confirmed that this reproduces a problem on HEAD.  Will look into it,
>> thanks!
> 
> I believe I understand what's going on here, and it's not quite as
> exciting as it first appears.  The issue is that we are failing to
> honor the "toasting goes only one level deep" rule in the specific
> case of arrays of composite type.  So while it's definitely a nasty
> bug, it affects only narrow use cases, and doesn't call into question
> our whole vacuuming strategy or anything like that.

I would like to ask if there is anything new in this bug?

I've made a simple script which checks if daily backups are complete. For a 
week I've been deleting
a few records every day and hope the backup will be successfully done after 
that. The problem is
it's not possible to read all data from table during making backup using 
pg_dump too (not only
selecting data from table). I've found there is only one possibility to 
temporarily solve it and
have full backups - delete corrupted records.

Thanks for any news,

Jan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] AXLE Plans for 9.5 and 9.6

2014-04-22 Thread Jov
what about runtime code generation using LLVM?
http://blog.cloudera.com/blog/2013/02/inside-cloudera-impala-runtime-code-generation/
http://llvm.org/devmtg/2013-11/slides/Wanderman-Milne-Cloudera.pdf

Jov
blog: http:amutu.com/blog 


2014-04-22 6:41 GMT+08:00 Simon Riggs :

> I've discussed 2ndQuadrant's involvement in the AXLE project a few
> times publicly, but never on this mailing list. The project relates to
> innovation and improvement in Business Intelligence for systems based
> upon PostgreSQL in the range of 10-100TB.
>
> Our work will span the 9.5 and 9.6 cycles. We're looking to make
> measurable improvements in a number of cases; one of those is TPC-H,
> since its a publicly accessible benchmark, another is a more private
> benchmark on healthcare data. In brief, this means speeding up the
> performance of large queries, data loading and looking at very large
> systems issues.
>
> Some of areas of R&D are definitely on the roadmap, others are more
> flexible. Some of this is in progress, other stuff is not even at the
> design stage - yet, just a few paragraphs along the lines of "we will
> look at these topics". If we have room, its possible we may
> accommodate other topics; this is not carte blanche, but the reason
> for posting here is so people know we will take input, following the
> normal community process. Detailed in-person discussions at PGCon are
> expected and the Wiki pages will be updated for each aspect.
>
> BI-related Indexing
> * MinMax indexes
> * Bitmap indexes
>
> Large Systems
> * Freeze avoidance
> * Storage management issues for very large systems
>
> Storage Efficiency
> * Compression
> * Column Orientation
>
> Optimisation
> * Bulk loading speed improvements
> * Bulk FK evaluation
> * Executor tuning for very large queries
>
> Query tuning
> * Approximate queries, sampling
> * Materialized Views
>
> ...and possibly some other aspects.
>
> 2ndQuadrant is also assisting other researchers on GPU and FPGA
> topics, which may also yield work of interest to PostgreSQL project.
>
> Couple of points: The project is time limited, so if work gets pushed
> back beyond that then we'll lose the opportunity to contribute. Please
> support our work with timely objections, assistance in defining the
> path forwards and limiting the scope to something that avoids wasting
> this opportunity. Further funding is possible if we don't squander
> this. We are being funded to make best efforts to contribute to open
> source PostgreSQL, not pay-for-commit.
>
> AXLE is funded by the EU under FP7 Grant Agreement 318633.  Further
> details are available here http://www.axleproject.eu/
>
> (There are also other 2ndQuadrant development projects in progress,
> this is just one of the larger ones).
>
> Best Regards
>
> --
>  Simon Riggs   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] A question about code in DefineRelation()

2014-04-22 Thread Etsuro Fujita
(2014/04/04 13:35), Etsuro Fujita wrote:
> If I understand correctly, foreign tables cannot have an OID column, but
> the following code in DefineRelation() assumes that foreign tables *can*
> have that coulum:

On second thought I noticed that that makes CREATE FOREIGN TABLE include
an OID column in newly-created foreign tables wrongly, when the
default_with_oids parameter is set to on.  Please find attached a patch.

Thanks,

Best regards,
Etsuro Fujita
*** a/src/backend/commands/tablecmds.c
--- b/src/backend/commands/tablecmds.c
***
*** 565,572  DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId)
descriptor = BuildDescForRelation(schema);
  
localHasOids = interpretOidsOption(stmt->options,
!  
(relkind == RELKIND_RELATION ||
!   
relkind == RELKIND_FOREIGN_TABLE));
descriptor->tdhasoid = (localHasOids || parentOidCount > 0);
  
/*
--- 565,571 
descriptor = BuildDescForRelation(schema);
  
localHasOids = interpretOidsOption(stmt->options,
!  
relkind == RELKIND_RELATION);
descriptor->tdhasoid = (localHasOids || parentOidCount > 0);
  
/*

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-22 Thread Hannu Krosing
On 04/17/2014 10:39 PM, Andres Freund wrote:
> On 2014-04-17 13:33:27 -0700, Peter Geoghegan wrote:
>> Just over 99.6% of pages (leaving aside the meta page) in the big 10
>> GB pgbench_accounts_pkey index are leaf pages.

What is the depth of b-tree at this percentage ?

Cheers
Hannu


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-22 Thread Atri Sharma
On Tue, Apr 22, 2014 at 12:59 PM, Albe Laurenz wrote:

> Jason Petersen wrote:
> > Yes, we obviously want a virtual clock. Focusing on the use of
> gettimeofday seems silly to me: it was
> > something quick for the prototype.
> >
> > The problem with the clocksweeps is they don’t actually track the
> progression of “time” within the
> > PostgreSQL system.
>
> Would it make sense to just cache the result of the latest gettimeofday()
> call
> and use that as an approximation for wall time?
> The busier the system is, the more accurate that should be.
>
>
That sounds...risky. How will the invalidation/updation of the cache work?

How will we track the time window in which the cached value is still valid
and applicable?

My first thoughts only. I may be missing the point though.

Regards,

Atri



-- 
Regards,

Atri
*l'apprenant*


Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-22 Thread Albe Laurenz
Jason Petersen wrote:
> Yes, we obviously want a virtual clock. Focusing on the use of gettimeofday 
> seems silly to me: it was
> something quick for the prototype.
> 
> The problem with the clocksweeps is they don’t actually track the progression 
> of “time” within the
> PostgreSQL system.

Would it make sense to just cache the result of the latest gettimeofday() call
and use that as an approximation for wall time?
The busier the system is, the more accurate that should be.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers