date:20141217

Re: [HACKERS] pg_regress writes into source tree

2014-12-17 Thread Andres Freund

Hi,

On 2014-12-11 22:02:26 -0500, Peter Eisentraut wrote:
> When using a vpath build pg_regress writes the processed input/*.source
> files into the *source* tree, which isn't supposed to happen.
> 
> This appears to be a thinko introduced in this patch:
> e3fc4a97bc8ee82a78605b5ffe79bd4cf3c6213b
> 
> The attached patch fixes it.

I've been annoyed by this more than once, specifically when trying to
run tests with different compilation settings at once...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Async execution of postgres_fdw.

2014-12-17 Thread Ashutosh Bapat

Hi Horiguchi-san,
Here are my comments for the patches together

Sanity
1. The patch applies cleanly but has trailing white spaces.
[ashutosh@ubuntu coderoot]git apply
/mnt/hgfs/tmp/0001-Async-exec-of-postgres_fdw.patch
/mnt/hgfs/tmp/0001-Async-exec-of-postgres_fdw.patch:41: trailing whitespace.
entry->conn =
/mnt/hgfs/tmp/0001-Async-exec-of-postgres_fdw.patch:44: trailing whitespace.

/mnt/hgfs/tmp/0001-Async-exec-of-postgres_fdw.patch:611: trailing
whitespace.

warning: 3 lines add whitespace errors.

2. The patches compile cleanly.
3. The regression is clean, even in contrib/postgres_fdw and
contrib/file_fdw

Tests
---
We need tests to make sure that the logic remains intact even after further
changes in this area. Couple of tests which require multiple foreign scans
within the same query fetching rows more than fetch size (100) would be
required. Also, some DMLs, which involve multiple foreign scans would test
the sanity when UPDATE/DELETE interleave such scans. By multiple foreign
scans I mean both multiple scans on a single foreign server and multiple
scans spread across multiple foreign servers.

Code
---
Because previous "conn" is now replaced by "conn->pgconn", the double
indirection makes the code a bit ugly and prone to segfaults (conn being
NULL or invalid pointer). Can we minimize such code or wrap it under a
macro?

We need some comments about the structure definition of PgFdwConn and its
members explaining the purpose of this structure and its members.

Same is the case with enum PgFdwConnState. In fact, the state diagram of a
connection has become more complicated with the async connections, so it
might be better to explain that state diagram at one place in the code
(through comments). The definition of the enum might be a good place to do
that. Otherwise, the logic of connection maintenance is spread at multiple
places and is difficult to understand by looking at the code.

In function GetConnection(), at line
elog(DEBUG3, "new postgres_fdw connection %p for server \"%s\"",
-entry->conn, server->servername);
+entry->conn->pgconn, server->servername);

entry->conn->pgconn may not necessarily be a new connection and may be NULL
(as the next line check it for being NULL). So, I think this line should be
moved within the following if block after pgconn has been initialised with
the new connection.
+   if (entry->conn->pgconn == NULL)
+   {
+   entry->conn->pgconn = connect_pg_server(server, user);
+   entry->conn->nscans = 0;
+   entry->conn->async_state = PGFDW_CONN_IDLE;
+   entry->conn->async_scan = NULL;
+   }

The if condition if (entry->conn == NULL) in GetConnection(), used to track
whether there is a PGConn active for the given entry, now it tracks whether
it has PgFdwConn for the same.

Please see more comments inline.

On Mon, Dec 15, 2014 at 2:39 PM, Kyotaro HORIGUCHI <
horiguchi.kyot...@lab.ntt.co.jp> wrote:

>
>
> * Outline of this patch
>
> From some consideration after the previous discussion and
> comments from others, I judged the original (WIP) patch was
> overdone as the first step. So I reduced the patch to minimal
> function. The new patch does the following,
>
> - Wrapping PGconn by PgFdwConn in order to handle multiple scans
>   on one connection.
>
> - The core async logic was added in fetch_more_data().
>

It might help if you can explain this logic in this mail as well as in code
(as per my comment above).

>
> - Invoking remote commands asynchronously in ExecInitForeignScan.
>

> - Canceling async invocation if any other foreign scans,
>   modifies, deletes use the same connection.
>

> Cancellation is done by immediately fetching the return of
> already-invoked acync command.
>

> * Where this patch will be effective.
>
> With upcoming inheritance-partition feature, this patch enables
> stating and running foreign scans asynchronously. It will be more
> effective for longer TAT or remote startup times, and larger
> number of foreign servers. No negative performance effect on
> other situations.
>
>
AFAIU, this logic sends only the first query in asynchronous manner not all
of them. Is that right? If yes, I think it's a sever limitation of the
feature. For a query involving multiple foreign scans, only the first one
will be done in async fashion and not the rest. Sorry, if my understanding
is wrong.

I think, we need some data which shows the speed up by this patch. You may
construct a case, where a single query involved multiple foreign scans, and
we can check what is the speed up obtained against the number of foreign
scans.

> * Concerns about this patch.
>
> - This breaks the assumption that scan starts at ExecForeignScan,
>   not ExecInitForeignScan, which might cause some problem.
>
>
This should be fine as long as it doesn't have any side effects like
sending query during EXPLAIN (which has been taken care of in this patch.)
Do you think, we need any special handling for PREPAREd statements?

Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2014-12-17 Thread Michael Paquier

On Wed, Dec 17, 2014 at 11:33 PM, Rahila Syed 
wrote:

> I had a look at code. I have few minor points,
>
Thanks!

+   bkpb.fork_flags |= BKPBLOCK_HAS_IMAGE;
> +
> +   if (is_compressed)
> {
> -   rdt_datas_last->data = page;
> -   rdt_datas_last->len = BLCKSZ;
> +   /* compressed block information */
> +   bimg.length = compress_len;
> +   bimg.extra_data = hole_offset;
> +   bimg.extra_data |= XLR_BLCK_COMPRESSED_MASK;
>
> For consistency with the existing code , how about renaming the macro
> XLR_BLCK_COMPRESSED_MASK as BKPBLOCK_HAS_COMPRESSED_IMAGE on the lines of
> BKPBLOCK_HAS_IMAGE.
>
OK, why not...


> +   blk->hole_offset = extra_data & ~XLR_BLCK_COMPRESSED_MASK;
> Here , I think that having the mask as BKPBLOCK_HOLE_OFFSET_MASK will be
> more indicative of the fact that lower 15 bits of extra_data field
> comprises of hole_offset value. This suggestion is also just to achieve
> consistency with the existing BKPBLOCK_FORK_MASK for fork_flags field.
>
Yeah that seems clearer, let's define it as ~XLR_BLCK_COMPRESSED_MASK
though.

And comment typo
> +* First try to compress block, filling in the page hole with
> zeros
> +* to improve the compression of the whole. If the block is
> considered
> +* as incompressible, complete the block header information as
> if
> +* nothing happened.
>
> As hole is no longer being compressed, this needs to be changed.
>
Fixed. As well as an additional comment block down.

A couple of things noticed on the fly:
- Fixed pg_xlogdump being not completely correct to report the FPW
information
- A couple of typos and malformed sentences fixed
- Added an assertion to check that the hole offset value does not the bit
used for compression status
- Reworked docs, mentioning as well that wal_compression is off by default.
- Removed stuff in pg_controldata and XLOG_PARAMETER_CHANGE (mentioned by
Fujii-san)

Regards,
-- 
Michael


20141218_fpw_compression_v8.tar.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] TABLESAMPLE patch

2014-12-17 Thread Petr Jelinek


On 17/12/14 19:55, Alvaro Herrera wrote:

I noticed that this makes REPEATABLE a reserved keyword, which is
currently an unreserved one.  Can we avoid that?



I added it because I did not find any other way to fix the shift/reduce 
conflicts that bison complained about. I am no bison expert though.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] NUMERIC private methods?

2014-12-17 Thread Andrew Gierth

> "Tom" == Tom Lane  writes:

 >> Hmm. You'd want to make add_var, mul_var etc. non-static?

 Tom> -1 for that.

possibly with more meaningful names.

 Tom> If you're concerned about arithmetic performance, there is a
 Tom> very obvious fix here: use double.

Independently of this specific example, the obvious issue there is that
there are applications for which double is simply not acceptable.

As it stands, no extension can use the numeric type in any non-trivial
way without paying a large penalty for repeated pallocs and data copies.
Given that the ability to write C extensions easily is one of pg's great
strengths, this is a defect that should be corrected.

 Tom> (It would still be orders of magnitude slower, no matter how
 Tom> much we were willing to destroy numeric.c's modularity
 Tom> boundary.)

There is no need to expose any details of NumericVar's implementation;
it would suffice to provide an interface to allocate NumericVars, and
access to the functions.

-- 
Andrew (irc:RhodiumToad)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2014-12-17 Thread Michael Paquier

On Thu, Dec 18, 2014 at 1:05 PM, Fujii Masao  wrote:
> On Wed, Dec 17, 2014 at 1:34 AM, Michael Paquier
>  wrote:
> I think that neither pg_control nor xl_parameter_change need to have the info
> about WAL compression because each backup block has that entry.
>
> Will review the remaining part later.
I got into wondering the utility of this part earlier this morning as
that's some remnant of when wal_compression was set as PGC_POSTMASTER.
Will remove.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2014-12-17 Thread Fujii Masao

On Wed, Dec 17, 2014 at 1:34 AM, Michael Paquier
 wrote:
>
>
> On Wed, Dec 17, 2014 at 12:00 AM, Michael Paquier
>  wrote:
>>
>> Actually, the original length of the compressed block in saved in
>> PGLZ_Header, so if we are fine to not check the size of the block
>> decompressed when decoding WAL we can do without the hole filled with zeros,
>> and use only 1 bit to see if the block is compressed or not.
>
> And.. After some more hacking, I have been able to come up with a patch that
> is able to compress blocks without the page hole, and that keeps the WAL
> record length the same as HEAD when compression switch is off. The numbers
> are pretty good, CPU is saved in the same proportions as previous patches
> when compression is enabled, and there is zero delta with HEAD when
> compression switch is off.
>
> Here are the actual numbers:
>test_name   | pg_size_pretty | user_diff | system_diff
> ---++---+-
>  FPW on + 2 bytes, ffactor 50  | 582 MB | 42.391894 |0.807444
>  FPW on + 2 bytes, ffactor 20  | 229 MB | 14.330304 |0.729626
>  FPW on + 2 bytes, ffactor 10  | 117 MB |  7.335442 |0.570996
>  FPW off + 2 bytes, ffactor 50 | 746 MB | 25.330391 |1.248503
>  FPW off + 2 bytes, ffactor 20 | 293 MB | 10.537475 |0.755448
>  FPW off + 2 bytes, ffactor 10 | 148 MB |  5.762775 |0.763761
>  FPW on + 0 bytes, ffactor 50  | 582 MB | 42.174297 |0.790596
>  FPW on + 0 bytes, ffactor 20  | 229 MB | 14.424233 |0.770459
>  FPW on + 0 bytes, ffactor 10  | 117 MB |  7.057195 |0.584806
>  FPW off + 0 bytes, ffactor 50 | 746 MB | 25.261998 |1.054516
>  FPW off + 0 bytes, ffactor 20 | 293 MB | 10.589888 |0.860207
>  FPW off + 0 bytes, ffactor 10 | 148 MB |  5.827191 |0.874285
>  HEAD, ffactor 50  | 746 MB | 25.181729 |1.133433
>  HEAD, ffactor 20  | 293 MB |  9.962242 |0.765970
>  HEAD, ffactor 10  | 148 MB |  5.693426 |0.775371
>  Record, ffactor 50| 582 MB | 54.904374 |0.678204
>  Record, ffactor 20| 229 MB | 19.798268 |0.807220
>  Record, ffactor 10| 116 MB |  9.401877 |0.668454
> (18 rows)
>
> The new tests of this patch are "FPW off + 0 bytes". Patches as well as
> results are attached.

I think that neither pg_control nor xl_parameter_change need to have the info
about WAL compression because each backup block has that entry.

Will review the remaining part later.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Minor improvement to explain.c

2014-12-17 Thread Etsuro Fujita

Hi,

The attached patch just removes one bad-looking blank line in the
comments at the top of a function in explain.c.

Thanks,

Best regards,
Etsuro Fujita
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 332f04a..064f880 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -568,7 +568,6 @@ ExplainPrintPlan(ExplainState *es, QueryDesc *queryDesc)
 
 /*
  * ExplainPrintTriggers -
-
  *	  convert a QueryDesc's trigger statistics to text and append it to
  *	  es->str
  *

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Combining Aggregates

2014-12-17 Thread Kouhei Kaigai

Hi Atri,

> So are you proposing not calling transfuncs at all and just use combined
> functions?
> 
No. It is discretion of software component that distribute an aggregate
into multiple partial-aggregates.

> That sounds counterintuitive to me. I am not able to see why you would want
> to avoid transfns totally even for the case of pushing down aggregates that
> you mentioned.
> 
> From Simon's example mentioned upthread:
> 
> PRE-AGGREGATED PLAN
> Aggregate
> -> Join
>  -> PreAggregate (doesn't call finalfn)
>   -> Scan BaseTable1
>  -> Scan BaseTable2
> 
> 
> 
> finalfn wouldnt be called. Instead, combined function would be responsible
> for getting preaggregate results and combining them (unless of course, I
> am missing something).
> 
In this case, aggregate distributor is responsible to mark Aggref correctly.
Aggref in Aggregate-node will call combined-function, then final-function to
generate expected result, but no transition-function shall be called because
we expect state date as its input stream.
On the other hands, Aggref in PreAggregate-node will call transition-function
to accumulate its state-data, but does not call final-function because it is
expected to return state-data as is.

> Special casing transition state updating in Aggref seems like a bad idea
> to me. I would think that it would be better if we made it more explicit
> i.e. add a new node on top that does the combination (it would be primarily
> responsible for calling combined function).
> 
I'm neutral towards above idea. Here is no difference in amount of information,
isn't it?
If we define explicit node types, instead of special flags, it seems to me
the following new nodes are needed.
 - Aggref that takes individual rows and populate a state data (trans + 
no-final)
 - Aggref that takes state data and populate a state data (combined + no-final)
 - Aggref that takes state data and populate a final result (combined + final)

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] no test programs in contrib

2014-12-17 Thread Michael Paquier

On Sat, Nov 29, 2014 at 5:54 AM, Alvaro Herrera 
wrote:
>
> I also attach some changes for the MSVC build stuff.  I tested it and it
> builds fine AFAICT, but it doesn't install because Install.pm wants to
> install contrib modules from contrib/ (which seems reasonable) but my
> hack adds the src/test/modules/ as contrib modules also, so Install.pm
> goes bonkers.  I'm not even sure *what* we're supposed to build -- there
> is no distinction in these programs as there is in the makefiles about
> what to install.  So if some Windows developer can look into this, I'd
> appreciate it.
>

It would be good to be consistent on Windows with what is now done on other
platforms: those modules should not be installed by default, but it would
be good to make install.pm a bit smarter with for example an option "full",
aka install server + client + test modules. Building them is worth it in
any case as they can be used with modulescheck.
My 2c.
-- 
Michael

Re: [HACKERS] inherit support for foreign tables

2014-12-17 Thread Etsuro Fujita

(2014/12/18 7:04), Tom Lane wrote:
> Etsuro Fujita  writes:
>> Attached are updated patches.  Patch fdw-inh-5.patch has been created on
>> top of patch fdw-chk-5.patch.  Patch fdw-chk-5.patch is basically the
>> same as the previous one fdw-chk-4.patch, but I slightly modified that
>> one.  The changes are the following.
>> * added to foreign_data.sql more tests for your comments.
>> * revised docs on ALTER FOREIGN TABLE a bit further.
> 
> I've committed fdw-chk-5.patch with some minor further adjustments;
> the most notable one was that I got rid of the error check prohibiting
> NO INHERIT, which did not seem to me to have any value.  Attaching such
> a clause won't have any effect, but so what?
> 
> Have not looked at the other patch yet.

Thanks!

I added the error check because the other patch, fdw-inh-5.patch,
doesn't allow foreign tables to be inherited and so it seems more
consistent at least to me to do so.

Best regards,
Etsuro Fujita


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Compiling C++ extensions on MSVC using scripts in src/tools

2014-12-17 Thread Michael Paquier

On Thu, Dec 18, 2014 at 4:55 AM, Heikki Linnakangas
 wrote:
> If .lp and .ypp files are supposed to be Bison and Flex files with C++ code
> in them, it wouldn't work anyway, because the rules elsewhere in the MSVC
> scripts just check for /\.y$/) to decide whether to run bison on it.
>
> I committed Andrew's suggestion:
> /^(.*)\\([^\\]+)\.(c|cpp|y|l|rc)$/
Thanks, that's enough for my stuff.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] moving from contrib to bin

2014-12-17 Thread Michael Paquier

On Thu, Dec 18, 2014 at 4:02 AM, Alvaro Herrera
 wrote:
>
> I know this is how it currently works, but it looks way too messy to me:
>
> +   my $pgarchivecleanup = AddSimpleFrontend('pg_archivecleanup');
> +   my $pgstandby = AddSimpleFrontend('pg_standby');
> +   my $pgtestfsync = AddSimpleFrontend('pg_test_fsync');
> +   my $pgtesttiming = AddSimpleFrontend('pg_test_timing');
> +   my $pgbench = AddSimpleFrontend('pgbench', 1);
>
> ISTM we should be something like
>
> for each $elem in src/bin/Makefile:$(SUBDIRS)
> AddSimpleFrontend($elem)
>
> and avoid having to list the modules one by one.

If we take this road, I'd like to avoid a huge if/elseif scanning the
names of the submodules to do the necessary adjustments (Some need
FRONTEND defined, others ws2_32, etc.). Also, there is the case of
pg_basebackup where multiple binaries are included with pg_basebackup,
pg_recvlogical and pg_receivexlog. So I think that we'd need something
similar to what contrib does, aka:
my @frontend_excludes = ('pg_basebackup', 'pg_dump', 'pg_dumpall',
'pg_xlogdump', 'initdb' ...);
my frontend_extralibs = ('pgbench' => 'ws2_32.lib');
my @frontend_uselibpq = ('pgbench', 'pg_ctl', 'pg_upgrade');
And for each frontend name excluded we have an individual project
declaration with its own exceptions. With this way of doing when a new
frontend is added by default in src/bin it will be automatically
compiled. How does that sound?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}

2014-12-17 Thread Peter Geoghegan

On Wed, Dec 17, 2014 at 1:12 PM, Heikki Linnakangas
 wrote:
> It looks like we are close to reaching consensus on the syntax. Phew! Thanks
> for maintaining the wiki pages and the documentation. All of the below is
> based on those, I haven't looked at the patch itself yet.

Great, thanks!

Yes, I am relieved that we appeared to have agreed on a syntax.

> The one thing that I still feel uneasy about is the Unique Index Inference
> thing.

> The documentation says that:
>
>> Omitting the specification indicates a total indifference to where
>> any would-be uniqueness violation could occur, which isn't always
>> appropriate;

> Some questions:
>
> 1. Does that mean that if you leave out the key columns, the insertion is
> IGNOREd if it violates *any* unique key constraint?

Yes. This is particularly important for the implementation of things
like IGNORE's updatable view support. More generally, for various ETL
use cases it's possible to imagine the user simply not caring.

> 2. If you do specify the key columns, then the IGNORE path is taken only if
> the insertion violates a unique key constraint on those particular columns.
> Otherwise an error is thrown. Right?

That's right.

> Now, let's imagine a table like this:
>
> CREATE TABLE persons (
>   username text unique,
>   real_name text unique,
>   data text
> );
>
> Is there any way to specify both of those constraints, so that the insertion
> is IGNOREd if it violates either one of them? If you try to do:
>
> INSERT INTO persons(username, real_name, data)
> VALUES('foobar', 'foo bar')
> ON CONFLICT (username, real_name) IGNORE;
>
> It will fail because there is no unique index on (username, real_name). In
> this particular case, you could leave out the specification, but if there
> was a third constraint that you're not expecting to conflict with, you would
> want violations of that constraint to still throw an error. And you can't
> leave out the specification with ON CONFLICT UPDATE anyway.

Good point.

For the IGNORE case: I guess the syntax just isn't that flexible. I
agree that that isn't ideal.

For the UPDATE case: Suppose your example was an UPDATE where we
simply assigned the excluded.data value to the data column in the
auxiliary UPDATE's targetlist. What would the user really be asking
for with that command, at a really high level? It seems like they
might actually want to run two UPSERT commands (one for username, the
other for real_name), or rethink their indexing strategy - in
particular, whether it's appropriate that there isn't a composite
unique constraint on (username, real_name).

Now, suppose that by accident or by convention it will always be
possible for a composite unique index to be built on (username,
real_name) - no dup violations would be raised if it was attempted,
but it just hasn't been and won't be. In other words, it's generally
safe to actually pretend that there is one. Then, surely it doesn't
matter if the user picks one or the other unique index. It'll all work
out when the user assigns to both in the UPDATE targetlist, because of
the assumed convention that I think is implied by the example. If the
convention is violated, at least you get a dup violation letting you
know (iff you bothered to assign). But I wouldn't like to encourage
that pattern.

I think that the long and the short of it is that you really ought to
have one unique index as an arbiter in mind when writing a DML
statement for the UPDATE variant. Relying on this type of convention
is possible, I suppose, but ill-advised.

> 3. Why is the specification required with ON CONFLICT UPDATE, but not with
> ON CONFLICT IGNORE?

That was a fairly recent decision, taken mainly to keep Kevin happy --
although TBH I don't recall that he was particularly insistent on that
restriction. I could still go either way on that question.

The idea, as I mentioned, is that it's legitimate to not care where a
dup violation might occur for certain ETL use cases. For UPSERT,
though, the only argument for not making it mandatory is that that's
something extra to type, and lazy people would prefer not to bother.
This is because we assume that the first dup violation is the only
possible one without the unique index inference clause. If we don't
have the index expressions with which to infer an arbiter unique index
(as with MySQL's ON DUPLICATE KEY UPDATE), you'd better be sure that
you accounted for all possible sources of would-be duplicate
violations - otherwise a random row will be updated!

That isn't a fantastic argument for not making a unique index
inference clause mandatory, but it might be an okay one.

> 4. What happens if there are multiple unique indexes with identical columns,
> and you give those columns in the inference specification? Doesn't matter
> which index you use, I guess, if they're all identical, but see next
> question.

It doesn't really matter which one you pick, but right now, at the
urging of Robert, we cost the list of candidates and pick the che

Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}

2014-12-17 Thread Josh Berkus

On 12/17/2014 01:12 PM, Heikki Linnakangas wrote:
> 3. Why is the specification required with ON CONFLICT UPDATE, but not
> with ON CONFLICT IGNORE?

Well, UPDATE has to know which row to lock, no?  IGNORE does not.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] inherit support for foreign tables

2014-12-17 Thread Tom Lane

Etsuro Fujita  writes:
> Attached are updated patches.  Patch fdw-inh-5.patch has been created on 
> top of patch fdw-chk-5.patch.  Patch fdw-chk-5.patch is basically the 
> same as the previous one fdw-chk-4.patch, but I slightly modified that 
> one.  The changes are the following.
> * added to foreign_data.sql more tests for your comments.
> * revised docs on ALTER FOREIGN TABLE a bit further.

I've committed fdw-chk-5.patch with some minor further adjustments;
the most notable one was that I got rid of the error check prohibiting
NO INHERIT, which did not seem to me to have any value.  Attaching such
a clause won't have any effect, but so what?

Have not looked at the other patch yet.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}

2014-12-17 Thread Heikki Linnakangas

It looks like we are close to reaching consensus on the syntax. Phew! 
Thanks for maintaining the wiki pages and the documentation. All of the 
below is based on those, I haven't looked at the patch itself yet.


The one thing that I still feel uneasy about is the Unique Index 
Inference thing. Per the syntax example from the wiki page, the UPSERT 
statement looks like this:


INSERT INTO upsert(key, val) VALUES(1, 'insert')
   ON CONFLICT (key) IGNORE;

With ON CONFLICT IGNORE, the list of key columns can also be left out:

INSERT INTO upsert(key, val) VALUES(1, 'insert')
   ON CONFLICT IGNORE;

The documentation says that:


Omitting the specification indicates a total indifference to where
any would-be uniqueness violation could occur, which isn't always
appropriate; at times, it may be desirable for ON CONFLICT IGNORE to
not suppress a duplicate violation within an index where that isn't
explicitly anticipated. Note that ON CONFLICT UPDATE assignment may
result in a uniqueness violation, just as with a conventional
UPDATE.


Some questions:

1. Does that mean that if you leave out the key columns, the insertion 
is IGNOREd if it violates *any* unique key constraint?


2. If you do specify the key columns, then the IGNORE path is taken only 
if the insertion violates a unique key constraint on those particular 
columns. Otherwise an error is thrown. Right? Now, let's imagine a table 
like this:


CREATE TABLE persons (
  username text unique,
  real_name text unique,
  data text
);

Is there any way to specify both of those constraints, so that the 
insertion is IGNOREd if it violates either one of them? If you try to do:


INSERT INTO persons(username, real_name, data)
VALUES('foobar', 'foo bar')
ON CONFLICT (username, real_name) IGNORE;

It will fail because there is no unique index on (username, real_name). 
In this particular case, you could leave out the specification, but if 
there was a third constraint that you're not expecting to conflict with, 
you would want violations of that constraint to still throw an error. 
And you can't leave out the specification with ON CONFLICT UPDATE anyway.


3. Why is the specification required with ON CONFLICT UPDATE, but not 
with ON CONFLICT IGNORE?


4. What happens if there are multiple unique indexes with identical 
columns, and you give those columns in the inference specification? 
Doesn't matter which index you use, I guess, if they're all identical, 
but see next question.


5. What if there are multiple unique indexes with the same columns, but 
different operator classes?


6. Why are partial unique indexes not supported as arbitrators?

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Combining Aggregates

2014-12-17 Thread Simon Riggs

On 17 December 2014 at 19:06, David Rowley  wrote:

> Standalone calls to the combine/merge functions I don't think would be
> testing anything new.

Guess its a simple enough API, doesn't really need a specific test.

> That's the reason I thought it wasn't really acceptable until we have a use
> for this. That's why I posted on the thread about parallel seqscan. I hoped
> that Amit could add something which needed it and I could get it committed
> that way.

My view is that its infrastructure to cover a variety of possibilities.

It's best if if it goes in with enough time to get some user cases
soon. If not, its there at the start of the cycle for next release.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSL information view

2014-12-17 Thread Heikki Linnakangas


On 11/19/2014 02:36 PM, Magnus Hagander wrote:

+   /* Create or attach to the shared SSL status buffers */
+   size = mul_size(NAMEDATALEN, MaxBackends);
+   BackendSslVersionBuffer = (char *)
+   ShmemInitStruct("Backend SSL Version Buffer", size, &found);
+
+   if (!found)
+   {
+   MemSet(BackendSslVersionBuffer, 0, size);
+
+   /* Initialize st_ssl_version pointers. */
+   buffer = BackendSslVersionBuffer;
+   for (i = 0; i < MaxBackends; i++)
+   {
+   BackendStatusArray[i].st_ssl_version = buffer;
+   buffer += NAMEDATALEN;
+   }
+   }
+
+   size = mul_size(NAMEDATALEN, MaxBackends);
+   BackendSslCipherBuffer = (char *)
+   ShmemInitStruct("Backend SSL Cipher Buffer", size, &found);
+
+   if (!found)
+   {
+   MemSet(BackendSslCipherBuffer, 0, size);
+
+   /* Initialize st_ssl_cipher pointers. */
+   buffer = BackendSslCipherBuffer;
+   for (i = 0; i < MaxBackends; i++)
+   {
+   BackendStatusArray[i].st_ssl_cipher = buffer;
+   buffer += NAMEDATALEN;
+   }
+   }
+
+   size = mul_size(NAMEDATALEN, MaxBackends);
+   BackendSslClientDNBuffer = (char *)
+   ShmemInitStruct("Backend SSL Client DN Buffer", size, &found);
+
+   if (!found)
+   {
+   MemSet(BackendSslClientDNBuffer, 0, size);
+
+   /* Initialize st_ssl_clientdn pointers. */
+   buffer = BackendSslClientDNBuffer;
+   for (i = 0; i < MaxBackends; i++)
+   {
+   BackendStatusArray[i].st_ssl_clientdn = buffer;
+   buffer += NAMEDATALEN;
+   }
+   }


This pattern gets a bit tedious. We do that already for 
application_names, client hostnames, and activity status but this adds 
three more such strings. Why are these not just regular char arrays in 
PgBackendStatus struct, anyway? The activity status is not, because its 
size is configurable with the pgstat_track_activity_query_size GUC, but 
all those other things are fixed-size.


Also, it would be nice if you didn't allocate the memory for all those 
SSL strings, when SSL is disabled altogether. Perhaps put the 
SSL-related information into a separate struct:


struct
{
/* Information about SSL connection */
int st_ssl_bits;
boolst_ssl_compression;
charst_ssl_version[NAMEDATALEN];  /* MUST be 
null-terminated */
charst_ssl_cipher[NAMEDATALEN];   /* MUST be 
null-terminated */
charst_ssl_clientdn[NAMEDATALEN]; /* MUST be 
null-terminated */
} PgBackendSSLStatus;

Those structs could be allocated like you allocate the string buffers 
now, with a pointer to that struct from PgBackendStatus. When SSL is 
disabled, the structs are not allocated and the pointers in 
PgBackendStatus structs are NULL.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Compiling C++ extensions on MSVC using scripts in src/tools

2014-12-17 Thread Heikki Linnakangas


On 11/27/2014 06:39 AM, Michael Paquier wrote:

On Thu, Nov 27, 2014 at 1:40 AM, Tom Lane  wrote:

Andrew Dunstan  writes:

This doesn't seem to me to be terribly well expressed (I know it's not your 
fault, quite possibly it's mine.) Perhaps we should replace
 [r]?[cyl](pp)?
with
 (c|cpp|y|l|rc)


+1 ... the original coding is illegible already, not to mention wrong
since it will match stuff it shouldn't.

Yes even the older code could find matches with ry or rl. Except that,
lpp and ypp could be present as well. OK, there are low chances to be
present in a Postgres extension (I don't have such extensions myself),
still they could. So I think that this expression should be written
like that instead:
(c|cpp|l|lpp|y|ypp|rc).
Updated patch is attached.


If .lp and .ypp files are supposed to be Bison and Flex files with C++ 
code in them, it wouldn't work anyway, because the rules elsewhere in 
the MSVC scripts just check for /\.y$/) to decide whether to run bison 
on it.


I committed Andrew's suggestion:

/^(.*)\\([^\\]+)\.(c|cpp|y|l|rc)$/

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parallel mode and parallel contexts

2014-12-17 Thread Robert Haas

On Wed, Dec 17, 2014 at 9:16 AM, Simon Riggs  wrote:
> On 12 December 2014 at 22:52, Robert Haas  wrote:
>
>> I would be remiss if I failed to mention that this patch includes work
>> by my colleagues Amit Kapila, Rushabh Lathia, and Jeevan Chalke, as
>> well as my former colleague Noah Misch; and that it would not have
>> been possible without the patient support of EnterpriseDB management.
>
> Noted and thanks to all.
>
> I will make this my priority for review, but regrettably not until New
> Year, about 2.5-3 weeks away.

OK!

In the meantime, I had a good chat with Heikki on IM yesterday which
gave me some new ideas on how to fix up the transaction handling in
here, which I am working on implementing.  So hopefully I will have
that by then.

I am also hoping Amit will be adapting his parallel seq-scan patch to
this framework.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On partitioning

2014-12-17 Thread Robert Haas

On Wed, Dec 17, 2014 at 1:53 PM, Josh Berkus  wrote:
> On 12/16/2014 07:35 PM, Robert Haas wrote:
>> On Tue, Dec 16, 2014 at 9:01 PM, Josh Berkus  wrote:
>>> Anyway, what I'm saying is that I personally regard the inability to
>>> handle even moderately complex expressions a major failing of our
>>> existing partitioning scheme (possibly its worst single failing), and I
>>> would regard any new partitioning feature which didn't address that
>>> issue as suspect.
>>
>> I understand, but I think you need to be careful not to stonewall all
>> progress in the name of getting what you want.  Getting the
>> partitioning metadata into the system catalogs in a suitable format
>> will be a huge step forward regardless of whether it solves this
>> particular problem right away or not, because it will make it possible
>> to solve this problem in a highly-efficient way, which is quite hard
>> to do right now.
>
> Sure.  But there's a big difference between "we're going to take these
> steps and that problem will be fixable eventually" and "we're going to
> retain features of the current partitioning system which make that
> problem impossible to fix."  The drift of discussion on this thread
> *sounded* like the latter, and I've been calling attention to the issue
> in an effort to make sure that it's not.
>
> Last week, I wrote a longish email listing out the common problems users
> have with our current partitioning as a way of benchmarking the plan for
> new partitioning.  While some people responded to that post, absolutely
> nobody discussed the list of issues I gave.  Is that because there's
> universal agreement that I got the major issues right?  Seems doubtful.

I agreed with many of the things you listed but not all of them.
However, I don't think it's realistic to burden whatever patch Amit
writes with the duty of, for example, making global indexes work.
That's a huge problem all of its own.  Now, conceivably, we could try
to solve that as part of the next patch by insisting that the
"partitions" have to really be block number ranges within a single
relfilenode rather than separate relfilenodes as they are today ...
but I think that's a bad design which we would likely regret bitterly.
I also think that it would likely make what's being talked about here
so complicated that it will never go anywhere.  I think it's better
that we focus on solving one problem really well - storing metadata
for partition boundaries in the catalog so that we can do efficient
tuple routing and partition pruning - and leave the other problems for
later.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On partitioning

2014-12-17 Thread Josh Berkus

On 12/17/2014 11:19 AM, Heikki Linnakangas wrote:
> On 12/17/2014 08:53 PM, Josh Berkus wrote:
>> Last week, I wrote a longish email listing out the common problems users
>> have with our current partitioning as a way of benchmarking the plan for
>> new partitioning.  While some people responded to that post, absolutely
>> nobody discussed the list of issues I gave.  Is that because there's
>> universal agreement that I got the major issues right?  Seems doubtful.
> 
> That was a good list.

;-)

Ok, that made my morning.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On partitioning

2014-12-17 Thread Heikki Linnakangas


On 12/17/2014 08:53 PM, Josh Berkus wrote:

Last week, I wrote a longish email listing out the common problems users
have with our current partitioning as a way of benchmarking the plan for
new partitioning.  While some people responded to that post, absolutely
nobody discussed the list of issues I gave.  Is that because there's
universal agreement that I got the major issues right?  Seems doubtful.


That was a good list.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] moving from contrib to bin

2014-12-17 Thread Alvaro Herrera

I know this is how it currently works, but it looks way too messy to me:

+   my $pgarchivecleanup = AddSimpleFrontend('pg_archivecleanup');
+   my $pgstandby = AddSimpleFrontend('pg_standby');
+   my $pgtestfsync = AddSimpleFrontend('pg_test_fsync');
+   my $pgtesttiming = AddSimpleFrontend('pg_test_timing');
+   my $pgbench = AddSimpleFrontend('pgbench', 1);

ISTM we should be something like 

for each $elem in src/bin/Makefile:$(SUBDIRS)
AddSimpleFrontend($elem)

and avoid having to list the modules one by one.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] TABLESAMPLE patch

2014-12-17 Thread Alvaro Herrera

I noticed that this makes REPEATABLE a reserved keyword, which is
currently an unreserved one.  Can we avoid that?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] btree_gin and ranges

2014-12-17 Thread Heikki Linnakangas


On 10/22/2014 01:55 PM, Teodor Sigaev wrote:

Suggested patch adds GIN support contains operator for ranges over scalar 
column.

It allows more effective GIN scan. Currently, queries like
SELECT * FROM test_int4 WHERE i <= 1 and i >= 1
will be excuted by GIN with two scans: one is from mines infinity to 1 and
another is from -1 to plus infinity. That's because GIN is "generalized" and it
doesn't know a semantics of operation.

With patch it's possible to rewrite query with ranges
SELECT * FROM test_int4 WHERE i <@ '[-1, 1]'::int4range
and GIN index will support this query with single scan from -1 to 1.

Patch provides index support only for existing range types: int4, int8, numeric,
date and timestamp with and without time zone.


I started to look at this, but very quickly got carried away into 
refactoring away the macros. Defining long functions as macros makes 
debugging quite difficult, and it's not nice to read or edit the source 
code either.


I propose the attached refactoring (it doesn't include your range-patch 
yet, just refactoring the existing code). It turns the bulk of the 
macros into static functions. GIN_SUPPORT macro still defines the 
datatype-specific functions, but they are now very thin wrappers that 
just call the corresponding generic static functions.


It's annoying that we need the concept of a leftmost value, for the < 
and <= queries. Without that, we could have the support functions look 
up the required datatype information from the type cache, based on the 
datatype of the passed argument. Then you could easily use the btree_gin 
support functions also for user-defined datatypes. But that needs bigger 
changes, and this is a step in the right direction anyway.


- Heikki

>From fc96b3842e0695f8b4e39fc870ad0c3d628db834 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas 
Date: Wed, 17 Dec 2014 19:57:39 +0200
Subject: [PATCH] Turn much of the btree_gin macros into real functions.

This makes the functions much nicer to read and edit, and also makes
debugging easier.

diff --git a/contrib/btree_gin/btree_gin.c b/contrib/btree_gin/btree_gin.c
index 87d23e0..80521fb 100644
--- a/contrib/btree_gin/btree_gin.c
+++ b/contrib/btree_gin/btree_gin.c
@@ -17,34 +17,30 @@
 
 PG_MODULE_MAGIC;
 
-typedef struct TypeInfo
-{
-	bool		is_varlena;
-	Datum		(*leftmostvalue) (void);
-	Datum		(*typecmp) (FunctionCallInfo);
-} TypeInfo;
-
 typedef struct QueryInfo
 {
 	StrategyNumber strategy;
 	Datum		datum;
+	bool		is_varlena;
+	Datum		(*typecmp) (FunctionCallInfo);
 } QueryInfo;
 
-#define  GIN_EXTRACT_VALUE(type)			\
-PG_FUNCTION_INFO_V1(gin_extract_value_##type);\
-Datum		\
-gin_extract_value_##type(PG_FUNCTION_ARGS)	\
-{			\
-	Datum		datum = PG_GETARG_DATUM(0);	\
-	int32	   *nentries = (int32 *) PG_GETARG_POINTER(1);	\
-	Datum	   *entries = (Datum *) palloc(sizeof(Datum));	\
-			\
-	if ( TypeInfo_##type.is_varlena )		\
-		datum = PointerGetDatum(PG_DETOAST_DATUM(datum));	\
-	entries[0] = datum;		\
-	*nentries = 1;			\
-			\
-	PG_RETURN_POINTER(entries);\
+
+/*** GIN support functions shared by all datatypes ***/
+
+static Datum
+gin_btree_extract_value(FunctionCallInfo fcinfo, bool is_varlena)
+{
+	Datum		datum = PG_GETARG_DATUM(0);
+	int32	   *nentries = (int32 *) PG_GETARG_POINTER(1);
+	Datum	   *entries = (Datum *) palloc(sizeof(Datum));
+
+	if (is_varlena)
+		datum = PointerGetDatum(PG_DETOAST_DATUM(datum));
+	entries[0] = datum;
+	*nentries = 1;
+
+	PG_RETURN_POINTER(entries);
 }
 
 /*
@@ -55,49 +51,51 @@ gin_extract_value_##type(PG_FUNCTION_ARGS)	\
  * key, and work forward until the supplied query datum (which must be
  * sent along inside the QueryInfo structure).
  */
+static Datum
+gin_btree_extract_query(FunctionCallInfo fcinfo,
+		bool is_varlena,
+		Datum (*leftmostvalue) (void),
+		Datum (*typecmp) (FunctionCallInfo))
+{
+	Datum		datum = PG_GETARG_DATUM(0);
+	int32	   *nentries = (int32 *) PG_GETARG_POINTER(1);
+	StrategyNumber strategy = PG_GETARG_UINT16(2);
+	bool	  **partialmatch = (bool **) PG_GETARG_POINTER(3);
+	Pointer   **extra_data = (Pointer **) PG_GETARG_POINTER(4);
+	Datum	   *entries = (Datum *) palloc(sizeof(Datum));
+	QueryInfo  *data = (QueryInfo *) palloc(sizeof(QueryInfo));
+	bool	   *ptr_partialmatch;
+
+	*nentries = 1;
+	ptr_partialmatch = *partialmatch = (bool *) palloc(sizeof(bool));
+	*ptr_partialmatch = false;
+	if (is_varlena)
+		datum = PointerGetDatum(PG_DETOAST_DATUM(datum));
+	data->strategy = strategy;
+	data->datum = datum;
+	data->is_varlena = is_varlena;
+	data->typecmp = typecmp;
+	*extra_data = (Pointer *) palloc(sizeof(Pointer));
+	**extra_data = (Pointer) data;
+
+	switch (strategy)
+	{
+		case BTLessStrategyNumber:
+		case BTLessEqualStrategyNumber:
+			entries[0] = leftmostvalue();
+			*ptr_partialmatch = true;
+			break;
+		case BTGreaterEqualStrategyNumber:
+		case BTG

Re: [HACKERS] Combining Aggregates

2014-12-17 Thread David Rowley

On 18 December 2014 at 02:48, Simon Riggs  wrote:
>
> On 17 December 2014 at 12:35, Kouhei Kaigai  wrote:
>
> > Its concept is good to me. I think, the new combined function should be
> > responsible to take a state data type as argument and update state object
> > of the aggregate function. In other words, combined function performs
> like
> > transition function but can update state object according to the summary
> > of multiple rows. Right?
>
> That wasn't how I defined it, but I think your definition is better.
>
> > It also needs some enhancement around Aggref/AggrefExprState structure to
> > inform which function shall be called on execution time.
> > Combined functions are usually no-thank-you. AggrefExprState updates its
> > internal state using transition function row-by-row. However, once
> someone
> > push-down aggregate function across table joins, combined functions have
> > to be called instead of transition functions.
> > I'd like to suggest Aggref has a new flag to introduce this aggregate
> expects
> > state object instead of scalar value.
> >
> > Also, I'd like to suggest one other flag in Aggref not to generate final
> > result, and returns state object instead.
> >
> > Let me use David's example but little bit adjusted.
> >
> > original)
> > SELECT p.name, AVG(s.qty)
> >   FROM sales s INNER JOIN product p ON s.product_id = s.product_id
> >   GROUP BY p.name;
> >
> > modified)
> > SELECT p.name, AVG(qty)
> >   FROM (SELECT product_id, AVG(qty) AS qty FROM sales GROUP BY
> product_id) s
> >INNER JOIN product p
> >   ON p.product_id = s.product_id GROUP BY p_name;
> >
> > Let's assume the modifier set a flag of use_combine_func on the AVG(qty)
> of
> > the main query, and also set a flag of not_generate_final on the
> AVG(qty) of
> > the sub-query.
> > It shall work as we expected.
>
> That matches my thinking exactly.
>
> David, if you can update your patch with some docs to explain the
> behaviour, it looks complete enough to think about committing it in
> early January, to allow other patches that depend upon it to stand a
> chance of getting into 9.5. (It is not yet ready, but I see it could
> be).
>
>
Yes I'll add something to it and post here.


> The above example is probably the best description of the need, since
> user defined aggregates must also understand this.
>
> Could we please call these "combine functions" or other? MERGE is an
> SQL Standard statement type that we will add later, so it will be
> confusing if we use the "merge" word in this context.
>
>
Yeah I think you're right, combine may help remove some confusion when we
get MERGE.


> David, your patch avoids creating any mergefuncs for existing
> aggregates. We would need to supply working examples for at least a
> few of the builtin aggregates, so we can test the infrastructure. We
> can add examples for all cases later.
>
>
I added merge/combine functions for all the aggregates I could do by making
use of existing functions. I did all the MAX() and MIN() ones and
bit_and(), bit_or(), and a few sum() ones that didn't have a final function.

It felt a bit premature to add new functions to support avg and stddev,
since that's probably the bulk of the work.


> Is there a way of testing this in existing code? Or do we need to add
> something to test it?
>
>
I can't think of anyway to test it. Apart from the create aggregate syntax
test I also added.

Standalone calls to the combine/merge functions I don't think would be
testing anything new.

That's the reason I thought it wasn't really acceptable until we have a use
for this. That's why I posted on the thread about parallel seqscan. I hoped
that Amit could add something which needed it and I could get it committed
that way.

Regards

David Rowley

Re: [alvhe...@2ndquadrant.com: Re: [HACKERS] no test programs in contrib]

2014-12-17 Thread Andrew Dunstan



On 12/17/2014 01:02 PM, Tom Lane wrote:

Andrew Dunstan  writes:

On 12/17/2014 11:34 AM, Andrew Dunstan wrote:

Oh, darn, I thought we had a version check. Will fix.

OK, I have committed a fix. Revised script is at


Pulled into dromedary, thanks.


I have set crake to do runs that should clear the errors:
 cd root && for f in REL*; do touch $f/crake.force-one-run; done

Cool, was just wondering what was the easiest way to force that.
Maybe this should be documented on the buildfarm how-to page?




Done.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest problems

2014-12-17 Thread Stephen Frost

* Jaime Casanova (ja...@2ndquadrant.com) wrote:
> On Tue, Dec 16, 2014 at 11:32 AM, Robert Haas  wrote:
> >
> > It has been proposed that we do a general list of people at the bottom
> > of the release notes who helped review during that cycle.  That would
> > be less intrusive and possibly a good idea, but would we credit the
> > people who did a TON of reviewing?  Everyone who reviewed even one
> > patch?  Somewhere in between? Would committers be excluded because "we
> > just expect them to help" or included because credit is important to
> > established community members too?  To what extent would this be
> > duplicative of http://www.postgresql.org/community/contributors/ ?
> 
> Not much really, I tried to get my name on that list a couple of years
> ago, when i reviewed more than i do now, and never got it.
> And while my name is in a couple commit messages, that is a lot harder
> to show to people...

Having your name in a list of other names at the bottom of the release
notes page, without any indication of what you helped with, would work
better?  Perhaps it would but I tend to doubt it.

If the release notes were expanded to include more than just a list of
reviewer names, and more along the lines of the way patch authors are
credited, that would be more valuable for the reviewers but I'm not
excited about such an increase in the size of the release notes.  To
play it out a bit though, we could do something like:

* Add Magic PostgreSQL feature (Author: John Doe, Reviewed By: Jane Doe)

With the commit history now including that information, generally, this
would hopefully not require too much additional work for the release
notes authors.  As for the 'rules' about who is considered a reviewer
for these purposes, I'd suggest it just go off of whatever is mentioned
in the commit history but exclude committers (and perhaps major
contributors..?  they are already acknowledged on the website, after
all..).  That would help keep the lists of reviewers from getting too
large, hopefully.

I'm pretty sure this was brought up previously and shot down but perhaps
feelings have changed about it now, especially as the information is
generally in the commit history and there is already some amount of
"policy" regarding who gets what credit as it's more-or-less up to the
committers.  I definitely think that, in general, reviews need to be
more than "it compiles" to get that credit.

> you know, it's kind of frustrating when some not-yet customers ask for
> certificated engineers, and there isn't any official (as in "from
> community") certificate so you need to prove you're a contributor so
> let's see this random commit messages...

Another thought I had was to suggest we consider *everyone* to be a
contributor and implement a way to tie together the mailing list
archives with the commit history and perhaps the commitfest app and make
it searchable and indexed on some website.  eg:

contributors.postgresql.org/sfrost
  - Recent commits
  - Recent commit mentions
  - Recent emails to any list
  - Recent commitfest app activity
  - Recent wiki page updates
...

Ideally with a way for individuals to upload a photo, provide a company
link, etc, similar to what the existing Major Contributors have today.
Obviously, this is not a small task to develop and there is some risk of
abuse (which I expect the other folks on the infra team will point out
and likely tar and feather me for suggesting this at all..) but it might
be along the same lines as Bruce's PgLife..

Thanks,

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Combining Aggregates

2014-12-17 Thread David Rowley

On 18 December 2014 at 01:31, Simon Riggs  wrote:
>
> On 17 December 2014 at 10:20, David Rowley  wrote:
> > On 17 December 2014 at 22:53, Simon Riggs  wrote:
> >>
> >> KaiGai, David Rowley and myself have all made mention of various ways
> >> we could optimize aggregates.
> >>
> >> Following WIP patch adds an extra function called a "combining
> >> function", that is intended to allow the user to specify a
> >> semantically correct way of breaking down an aggregate into multiple
> >> steps.
> >>
> >> Gents, is this what you were thinking? If not...
> >>
> >
> > Very much so! You must have missed my patch.
> >
> >
> http://www.postgresql.org/message-id/CAApHDvrZG5Q9rNxU4WOga8AgvAwQ83bF83CFvMbOQcCg8vk=z...@mail.gmail.com
>
> Very strange that you should post an otherwise unrelated patch on
> someone else's thread AND not add the patch to the CommitFest.
>
> Stealth patch submission is a new one on me.
>
>
Apologies about that, It was a bad decision.

I had thought that it's a bit of a chicken and the egg problem... This is
the egg, we just need a chicken to come and lay it.

I had imagined that it would be weird to commit something that's dead in
code and not all that testable until someone adds some other code to
utilise it.

Regards

David Rowley

Re: [HACKERS] On partitioning

2014-12-17 Thread Josh Berkus

On 12/16/2014 07:35 PM, Robert Haas wrote:
> On Tue, Dec 16, 2014 at 9:01 PM, Josh Berkus  wrote:
>> Anyway, what I'm saying is that I personally regard the inability to
>> handle even moderately complex expressions a major failing of our
>> existing partitioning scheme (possibly its worst single failing), and I
>> would regard any new partitioning feature which didn't address that
>> issue as suspect.
> 
> I understand, but I think you need to be careful not to stonewall all
> progress in the name of getting what you want.  Getting the
> partitioning metadata into the system catalogs in a suitable format
> will be a huge step forward regardless of whether it solves this
> particular problem right away or not, because it will make it possible
> to solve this problem in a highly-efficient way, which is quite hard
> to do right now.

Sure.  But there's a big difference between "we're going to take these
steps and that problem will be fixable eventually" and "we're going to
retain features of the current partitioning system which make that
problem impossible to fix."  The drift of discussion on this thread
*sounded* like the latter, and I've been calling attention to the issue
in an effort to make sure that it's not.

Last week, I wrote a longish email listing out the common problems users
have with our current partitioning as a way of benchmarking the plan for
new partitioning.  While some people responded to that post, absolutely
nobody discussed the list of issues I gave.  Is that because there's
universal agreement that I got the major issues right?  Seems doubtful.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: Log inability to lock pages during vacuum

2014-12-17 Thread Alvaro Herrera

Jim Nasby wrote:
> On 12/17/14, 10:20 AM, Heikki Linnakangas wrote:
> >>
> >>>* Change the new bit in the errdetail. "could not acquire cleanup lock"
> >>>sounds too much like an error to me. "skipped %u pinned pages" maybe?
> >>
> >>Seems reasonable.
> >
> >Well, that's not always true either; when freezing, it doesn't skip the 
> >pinned pages, it waits for them.
> 
> Oops. :(
> 
> >I.e. this just says how many pages were pinned, without saying what was done 
> >about them. That's not very meaningful to an average DBA, but that's true 
> >for many of the numbers printed in vacuum verbose.
> 
> In this case it'll mean people will be less likely to report that this is 
> happening, but maybe that's OK. At least if someone comes to us with a 
> problem we'll be able to get some info from them. I'll separately look into 
> the vacuum docs and see if we can do a better job explaining the verbose 
> output.
> 
> BTW, what is it about a dynamic message that makes it untranslatable? Doesn't 
> the translation happen down-stream, so that at most we'd just need two 
> translation messages? Or worst case we could have two separate elog calls, if 
> we wanted to go that route.

Since each part is a complete sentence, you can paste them together.
For example you could do something like

msg1 = _("Could not acquire cleanup lock.");
msg2 = _("Skipped %u pinned pages.");

ereport(INFO,
(errcode(ERRCODE_SOME_STUFF),
 errmsg_internal("%s\n%s", msg1, msg2)));

The use of errmsg_internal() is to avoid having it be translated again.

FWIW I think the vacuum summary INFO message is pretty terrible already.
I would support a patch that changed it to be separately translatable
units (as long as each unit is a complete sentence.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: Log inability to lock pages during vacuum

2014-12-17 Thread Heikki Linnakangas


On 12/17/2014 08:02 PM, Jim Nasby wrote:

BTW, what is it about a dynamic message that makes it untranslatable?
Doesn't the translation happen down-stream, so that at most we'd just
need two translation messages?


I'm not sure what you mean by down-stream. There is some explanation on 
this in the localization guide in the manual:


http://www.postgresql.org/docs/devel/static/nls-programmer.html#NLS-GUIDELINES

printf("Files were %s.\n", flag ? "copied" : "removed");

The translator will get three strings to translate: "Files were %s.\n", 
"copied", and "removed".



Or worst case we could have two
separate elog calls, if we wanted to go that route.


Yeah, that's one option. I.e:

if (flag)
  printf("Files were copied.\n");
else
  printf("Files were removed.\n");

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] postgres messages error

2014-12-17 Thread Tom Lane

Alvaro Herrera  writes:
> A quick grep reveals also
> initdb.po:msgid "%s: could not to allocate SIDs: error code %lu\n"

Good catch, I'll get that one too.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] postgres messages error

2014-12-17 Thread Alvaro Herrera

Tom Lane wrote:
> Adam Brightwell  writes:
> > Martin,
> > #: libpq/auth.c:1593
> >> #, fuzzy, c-format
> >> msgid "could not to look up local user ID %ld: %s"
> >> 
> >> It looks like there is an extra *to* there , so the string should be:
> >> 
> >> "could not look up local user ID %ld: %s"
> 
> > I think you are right.  FWIW, I have attached a patch that fixes it for
> > consideration if others concur.
> 
> I agree, that good English is not.  Will commit.

A quick grep reveals also

initdb.po:msgid "%s: could not to allocate SIDs: error code %lu\n"

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] postgres messages error

2014-12-17 Thread Tom Lane

Adam Brightwell  writes:
> Martin,
> #: libpq/auth.c:1593
>> #, fuzzy, c-format
>> msgid "could not to look up local user ID %ld: %s"
>> 
>> It looks like there is an extra *to* there , so the string should be:
>> 
>> "could not look up local user ID %ld: %s"

> I think you are right.  FWIW, I have attached a patch that fixes it for
> consideration if others concur.

I agree, that good English is not.  Will commit.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: Log inability to lock pages during vacuum

2014-12-17 Thread Jim Nasby


On 12/17/14, 10:20 AM, Heikki Linnakangas wrote:



* Change the new bit in the errdetail. "could not acquire cleanup lock"
sounds too much like an error to me. "skipped %u pinned pages" maybe?


Seems reasonable.


Well, that's not always true either; when freezing, it doesn't skip the pinned 
pages, it waits for them.


Oops. :(


I.e. this just says how many pages were pinned, without saying what was done 
about them. That's not very meaningful to an average DBA, but that's true for 
many of the numbers printed in vacuum verbose.


In this case it'll mean people will be less likely to report that this is 
happening, but maybe that's OK. At least if someone comes to us with a problem 
we'll be able to get some info from them. I'll separately look into the vacuum 
docs and see if we can do a better job explaining the verbose output.

BTW, what is it about a dynamic message that makes it untranslatable? Doesn't 
the translation happen down-stream, so that at most we'd just need two 
translation messages? Or worst case we could have two separate elog calls, if 
we wanted to go that route.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [alvhe...@2ndquadrant.com: Re: [HACKERS] no test programs in contrib]

2014-12-17 Thread Tom Lane

Andrew Dunstan  writes:
> On 12/17/2014 11:34 AM, Andrew Dunstan wrote:
>> Oh, darn, I thought we had a version check. Will fix.

> OK, I have committed a fix. Revised script is at 
> 

Pulled into dromedary, thanks.

> I have set crake to do runs that should clear the errors:
> cd root && for f in REL*; do touch $f/crake.force-one-run; done

Cool, was just wondering what was the easiest way to force that.
Maybe this should be documented on the buildfarm how-to page?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

hash_create API changes (was Re: [HACKERS] speedup tidbitmap patch: hash BlockNumber)

2014-12-17 Thread Tom Lane

Teodor Sigaev  writes:
>> -hash_ctl.hash = oid_hash; /* a bit more efficient than tag_hash */
>> +hash_ctl.hash = tag_hash; /* a bit more efficient than tag_hash */
>> 
>> I think the comment may need removed here.

> Thank you, fixed

I looked at this patch.  It's not right at all here:

+if (hashp->hash == sizeof(int32) && info->hash == tag_hash)
+hashp->hash = int32_hash;
+else
+hashp->hash = info->hash;

hashp->hash is not defined at this point, and if it were defined it would
not be the size of the key, and replacing that with info->keysize wouldn't
be exactly the right thing either since info->keysize isn't necessarily
set (there is a default, though I suspect it's mostly unused).

I hacked up a solution to that and was about to commit when I had second
thoughts about the whole approach.  The thing that's worrying me about
this is that it depends on the assumption that the passed-in info->hash
pointer is identical to what dynahash.c thinks is the address of tag_hash.
I'm not convinced that that's necessarily true on all platforms; in
particular, I'm worried that a call from a loadable module (such as
plperl) might be passing the address of a trampoline or thunk function
that is a jump to tag_hash and not tag_hash itself.  I don't see this
happening on my Linux box but it might well happen on, say, Windows.

There are already some comparisons of the hash function pointer in
dynahash.c, so you might think this concern is moot, but if you look
closely they're just comparisons to the address of string_hash ---
which, in normal use-cases, would be supplied by dynahash.c itself.
So I don't think the existing coding proves this will work.

Now, we could do it like this anyway, and just ignore the fact that we
might not get the optimization on every platform for hash tables created
by loadable modules.  However, there's another way we could attack this,
which is to invent a new hash option flag bit that says "pick a suitable
hash function for me, assuming that all bits of the specified key size are
significant".  So instead of

ctl.keysize = sizeof(...);
ctl.entrysize = sizeof(...);
ctl.hash = tag_hash;
tab = hash_create("...", ..., &ctl,
  HASH_ELEM | HASH_FUNCTION);

you'd write

ctl.keysize = sizeof(...);
ctl.entrysize = sizeof(...);
tab = hash_create("...", ..., &ctl,
  HASH_ELEM | HASH_BLOBS);

which would save some code space and is arguably cleaner than the
current approach of specifying some support functions and not others.

This would be more invasive than the current patch, since we'd have
to run around and touch code that currently mentions tag_hash as
well as the places that currently mention oid_hash.  But the patch
is already pretty invasive just doing the latter.

Thoughts?  Anyone have a decent suggestion for what to call the
flag bit?  I'm not enamored of "HASH_BLOBS", it's just the first
thing that came to mind.

Also, depending on how much we want to annoy third-party authors,
we could consider making a clean break from the Old Way here and say
say that callers must specify all or none of the hash support functions,
thus letting us get rid of the very ad-hoc logic in hash_create()
that fills in defaults for some functions based on what you said
for others.  So the rule would be something like:

No flag mentioned: use functions suitable for null-terminated strings
HASH_BLOBS: use functions suitable for arbitrary fixed-size tags
HASH_FUNCTIONS: caller must supply all three support functions

and we'd remove the existing flag bits HASH_FUNCTION, HASH_COMPARE,
HASH_KEYCOPY.  But this would just be in the line of making the API
simpler/more logical, it wouldn't buy us performance as such.  I'm
not sure whether it's a good idea to go this far.

Comments?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] postgres messages error

2014-12-17 Thread Adam Brightwell

Martin,

#: libpq/auth.c:1593
> #, fuzzy, c-format
> msgid "could not to look up local user ID %ld: %s"
>
> It looks like there is an extra *to* there , so the string should be:
>
> "could not look up local user ID %ld: %s"
>

I think you are right.  FWIW, I have attached a patch that fixes it for
consideration if others concur.

-Adam

-- 
Adam Brightwell - adam.brightw...@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com
diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
new file mode 100644
index 9ad99ce..2b2dbb3
*** a/src/backend/libpq/auth.c
--- b/src/backend/libpq/auth.c
*** auth_peer(hbaPort *port)
*** 1590,1596 
  	if (!pw)
  	{
  		ereport(LOG,
! (errmsg("could not to look up local user ID %ld: %s",
  		   (long) uid, errno ? strerror(errno) : _("user does not exist";
  		return STATUS_ERROR;
  	}
--- 1590,1596 
  	if (!pw)
  	{
  		ereport(LOG,
! (errmsg("could not look up local user ID %ld: %s",
  		   (long) uid, errno ? strerror(errno) : _("user does not exist";
  		return STATUS_ERROR;
  	}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [alvhe...@2ndquadrant.com: Re: [HACKERS] no test programs in contrib]

2014-12-17 Thread Andrew Dunstan



On 12/17/2014 11:34 AM, Andrew Dunstan wrote:


On 12/17/2014 10:23 AM, Tom Lane wrote:

Andrew Dunstan  writes:

On 12/16/2014 04:44 PM, Tom Lane wrote:
I've put this in dromedary as well (though the HEAD build that's 
running
right this moment is still using the 4.13 script).  I take it I 
don't need

to adjust the configuration file?

Nope, no config changes required.

As seems blindingly obvious in hindsight, both crake and dromedary are
now red in every branch but HEAD.





Oh, darn, I thought we had a version check. Will fix.





OK, I have committed a fix. Revised script is at 



I have set crake to do runs that should clear the errors:

   cd root && for f in REL*; do touch $f/crake.force-one-run; done

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] DROP PRIVILEGES OWNED BY

2014-12-17 Thread Marko Tiikkaja


On 12/17/14 5:37 PM, Heikki Linnakangas wrote:

On 12/15/2014 02:43 AM, Marko Tiikkaja wrote:

The syntax is:

 DROP PRIVILEGES OWNED BY role [, ...]


DROP seems like the wrong verb here. DROP is used for deleting objects,
while REVOKE is used for removing permissions from them. REVOKE already
has something similar:

REVOKE ALL PRIVILEGES ON ALL TABLES IN SCHEMA public FROM heikki;

Following that style, how about making the syntax:

REVOKE ALL PRIVILEGES FROM ;


I don't have a problem with that.  It would probably work, too, since 
FROM is already fully reserved.



.marko


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] DROP PRIVILEGES OWNED BY

2014-12-17 Thread Heikki Linnakangas


On 12/15/2014 02:43 AM, Marko Tiikkaja wrote:

This week I had a problem where I wanted to drop only the privileges a
certain role had in the system, while keeping all the objects.  I
couldn't figure out a reasonable way to do that, so I've attached a
patch for this to this email.  Please consider it for inclusion into
9.5.  The syntax is:

DROP PRIVILEGES OWNED BY role [, ...]

I at some point decided to implement it as a new command instead of
changing DropOwnedStmt, and I think that might have been a mistake.  It
might have made more sense to instead teach DROP OWNED to accept a
specification of which things to drop.  But the proposal is more
important than such details, I think.


DROP seems like the wrong verb here. DROP is used for deleting objects, 
while REVOKE is used for removing permissions from them. REVOKE already 
has something similar:


REVOKE ALL PRIVILEGES ON ALL TABLES IN SCHEMA public FROM heikki;

Following that style, how about making the syntax:

REVOKE ALL PRIVILEGES ON ALL OBJECTS FROM 

or just:

REVOKE ALL PRIVILEGES FROM ;

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [alvhe...@2ndquadrant.com: Re: [HACKERS] no test programs in contrib]

2014-12-17 Thread Andrew Dunstan



On 12/17/2014 10:23 AM, Tom Lane wrote:

Andrew Dunstan  writes:

On 12/16/2014 04:44 PM, Tom Lane wrote:

I've put this in dromedary as well (though the HEAD build that's running
right this moment is still using the 4.13 script).  I take it I don't need
to adjust the configuration file?

Nope, no config changes required.

As seems blindingly obvious in hindsight, both crake and dromedary are
now red in every branch but HEAD.





Oh, darn, I thought we had a version check. Will fix.

cheers

andrew






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: Log inability to lock pages during vacuum

2014-12-17 Thread Heikki Linnakangas


On 12/01/2014 08:55 PM, Jim Nasby wrote:

On 12/1/14, 11:57 AM, Andres Freund wrote:

On 2014-11-30 20:46:51 -0600, Jim Nasby wrote:

On 11/10/14, 7:52 PM, Tom Lane wrote:

On the whole, I'm +1 for just logging the events and seeing what we learn
that way.  That seems like an appropriate amount of effort for finding out
whether there is really an issue.


Attached is a patch that does this.


Unless somebody protests I plan to push this soon. I'll change two
things:

* Always use the same message, independent of scan_all. For one Jim's
version would be untranslatable, for another it's not actually
correct. In most cases we'll *not* wait, even if scan_all is
true as we'll often just balk after !lazy_check_needs_freeze().


Good point.


* Change the new bit in the errdetail. "could not acquire cleanup lock"
sounds too much like an error to me. "skipped %u pinned pages" maybe?


Seems reasonable.


Well, that's not always true either; when freezing, it doesn't skip the 
pinned pages, it waits for them.


How about the attached (I also edited the comments a bit)? It looks like 
this:


postgres=# vacuum verbose foo;
INFO:  vacuuming "public.foo"
INFO:  "foo": found 0 removable, 0 nonremovable row versions in 0 out of 
7256 pages

DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
1 pages were pinned.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "foo": stopping truncate due to conflicting lock request
INFO:  vacuuming "pg_toast.pg_toast_16384"
INFO:  index "pg_toast_16384_index" now contains 0 row versions in 1 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.

and for autovacuum log:

LOG:  automatic vacuum of table "postgres.public.foo": index scans: 0
pages: 0 removed, 7256 remain, 0 pinned
tuples: 79415 removed, 513156 remain, 0 are dead but not yet removable
buffer usage: 14532 hits, 6 misses, 6241 dirtied
avg read rate: 0.003 MB/s, avg write rate: 3.413 MB/s
system usage: CPU 0.00s/0.30u sec elapsed 14.28 sec

I.e. this just says how many pages were pinned, without saying what was 
done about them. That's not very meaningful to an average DBA, but 
that's true for many of the numbers printed in vacuum verbose.


- Heikki
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 450c94f..5755753 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -252,6 +252,7 @@ DETAIL:  CPU 0.01s/0.06u sec elapsed 0.07 sec.
 INFO:  "onek": found 3000 removable, 1000 nonremovable tuples in 143 pages
 DETAIL:  0 dead tuples cannot be removed yet.
 There were 0 unused item pointers.
+0 pages were pinned.
 0 pages are entirely empty.
 CPU 0.07s/0.39u sec elapsed 1.56 sec.
 INFO:  analyzing "public.onek"
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index 6db6c5c..af36486 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -105,6 +105,7 @@ typedef struct LVRelStats
 	BlockNumber old_rel_pages;	/* previous value of pg_class.relpages */
 	BlockNumber rel_pages;		/* total number of pages */
 	BlockNumber scanned_pages;	/* number of pages we examined */
+	BlockNumber	pinned_pages;	/* # of pages we could not initially lock */
 	double		scanned_tuples; /* counts only tuples on scanned pages */
 	double		old_rel_tuples; /* previous value of pg_class.reltuples */
 	double		new_rel_tuples; /* new estimated total # of tuples */
@@ -345,7 +346,7 @@ lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt,
 			}
 			ereport(LOG,
 	(errmsg("automatic vacuum of table \"%s.%s.%s\": index scans: %d\n"
-			"pages: %d removed, %d remain\n"
+			"pages: %d removed, %d remain, %d pinned\n"
 			"tuples: %.0f removed, %.0f remain, %.0f are dead but not yet removable\n"
 			"buffer usage: %d hits, %d misses, %d dirtied\n"
 	  "avg read rate: %.3f MB/s, avg write rate: %.3f MB/s\n"
@@ -356,6 +357,7 @@ lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt,
 			vacrelstats->num_index_scans,
 			vacrelstats->pages_removed,
 			vacrelstats->rel_pages,
+			vacrelstats->pinned_pages,
 			vacrelstats->tuples_deleted,
 			vacrelstats->new_rel_tuples,
 			vacrelstats->new_dead_tuples,
@@ -611,6 +613,8 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
 		/* We need buffer cleanup lock so that we can prune HOT chains. */
 		if (!ConditionalLockBufferForCleanup(buf))
 		{
+			vacrelstats->pinned_pages++;
+
 			/*
 			 * If we're not scanning the whole relation to guard against XID
 			 * wraparound, it's OK to skip vacuuming a page.  The next vacuum
@@ -1101,10 +1105,12 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
 	vacrelstats->scanned_pages, nblocks),
 			 errdetail("%.0f dead row versions cannot be removed yet.\n"
 	   "There were %.0f unused i

Re: [HACKERS] speedup tidbitmap patch: cache page

2014-12-17 Thread Teodor Sigaev


You could well be right, but it would be good to compare the numbers just so we
know this for sure.

I wasn't right :(

# set work_mem='64kB';
# set enable_seqscan  = off;
Patched: 1194.094 ms
Master:  1765.338 ms

> Are you seeing the same?
Fixed too, the mistake was in supposition that current page becomes lossy in 
tbm_lossify. It's not always true because tbm_lossify() could lossify only part 
of pages and we don't know which. Thank you very much.


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


tbm_cachepage-2.2.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WALWriter active during recovery

2014-12-17 Thread didier

Hi

On Wed, Dec 17, 2014 at 2:39 PM, Alvaro Herrera
 wrote:
> didier wrote:
>
>> On many Linux systems it may not do that much (2.6.32 and 3.2 are bad,
>> 3.13 is better but still it slows the fsync).
>>
>> If there's a fsync in progress WALReceiver will:
>> 1- slow the fsync because its writes to the same file are grabbed by the 
>> fsync
>> 2- stall until the end of fsync.
>
> Is this behavior filesystem-dependent?
I don't know. I only tested  ext4

Attach the trivial code I used, there's a lot of junk in it.

Didier
/*
* Compile with: gcc testf.c -Wall -W -O0 
*/
 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
 
static long long microseconds(void) {
   struct timeval tv;
   long long mst;
 
   gettimeofday(&tv, NULL);
   mst = ((long long)tv.tv_sec)*100;
   mst += tv.tv_usec;
   return mst;
}
int out= 0;
//#define FLOCK(a,b) flock(a,b)
#define FLOCK(a,b) (0)
 
//==
// fsync  
void child(void) {
  int fd, retval;
  long long start;
 
  while(1) {
 fd = open("/tmp/foo.txt",O_RDONLY);
 //usleep(3000);
 usleep(500);
 FLOCK(fd, LOCK_EX);
 if (out) {
   printf("Start sync\n");
   fflush(stdout);
   start = microseconds();
 }
 retval = fsync(fd);
 FLOCK(fd, LOCK_UN);
 if (out) {
   printf("Sync in %lld microseconds (%d)\n", microseconds()-start,retval);
   fflush(stdout);
 }  
 close(fd);
   }
   exit(0);
}

char buf[8*1024];
#define f_size (2lu*1024*1024*1024)

//==
// read
void child2(void) {
   int fd, retval;
   long long start;
   off_t lfsr;

   fd = open("/tmp/foo.txt",O_RDWR /*|O_CREAT | O_SYNC*/,0644);
   srandom(2000 +time(NULL));
   while(1) {
  if (out) {
start = microseconds();
  }
  lfsr = random()/sizeof(buf);
  if (pread (fd, buf, sizeof(buf), sizeof(buf)*lfsr) == -1) {
 perror("read");
 exit(1);
  }
  // posix_fadvise(fd, sizeof(buf)*lfsr, sizeof(buf), POSIX_FADV_DONTNEED);
  if (out) {
printf("read %lu in %lld microseconds\n", lfsr *sizeof(buf), microseconds()-start);
fflush(stdout);
  }
  usleep(500);
   }
   close(fd);
   exit(0);
}

//==
void child3(int end) {
   int fd, retval;
   long long start;
   off_t lfsr;
   int i;
   int j = 2;

   fd = open("/tmp/foo.txt",O_RDWR /*|O_CREAT | O_SYNC*/,0644);
   for (i = 0; i < 131072/j; i++) {
  int u;
  lseek(fd, sizeof(buf)*(i*j), SEEK_SET);
  write(fd, buf , sizeof(buf));  
}
   close(fd);
   if (end)
 exit(0);
   sleep(60);
}

 
int main(void) {
   int fd0 = open("/tmp/foo.txt",O_RDWR |O_CREAT /*| O_SYNC*/,0644);
   int fd1 = open("/tmp/foo1.txt",O_RDWR |O_CREAT /*| O_SYNC*/,0644);
   
   int fd;
   long long start;
   long long end = 0;
   off_t lfsr = 0;
   memset(buf, 'a', sizeof(buf));
   ftruncate(fd0, f_size);
   ftruncate(fd1, f_size);
   printf ("%d\n",RAND_MAX);
//   child3(0);
   
   if (!fork()) {
 child();
 exit(1);
   }
   
#if 0
   if (!fork()) {
 child2();
 exit(1);
   }
   if (!fork()) {
 child3(1);
 exit(1);
   }
#endif   
   srandom(1000+time(NULL));
   while(1) {
  fd = fd0;
  if (FLOCK(fd, LOCK_EX| LOCK_NB) == -1) {
 if (errno == EWOULDBLOCK)
fd =fd1;
  }
  lfsr = random()/sizeof(buf);
  if (out) {
 start = microseconds();
  }
//  if (pwrite(fd ,buf ,sizeof(buf), sizeof(buf)*lfsr) == -1) {
  lseek(fd, sizeof(buf)*lfsr, SEEK_SET);
  if (write(fd,buf,sizeof(buf)) == -1) {
 perror("write");
 exit(1);
  }
  if (out) {
printf("Write %lu in %lld microseconds\n", lfsr *sizeof(buf), microseconds()-start);
fflush(stdout);
  }
  if (fd == fd0) { 
 FLOCK(fd, LOCK_UN);
  }
  usleep(2);
   }
   close(fd);
   exit(0);
}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] REVIEW: Track TRUNCATE via pgstat

2014-12-17 Thread Alex Shulgin

Alvaro Herrera  writes:

> Alex Shulgin wrote:
>
>> OK, I think I have now all bases covered, though the updated patch is
>> not that "pretty".
>> 
>> The problem is that we don't know in advance if the (sub)transaction is
>> going to succeed or abort, and in case of aborted truncate we need to
>> use the stats gathered prior to truncate.  Thus the need to track
>> insert/update/deletes that happened before first truncate separately.
>
> Ugh, this is messy indeed.  I grant that TRUNCATE is a tricky case to
> handle correctly, so some complexity is expected.  Can you please
> explain in detail how this works?

The main idea is that aborted transaction can leave dead tuples behind
(that is every insert and update), but when TRUNCATE is issued we need
to reset insert/update/delete counters to 0: otherwise we won't get
accurate live and dead counts at commit time.

If the transaction that issued TRUNCATE is instead aborted, the
insert/update counters that we were incrementing *after* truncate are
not relevant to accurate calculation of dead tuples in the original
relfilenode we are now back to due to abort.  We need the insert/updates
counts that happened *before* the first TRUNCATE, hence the need for
separate counters.

>> To the point of making a dedicated pgstat testing tool: let's have
>> another TODO item?
>
> Sure.

Added one.

--
Alex

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] explain sortorder

2014-12-17 Thread Tom Lane

Heikki Linnakangas  writes:
> I would suggest just adding the information to the Sort Key line. As 
> long as you don't print the modifiers when they are defaults (ASC and 
> NULLS LAST), we could print the information even in non-VERBOSE mode.

+1.  I had assumed without looking that that was what it did already,
else I'd have complained too.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] explain sortorder

2014-12-17 Thread Heikki Linnakangas


On 12/15/2014 06:49 PM, Mike Blackwell wrote:

QUERY PLAN

  Sort
Output: n1, n2, ((n1)::character(1))
Sort Key: sortordertest.n1, sortordertest.n2
Sort Order:  ASC NULLS LAST,  ASC NULLS LAST
->  Seq Scan on public.sortordertest
  Output: n1, n2, n1
(6 rows)


I don't like this output. If there are a lot of sort keys, it gets 
difficult to match the right ASC/DESC element to the sort key it applies 
to. (Also, there seems to be double-spaces in the list)


I would suggest just adding the information to the Sort Key line. As 
long as you don't print the modifiers when they are defaults (ASC and 
NULLS LAST), we could print the information even in non-VERBOSE mode. So 
it would look something like:


Sort Key: sortordertest.n1 NULLS FIRST, sortordertest.n2 DESC

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] POLA violation with \c service=

2014-12-17 Thread David Fetter

On Wed, Dec 17, 2014 at 08:14:04AM -0500, Andrew Dunstan wrote:
> 
> On 12/17/2014 04:11 AM, Heikki Linnakangas wrote:
> >On 12/17/2014 10:03 AM, Albe Laurenz wrote:
> >>David Fetter wrote:
> >>>I've noticed that psql's  \c function handles service= requests in a
> >>>way that I can only characterize as broken.
> >>>
> >>>This came up in the context of connecting to a cloud hosting service
> >>>named after warriors or a river or something, whose default hostnames
> >>>are long, confusing, and easy to typo, so I suspect that service= may
> >>>come up more often going forward than it has until now.
> >>>
> >>>For example, when I try to use
> >>>
> >>>\c "service=foo"
> >>>
> >>>It will correctly figure out which database I'm trying to connect to,
> >>>but fail to notice that it's on a different host, port, etc., and
> >>>hence fail to connect with a somewhat unhelpful error message.
> >>>
> >>>I can think of a few approaches for fixing this:
> >>>
> >>>0.  Leave it broken.
> >>>1.  Disable "service=" requests entirely in \c context, and error out
> >>>if attempted.
> >>>2.  Ensure that \c actually uses all of the available information.
> >>>
> >>>Is there another one I missed?
> >>>
> >>>If not, which of the approaches seems reasonable?
> >>
> >>#2 is the correct solution, #1 a band aid.
> >
> >It would be handy, if \c "service=foo" actually worked. We should do #3.
> >If the database name is actually a connection string, or a service
> >specification, it should not re-use the hostname and port from previous
> >connection, but use the values from the connection string or service file.
> 
> 
> Yeah, that's the correct solution. It should not be terribly difficult to
> create a test for a conninfo string in the dbname parameter. That's what
> libpq does after all. We certainly don't want psql to have to try to
> interpret the service file. psql just needs to let libpq do its work in this
> situation.

letting libpq handle this is the only sane plan for fixing it.  I'm
looking into that today.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [alvhe...@2ndquadrant.com: Re: [HACKERS] no test programs in contrib]

2014-12-17 Thread Tom Lane

Andrew Dunstan  writes:
> On 12/16/2014 04:44 PM, Tom Lane wrote:
>> I've put this in dromedary as well (though the HEAD build that's running
>> right this moment is still using the 4.13 script).  I take it I don't need
>> to adjust the configuration file?

> Nope, no config changes required.

As seems blindingly obvious in hindsight, both crake and dromedary are
now red in every branch but HEAD.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] REVIEW: Track TRUNCATE via pgstat

2014-12-17 Thread Alvaro Herrera

Alex Shulgin wrote:

> OK, I think I have now all bases covered, though the updated patch is
> not that "pretty".
> 
> The problem is that we don't know in advance if the (sub)transaction is
> going to succeed or abort, and in case of aborted truncate we need to
> use the stats gathered prior to truncate.  Thus the need to track
> insert/update/deletes that happened before first truncate separately.

Ugh, this is messy indeed.  I grant that TRUNCATE is a tricky case to
handle correctly, so some complexity is expected.  Can you please
explain in detail how this works?

> To the point of making a dedicated pgstat testing tool: let's have
> another TODO item?

Sure.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Reducing lock strength of adding foreign keys

2014-12-17 Thread Simon Riggs

On 25 October 2014 19:00, Noah Misch  wrote:

>> So I tentatively propose (and with due regard for the possibility
>> others may see dangers that I've missed) that a reasonable goal would
>> be to lower the lock strength required for both CREATE TRIGGER and ADD
>> FOREIGN KEY from AccessExclusiveLock to ShareRowExclusiveLock,
>> allowing concurrent SELECT and SELECT FOR SHARE against the tables,
>> but not any actual write operations.
>
> +1

All of this is just a replay of the earlier conversations about
reducing lock levels.

My original patch did reduce lock levels for CREATE TRIGGER, but we
stepped back from doing that in 9.4 until we had feedback as to
whether the whole idea of reducing lock levels was safe, which Robert,
Tom and others were slightly uncertain about.

Is there anything different here from work in my original patch series?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2014-12-17 Thread Rahila Syed

Hello,

>Patches as well as results are attached.

I had a look at code. I have few minor points,

+   bkpb.fork_flags |= BKPBLOCK_HAS_IMAGE;
+
+   if (is_compressed)
{
-   rdt_datas_last->data = page;
-   rdt_datas_last->len = BLCKSZ;
+   /* compressed block information */
+   bimg.length = compress_len;
+   bimg.extra_data = hole_offset;
+   bimg.extra_data |= XLR_BLCK_COMPRESSED_MASK;

For consistency with the existing code , how about renaming the macro
XLR_BLCK_COMPRESSED_MASK as BKPBLOCK_HAS_COMPRESSED_IMAGE on the lines of
BKPBLOCK_HAS_IMAGE.

+   blk->hole_offset = extra_data & ~XLR_BLCK_COMPRESSED_MASK;
Here , I think that having the mask as BKPBLOCK_HOLE_OFFSET_MASK will be
more indicative of the fact that lower 15 bits of extra_data field
comprises of hole_offset value. This suggestion is also just to achieve
consistency with the existing BKPBLOCK_FORK_MASK for fork_flags field.

And comment typo
+* First try to compress block, filling in the page hole with
zeros
+* to improve the compression of the whole. If the block is
considered
+* as incompressible, complete the block header information as
if
+* nothing happened.

As hole is no longer being compressed, this needs to be changed.

Thank you,
Rahila Syed







On Tue, Dec 16, 2014 at 10:04 PM, Michael Paquier  wrote:
>
>
>
> On Wed, Dec 17, 2014 at 12:00 AM, Michael Paquier <
> michael.paqu...@gmail.com> wrote:
>>
>> Actually, the original length of the compressed block in saved in
>> PGLZ_Header, so if we are fine to not check the size of the block
>> decompressed when decoding WAL we can do without the hole filled with
>> zeros, and use only 1 bit to see if the block is compressed or not.
>>
> And.. After some more hacking, I have been able to come up with a patch
> that is able to compress blocks without the page hole, and that keeps the
> WAL record length the same as HEAD when compression switch is off. The
> numbers are pretty good, CPU is saved in the same proportions as previous
> patches when compression is enabled, and there is zero delta with HEAD when
> compression switch is off.
>
> Here are the actual numbers:
>test_name   | pg_size_pretty | user_diff | system_diff
> ---++---+-
>  FPW on + 2 bytes, ffactor 50  | 582 MB | 42.391894 |0.807444
>  FPW on + 2 bytes, ffactor 20  | 229 MB | 14.330304 |0.729626
>  FPW on + 2 bytes, ffactor 10  | 117 MB |  7.335442 |0.570996
>  FPW off + 2 bytes, ffactor 50 | 746 MB | 25.330391 |1.248503
>  FPW off + 2 bytes, ffactor 20 | 293 MB | 10.537475 |0.755448
>  FPW off + 2 bytes, ffactor 10 | 148 MB |  5.762775 |0.763761
>  FPW on + 0 bytes, ffactor 50  | 582 MB | 42.174297 |0.790596
>  FPW on + 0 bytes, ffactor 20  | 229 MB | 14.424233 |0.770459
>  FPW on + 0 bytes, ffactor 10  | 117 MB |  7.057195 |0.584806
>  FPW off + 0 bytes, ffactor 50 | 746 MB | 25.261998 |1.054516
>  FPW off + 0 bytes, ffactor 20 | 293 MB | 10.589888 |0.860207
>  FPW off + 0 bytes, ffactor 10 | 148 MB |  5.827191 |0.874285
>  HEAD, ffactor 50  | 746 MB | 25.181729 |1.133433
>  HEAD, ffactor 20  | 293 MB |  9.962242 |0.765970
>  HEAD, ffactor 10  | 148 MB |  5.693426 |0.775371
>  Record, ffactor 50| 582 MB | 54.904374 |0.678204
>  Record, ffactor 20| 229 MB | 19.798268 |0.807220
>  Record, ffactor 10| 116 MB |  9.401877 |0.668454
> (18 rows)
>
> The new tests of this patch are "FPW off + 0 bytes". Patches as well as
> results are attached.
> Regards,
> --
> Michael
>

Re: [HACKERS] REVIEW: Track TRUNCATE via pgstat

2014-12-17 Thread Alex Shulgin

Alvaro Herrera  writes:
>
>> Even with the current approach of checking the stats after every
>> isolated case it's sometimes takes quite a little more than a glance
>> to verify correctness due to side-effects of rollback (ins/upd/del
>> counters are still updated), and the way stats are affecting the dead
>> tuples counter.
>
> Honestly I think pg_regress is not the right tool to test stat counter
> updates.  It kind-of works today, but only because we don't stress it
> too much.  If you want to create a new test framework for pgstats, and
> include some tests for truncate, be my guest.

OK, I think I have now all bases covered, though the updated patch is
not that "pretty".

The problem is that we don't know in advance if the (sub)transaction is
going to succeed or abort, and in case of aborted truncate we need to
use the stats gathered prior to truncate.  Thus the need to track
insert/update/deletes that happened before first truncate separately.

To the point of making a dedicated pgstat testing tool: let's have
another TODO item?

--
Alex

>From 0b3161191a3ddb999cd9d0da08e1b6088ce07a84 Mon Sep 17 00:00:00 2001
From: Alex Shulgin 
Date: Tue, 9 Dec 2014 16:35:14 +0300
Subject: [PATCH] WIP: track TRUNCATEs in pgstat transaction stats.

The n_live_tup and n_dead_tup counters need to be set to 0 after a
TRUNCATE on the relation.  We can't issue a special message to the stats
collector because the xact might be later aborted, so we track the fact
that the relation was truncated during the xact (and reset this xact's
insert/update/delete counters).  When xact is committed, we use the
`truncated' flag to reset the n_live_tup and n_dead_tup counters.
---
 src/backend/commands/tablecmds.c |  3 +
 src/backend/postmaster/pgstat.c  | 98 ++--
 src/include/pgstat.h | 14 ++--
 src/test/regress/expected/prepared_xacts.out | 50 ++
 src/test/regress/expected/stats.out  | 63 ++
 src/test/regress/sql/prepared_xacts.sql  | 27 
 src/test/regress/sql/stats.sql   | 68 +++
 7 files changed, 315 insertions(+), 8 deletions(-)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
new file mode 100644
index 1e737a0..4f0e3d8
*** a/src/backend/commands/tablecmds.c
--- b/src/backend/commands/tablecmds.c
***
*** 71,76 
--- 71,77 
  #include "parser/parse_type.h"
  #include "parser/parse_utilcmd.h"
  #include "parser/parser.h"
+ #include "pgstat.h"
  #include "rewrite/rewriteDefine.h"
  #include "rewrite/rewriteHandler.h"
  #include "rewrite/rewriteManip.h"
*** ExecuteTruncate(TruncateStmt *stmt)
*** 1224,1229 
--- 1225,1232 
  			 */
  			reindex_relation(heap_relid, REINDEX_REL_PROCESS_TOAST);
  		}
+ 
+ 		pgstat_count_heap_truncate(rel);
  	}
  
  	/*
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
new file mode 100644
index f71fdeb..9d19cf9
*** a/src/backend/postmaster/pgstat.c
--- b/src/backend/postmaster/pgstat.c
*** typedef struct TwoPhasePgStatRecord
*** 199,206 
--- 199,210 
  	PgStat_Counter tuples_inserted;		/* tuples inserted in xact */
  	PgStat_Counter tuples_updated;		/* tuples updated in xact */
  	PgStat_Counter tuples_deleted;		/* tuples deleted in xact */
+ 	PgStat_Counter inserted_pre_trunc;	/* tuples inserted prior to truncate */
+ 	PgStat_Counter updated_pre_trunc;	/* tuples updated prior to truncate */
+ 	PgStat_Counter deleted_pre_trunc;	/* tuples deleted prior to truncate */
  	Oid			t_id;			/* table's OID */
  	bool		t_shared;		/* is it a shared catalog? */
+ 	bool		t_truncated;	/* was the relation truncated? */
  } TwoPhasePgStatRecord;
  
  /*
*** pgstat_count_heap_delete(Relation rel)
*** 1863,1868 
--- 1867,1919 
  	}
  }
  
+ static void
+ record_xact_truncate_stats(PgStat_TableXactStatus *trans)
+ {
+ 	if (!trans->truncated)
+ 	{
+ 		trans->inserted_pre_trunc = trans->tuples_inserted;
+ 		trans->updated_pre_trunc = trans->tuples_updated;
+ 		trans->deleted_pre_trunc = trans->tuples_deleted;
+ 	}
+ }
+ 
+ static void
+ restore_xact_truncate_stats(PgStat_TableXactStatus *trans)
+ {
+ 	if (trans->truncated)
+ 	{
+ 		trans->tuples_inserted = trans->inserted_pre_trunc;
+ 		trans->tuples_updated = trans->updated_pre_trunc;
+ 		trans->tuples_deleted = trans->deleted_pre_trunc;
+ 	}
+ }
+ 
+ /*
+  * pgstat_count_heap_truncate - update tuple counters due to truncate
+  */
+ void
+ pgstat_count_heap_truncate(Relation rel)
+ {
+ 	PgStat_TableStatus *pgstat_info = rel->pgstat_info;
+ 
+ 	if (pgstat_info != NULL)
+ 	{
+ 		/* We have to log the effect at the proper transactional level */
+ 		int			nest_level = GetCurrentTransactionNestLevel();
+ 
+ 		if (pgstat_info->trans == NULL ||
+ 			pgstat_info->trans->nest_level != nest_level)
+ 			add_tabstat_xact_level(pgstat_info, nest_level);
+ 
+ 		record_xact_truncate_stats(pgstat_info->tr

Re: [HACKERS] parallel mode and parallel contexts

2014-12-17 Thread Simon Riggs

On 12 December 2014 at 22:52, Robert Haas  wrote:

> I would be remiss if I failed to mention that this patch includes work
> by my colleagues Amit Kapila, Rushabh Lathia, and Jeevan Chalke, as
> well as my former colleague Noah Misch; and that it would not have
> been possible without the patient support of EnterpriseDB management.

Noted and thanks to all.

I will make this my priority for review, but regrettably not until New
Year, about 2.5-3 weeks away.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Combining Aggregates

2014-12-17 Thread Atri Sharma

On Wed, Dec 17, 2014 at 7:18 PM, Simon Riggs  wrote:
>
> On 17 December 2014 at 12:35, Kouhei Kaigai  wrote:
>
> > Its concept is good to me. I think, the new combined function should be
> > responsible to take a state data type as argument and update state object
> > of the aggregate function. In other words, combined function performs
> like
> > transition function but can update state object according to the summary
> > of multiple rows. Right?
>
> That wasn't how I defined it, but I think your definition is better.
>
> > It also needs some enhancement around Aggref/AggrefExprState structure to
> > inform which function shall be called on execution time.
> > Combined functions are usually no-thank-you. AggrefExprState updates its
> > internal state using transition function row-by-row. However, once
> someone
> > push-down aggregate function across table joins, combined functions have
> > to be called instead of transition functions.
> > I'd like to suggest Aggref has a new flag to introduce this aggregate
> expects
> > state object instead of scalar value.
> >
> > Also, I'd like to suggest one other flag in Aggref not to generate final
> > result, and returns state object instead.
> >
> > Let me use David's example but little bit adjusted.
> >
> > original)
> > SELECT p.name, AVG(s.qty)
> >   FROM sales s INNER JOIN product p ON s.product_id = s.product_id
> >   GROUP BY p.name;
> >
> > modified)
> > SELECT p.name, AVG(qty)
> >   FROM (SELECT product_id, AVG(qty) AS qty FROM sales GROUP BY
> product_id) s
> >INNER JOIN product p
> >   ON p.product_id = s.product_id GROUP BY p_name;
> >
> > Let's assume the modifier set a flag of use_combine_func on the AVG(qty)
> of
> > the main query, and also set a flag of not_generate_final on the
> AVG(qty) of
> > the sub-query.
> > It shall work as we expected.
>
> That matches my thinking exactly.
>
> David, if you can update your patch with some docs to explain the
> behaviour, it looks complete enough to think about committing it in
> early January, to allow other patches that depend upon it to stand a
> chance of getting into 9.5. (It is not yet ready, but I see it could
> be).
>
> The above example is probably the best description of the need, since
> user defined aggregates must also understand this.
>
> Could we please call these "combine functions" or other? MERGE is an
> SQL Standard statement type that we will add later, so it will be
> confusing if we use the "merge" word in this context.
>
> David, your patch avoids creating any mergefuncs for existing
> aggregates. We would need to supply working examples for at least a
> few of the builtin aggregates, so we can test the infrastructure. We
> can add examples for all cases later.
>
>
>
I am still missing how and why we skip transfns. What am I missing,please?

Re: [HACKERS] GiST kNN search queue (Re: KNN-GiST with recheck)

2014-12-17 Thread Heikki Linnakangas


On 12/15/2014 03:14 PM, Andres Freund wrote:

If we add another heap implementation we probably should at least hint
at the different advantages somewhere.


How about adding a src/backend/lib/README for that, per attached?

- Heikki

diff --git a/src/backend/lib/Makefile b/src/backend/lib/Makefile
index 327a1bc..b24ece6 100644
--- a/src/backend/lib/Makefile
+++ b/src/backend/lib/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/lib
 top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
-OBJS = ilist.o binaryheap.o stringinfo.o
+OBJS = ilist.o binaryheap.o pairingheap.o stringinfo.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/lib/README b/src/backend/lib/README
new file mode 100644
index 000..49cf99a
--- /dev/null
+++ b/src/backend/lib/README
@@ -0,0 +1,21 @@
+This directory contains a general purpose data structures, for use anywhere
+in the backend:
+
+binaryheap.c - a binary heap
+
+pairingheap.c - a pairing heap
+
+ilist.c - single and double-linked lists.
+
+stringinfo.c - an extensible string type
+
+
+Aside from the inherent characteristics of the data structures, there are a
+few practical differences between the binary heap and the pairing heap. The
+binary heap is fully allocated at creation, and cannot be expanded beyond the
+allocated size. The pairing heap on the other hand has no inherent maximum
+size, but the caller needs to allocate each element being stored in the heap,
+while the binary heap works with plain Datums or pointers.
+
+In addition to these, there is an implementation of a Red-Black tree in
+src/backend/utils/adt/rbtree.c.
diff --git a/src/backend/lib/pairingheap.c b/src/backend/lib/pairingheap.c
new file mode 100644
index 000..a7e8901
--- /dev/null
+++ b/src/backend/lib/pairingheap.c
@@ -0,0 +1,235 @@
+/*-
+ *
+ * pairingheap.c
+ *	  A Pairing Heap implementation
+ *
+ * Portions Copyright (c) 2012-2014, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  src/backend/lib/pairingheap.c
+ *
+ *-
+ */
+
+#include "postgres.h"
+
+#include "lib/pairingheap.h"
+
+static pairingheap_node *merge(pairingheap *heap, pairingheap_node *a, pairingheap_node *b);
+static pairingheap_node *merge_children(pairingheap *heap, pairingheap_node *children);
+
+/*
+ * pairingheap_allocate
+ *
+ * Returns a pointer to a newly-allocated heap, with the heap property
+ * defined by the given comparator function, which will be invoked with the
+ * additional argument specified by 'arg'.
+ */
+pairingheap *
+pairingheap_allocate(pairingheap_comparator compare, void *arg)
+{
+	pairingheap *heap;
+
+	heap = (pairingheap *) palloc(sizeof(pairingheap));
+	heap->ph_compare = compare;
+	heap->ph_arg = arg;
+
+	heap->ph_root = NULL;
+
+	return heap;
+}
+
+/*
+ * pairingheap_free
+ *
+ * Releases memory used by the given pairingheap.
+ *
+ * Note: The items in the heap are not released!
+ */
+void
+pairingheap_free(pairingheap *heap)
+{
+	pfree(heap);
+}
+
+
+/* A helper function to merge two subheaps into one. */
+static pairingheap_node *
+merge(pairingheap *heap, pairingheap_node *a, pairingheap_node *b)
+{
+	if (a == NULL)
+		return b;
+	if (b == NULL)
+		return a;
+
+	/* Put the larger of the items as a child of the smaller one */
+	if (heap->ph_compare(a, b, heap->ph_arg) < 0)
+	{
+		pairingheap_node *tmp;
+
+		tmp = a;
+		a = b;
+		b = tmp;
+	}
+
+	if (a->first_child)
+		a->first_child->prev_or_parent = b;
+	b->prev_or_parent = a;
+	b->next_sibling = a->first_child;
+	a->first_child = b;
+	return a;
+}
+
+/*
+ * pairingheap_add
+ *
+ * Adds the given datum to the heap in O(1) time.
+ */
+void
+pairingheap_add(pairingheap *heap, pairingheap_node *d)
+{
+	d->first_child = NULL;
+
+	/* Link the new item as a new tree */
+	heap->ph_root = merge(heap, heap->ph_root, d);
+}
+
+/*
+ * pairingheap_first
+ *
+ * Returns a pointer to the first (root, topmost) node in the heap
+ * without modifying the heap. The caller must ensure that this
+ * routine is not used on an empty heap. Always O(1).
+ */
+pairingheap_node *
+pairingheap_first(pairingheap *heap)
+{
+	Assert(!pairingheap_is_empty(heap));
+	return heap->ph_root;
+}
+
+/*
+ * pairingheap_remove_first
+ *
+ * Removes the first (root, topmost) node in the heap and returns a
+ * pointer to it after rebalancing the heap. The caller must ensure
+ * that this routine is not used on an empty heap. O(log n) amortized.
+ */
+pairingheap_node *
+pairingheap_remove_first(pairingheap *heap)
+{
+	pairingheap_node *result;
+	pairingheap_node *children;
+
+	Assert(!pairingheap_is_empty(heap));
+
+	/* Remove the smallest root. */
+	result = heap->ph_root;
+	children = result->first_child;
+
+	heap->ph_root = merge_children(heap, children);
+
+	return result;
+}
+
+/*
+ * Merge a list of subheaps into a single heap.
+ *
+ * This implements the basic two-pass merging str

Re: [HACKERS] Combining Aggregates

2014-12-17 Thread Simon Riggs

On 17 December 2014 at 12:35, Kouhei Kaigai  wrote:

> Its concept is good to me. I think, the new combined function should be
> responsible to take a state data type as argument and update state object
> of the aggregate function. In other words, combined function performs like
> transition function but can update state object according to the summary
> of multiple rows. Right?

That wasn't how I defined it, but I think your definition is better.

> It also needs some enhancement around Aggref/AggrefExprState structure to
> inform which function shall be called on execution time.
> Combined functions are usually no-thank-you. AggrefExprState updates its
> internal state using transition function row-by-row. However, once someone
> push-down aggregate function across table joins, combined functions have
> to be called instead of transition functions.
> I'd like to suggest Aggref has a new flag to introduce this aggregate expects
> state object instead of scalar value.
>
> Also, I'd like to suggest one other flag in Aggref not to generate final
> result, and returns state object instead.
>
> Let me use David's example but little bit adjusted.
>
> original)
> SELECT p.name, AVG(s.qty)
>   FROM sales s INNER JOIN product p ON s.product_id = s.product_id
>   GROUP BY p.name;
>
> modified)
> SELECT p.name, AVG(qty)
>   FROM (SELECT product_id, AVG(qty) AS qty FROM sales GROUP BY product_id) s
>INNER JOIN product p
>   ON p.product_id = s.product_id GROUP BY p_name;
>
> Let's assume the modifier set a flag of use_combine_func on the AVG(qty) of
> the main query, and also set a flag of not_generate_final on the AVG(qty) of
> the sub-query.
> It shall work as we expected.

That matches my thinking exactly.

David, if you can update your patch with some docs to explain the
behaviour, it looks complete enough to think about committing it in
early January, to allow other patches that depend upon it to stand a
chance of getting into 9.5. (It is not yet ready, but I see it could
be).

The above example is probably the best description of the need, since
user defined aggregates must also understand this.

Could we please call these "combine functions" or other? MERGE is an
SQL Standard statement type that we will add later, so it will be
confusing if we use the "merge" word in this context.

David, your patch avoids creating any mergefuncs for existing
aggregates. We would need to supply working examples for at least a
few of the builtin aggregates, so we can test the infrastructure. We
can add examples for all cases later.

Is there a way of testing this in existing code? Or do we need to add
something to test it?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] exitArchiveRecovery woes

2014-12-17 Thread Heikki Linnakangas

At the end of archive recovery, we copy the last segment from the old 
timeline, to initialize the first segment on the new timeline. For 
example, if the timeline switch happens in the middle of WAL segment 
00010005, the whole 00010005 segment is 
copied to become 00020005. The copying is necessary, so 
that the new segment contains valid data up to the switch point.


However, we wouldn't really need to copy the whole segment, copying up 
to the switch point would be enough. In fact, copying the whole segment 
is a bad idea, because the copied WAL looks valid on the new timeline 
too. When we read the WAL at crash recovery, we rely on a number of 
things to determine if the next WAL record is valid. Most importantly, 
the checksum, and the prev-pointer. The checksum protects any random 
data from appearing valid, and the prev-pointer makes sure that a WAL 
record copied from another location in the WAL is not mistaken as valid. 
The prev-pointer is particularly important when we recycle old WAL 
segments as new, because the old segment contains valid WAL records with 
checksums and all. When we copy a WAL segment with the same segment 
number, the prev pointer doesn't protect us, as there can be WAL records 
at the exact same locations in both segments. There is a timeline ID on 
the page header, but we could still be mistaken within the page. Also, 
we are lenient with the TLI at start of WAL recovery, when we read the 
first WAL record after the checkpoint. There are further safeguards, 
like the fact that when writing WAL, we always write full blocks. But 
the write could still be torn at the OS or disk level, if you crash 
after writing the WAL, but before fsyncing it.


This is largely academic, but I was able to craft a test case where WAL 
recovery mistakenly starts to replay the WAL copied from the old 
timeline, as if it was on the new timeline. Attached is a shell script I 
used. It's very sensitive to the lengths of the WAL records, so probably 
only works on a similar platform as mine (x86_64 Linux). Running 
pitr-test.sh ends with this:


S LOG:  database system was interrupted; last known up at 2014-12-17 
15:15:42 EET
S LOG:  database system was not properly shut down; automatic recovery 
in progress

S LOG:  redo starts at 0/50A2018
S PANIC:  heap_insert_redo: invalid max offset number
S CONTEXT:  xlog redo Heap/INSERT: off 28
S LOG:  startup process (PID 10640) was terminated by signal 6: Aborted
S LOG:  aborting startup due to startup process failure

That PANIC happens because it tries to apply WAL from different 
timeline, and it doesn't work because it missed an earlier change to the 
same page it modifies. (If you were unlucky, you could get silent 
corruption instead, if the WAL record happens to apply without an error)


A simple way to avoid this is to copy the old WAL segment only up to the 
point of the timeline switch, and zero the rest.



Another thing I noticed is that we copy the last old WAL segment on the 
new timeline, even if the timeline switch happens at a segment boundary. 
In that case, the copied WAL segment is 100% identical to the old 
segment; it contains no records belonging to the new timeline. I guess 
that's not wrong per se, but it seems pointless and confusing.


Attached is a patch that addresses both of those issues. This doesn't 
seem worth the risk to back-patch, but let's fix these in master.



PS. The "if (endTLI != ThisTimeLineID)" test in exitArchiveRecovery was 
always true, because we always switch to a new timeline after archive 
recovery. I turned that into an Assert.


- Heikki


pitr-test.sh
Description: Bourne shell script
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***
*** 2923,2934  XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock)
   * srcTLI, srclog, srcseg: identify segment to be copied (could be from
   *		a different timeline)
   *
   * Currently this is only used during recovery, and so there are no locking
   * considerations.  But we should be just as tense as XLogFileInit to avoid
   * emplacing a bogus file.
   */
  static void
! XLogFileCopy(XLogSegNo destsegno, TimeLineID srcTLI, XLogSegNo srcsegno)
  {
  	char		path[MAXPGPATH];
  	char		tmppath[MAXPGPATH];
--- 2923,2937 
   * srcTLI, srclog, srcseg: identify segment to be copied (could be from
   *		a different timeline)
   *
+  * upto: how much of the source file to copy? (the rest is filled with zeros)
+  *
   * Currently this is only used during recovery, and so there are no locking
   * considerations.  But we should be just as tense as XLogFileInit to avoid
   * emplacing a bogus file.
   */
  static void
! XLogFileCopy(XLogSegNo destsegno, TimeLineID srcTLI, XLogSegNo srcsegno,
! 			 int upto)
  {
  	char		path[MAXPGPATH];
  	char		tmppath[MAXPGPATH];
***
*** 2967,2982  XLogFileCopy(XLogSegNo destsegno, TimeLineID srcTLI, XLogSegNo

Re: [HACKERS] WALWriter active during recovery

2014-12-17 Thread Alvaro Herrera

didier wrote:

> On many Linux systems it may not do that much (2.6.32 and 3.2 are bad,
> 3.13 is better but still it slows the fsync).
> 
> If there's a fsync in progress WALReceiver will:
> 1- slow the fsync because its writes to the same file are grabbed by the fsync
> 2- stall until the end of fsync.

Is this behavior filesystem-dependent?


-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Combining Aggregates

2014-12-17 Thread Atri Sharma

On Wed, Dec 17, 2014 at 6:05 PM, Kouhei Kaigai  wrote:
>
> Simon,
>
> Its concept is good to me. I think, the new combined function should be
> responsible to take a state data type as argument and update state object
> of the aggregate function. In other words, combined function performs like
> transition function but can update state object according to the summary
> of multiple rows. Right?
>
> It also needs some enhancement around Aggref/AggrefExprState structure to
> inform which function shall be called on execution time.
> Combined functions are usually no-thank-you. AggrefExprState updates its
> internal state using transition function row-by-row. However, once someone
> push-down aggregate function across table joins, combined functions have
> to be called instead of transition functions.
> I'd like to suggest Aggref has a new flag to introduce this aggregate
> expects
> state object instead of scalar value.
>
> Also, I'd like to suggest one other flag in Aggref not to generate final
> result, and returns state object instead.
>
>
>
So are you proposing not calling transfuncs at all and just use combined
functions?

That sounds counterintuitive to me. I am not able to see why you would want
to avoid transfns totally even for the case of pushing down aggregates that
you mentioned.

>From Simon's example mentioned upthread:

PRE-AGGREGATED PLAN
Aggregate
-> Join
 -> PreAggregate (doesn't call finalfn)
  -> Scan BaseTable1
 -> Scan BaseTable2

finalfn wouldnt be called. Instead, combined function would be responsible
for getting preaggregate results and combining them (unless of course, I am
missing something).

Special casing transition state updating in Aggref seems like a bad idea to
me. I would think that it would be better if we made it more explicit i.e.
add a new node on top that does the combination (it would be primarily
responsible for calling combined function).

Not a good source of inspiration, but seeing how SQL Server does it
(Exchange operator + Stream Aggregate) seems intuitive to me, and having
combination operation as a separate top node might be a cleaner way.

I may be wrong though.

Regards,

Atri

Re: [HACKERS] POLA violation with \c service=

2014-12-17 Thread Andrew Dunstan



On 12/17/2014 04:11 AM, Heikki Linnakangas wrote:

On 12/17/2014 10:03 AM, Albe Laurenz wrote:

David Fetter wrote:

I've noticed that psql's  \c function handles service= requests in a
way that I can only characterize as broken.

This came up in the context of connecting to a cloud hosting service
named after warriors or a river or something, whose default hostnames
are long, confusing, and easy to typo, so I suspect that service= may
come up more often going forward than it has until now.

For example, when I try to use

\c "service=foo"

It will correctly figure out which database I'm trying to connect to,
but fail to notice that it's on a different host, port, etc., and
hence fail to connect with a somewhat unhelpful error message.

I can think of a few approaches for fixing this:

0.  Leave it broken.
1.  Disable "service=" requests entirely in \c context, and error 
out if attempted.

2.  Ensure that \c actually uses all of the available information.

Is there another one I missed?

If not, which of the approaches seems reasonable?


#2 is the correct solution, #1 a band aid.


It would be handy, if \c "service=foo" actually worked. We should do 
#3. If the database name is actually a connection string, or a service 
specification, it should not re-use the hostname and port from 
previous connection, but use the values from the connection string or 
service file.






Yeah, that's the correct solution. It should not be terribly difficult 
to create a test for a conninfo string in the dbname parameter. That's 
what libpq does after all. We certainly don't want psql to have to try 
to interpret the service file. psql just needs to let libpq do its work 
in this situation.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] postgres messages error

2014-12-17 Thread Martín Marqués

Hi there,

I was doing some translation on postgres.po and found a string which
looks mistaken.

#: libpq/auth.c:1593
#, fuzzy, c-format
msgid "could not to look up local user ID %ld: %s"

It looks like there is an extra *to* there , so the string should be:

"could not look up local user ID %ld: %s"

Cheers,

-- 
Martín Marquéshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [alvhe...@2ndquadrant.com: Re: [HACKERS] no test programs in contrib]

2014-12-17 Thread Alvaro Herrera

Michael Paquier wrote:
> On Wed, Dec 17, 2014 at 6:02 AM, Alvaro Herrera 
> wrote:
> >
> > Andrew Dunstan wrote:

> > > Any brave buildfarm owners on *nix can try it by replacing their copy of
> > > run_build.pl with the bleeding edge version. We can't put it in a client
> > > release until we fix up the MSVC side of things.
> >
> > I guess I will have to find someone to assist me in architecting a
> > solution for the MSVC stuff.
> >
> What's the matter here? Do we need to extend vcregress.pl with a new mode
> to test stuff in src/test/modules?

Please see my message
http://www.postgresql.org/message-id/20141128205453.ga1...@alvh.no-ip.org

The -msvc patch attached there adds some support to Mkvcbuild.pm and
vcregress.pl, but it doesn't completely work (fails to install).

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WALWriter active during recovery

2014-12-17 Thread Simon Riggs

On 17 December 2014 at 11:27, didier  wrote:

> If there's a fsync in progress WALReceiver will:
> 1- slow the fsync because its writes to the same file are grabbed by the fsync
> 2- stall until the end of fsync.

PostgreSQL already fsyncs files while they are being written to. Are
you saying we should stop doing that?

It would be possible to synchronize processes so that we don't write
to a file while it is being fsynced.

fsyncs are also made once the whole 16MB has been written, so in those
cases there is no simultaneous action.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] speedup tidbitmap patch: hash BlockNumber

2014-12-17 Thread Teodor Sigaev


-hash_ctl.hash = oid_hash; /* a bit more efficient than tag_hash */
+hash_ctl.hash = tag_hash; /* a bit more efficient than tag_hash */

I think the comment may need removed here.

Thank you, fixed

--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


tbm_blocknumber-4.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Combining Aggregates

2014-12-17 Thread Kouhei Kaigai

Simon,

Its concept is good to me. I think, the new combined function should be
responsible to take a state data type as argument and update state object
of the aggregate function. In other words, combined function performs like
transition function but can update state object according to the summary
of multiple rows. Right?

It also needs some enhancement around Aggref/AggrefExprState structure to
inform which function shall be called on execution time.
Combined functions are usually no-thank-you. AggrefExprState updates its
internal state using transition function row-by-row. However, once someone
push-down aggregate function across table joins, combined functions have
to be called instead of transition functions.
I'd like to suggest Aggref has a new flag to introduce this aggregate expects
state object instead of scalar value.

Also, I'd like to suggest one other flag in Aggref not to generate final
result, and returns state object instead.

Let me use David's example but little bit adjusted.

original)
SELECT p.name, AVG(s.qty)
  FROM sales s INNER JOIN product p ON s.product_id = s.product_id
  GROUP BY p.name;

modified)
SELECT p.name, AVG(qty)
  FROM (SELECT product_id, AVG(qty) AS qty FROM sales GROUP BY product_id) s
   INNER JOIN product p
  ON p.product_id = s.product_id GROUP BY p_name;

Let's assume the modifier set a flag of use_combine_func on the AVG(qty) of
the main query, and also set a flag of not_generate_final on the AVG(qty) of
the sub-query.
It shall work as we expected.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei 


> -Original Message-
> From: Simon Riggs [mailto:si...@2ndquadrant.com]
> Sent: Wednesday, December 17, 2014 6:53 PM
> To: Kaigai Kouhei(海外 浩平); David Rowley; PostgreSQL-development; Amit
> Kapila
> Subject: Combining Aggregates
> 
> KaiGai, David Rowley and myself have all made mention of various ways we
> could optimize aggregates.
> 
> Following WIP patch adds an extra function called a "combining function",
> that is intended to allow the user to specify a semantically correct way
> of breaking down an aggregate into multiple steps.
> 
> Gents, is this what you were thinking? If not...
> 
> --
>  Simon Riggs   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Combining Aggregates

2014-12-17 Thread Simon Riggs

On 17 December 2014 at 10:20, David Rowley  wrote:
> On 17 December 2014 at 22:53, Simon Riggs  wrote:
>>
>> KaiGai, David Rowley and myself have all made mention of various ways
>> we could optimize aggregates.
>>
>> Following WIP patch adds an extra function called a "combining
>> function", that is intended to allow the user to specify a
>> semantically correct way of breaking down an aggregate into multiple
>> steps.
>>
>> Gents, is this what you were thinking? If not...
>>
>
> Very much so! You must have missed my patch.
>
> http://www.postgresql.org/message-id/CAApHDvrZG5Q9rNxU4WOga8AgvAwQ83bF83CFvMbOQcCg8vk=z...@mail.gmail.com

Very strange that you should post an otherwise unrelated patch on
someone else's thread AND not add the patch to the CommitFest.

Stealth patch submission is a new one on me.

Looks like great minds think alike, at least. Or fools seldom differ.

Please add your patch to the CF app.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] speedup tidbitmap patch: cache page

2014-12-17 Thread David Rowley

On 17 December 2014 at 05:25, Teodor Sigaev  wrote:
>
> I've been having a look at this and I'm wondering about a certain scenario:
>>
>> In tbm_add_tuples, if tbm_page_is_lossy() returns true for a given block,
>> and on
>> the next iteration of the loop we have the same block again, have you
>> benchmarked any caching code to store if tbm_page_is_lossy() returned
>> true for
>> that block on the previous iteration of the loop? This would save from
>> having to
>> call tbm_page_is_lossy() again for the same block. Or are you just
>> expecting
>> that tbm_page_is_lossy() returns true so rarely that you'll end up
>> caching the
>> page most of the time, and gain on skipping both hash lookups on the next
>> loop,
>> since page will be set in this case?
>>
> I believe that if we fall in lossy pages then tidbitmap will not have a
> significant impact on preformance because postgres will spend a lot of
> time on tuple rechecking on page. If work_mem is to small to keep exact
> tidbitmap then postgres will significantly slowdown. I implemented it,
> (v2.1 in attachs) but
> I don't think that is an improvement, at least significant improvement.
>
>
You could well be right, but it would be good to compare the numbers just
so we know this for sure.

There seems to be a problem with the v2.1. The code does not look quite
right, if have expected something more like:

if (lossy_page == blk)
continue; /* whole page is already marked */

if (page == NULL || page->blockno != blk)

where you have the lossy_page check within the 2nd if test. I'd imagine
this will never be true as page->blockno != blk, or perhaps it'll only be
true when tbm_lossify() is called and you set the page to NULL.

There's also something weird going on using your original test case:

postgres=# set work_mem = '256MB';
SET
Time: 0.450 ms
postgres=# select count(*) from t where i >= 0;
  count
-
 500
(1 row)

That's ok

Time: 1369.562 ms
postgres=# set work_mem = '64kB';
SET
Time: 0.425 ms
postgres=# select count(*) from t where i >= 0;
  count
-
 498
(1 row)


Time: 892.726 ms

Notice the row counts are not the same.

create table t_result (i int not null);

postgres=# select a.a,a.b,b.a,b.b from (select i a,count(*) b from t group
by i) a full outer join (select i a,count(*) b from t_result group by i) b
on a.a = b.a where
 b.b <> a.b;
   a| b  |   a| b
+++---
 315744 | 10 | 315744 | 8
(1 row)

postgres=# select ctid from t where i = 315744;
ctid
-
 (13970,220)
 (13970,221)
 (13970,222)
 (13970,223)
 (13970,224)
 (13970,225)
 (13970,226)
 (13971,1)
 (13971,2)
 (13971,3)
(10 rows)

This only seems to be broken in the v2.1.

Are you seeing the same?


>> It would be nice to see a comment to explain why it might be a good idea
>> to
>> cache the page lookup. Perhaps something like:
>>
>>
> added, see attachment (v2)
>
>
Thanks. That looks better. It would be good to compare performance of v2
and v2.1, but I think the problem with v2.1 needs to be fixed before that
can happen.


>
>> I also wondered if there might be a small slowdown in the case where the
>> index
>> only finds 1 matching tuple. So I tried the following:
>> avg.2372.4456 2381.909 99.6%
>> med.2371.224 2359.494 100.5%
>>
>> It appears that if it does, then it's not very much.
>>
>
> I believe, that's unmeasurable because standard deviation of your tests is
> about 2% what is greater that difference between pathed and master versions.
>
>
Agreed.

Regards

David Rowley

Re: [HACKERS] WALWriter active during recovery

2014-12-17 Thread didier

Hi,

On Tue, Dec 16, 2014 at 6:07 PM, Simon Riggs  wrote:
> On 16 December 2014 at 14:12, Heikki Linnakangas
>  wrote:
>> On 12/15/2014 08:51 PM, Simon Riggs wrote:
>>>
>>> Currently, WALReceiver writes and fsyncs data it receives. Clearly,
>>> while we are waiting for an fsync we aren't doing any other useful
>>> work.
>>>
>>> Following patch starts WALWriter during recovery and makes it
>>> responsible for fsyncing data, allowing WALReceiver to progress other
>>> useful actions.
On many Linux systems it may not do that much (2.6.32 and 3.2 are bad,
3.13 is better but still it slows the fsync).

If there's a fsync in progress WALReceiver will:
1- slow the fsync because its writes to the same file are grabbed by the fsync
2- stall until the end of fsync.

from 'stracing' a test program simulating this pattern:
two processes, one writes to a file the second fsync it.

20279 11:51:24.037108 fsync(5 
20278 11:51:24.053524 <... nanosleep resumed> NULL) = 0 <0.020281>
20278 11:51:24.053691 lseek(3, 1383612416, SEEK_SET) = 1383612416 <0.000119>
20278 11:51:24.053965 write(3, ""...,
8192) = 8192 <0.000111>
20278 11:51:24.054190 nanosleep({0, 2000}, NULL) = 0 <0.020243>

20278 11:51:24.404386 lseek(3, 194772992, SEEK_SET 
20279 11:51:24.754123 <... fsync resumed> ) = 0 <0.716971>
20279 11:51:24.754202 close(5 
20278 11:51:24.754232 <... lseek resumed> ) = 194772992 <0.349825>

Yes that's a 300ms lseek...

>>
>>
>> What other useful actions can WAL receiver do while it's waiting? It doesn't
>> do much else than receive WAL, and fsync it to disk.
>
> So now it will only need to do one of those two things.
>

Regards
Didier


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] analyze_new_cluster.bat and delete_old_cluster.bat not ignored with vcregress upgradecheck

2014-12-17 Thread Magnus Hagander

On Tue, Dec 16, 2014 at 7:25 AM, Michael Paquier 
wrote:
>
> Hi all,
>
> As mentioned in $subject, I noticed that those automatically-generated
> files are not ignored in the tree when running vcregress on Windows,
> we do ignore their .sh version though. I think that it would be good
> to back-patch the patch attached to prevent the inclusion of those
> files in the future.
>

Applied, thanks.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [HACKERS] Combining Aggregates

2014-12-17 Thread David Rowley

On 17 December 2014 at 22:53, Simon Riggs  wrote:
>
> KaiGai, David Rowley and myself have all made mention of various ways
> we could optimize aggregates.
>
> Following WIP patch adds an extra function called a "combining
> function", that is intended to allow the user to specify a
> semantically correct way of breaking down an aggregate into multiple
> steps.
>
> Gents, is this what you were thinking? If not...
>
>
Very much so! You must have missed my patch.

http://www.postgresql.org/message-id/CAApHDvrZG5Q9rNxU4WOga8AgvAwQ83bF83CFvMbOQcCg8vk=z...@mail.gmail.com

The cases I think that may benefit from such infrastructure would be:

1. Parallel queries, where each worker could work on a portion of the
tuples being aggregated and then the combine/merge function is called in
the end in order to produce the final aggregated result. We currently don't
yet have parallel query, so we can't really commit anything yet, for that
reason.

2. Queries such as:

SELECT p.name, SUM(s.qty) FROM sales s INNER JOIN product p ON s.product_id
= s.product_id GROUP BY p.name;

Such a query could be transformed into:

SELECT p.name,SUM(qty) FROM (SELECT product_id,SUM(qty) AS qty FROM sales
GROUP BY product_id) s
INNER JOIN product p ON p.product_id = s.product_id GROUP BY p_name;

Of course the outer query's SUM and GROUP BY would not be required if there
happened to be a UNIQUE index on product(name), but assuming there's not
then the above should produce the results faster. This of course works ok
for SUM(), but for something like AVG() or STDDEV() the combine/merge
aggregate functions would be required to process those intermediate
aggregate results that were produced by the sub-query.

Regards

David Rowley

Re: [HACKERS] Combining Aggregates

2014-12-17 Thread Simon Riggs

On 17 December 2014 at 10:02, Atri Sharma  wrote:
>
>
> On Wed, Dec 17, 2014 at 3:23 PM, Simon Riggs  wrote:
>>
>> KaiGai, David Rowley and myself have all made mention of various ways
>> we could optimize aggregates.
>>
>> Following WIP patch adds an extra function called a "combining
>> function", that is intended to allow the user to specify a
>> semantically correct way of breaking down an aggregate into multiple
>> steps.
>>
>> Gents, is this what you were thinking? If not...
>>
>>
>
> A quick look at the patch makes me assume that the patch does not handle the
> problem of combining transvals or move at all in that direction (which is
> fine, just reconfirming).

There are no applications of the new function at present. Each would
be similar, but unsure as to whether they would be identical.

> So, essentially, we are adding a "grand total" on top of individual sum() or
> count() operations,right?

The idea is to be able to do aggregation in multiple parts. For
example, allowing parallel query to have a plan like this

Aggregate
->Gather (subPlan is repeated N times on each parallel worker)
   ->Aggregate
  -->ParallelSeqScan (scans a distinct portion of a table)

or to allow a serial plan where an aggregate was pushed down below a
join, like this

CURRENT PLAN
Aggregate
-> Join
 -> Scan BaseTable1
 -> Scan BaseTable2

PRE-AGGREGATED PLAN
Aggregate
-> Join
 -> PreAggregate (doesn't call finalfn)
  -> Scan BaseTable1
 -> Scan BaseTable2

and also allow the above plan to be replaced by a Matview plan like this

Aggregate
-> Join
 -> Scan BaseTable1.matviewA
 -> Scan BaseTable2

where we would assume that the contents of matviewA are
pre-aggregations that could be further combined.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Combining Aggregates

2014-12-17 Thread Petr Jelinek


On 17/12/14 11:02, Atri Sharma wrote:



On Wed, Dec 17, 2014 at 3:23 PM, Simon Riggs mailto:si...@2ndquadrant.com>> wrote:

KaiGai, David Rowley and myself have all made mention of various ways
we could optimize aggregates.

Following WIP patch adds an extra function called a "combining
function", that is intended to allow the user to specify a
semantically correct way of breaking down an aggregate into multiple
steps.


Also, should we not have a sanity check for the user function provided?


Looking at the code, yes, there seems to be XXX comment about that.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some modes of vcregress not mentioned in MSVC scripts

2014-12-17 Thread Magnus Hagander

Applied, thanks.

Also backpatched to 9.4 so the docs changes go into the /current/ docs (or
what will be current later this week).

//Magnus

On Tue, Dec 16, 2014 at 7:10 AM, Michael Paquier 
wrote:
>
> Hi all,
>
> While looking at the Windows docs and the MSVC scripts, I noticed that
> the following run modes of vcregress.pl are either not mentioned in
> the WIndows installation docs or in the help output of vcregress.pl:
> - ecpgcheck
> - isolationcheck
> - upgradecheck
> Attached is a patch fixing that.
> Regards,
> --
> Michael
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [HACKERS] Combining Aggregates

2014-12-17 Thread Atri Sharma

On Wed, Dec 17, 2014 at 3:23 PM, Simon Riggs  wrote:
>
> KaiGai, David Rowley and myself have all made mention of various ways
> we could optimize aggregates.
>
> Following WIP patch adds an extra function called a "combining
> function", that is intended to allow the user to specify a
> semantically correct way of breaking down an aggregate into multiple
> steps.
>
> Gents, is this what you were thinking? If not...
>
>
>
A quick look at the patch makes me assume that the patch does not handle
the problem of combining transvals or move at all in that direction (which
is fine, just reconfirming).

So, essentially, we are adding a "grand total" on top of individual sum()
or count() operations,right?

Also, should we not have a sanity check for the user function provided?

[HACKERS] Combining Aggregates

2014-12-17 Thread Simon Riggs

KaiGai, David Rowley and myself have all made mention of various ways
we could optimize aggregates.

Following WIP patch adds an extra function called a "combining
function", that is intended to allow the user to specify a
semantically correct way of breaking down an aggregate into multiple
steps.

Gents, is this what you were thinking? If not...

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


aggregate_combining_fn.v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL

2014-12-17 Thread Simon Riggs

On 16 December 2014 at 21:45, Stephen Frost  wrote:
> * Simon Riggs (si...@2ndquadrant.com) wrote:
>> On 16 December 2014 at 18:28, Stephen Frost  wrote:
>>
>> > For the sake of the archives, the idea I had been trying to propose is
>> > to use a role's permissions as a mechanism to define what should be
>> > audited.  An example is:
>>
>> My understanding is that was done.
>
> Based on the discussion I had w/ Abhijit on IRC, and what I saw him
> commit, it's not the same.  I've been trying to catch up with him on IRC
> to get clarification but havn't managed to yet.
>
> Abhijit, could you comment on the above (or, well, my earlier email
> which had the details)?  It's entirely possible that I've completely
> misunderstood as I haven't dug into the code yet but rather focused on
> the documentation.

Abhijit added the "roles" idea on Nov 27 after speaking with you.

AFAIK, it is intended to perform as you requested.

Please review...

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] POLA violation with \c service=

2014-12-17 Thread Heikki Linnakangas


On 12/17/2014 10:03 AM, Albe Laurenz wrote:

David Fetter wrote:

I've noticed that psql's  \c function handles service= requests in a
way that I can only characterize as broken.

This came up in the context of connecting to a cloud hosting service
named after warriors or a river or something, whose default hostnames
are long, confusing, and easy to typo, so I suspect that service= may
come up more often going forward than it has until now.

For example, when I try to use

\c "service=foo"

It will correctly figure out which database I'm trying to connect to,
but fail to notice that it's on a different host, port, etc., and
hence fail to connect with a somewhat unhelpful error message.

I can think of a few approaches for fixing this:

0.  Leave it broken.
1.  Disable "service=" requests entirely in \c context, and error out if 
attempted.
2.  Ensure that \c actually uses all of the available information.

Is there another one I missed?

If not, which of the approaches seems reasonable?


#2 is the correct solution, #1 a band aid.


It would be handy, if \c "service=foo" actually worked. We should do #3. 
If the database name is actually a connection string, or a service 
specification, it should not re-use the hostname and port from previous 
connection, but use the values from the connection string or service file.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-12-17 Thread Simon Riggs

On 15 December 2014 at 20:26, Jeff Janes  wrote:

> I still get the compiler error in contrib:
>
> pgstattuple.c: In function 'pgstat_heap':
> pgstattuple.c:279: error: too few arguments to function
> 'heap_beginscan_strat'
>
> Should it pass false for the always_prune?

Yes.

New version attached.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


hot_disable.v8.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical Decoding follows timelines

2014-12-17 Thread Simon Riggs

On 16 December 2014 at 21:17, Simon Riggs  wrote:

>>> This patch is a WIP version of doing that, but only currently attempts

>> With the patch, XLogSendLogical uses the same logic to calculate SendRqstPtr
>> that XLogSendPhysical does. It would be good to refactor that into a common
>> function, rather than copy-paste.
>
> Some of the logic is similar, but not all.
>
>> SendRqstPtr isn't actually used for anything in XLogSendLogical.
>
> It exists to allow the call which resets TLI.
>
> I'll see if I can make it exactly identical; I didn't think so when I
> first looked, will look again.

Yes, that works. New version attached

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


logical_timeline_following.v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] POLA violation with \c service=

2014-12-17 Thread Albe Laurenz

David Fetter wrote:
> I've noticed that psql's  \c function handles service= requests in a
> way that I can only characterize as broken.
> 
> This came up in the context of connecting to a cloud hosting service
> named after warriors or a river or something, whose default hostnames
> are long, confusing, and easy to typo, so I suspect that service= may
> come up more often going forward than it has until now.
> 
> For example, when I try to use
> 
> \c "service=foo"
> 
> It will correctly figure out which database I'm trying to connect to,
> but fail to notice that it's on a different host, port, etc., and
> hence fail to connect with a somewhat unhelpful error message.
> 
> I can think of a few approaches for fixing this:
> 
> 0.  Leave it broken.
> 1.  Disable "service=" requests entirely in \c context, and error out if 
> attempted.
> 2.  Ensure that \c actually uses all of the available information.
> 
> Is there another one I missed?
> 
> If not, which of the approaches seems reasonable?

#2 is the correct solution, #1 a band aid.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

85 matches

Mail list logo