date:20171226

Re: Getting rid of "tuple concurrently updated" elog()s with concurrent DDLs (at least ALTER TABLE)

2017-12-26 Thread Michael Paquier

On Tue, Dec 26, 2017 at 10:47:59PM -0800, Robert Haas wrote:
> On Tue, Dec 26, 2017 at 4:14 AM, Michael Paquier
>  wrote:
>> On Tue, Dec 26, 2017 at 4:30 PM, Andres Freund  wrote:
>>> You're proposing to lock the entire relation against many forms of 
>>> concurrent DDL, just to get rid of that error? That seems unacceptable.
>>> Isn't the canonical way to solve this to take object locks?
>>
>> Sure. That's where things in lmgr.c come into play, like
>> LockSharedObject(), and you could hold with an exclusive lock on a
>> given object until the end of a transaction before opening the catalog
>> relation with heap_open(), however with those you need to know the
>> object OID before taking a lock on the parent relation, right? So you
>> face problems with lock upgrades, or race conditions show up more
>> easily. I have to admit that I have not dug much into the problem yet,
>> it is easy enough to have isolation tests by the way, and I just
>> noticed that ALTER DATABASE SET can equally trigger the error.
> 
> I don't understand what you mean by "you need to know the object OID
> before taking a lock on the parent relation, right?".  That seems
> wrong.

Well, the point I am trying to outline here is that it is essential to
take a lock on the object ID before opening pg_authid with heap_open()
with a low-level lock to avoid any concurrency issues when working on
the catalog relation opened.

> I think you might need something like what was done in
> b3ad5d02c9cd8a4c884cd78480f221afe8ce5590; if, after we look up the
> name and before we acquire a lock on the OID, we accept any
> invalidation messages, recheck that the object we've locked is still
> the one associated with that name.

Indeed, this bit is important, and RangeVarGet does so. Looking at
get_object_address(), it also handles locking of the object. Using that
as interface to address all the concurrency problems could be appealing,
and less invasive for a final result as many DDLs are visibly
impacted (still need to make a list here). What do you think?
--
Michael


signature.asc
Description: PGP signature

Re: [HACKERS] pow support for pgbench

2017-12-26 Thread Fabien COELHO



Bonjour Michaël,


And my 2c on the matter is that switching silently from one version to
the other would be unwelcome. The user should be aware if a test is
overflowing a number when specifying an integer.


This whole integer pow version is becoming unduly complicated and ugly.

For me, the rational of having the ipow implementation was to have a 
precise integer result when possible. Now that it is casted to double and 
the precision is lost, the whole point of ipow is moot, even if there is 
some performance gain.


So given that committers do not want the int/double version because it is 
slightly different from the numeric/double version of SQL (obviously), and 
that the integer version is becoming over complicated with checks and 
warnings or errors, I'm now in favor of just dropping it, and provide the 
double version only.


Too bad for integer computations on keys which is the core of pgbench use 
case, but I can't help it.


--
Fabien.

Re: Add hint about replication slots when nearing wraparound

2017-12-26 Thread Feike Steenbergen

On 23 December 2017 at 11:58, Michael Paquier  wrote:
> On Fri, Dec 22, 2017 at 07:55:19AM +0100, Feike Steenbergen wrote:
>> On 21 December 2017 at 05:32, Michael Paquier  
>> wrote:
>>
>> > Don't you want to put that in its own  block? That's rather
>> > important not to miss for administrators.
>>
>> I didn't want to add yet another block on that documentation page,
>> as it already has 2, however it may be good to upgrade the
>> note to a caution, similar to the prepared transaction caution.
>
> Yes, I agree with this position.

Changed the block from a note to a caution,

regards,

Feike


0003-Add-hint-about-replication-slots-for-wraparound.patch
Description: Binary data

Re: Using ProcSignal to get memory context stats from a running backend

2017-12-26 Thread Craig Ringer

On 22 December 2017 at 23:19, Maksim Milyutin  wrote:

> On 22.12.2017 16:56, Craig Ringer wrote:
>
> On 22 December 2017 at 20:50, Maksim Milyutin 
> wrote:
>
>> On 19.12.2017 16:54, Pavel Stehule wrote:
>>
>> sorry for small offtopic. Can be used this mechanism for log of executed
>> plan or full query?
>>
>>
> That's a really good idea. I'd love to be able to pg_explain_backend(...)
>
> I left the mechanism as a generic diagnostic signal exactly so that we
> could add other things we wanted to be able to ask backends. I think a
> follow-on patch that adds support for dumping explain-format plans would be
> great, if it's practical to do that while a query's already running.
>
>
> Noticing the interest in the calling some routines on the remote backend
> through signals, in parallel thread[1] I have proposed the possibility to
> define user defined signal handlers in extensions. There is a patch taken
> from pg_query_state module.
>
> 1. https://www.postgresql.org/message-id/3f905f10-cf7a-d4e0-
> 64a1-7fd9b8351592%40gmail.com
>
>
Haven't read the patch, but the idea seems useful if a bit hairy in
practice. It'd be done on a "use at your own risk, and if it breaks you get
to keep the pieces" basis, where no guarantees are made by Pg about how and
when the function is called so it must be utterly defensive. The challenge
there being that you can't always learn enough about the state of the
system without doing things that might break it, so lots of things you
could do would be fragile.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: AS OF queries

2017-12-26 Thread Craig Ringer

On 25 December 2017 at 15:59, Konstantin Knizhnik  wrote:

>
>
> On 25.12.2017 06:26, Craig Ringer wrote:
>
> On 24 December 2017 at 04:53, konstantin knizhnik <
> k.knizh...@postgrespro.ru> wrote:
>
>>
>>
>> But what if I just forbid to change recent_global_xmin?
>> If it is stalled at FirstNormalTransactionId and never changed?
>> Will it protect all versions from been deleted?
>>
>
> That's totally impractical, you'd have unbounded bloat and a nonfunctional
> system in no time.
>
> You'd need a mechanism - akin to what we have with replication slots - to
> set a threshold for age.
>
>
> Well, there are systems with "never delete" and "append only" semantic.
> For example, I have participated in SciDB project: database for scientific
> applications.
> One of the key requirements for scientific researches is reproducibility.
> From the database point of view it means that we need to store all raw
> data and never delete it.
>

PostgreSQL can't cope with that for more than 2^31 xacts, you have to
"forget" details of which xacts created/updated tuples and the contents of
deleted tuples, or you exceed our xid limit. You'd need 64-bit XIDs, or a
redo-buffer based heap model (like the zheap stuff) with redo buffers
marked with an xid epoch, or something like that.

> I am not sure that it should be similar with logical replication slot.
>
Here semantic is quite clear: we preserve segments of WAL until them are
> replicated to the subscribers.
>

Er, what?

This isn't to do with restart_lsn. That's why I mentioned *logical*
replication slots.

I'm talking about how they interact with GetOldestXmin using their xmin and
catalog_xmin.

You probably won't want to re-use slots, but you'll want something akin to
that, a transaction age threshold. Otherwise your system has a finite end
date where it can no longer function due to xid count, or if you solve
that, it'll slowly choke on table bloat etc. I guess if you're willing to
accept truly horrible performance...

> With time travel situation is less obscure: we may specify some threshold
> for age - keep data for example for one year.
>

Sure. You'd likely do that by mapping commit timestamps => xids and using
an xid threshold though.

> But unfortunately trick with snapshot (doesn't matter how we setup oldest
> xmin horizon) affect all tables.
>

You'd need to be able to pass more info into HeapTupleSatisfiesMVCC etc. I
expect you'd probably add a new snapshot type (like logical decoding did
with historic snapshots), that has a new Satisfies function. But you'd have
to be able to ensure all snapshot Satisfies callers had the required extra
info - like maybe a Relation - which could be awkward for some call sites.

The user would have to be responsible for ensuring sanity of FK
relationships etc when specifying different snapshots for different
relations.

Per-relation time travel doesn't seem totally impractical so long as you
can guarantee that there is some possible snapshot for which the catalogs
defining all the relations and types are simultaneously valid, i.e. there's
no disjoint set of catalog changes. Avoiding messy performance implications
with normal queries might not even be too bad if you use a separate
snapshot model, so long as you can avoid callsites having to do extra work
in the normal case.

Dealing with dropped columns and rewrites would be a pain though. You'd
have to preserve the dropped column data when you re-projected the rewrite
tuples.

> There is similar (but not the same) problem with logical replication:
> assume that we need to replicate only one small table. But we have to pin
> in WAL all updates of other huge table which is not involved in logical
> replication at all.
>

I don't really see how that's similar. It's concerned with WAL, wheras what
you're looking at is heaps and bloat from old versions. Completely
different, unless you propose to somehow reconstruct data from old WAL to
do historic queries, which would be o_O ...

> Well, I am really not sure about user's demands to time travel. This is
> one of the reasons of initiating this discussion in hackers... May be it is
> not the best place for such discussion, because there are mostly Postgres
> developers and not users...
> At least, from experience of few SciDB customers, I can tell that we
> didn't have problems with schema evolution: mostly schema is simple, static
> and well defined.
> There was problems with incorrect import of data (this is why we have to
> add real delete), with splitting data in chunks (partitioning),...
>

Every system I've ever worked with that has a "static" schema has landed up
not being so static after all.

I'm sure there are exceptions, but if you can't cope with catalog changes
you've excluded the immense majority of users. Even the ones who promise
they don't ever need to change anything ... land up changing things.

> The question is how we should handle such catalog changes if them are
>> happen. Ideally we should not allow to move

Re: Getting rid of "tuple concurrently updated" elog()s with concurrent DDLs (at least ALTER TABLE)

2017-12-26 Thread Robert Haas

On Tue, Dec 26, 2017 at 4:14 AM, Michael Paquier
 wrote:
> On Tue, Dec 26, 2017 at 4:30 PM, Andres Freund  wrote:
>> You're proposing to lock the entire relation against many forms of 
>> concurrent DDL, just to get rid of that error? That seems unacceptable.
>> Isn't the canonical way to solve this to take object locks?
>
> Sure. That's where things in lmgr.c come into play, like
> LockSharedObject(), and you could hold with an exclusive lock on a
> given object until the end of a transaction before opening the catalog
> relation with heap_open(), however with those you need to know the
> object OID before taking a lock on the parent relation, right? So you
> face problems with lock upgrades, or race conditions show up more
> easily. I have to admit that I have not dug much into the problem yet,
> it is easy enough to have isolation tests by the way, and I just
> noticed that ALTER DATABASE SET can equally trigger the error.

I don't understand what you mean by "you need to know the object OID
before taking a lock on the parent relation, right?".  That seems
wrong.

I think you might need something like what was done in
b3ad5d02c9cd8a4c884cd78480f221afe8ce5590; if, after we look up the
name and before we acquire a lock on the OID, we accept any
invalidation messages, recheck that the object we've locked is still
the one associated with that name.

I think Andres is certainly right that locking all of pg_authid is a nonstarter.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Observations in Parallel Append

2017-12-26 Thread Robert Haas

On Sun, Dec 24, 2017 at 8:37 PM, Amit Kapila  wrote:
> On Sun, Dec 24, 2017 at 12:06 PM, Robert Haas  wrote:
>> On Fri, Dec 22, 2017 at 6:18 AM, Amit Kapila  wrote:
>>
>>> Also, don't we need to use parallel_divisor for partial paths instead
>>> of non-partial paths as those will be actually distributed among
>>> workers?
>>
>> Uh, that seems backwards to me.  We're trying to estimate the average
>> number of rows per worker.
>
> Okay, but is it appropriate to use the parallel_divisor?  The
> parallel_divisor means the contribution of all the workers (+
> leader_contribution) whereas for non-partial paths there will be
> always only the subset of workers which will operate on them.
> Consider a case with one non-partial subpath and five partial subpaths
> with six as parallel_divisor, now the current code will try to divide
> the rows of non-partial subpath with respect to six workers.  However,
> in reality, there will always be one worker which will execute that
> path.

That's true, of course, but if five processes each return 0 rows and
the sixth process returns 600 rows, the average number of rows per
process is 100, not anything else.

Here's one way to look at it.  Suppose there is a table with 1000
partitions.  If we do a Parallel Append over a Parallel Seq Scan per
partition, we will come up with a row estimate by summing the
estimated row count across all partitions and dividing by the
parallel_divisor.  This will give us some answer.  If we instead do a
Parallel Append over a Seq Scan per partition, we should really come
up with the *same* estimate.  The only way to do that is to also
divide by the parallel_divisor in this case.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Should we nonblocking open FIFO files in COPY?

2017-12-26 Thread Robert Haas

On Tue, Dec 26, 2017 at 7:51 PM, Michael Paquier
 wrote:
>> > Hmm.  What about the case where we try to open a plain file that's on
>> > an inaccessible filesystem, e.g. due to a disk failure?  Allowing
>> > cancel to work just for FIFOs would be OK, I guess, but allowing it
>> > for other open() calls that hang would be better.  I'm not sure if we
>> > can make it work that way, but it would be nice if we could.
>>
>> That is doable, just stat() and check before open().
>
> I think TOCTOU when I read such things.. The data folder is a trusted
> environment but any patches doing things like that ought to be careful.

Yeah.  I was more wondering whether an ostensibly non-blocking open()
would nevertheless block on an inaccessible file.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] taking stdbool.h into use

2017-12-26 Thread Michael Paquier

On Tue, Dec 26, 2017 at 02:00:47PM -0500, Peter Eisentraut wrote:
> On 12/25/17 00:32, Michael Paquier wrote:
> >> So here is a minimal patch set to perhaps wrap this up for the time
> >> being.  I have added static assertions that check the sizes of
> >> GinNullCategory and GinTernaryValue, which I think are the two critical
> >> places that require compatibility with bool.  And then we include
> >>  only if its bool has size 1.
> > 
> > +   /*
> > +* now we can use the nullFlags as category codes
> > +*/
> > +   StaticAssertStmt(sizeof(GinNullCategory) == sizeof(bool),
> > +"sizes of GinNullCategory and bool are not equal");
> > *categories = (GinNullCategory *) nullFlags;
> > 
> > Hm. So on powerpc compilation is going to fail with this patch as
> > sizeof(char) is 1, no?
> 
> Yes, but right now it would (probably) just fail in mysterious ways, so
> the static assertion adds safety.
> 
> > Wouldn't it be better to just allocate an array
> > for GinNullCategory entries and then just fill in the values one by
> > one with GIN_CAT_NORM_KEY or GIN_CAT_NULL_KEY by scanning nullFlags?
> 
> Yeah, initially I though making another loop through the array would be
> adding more overhead.  But reading the code again, we already loop
> through the array anyway to set the nullFlags to the right bit patterns.
>  So I think we can just drop that and build a proper GinNullCategory
> array instead.  I think that would be much cleaner.  Patch attached.

Thanks for the updated patch. This looks saner to me.

It would be nice to do something like that for GinTernaryValue in
tsginidx.c by mapping directly to GIN_FALSE and GIN_TRUE depending on
the input coming in gin_tsquery_consistent. The fix is more trivial
there. Do you think that those two things should be back-patched?
--
Michael


signature.asc
Description: PGP signature

Re: Should we nonblocking open FIFO files in COPY?

2017-12-26 Thread Michael Paquier

On Wed, Dec 27, 2017 at 10:18:03AM +0800, Adam Lee wrote:
> On Tue, Dec 26, 2017 at 11:48:58AM -0800, Robert Haas wrote:
> > On Thu, Dec 21, 2017 at 10:10 PM, Adam Lee  wrote:
> > > I have an issue that COPY from a FIFO, which has no writers, could not be
> > > canceled, because COPY invokes AllocateFile() -> fopen() -> blocking 
> > > open().
> > 
> > Hmm.  What about the case where we try to open a plain file that's on
> > an inaccessible filesystem, e.g. due to a disk failure?  Allowing
> > cancel to work just for FIFOs would be OK, I guess, but allowing it
> > for other open() calls that hang would be better.  I'm not sure if we
> > can make it work that way, but it would be nice if we could.
> 
> That is doable, just stat() and check before open().

I think TOCTOU when I read such things.. The data folder is a trusted
environment but any patches doing things like that ought to be careful.
--
Michael


signature.asc
Description: PGP signature

Re: [HACKERS] [PATCH] Tap test support for backup with tablespace mapping

2017-12-26 Thread Michael Paquier

On Wed, Dec 27, 2017 at 12:58:28PM +1100, Vaishnavi Prabakaran wrote:
> I have added support in Postgres TAP test framework to backup a data
> directory with tablespace mapping. Also added support to move the backup
> directory contents to standby node, because current option to init the
> standby from backup does not support copying softlinks, which is needed
> when tablespace mapping is involved in backup.
> 
> Added a new test to existing streaming replication tap test to demonstrate
> the usage of these new APIs.
> 
> Attached the patch, Hope this enhancement is useful.


+sub backup_withtablespace
+{
+my ($self, $backup_name, $old_tablespacedir, $new_tablespacedir) = @_;
+my $backup_path = $self->backup_dir . '/' . $backup_name;
+my $port= $self->port;
+my $name= $self->name;
+
+print "# Taking pg_basebackup $backup_name from node \"$name\"\n";
+TestLib::system_or_bail('pg_basebackup', '-D', $backup_path, '-p', 
$port,
+'--no-sync','-T',"$old_tablespacedir=$new_tablespacedir");
+print "# Backup with tablespace option finished\n";
+}
Let's not reinvent the wheel and instead let's extend the existing
PostgresNode::backup with a new option set. Please be careful that this
new option needs to be able to handle multiple sets of old/new
directories as pg_basebackup is able to handle that as well.

+=cut
+
+sub convert_backup_to_standby
+{
This is a copy/paste of the existing PostgresNode::init_from_backup,
except that you remove forcibly a backup. I don't understand what you
can actually win from that. A new routine which *removes* a past backup
could have some value, still those are automatically wiped out at the
end of the tests if the person who designed the test is smart enough to
use the default paths.

+$node_master->safe_psql('postgres',
+"CREATE TABLESPACE testtblspc location '$master_tablespacedir';");
+$node_master->safe_psql('postgres',
+"CREATE TABLE test_table(id integer,name varchar(20)) TABLESPACE 
testtblspc;");
Those tests overlap with the existing 010_pg_basebackup.pl.
--
Michael


signature.asc
Description: PGP signature

Re: AS OF queries

2017-12-26 Thread David Fetter

On Tue, Dec 26, 2017 at 03:43:36PM -0700, legrand legrand wrote:
> would actual syntax
> 
> WITH old_foo AS
> (select * from foo as of '')
> select * from foo except select * from old_foo;
> 
> work in replacement for
> 
> select * from foo except select * from foo as old_foo as of '';
> 
> ?

If there has to be a WITH, or (roughly) equivalently, a sub-select for
each relation, the queries get very hairy very quickly.  It would
nevertheless be better for the people who need the feature to have it
this way than not to have it at all.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: [HACKERS] Runtime Partition Pruning

2017-12-26 Thread David Rowley

Hi,

Please find attached my 4th version this patch.

This is now based on v17 of Amit's faster partition pruning patch [1].
It also now includes Beena's tests which I've done some mostly
cosmetic changes to.

I've also fixed a few bugs, one in a case where I was not properly
handling zero matching partitions in nodeAppend.c.

Another change I've made is to now perform the partition pruning at
run-time using a new memory context that's reset each time we
redetermine the matching partitions. This was required since we're
calling a planner function which might not be too careful about
pfreeing memory it allocates. A test case I was running before making
this change ended out failing to palloc memory due to OOM.

I've not done anything about reducing the cost of the Append path when
runtime pruning is enabled. I'm still thinking over the best way to
handle that.

[1] 
https://www.postgresql.org/message-id/58c3e20a-a964-4fdb-4e7d-bd833e9be...@lab.ntt.co.jp

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


runtime_prune_drowley_v4.patch
Description: Binary data

Re: Should we nonblocking open FIFO files in COPY?

2017-12-26 Thread Adam Lee

On Tue, Dec 26, 2017 at 11:48:58AM -0800, Robert Haas wrote:
> On Thu, Dec 21, 2017 at 10:10 PM, Adam Lee  wrote:
> > I have an issue that COPY from a FIFO, which has no writers, could not be
> > canceled, because COPY invokes AllocateFile() -> fopen() -> blocking open().
> 
> Hmm.  What about the case where we try to open a plain file that's on
> an inaccessible filesystem, e.g. due to a disk failure?  Allowing
> cancel to work just for FIFOs would be OK, I guess, but allowing it
> for other open() calls that hang would be better.  I'm not sure if we
> can make it work that way, but it would be nice if we could.

That is doable, just stat() and check before open().

-- 
Adam Lee

[HACKERS] [PATCH] Tap test support for backup with tablespace mapping

2017-12-26 Thread Vaishnavi Prabakaran

Hi All,

I have added support in Postgres TAP test framework to backup a data
directory with tablespace mapping. Also added support to move the backup
directory contents to standby node, because current option to init the
standby from backup does not support copying softlinks, which is needed
when tablespace mapping is involved in backup.

Added a new test to existing streaming replication tap test to demonstrate
the usage of these new APIs.

Attached the patch, Hope this enhancement is useful.

Thanks & Regards,
Vaishnavi,
Fujitsu Australia.


0001-Tap-test-support-for-backup-with-tablespace-mapping.patch
Description: Binary data

Re: [HACKERS] replace GrantObjectType with ObjectType

2017-12-26 Thread Michael Paquier

On Tue, Dec 26, 2017 at 02:15:18PM -0500, Peter Eisentraut wrote:
> On 12/20/17 22:01, Stephen Frost wrote:
> > There's some downsides to this approach though: we do an initial set of
> > checks in ExecGrantStmt, but we can't do all of them because we don't
> > know if it's a sequence or not, so we end up with some additional
> > special checks to see if the GRANT is valid down in ExecGrant_Relation
> > after we figure out what kind of relation it is.
> 
> I think that we allow a sequence to be treated like a table in GRANT
> (and other places) is a historical wart that we won't easily be able to
> get rid of.  I don't think the object address system should be bent to
> accommodate that.  I'd rather see the warts in the code explicitly.

Yes, I agree with that. GRANT without an object defined works fine for
sequences, so you want to keep an abstraction level where a relation
means more than a simple table.
--
Michael


signature.asc
Description: PGP signature

Re: [HACKERS] pow support for pgbench

2017-12-26 Thread Michael Paquier

On Tue, Dec 26, 2017 at 11:26:58PM +0100, Fabien COELHO wrote:
> > This version looks good to me, except that I wonder if we should try to
> > switch to the floating-point version if the integer version would/does
> > overflow.
> 
> My 0.02€ is that it is under the user control who provides either ints or
> doubles as arguments. So I do not think that we should bother, for what my
> opinion is worth.
> 
> If this is a new requirement, detecting the integer overflow is probably
> possible with some testing, eg unexpected changes of sign, but that would
> probably add two tests per round, and probably double the computation cost.

And my 2c on the matter is that switching silently from one version to
the other would be unwelcome. The user should be aware if a test is
overflowing a number when specifying an integer.
--
Michael


signature.asc
Description: PGP signature

Re: [HACKERS] pow support for pgbench

2017-12-26 Thread Raúl Marín Rodríguez

Hi,

I've implemented the overflow checks and made some benchmarks and the
ipow() version became slower except with some specific inputs (base 0 for
example). It's true that the new auxiliary functions could be optimized,
but I don't think it makes sense to keep working on them just to match
pow() speed.

I'm attaching both patches in case someone wants to have a look but I would
go with the simpler solution (pgbench_pow_v10.patch).

Regards,
-- 

*Raúl Marín Rodríguez *carto.com
From 28bc71f657fa967bafc03c015e5e2405c427a711 Mon Sep 17 00:00:00 2001
From: Raul Marin 
Date: Thu, 21 Dec 2017 20:28:02 +0100
Subject: [PATCH] Add pow() support to pgbench

---
 doc/src/sgml/ref/pgbench.sgml|  7 +++
 src/bin/pgbench/exprparse.y  |  6 ++
 src/bin/pgbench/pgbench.c| 16 
 src/bin/pgbench/pgbench.h|  3 ++-
 src/bin/pgbench/t/001_pgbench_with_server.pl | 22 +-
 5 files changed, 52 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 4431fc3eb7..1519fe78ef 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -1069,6 +1069,13 @@ pgbench  options  d
pi()
3.14159265358979323846
   
+  
+   pow(x, y), power(x, y)
+   double
+   exponentiation
+   pow(2.0, 10), power(2.0, 10)
+   1024.0
+  
   
random(lb, ub)
integer
diff --git a/src/bin/pgbench/exprparse.y b/src/bin/pgbench/exprparse.y
index 25d5ad48e5..74ffe5e7a7 100644
--- a/src/bin/pgbench/exprparse.y
+++ b/src/bin/pgbench/exprparse.y
@@ -194,6 +194,12 @@ static const struct
 	{
 		"random_zipfian", 3, PGBENCH_RANDOM_ZIPFIAN
 	},
+	{
+		"pow", 2, PGBENCH_POW
+	},
+	{
+		"power", 2, PGBENCH_POW
+	},
 	/* keep as last array element */
 	{
 		NULL, 0, 0
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 7ce6f607f5..5b444765dd 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -1850,6 +1850,22 @@ evalFunc(TState *thread, CState *st,
 return true;
 			}
 
+		case PGBENCH_POW:
+			{
+PgBenchValue *lval = &vargs[0];
+PgBenchValue *rval = &vargs[1];
+
+Assert(nargs == 2);
+
+if (!coerceToDouble(lval, &ld) ||
+	!coerceToDouble(rval, &rd))
+	return false;
+
+setDoubleValue(retval, pow(ld, rd));
+
+return true;
+			}
+
 		default:
 			/* cannot get here */
 			Assert(0);
diff --git a/src/bin/pgbench/pgbench.h b/src/bin/pgbench/pgbench.h
index 83fee1ae74..0e92882a4c 100644
--- a/src/bin/pgbench/pgbench.h
+++ b/src/bin/pgbench/pgbench.h
@@ -76,7 +76,8 @@ typedef enum PgBenchFunction
 	PGBENCH_RANDOM,
 	PGBENCH_RANDOM_GAUSSIAN,
 	PGBENCH_RANDOM_EXPONENTIAL,
-	PGBENCH_RANDOM_ZIPFIAN
+	PGBENCH_RANDOM_ZIPFIAN,
+	PGBENCH_POW
 } PgBenchFunction;
 
 typedef struct PgBenchExpr PgBenchExpr;
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index e3cdf28628..9cbeb2fc11 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -232,7 +232,17 @@ pgbench(
 		qr{command=19.: double 19\b},
 		qr{command=20.: double 20\b},
 		qr{command=21.: int 9223372036854775807\b},
-		qr{command=23.: int [1-9]\b}, ],
+		qr{command=23.: int [1-9]\b},
+		qr{command=24.: double -27\b},
+		qr{command=25.: double 1024\b},
+		qr{command=26.: double 1\b},
+		qr{command=27.: double 1\b},
+		qr{command=28.: double -0.125\b},
+		qr{command=29.: double -0.125\b},
+		qr{command=30.: double -0.00032\b},
+		qr{command=31.: double 8.50705917302346e\+37\b},
+		qr{command=32.: double 1e\+30\b},
+	],
 	'pgbench expressions',
 	{   '001_pgbench_expressions' => q{-- integer functions
 \set i1 debug(random(1, 100))
@@ -264,6 +274,16 @@ pgbench(
 \set i1 0
 -- yet another integer function
 \set id debug(random_zipfian(1, 9, 1.3))
+--- pow and power
+\set poweri debug(pow(-3,3))
+\set powerd debug(pow(2.0,10))
+\set poweriz debug(pow(0,0))
+\set powerdz debug(pow(0.0,0.0))
+\set powernegi debug(pow(-2,-3))
+\set powernegd debug(pow(-2.0,-3.0))
+\set powernegd2 debug(power(-5.0,-5.0))
+\set powerov debug(pow(9223372036854775807, 2))
+\set powerov2 debug(pow(10,30))
 } });
 
 # backslash commands
-- 
2.15.1

From 6cb8fb1a3eaffe680b85913b0acf0068dcb0d2a7 Mon Sep 17 00:00:00 2001
From: Raul Marin 
Date: Thu, 21 Dec 2017 20:28:02 +0100
Subject: [PATCH] Add pow() support to pgbench

---
 doc/src/sgml/ref/pgbench.sgml|   7 ++
 src/bin/pgbench/exprparse.y  |   6 ++
 src/bin/pgbench/pgbench.c| 100 +++
 src/bin/pgbench/pgbench.h|   3 +-
 src/bin/pgbench/t/001_pgbench_with_server.pl |  22 +-
 5 files changed, 136 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 4431fc3eb7..1519fe78ef 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b

Re: [JDBC] [HACKERS] Channel binding support for SCRAM-SHA-256

2017-12-26 Thread Michael Paquier

On Tue, Dec 26, 2017 at 03:28:09PM -0500, Peter Eisentraut wrote:
> On 12/22/17 03:10, Michael Paquier wrote:
> > Second thoughts on 0002 as there is actually no need to move around
> > errorMessage if the PGconn* pointer is saved in the SCRAM status data
> > as both are linked. The attached simplifies the logic even more.
> > 
> 
> That all looks pretty reasonable.

Thanks for the review. Don't you think that the the refactoring
simplifications should be done first though? This would result in
producing the patch set in reverse order. I'll be fine to produce them
if need be.

> I'm working through patch 0001 now.  I haven't found any documentation
> on the function OBJ_find_sigid_algs().  What does it do?  One might
> think that the nid returned by X509_get_signature_nid() is already the
> algo_nid we want to use, but there appears to be more to this.

All the objects returned by X509_get_signature_nid() are listed in
crypto/objects/obj_dat.h which may include more information than just
the algorithm type, like for example if RSA encryption is used or not,
etc. I found about the low-level OBJ_find_sigid_algs() to actually get
the real hashing algorithm after diving into X509* informations. And by
looking at X509_signature_print() I found out that this returns the
information we are looking for. This has the damn advantage that we rely
on a minimal lists of algorithms and we don't need to worry about any
future options linked with X509_get_signature_nid(), so this simplifies
Postgres code as well as long-term maintenance.
--
Michael


signature.asc
Description: PGP signature

Re: AS OF queries

2017-12-26 Thread legrand legrand

would actual syntax

WITH old_foo AS
(select * from foo as of '')
select * from foo except select * from old_foo;

work in replacement for

select * from foo except select * from foo as old_foo as of '';

?

Regards
PAscal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

Re: [HACKERS] pow support for pgbench

2017-12-26 Thread Fabien COELHO



Hello Robert,


If a double is always returned, I'm wondering whether keeping the ipow
version makes much sense: In case of double loss of precision, the precision
is lost, too bad, and casting back to int won't bring it back.


I've kept it because knowing that both are ints enables not making a lot of
checks (done in math.h pow) so it's way faster. In my system it's 2-3ns vs
~40ns. I'm willing to settle for using just pow() if that makes it clearer.


This version looks good to me, except that I wonder if we should try to 
switch to the floating-point version if the integer version would/does 
overflow.


My 0.02€ is that it is under the user control who provides either ints or 
doubles as arguments. So I do not think that we should bother, for what my 
opinion is worth.


If this is a new requirement, detecting the integer overflow is probably 
possible with some testing, eg unexpected changes of sign, but that would 
probably add two tests per round, and probably double the computation 
cost.


--
Fabien.

Re: AS OF queries

2017-12-26 Thread Jeff Janes

On Thu, Dec 21, 2017 at 6:00 AM, Konstantin Knizhnik <
k.knizh...@postgrespro.ru> wrote:


> There is still one significant difference of my prototype implementation
> with SQL standard: it associates timestamp with select statement, not with
> particular table.
> It seems to be more difficult to support and I am not sure that joining
> tables from different timelines has much sense.
> But certainly it also can be fixed.


I think the main use I would find for this feature is something like:

select * from foo except select * from foo as old_foo as of '';

So I would be grateful if you can make that work.  Also, I think conforming
to the standards is pretty important where it is feasible to do that.

Cheers,

Jeff

Re: [HACKERS] LDAPS

2017-12-26 Thread Peter Eisentraut

This patch looks reasonable to me.  I have also seen occasional requests
for this in the field.

If someone could test this on Windows, I think we could move ahead with it.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [JDBC] [HACKERS] Channel binding support for SCRAM-SHA-256

2017-12-26 Thread Peter Eisentraut

On 12/22/17 03:10, Michael Paquier wrote:
> On Fri, Dec 22, 2017 at 11:59 AM, Michael Paquier
>  wrote:
>> I have looked at how things could be done in symmetry for both the frontend
>> and backend code, and I have produced the attached patch 0002, which
>> can be applied on top of 0001 implementing tls-server-end-point. This
>> simplifies the interfaces to initialize the SCRAM status data by saving
>> into scram_state and fe_scram_state respectively Port* and PGconn* which
>> holds most of the data needed for the exchange. With this patch, cbind_data
>> is generated only if a specific channel binding type is used with the
>> appropriate data. So if no channel binding is used there is no additional
>> SSL call done to get the TLS finished data or the server certificate hash.
>>
>> 0001 has no real changes compared to the last versions.
> 
> Second thoughts on 0002 as there is actually no need to move around
> errorMessage if the PGconn* pointer is saved in the SCRAM status data
> as both are linked. The attached simplifies the logic even more.
> 

That all looks pretty reasonable.

I'm working through patch 0001 now.  I haven't found any documentation
on the function OBJ_find_sigid_algs().  What does it do?  One might
think that the nid returned by X509_get_signature_nid() is already the
algo_nid we want to use, but there appears to be more to this.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Should we nonblocking open FIFO files in COPY?

2017-12-26 Thread Robert Haas

On Thu, Dec 21, 2017 at 10:10 PM, Adam Lee  wrote:
> I have an issue that COPY from a FIFO, which has no writers, could not be
> canceled, because COPY invokes AllocateFile() -> fopen() -> blocking open().

Hmm.  What about the case where we try to open a plain file that's on
an inaccessible filesystem, e.g. due to a disk failure?  Allowing
cancel to work just for FIFOs would be OK, I guess, but allowing it
for other open() calls that hang would be better.  I'm not sure if we
can make it work that way, but it would be nice if we could.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] pow support for pgbench

2017-12-26 Thread Robert Haas

On Fri, Dec 22, 2017 at 12:46 AM, Raúl Marín Rodríguez
 wrote:
>> If a double is always returned, I'm wondering whether keeping the ipow
>> version makes much sense: In case of double loss of precision, the precision
>> is lost, too bad, and casting back to int won't bring it back.
>
> I've kept it because knowing that both are ints enables not making a lot of
> checks (done in math.h pow) so it's way faster. In my system it's 2-3ns vs
> ~40ns. I'm willing to settle for using just pow() if that makes it clearer.

This version looks good to me, except that I wonder if we should try
to switch to the floating-point version if the integer version
would/does overflow.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] replace GrantObjectType with ObjectType

2017-12-26 Thread Peter Eisentraut

On 12/20/17 22:01, Stephen Frost wrote:
> There's some downsides to this approach though: we do an initial set of
> checks in ExecGrantStmt, but we can't do all of them because we don't
> know if it's a sequence or not, so we end up with some additional
> special checks to see if the GRANT is valid down in ExecGrant_Relation
> after we figure out what kind of relation it is.

I think that we allow a sequence to be treated like a table in GRANT
(and other places) is a historical wart that we won't easily be able to
get rid of.  I don't think the object address system should be bent to
accommodate that.  I'd rather see the warts in the code explicitly.

> Further, anything which handles an OBJECT_TABLE or OBJECT_VIEW or
> OBJECT_SEQUENCE today might be expected to now be able to handle an
> OBJECT_RELATION instead, though it doesn't look like this patch makes
> any attempt to address that.

To some extent 0002 does that.

> In the end, I'd personally prefer refactoring ExecGrantStmt and friends
> to move the GRANT-type checks down into the individual functions and,
> ideally, avoid having to have OBJECT_RELATION at all, though that seems
> like it'd be a good bit of work.

I'm not sure I follow that, since it appears to be the opposite of what
you said earlier, i.e., we should have OBJECT_RELATION so as to avoid
using OBJECT_TABLE when we don't really know yet whether something is a
table.

> My second preference would be to at least add some commentary about
> OBJECT_RELATION vs. OBJECT_(TABLE|VIEW|SEQUENCE), when which should be
> used and why, and review functions that accept objects of those types
> to see if they should be extended to also accept OBJECT_RELATION.

I can look into that.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] taking stdbool.h into use

2017-12-26 Thread Peter Eisentraut

On 12/25/17 00:32, Michael Paquier wrote:
>> So here is a minimal patch set to perhaps wrap this up for the time
>> being.  I have added static assertions that check the sizes of
>> GinNullCategory and GinTernaryValue, which I think are the two critical
>> places that require compatibility with bool.  And then we include
>>  only if its bool has size 1.
> 
> +   /*
> +* now we can use the nullFlags as category codes
> +*/
> +   StaticAssertStmt(sizeof(GinNullCategory) == sizeof(bool),
> +"sizes of GinNullCategory and bool are not equal");
> *categories = (GinNullCategory *) nullFlags;
> 
> Hm. So on powerpc compilation is going to fail with this patch as
> sizeof(char) is 1, no?

Yes, but right now it would (probably) just fail in mysterious ways, so
the static assertion adds safety.

> Wouldn't it be better to just allocate an array
> for GinNullCategory entries and then just fill in the values one by
> one with GIN_CAT_NORM_KEY or GIN_CAT_NULL_KEY by scanning nullFlags?

Yeah, initially I though making another loop through the array would be
adding more overhead.  But reading the code again, we already loop
through the array anyway to set the nullFlags to the right bit patterns.
 So I think we can just drop that and build a proper GinNullCategory
array instead.  I think that would be much cleaner.  Patch attached.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From dc47c84286345eae48582df42bd977e369b4f341 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Tue, 26 Dec 2017 13:47:18 -0500
Subject: [PATCH] Don't cast between GinNullCategory and bool

The original idea was that we could use a isNull-style bool array
directly as a GinNullCategory array.  However, the existing code already
acknowledges that that doesn't really work, because of the possibility
that bool as currently defined can have arbitrary bit patterns for true
values.  So it has to loop through the nullFlags array to set each bool
value to and acceptable value.  But if we are looping through the whole
array anyway, we might as well build a proper GinNullCategory array
instead and abandon the type casting.  That makes the code much safer in
case bool is ever changed to something else.
---
 src/backend/access/gin/ginscan.c | 19 ---
 src/backend/access/gin/ginutil.c | 18 --
 src/include/access/ginblock.h|  7 +--
 3 files changed, 21 insertions(+), 23 deletions(-)

diff --git a/src/backend/access/gin/ginscan.c b/src/backend/access/gin/ginscan.c
index 7ceea7a741..77c0c577b5 100644
--- a/src/backend/access/gin/ginscan.c
+++ b/src/backend/access/gin/ginscan.c
@@ -295,6 +295,7 @@ ginNewScanKey(IndexScanDesc scan)
bool   *partial_matches = NULL;
Pointer*extra_data = NULL;
bool   *nullFlags = NULL;
+   GinNullCategory *categories;
int32   searchMode = GIN_SEARCH_MODE_DEFAULT;
 
/*
@@ -346,15 +347,12 @@ ginNewScanKey(IndexScanDesc scan)
}
 
/*
-* If the extractQueryFn didn't create a nullFlags array, 
create one,
-* assuming that everything's non-null.  Otherwise, run through 
the
-* array and make sure each value is exactly 0 or 1; this 
ensures
-* binary compatibility with the GinNullCategory 
representation. While
-* at it, detect whether any null keys are present.
+* Create GinNullCategory representation.  If the extractQueryFn
+* didn't create a nullFlags array, we assume everything is 
non-null.
+* While at it, detect whether any null keys are present.
 */
-   if (nullFlags == NULL)
-   nullFlags = (bool *) palloc0(nQueryValues * 
sizeof(bool));
-   else
+   categories = (GinNullCategory *) palloc0(nQueryValues * 
sizeof(GinNullCategory));
+   if (nullFlags)
{
int32   j;
 
@@ -362,17 +360,16 @@ ginNewScanKey(IndexScanDesc scan)
{
if (nullFlags[j])
{
-   nullFlags[j] = true;/* not any 
other nonzero value */
+   categories[j] = GIN_CAT_NULL_KEY;
hasNullQuery = true;
}
}
}
-   /* now we can use the nullFlags as category codes */
 
ginFillScanKey(so, skey->sk_attno,
   skey->sk_strategy, searchMode,
   skey->sk_argument, nQueryValues,
-  queryValues, (GinNullCategory *) 
nullFl

Re: [PROPOSAL] Shared Ispell dictionaries

2017-12-26 Thread Alvaro Herrera

Arthur Zakirov wrote:

> On Tue, Dec 26, 2017 at 01:55:57PM -0300, Alvaro Herrera wrote:
> > So what are you going to use instead?
>
> [ ... ]
>
> To allocate IspellDict in this case it is necessary to calculate needed
> memory size. I think arrays mentioned above will be built first then
> memcpy'ed into IspellDict, if it won't take much time.

OK, that sounds sensible on first blush.  If there are many processes
concurrently doing text searches, then the amnount of memory saved may
be large enough to justify the additional processing (moreso if it's
just one more memcpy cycle).

I hope that there is some way to cope with the ispell data changing
underneath -- maybe you'll need some sort of RCU?

> > So this will be a large patch not submitted to 2018-01?  Depending on
> > size/complexity I'm not sure it's OK to submit 2018-03 only -- it may be
> > too late.
> 
> Oh, I see. I try to prepare the patch while 2018-01 is open.

It isn't necessary that the patch to present to 2018-01 is final and
complete (so don't kill yourself to achieve that) -- a preliminary patch
that reviewers can comment on is enough, as long as the final patch you
present to 2018-03 is not *too* different.  But any medium-large patch
whose first post is to the last commitfest of a cycle is likely to be
thrown out to the next cycle's first commitfest very quickly.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [PROPOSAL] Shared Ispell dictionaries

2017-12-26 Thread Arthur Zakirov

Thank you for your feedback.

On Tue, Dec 26, 2017 at 01:55:57PM -0300, Alvaro Herrera wrote:
> So what are you going to use instead?

For example, AffixNode and AffixNodeData represent prefix tree of an
affix list. They are accessed by Suffix and Prefix pointers of
IspellDict struct now. Instead all affix nodes should be placed into an
array and accessed by an offset. Suffix array goes first, Prefix array
goes after. AffixNodeData will access to a child node by an offset too.

AffixNodeData struct has the array of pointers to AFFIX struct. These
array with all AFFIX data can be stored within AffixNodeData. Or
AffixNodeData can have an array of indexes to a single AFFIX array,
which stored within IspellDict before or after Suffix and Prefix.

Same for prefix tree of a word list, represented by SPNode struct. It
might by stored as an array after the Prefix array.

AffixData and CompoundAffix arrays go after them.

To allocate IspellDict in this case it is necessary to calculate needed
memory size. I think arrays mentioned above will be built first then
memcpy'ed into IspellDict, if it won't take much time.

Hope it makes sense and is reasonable.

> 
> So this will be a large patch not submitted to 2018-01?  Depending on
> size/complexity I'm not sure it's OK to submit 2018-03 only -- it may be
> too late.
> 

Oh, I see. I try to prepare the patch while 2018-01 is open.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Re: [PROPOSAL] Shared Ispell dictionaries

2017-12-26 Thread Pavel Stehule

2017-12-26 17:55 GMT+01:00 Alvaro Herrera :

> Arthur Zakirov wrote:
>
> > Implementation
> > --
> >
> > It is necessary to change all structures related with IspellDict:
> > SPNode, AffixNode, AFFIX, CMPDAffix, IspellDict itself. They all
> > shouldn't use pointers for this reason. Others are used only during
> > dictionary building.
>
> So what are you going to use instead?
>
> > It would be good to store in a shared memory StopList struct too.
>
> Sure (probably a separate patch though).
>
> > All fields of IspellDict struct, which are used only during dictionary
> > building, will be move into new IspellDictBuild to decrease needed
> > shared memory size. And they are going to be released by buildCxt.
> >
> > Each dictionary will be stored in its own dsm segment.
>
> All that sounds reasonable.
>
> > The patch will be ready and added into the 2018-03 commitfest.
>
> So this will be a large patch not submitted to 2018-01?  Depending on
> size/complexity I'm not sure it's OK to submit 2018-03 only -- it may be
> too late.
>

Tomas had some workable patches related to this topic

Regards

Pavel

>
> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>

Re: General purpose hashing func in pgbench

2017-12-26 Thread Fabien COELHO



Bonjour Daniel,


Most "permutation" functions are really cryptographic cyphers which are
quite expensive, and require powers of two, which is not what is needed.
ISTM that there are some constructs to deal with arbitrary sizes based on
cryptographic functions, but that would make it too expensive for the
purpose.


FWIW, you might have a look at the permuteseq extension:
 https://pgxn.org/dist/permuteseq
It permutes an arbitrary range of 64-bit integers into itself,
with a user-supplied key as the seed.
Outputs are coerced into the desired range by using the
smallest possible power of two for the Feistel cypher's
block size, and then cycle-walking over the results.


Thanks for the pointer.

I must admit that I do not like much the iterative "cycle walking" 
approach because it can be much more expensive for some values, and it 
makes the cost non uniform. Without that point, the overall encryption 
looks quite costly.


For testing purposes, I'm looking for "pretty cheap" and "good enough", 
and for that I'm ready to forsake "cryptographic":-)


Thanks again, it made an interesting read!

--
Fabien.

Re: Deadlock in multiple CIC.

2017-12-26 Thread Jeff Janes

On Tue, Dec 26, 2017 at 8:31 AM, Alvaro Herrera 
wrote:

> Jeff Janes wrote:
> > c3d09b3bd23f5f6 fixed it so concurrent CIC would not deadlock (or at
> least
> > not as reliably as before) by dropping its own snapshot before waiting
> for
> > all the other ones to go away.
> >
> > With commit 8aa3e47510b969354ea02a, concurrent CREATE INDEX CONCURRENTLY
> on
> > different tables in the same database started deadlocking against each
> > other again quite reliably.
> >
> > I think the solution is simply to drop the catalog snapshot as well, as
> in
> > the attached.
>
> Thanks for the analysis -- it sounds reasonable to me.  However, I'm
> wondering why you used the *Conditionally() version instead of plain
> InvalidateCatalogSnapshot().

My thinking was that if there was for some reason another snapshot hanging
around, that dropping the catalog snapshot unconditionally would be a
correctness bug, while doing it conditionally would just fail to avoid a
theoretically avoidable deadlock.  So it seemed safer.

>   I think they must have the same effect in
> practice (the assumption being that you can't run CIC in a transaction
> that may have other snapshots) but the theory seems simpler when calling
> the other routine: just do away with the snapshot always, period.
>

That is probably true.  But I never even knew that catalog snapshots
existed until yesterday, so didn't want to make make assumptions about what
else might exist, to avoid introducing new bugs similar to the one that
8aa3e47510b969354ea02a fixed.

>
> This is back-patchable to 9.4, first branch which has MVCC catalog
> scans.  It's strange that this has gone undetected for so long.

Since the purpose of CIC is to build an index with minimal impact on other
users, I think wanting to use it in concurrent cases might be rather rare.
In a maintenance window, I wouldn't want to use CIC because it is slower
and I'd rather just hold the stronger lock and do it fast, and as a hot-fix
outside a maintenance window I usually wouldn't want to hog the CPU with
concurrent builds when I could do them sequentially instead.  Also, since
deadlocks are "expected" errors rather than "should never happen" errors,
and since the docs don't promise that you can do parallel CIC without
deadlocks, many people would probably shrug it off (which I initially did)
rather than report it as a bug.  I was looking into it as an enhancement
rather than a bug until I discovered that it was already enhanced and then
undone.

Cheers,

Jeff

Re: [PROPOSAL] Shared Ispell dictionaries

2017-12-26 Thread Alvaro Herrera

Arthur Zakirov wrote:

> Implementation
> --
> 
> It is necessary to change all structures related with IspellDict:
> SPNode, AffixNode, AFFIX, CMPDAffix, IspellDict itself. They all
> shouldn't use pointers for this reason. Others are used only during
> dictionary building.

So what are you going to use instead?

> It would be good to store in a shared memory StopList struct too.

Sure (probably a separate patch though).

> All fields of IspellDict struct, which are used only during dictionary
> building, will be move into new IspellDictBuild to decrease needed
> shared memory size. And they are going to be released by buildCxt.
>
> Each dictionary will be stored in its own dsm segment.

All that sounds reasonable.

> The patch will be ready and added into the 2018-03 commitfest.

So this will be a large patch not submitted to 2018-01?  Depending on
size/complexity I'm not sure it's OK to submit 2018-03 only -- it may be
too late.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

[PROPOSAL] Shared Ispell dictionaries

2017-12-26 Thread Arthur Zakirov

Hello, hackers!

Introduction


I'm going to implement a patch which will store Ispell dictionaries in a shared 
memory.

There is an extension shared_ispell [1], developed by Tomas Vondra. But it is a 
bad candidate for including into contrib.
Because it should know a lot of information about IspellDict struct to copy it 
into a shared memory.

Why
---

Shared Ispell dictionary gives the following improvements:
- consume less memory - Ispell dictionary loads into memory for every backends 
and requires for some dictionaries more than 100Mb
- there is no overhead during first call of a full text search function (such 
as to_tsvector(), to_tsquery())

Implementation
--

It is necessary to change all structures related with IspellDict: SPNode, 
AffixNode, AFFIX, CMPDAffix, IspellDict itself. They all shouldn't use pointers 
for this reason. Others are used only during dictionary building.
It would be good to store in a shared memory StopList struct too.

All fields of IspellDict struct, which are used only during dictionary 
building, will be move into new IspellDictBuild to decrease needed shared 
memory size. And they are going to be released by buildCxt.

Each dictionary will be stored in its own dsm segment. Structures for regular 
expressions won't be stored in a shared memory. They are compiled for every 
backend.

The patch will be ready and added into the 2018-03 commitfest.

Thank you for your attention. Any thoughts?


1 - github.com/tvondra/shared_ispell or github.com/postgrespro/shared_ispell

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Re: Deadlock in multiple CIC.

2017-12-26 Thread Alvaro Herrera

Jeff Janes wrote:
> c3d09b3bd23f5f6 fixed it so concurrent CIC would not deadlock (or at least
> not as reliably as before) by dropping its own snapshot before waiting for
> all the other ones to go away.
> 
> With commit 8aa3e47510b969354ea02a, concurrent CREATE INDEX CONCURRENTLY on
> different tables in the same database started deadlocking against each
> other again quite reliably.
> 
> I think the solution is simply to drop the catalog snapshot as well, as in
> the attached.

Thanks for the analysis -- it sounds reasonable to me.  However, I'm
wondering why you used the *Conditionally() version instead of plain
InvalidateCatalogSnapshot().  I think they must have the same effect in
practice (the assumption being that you can't run CIC in a transaction
that may have other snapshots) but the theory seems simpler when calling
the other routine: just do away with the snapshot always, period.

This is back-patchable to 9.4, first branch which has MVCC catalog
scans.  It's strange that this has gone undetected for so long.  I
wonder if there's an interaction with logical decoding and its
historical snapshot stuff here.  I pinged Jeremy on the other thread
about your fix.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] static assertions in C++

2017-12-26 Thread Peter Eisentraut

On 12/20/17 10:51, Tom Lane wrote:
> Peter Eisentraut  writes:
>> On 12/20/17 00:57, Tom Lane wrote:
>>> I do not have a well-informed opinion on whether
>>> #if defined(__cpp_static_assert) && __cpp_static_assert >= 200410
>>> is an appropriate test for static_assert() being available, but I'm
>>> pretty suspicious of it because none of my C++ compilers seem to
>>> take that path, not even recent stuff like clang 9.0.0.
> 
>> For clang, you apparently need to pass -std=c++11 or higher.  With g++
>>> =6, that's the default.
> 
> Ah.  I did try -std=c++0x on one machine, but forgot to do so with
> the newer clang.  That does seem to make it take the static_assert
> path on recent macOS.

committed

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2017-12-26 Thread Dmitry Dolgov

> On 25 December 2017 at 18:40, Tomas Vondra 
wrote:
> The attached v3 fixes this issue, and also a couple of other thinkos

Thank you for the patch, it looks quite interesting. After a quick look at
it
(mostly the first one so far, but I'm going to continue) I have a few
questions:

> + * XXX With many subtransactions this might be quite slow, because we'll
have
> + * to walk through all of them. There are some options how we could
improve
> + * that: (a) maintain some secondary structure with transactions sorted
by
> + * amount of changes, (b) not looking for the entirely largest
transaction,
> + * but e.g. for transaction using at least some fraction of the memory
limit,
> + * and (c) evicting multiple transactions at once, e.g. to free a given
portion
> + * of the memory limit (e.g. 50%).

Do you want to address these possible alternatives somehow in this patch or
you
want to left it outside? Maybe it makes sense to apply some combination of
them, e.g. maintain a secondary structure with relatively large
transactions,
and then start evicting them. If it's somehow not enough, then start to
evict
multiple transactions at once (option "c").

> + /*
> + * We clamp manually-set values to at least 64kB. The
maintenance_work_mem
> + * uses a higher minimum value (1MB), so this is OK.
> + */
> + if (*newval < 64)
> + *newval = 64;
> +

I'm not sure what's recommended practice here, but maybe it makes sense to
have a warning here about changing this value to 64kB? Otherwise it can be
unexpected.

Re: [HACKERS] [PATCH] Lockable views

2017-12-26 Thread Alvaro Herrera

Yugo Nagata wrote:
> On Fri, 27 Oct 2017 07:11:14 +0200
> Robert Haas  wrote:
> 
> > On Wed, Oct 11, 2017 at 11:36 AM, Yugo Nagata  wrote:
> > > In the attached patch, only automatically-updatable views that do not have
> > > INSTEAD OF rules or INSTEAD OF triggers are lockable. It is assumed that
> > > those views definition have only one base-relation. When an auto-updatable
> > > view is locked, its base relation is also locked. If the base relation is 
> > > a
> > > view again, base relations are processed recursively. For locking a view,
> > > the view owner have to have he priviledge to lock the base relation.
> > 
> > Why is this the right behavior?
> > 
> > I would have expected LOCK TABLE v to lock the view and nothing else.

> This discussion is one about 7 years ago when automatically-updatable views
> are not supported. Since 9.3, simple views can be updated as well as tables,
> so now I think it is reasonable that LOCK TABLE for views locks their base
> tables.

I agree with Yugo Nagata -- LOCK TABLE is in some cases necessary to
provide the right isolation so that an operation can be carried out
without interference from other processes that want to process the same
data -- and if a view is provided on top of existing tables, preventing
concurrent changes to the data returned by the view is done by locking
the view and recursively the tables that the view are built on, as if
the view were a table.  This is why LOCK TABLE is the right command to
do it.

Also, if an application is designed using a table and concurrent changes
are prevented via LOCK TABLE, then when the underlying schema is changed
and the table is replaced by a view, the application continues to work
unchanged; not only syntactically (no error because of table-locking a
view) but also semantically because new application code that modifies
data in underlying tables from paths other than the view will need to
compete with those through the view, which is correct.

> > See 
> > http://postgr.es/m/AANLkTi=kupesjhrdevgfbt30au_iyro6zwk+fwwy_...@mail.gmail.com
> > for previous discussion of this topic.

> If we want to lock only the view, it seems to me that LOCK VIEW syntax is 
> good. 
> However, to realize this, changing the syntax to avoid a shift/reduce
> conflict will be needed as disucussed in the "LOCK for non-tables" thread.

+1 on making TABLE mandatory in LOCK [TABLE], since that will support
this new LOCK VIEW thing as well as locking other object types.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Logical Decoding and HeapTupleSatisfiesVacuum assumptions

2017-12-26 Thread Nikhil Sontakke

Hi,

As of today, in the existing code in HeapTupleSatisfiesVacuum, if a
row is inserted via an aborted transaction, then it's deemed to be
dead immediately and can be cleaned up for re-use by HOT or vacuum,
etc.

With logical decoding, there might arise a case that such a row, if it
belongs to a system catalog table or even a user catalog table
(https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=66abc2608c7c00fcd449e00a9e23f13f02e65d04),
then it might be being used for decoding at the very same moment that
the abort came in. If the row is re-claimed immediately, then it's
possible that the decoding happening alongside might face issues.

The main issue here is that HeapTupleSatisfiesVacuum *assumes* that
rows belonging to an aborted transaction are not visible to anyone
else. As mentioned above, that assumption needs to change in the case
when logical decoding is active and the row belongs to catalog tables.
This was not a problem before, but now there are at least two patches
out there which challenge these assumptions. One patch is for logical
decoding of 2PC, in which case a PREPARED transaction might be getting
decoded and a concurrent ROLLBACK PREPARED can happen alongside.
Another patch is about streaming logical changes before commit/abort.

The provided patch changes that assumption. We now pass in an
additional argument to HeapTupleSatisfiesVacuum() to indicate if the
current row belongs to a catalog table in the logical decoding
environment. We call the existing
RelationIsAccessibleInLogicalDecoding() function to ascertain if it's
a catalog table worth considering for this case. So if this is the
case, we return HEAPTUPLE_RECENTLY_DEAD if the xmin of the row is
newer than OldestXmin. So, for the following case, we now will return
HEAPTUPLE_RECENTLY_DEAD when earlier we would have returned
HEAPTUPLE_DEAD:

BEGIN;
ALTER subscribed_table ADD COLUMN;

ABORT;
VACUUM pg_attribute;

This is essentially the same logic that is used today for rows that
got deleted (via update or delete). We return HEAPTUPLE_RECENTLY_DEAD
for such deleted rows if the xmax of the row is newer than OldestXmin
because some open transactions can/could still see the tuple.

With changes to HeapTupleSatisfiesVacuum, we also had to tweak the
logic in heap_prune_chain(). That function again assumes that only
Updated tuples can return HEAPTUPLE_RECENTLY_DEAD. That assumption is
not true and we check explicitly for the HeapTupleHeaderGetUpdateXid
value to be valid before further processing.

The patch had to make changes in all locations where
HeapTupleSatisfiesVacuum gets called and I have done multiple "make
check-world" runs with cassert enabled and things run fine. Also,
since it only sets the flag for system/user catalog tables and that
too ONLY in the logical decoding environment, it does not cause any
performance issues in normal circumstances.

Regards,
Nikhils
-- 
 Nikhil Sontakke   http://www.2ndQuadrant.com/
 PostgreSQL/Postgres-XL Development, 24x7 Support, Training & Services


HeapTupleSatisfiesVacuum_Logical_Decoding_26_12_17.patch
Description: Binary data

Re: Ethiopian calendar year(DATE TYPE) are different from the Gregorian calendar year

2017-12-26 Thread Pavel Stehule

2017-12-26 14:46 GMT+01:00 Lelisa Diriba :

> yes it works on my machine,it coverts from Gregorian calendar year to
> Ethiopian calendar year.
> Swift ,C# & Java languages locally we have app to convert it, but i can't
> get it from them,it is as a commercial,Also ORACLE DB localize it.Oracle DB
> also use for commercial,
> I want to Localize in PostgreSQL.it works as an EXTENSION, how it be
> localized to Postgresql?
>

Sorry - I don't understand well. Can you show the extension?

regards

Pavel

Re: BUGFIX: standby disconnect can corrupt serialized reorder buffers

2017-12-26 Thread Masahiko Sawada

On Tue, Dec 26, 2017 at 10:03 PM, Petr Jelinek
 wrote:
> On 26/12/17 11:13, Masahiko Sawada wrote:
>> On Tue, Dec 26, 2017 at 12:49 AM, Petr Jelinek
>>  wrote:
>>

 It's not a problem on crash restart because StartupReorderBuffer already
 does the required delete.

 ReorderBufferSerializeTXN, which spills the txns to disk, doesn't appear
 to have any guard to ensure that the segment files don't already exist
 when it goes to serialize a snapshot. Adding one there would probably be
 expensive; we don't know the last lsn of the txn yet, so to be really
 safe we'd have to do a directory listing and scan for any txn-$OURXID-*
 entries.

 So to fix, I suggest that we should do a
 slot-specific StartupReorderBuffer-style deletion when we start a new
 decoding session on a slot, per attached patch.

 It might be nice to also add a hook on proc exit, so we don't have stale
 buffers lying around until next decoding session, but I didn't want to
 add new global state to reorderbuffer.c so I've left that for now.
>>>
>>>
>>> Hmm, can't we simply call the new cleanup function in
>>> ReplicationSlotRelease()? That's called during process exit and we know
>>> there about slot so no extra global variables are needed.
>>>
>>
>> I guess that ReplicationSlotRelease() currently might not get called
>> if walsender exits by proc_exit(). ReplicationSlotRelease() can is
>> called by some functions such as WalSndErrorCleanup(), but at least in
>> the case where wal sender exits due to failed to write data to socket,
>> ReplicationSlotRelease() didn't get called as far as I tested.
>>
>
> Are you sure about that testing? Did you test it with replication slot
> active? ReplicationSlotRelease() is called from ProcKill() if the
> process is using a slot and should be called for any kind of exit except
> for outright crash (the kind that makes postgres kill all backends). If
> it wasn't we'd never unlock the replication slot used by the exiting
> walsender.
>

Ah, sorry I was wrong. WalSndErrorCleanup() didn't get called but
ReplicationSlotRelease() got called. I agree that cleanup function
gets called in ReplicationSlotRelease().

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Ethiopian calendar year(DATE TYPE) are different from the Gregorian calendar year

2017-12-26 Thread Lelisa Diriba

yes it works on my machine,it coverts from Gregorian calendar year to
Ethiopian calendar year.
Swift ,C# & Java languages locally we have app to convert it, but i can't
get it from them,it is as a commercial,Also ORACLE DB localize it.Oracle DB
also use for commercial,
I want to Localize in PostgreSQL.it works as an EXTENSION, how it be
localized to Postgresql?

Re: [HACKERS] [PATCH] Lockable views

2017-12-26 Thread Michael Paquier

On Tue, Dec 26, 2017 at 06:37:06PM +0900, Yugo Nagata wrote:
> I have created a new entry in CF-2017-1 and registered this thread again.

Fine for me. Thanks for the update. And I guess that you are planning to
send a new version before the beginning of the next commit fest using
the feedback provided?
--
Michael


signature.asc
Description: PGP signature

Re: BUGFIX: standby disconnect can corrupt serialized reorder buffers

2017-12-26 Thread Petr Jelinek

On 26/12/17 11:13, Masahiko Sawada wrote:
> On Tue, Dec 26, 2017 at 12:49 AM, Petr Jelinek
>  wrote:
> 
>>>
>>> It's not a problem on crash restart because StartupReorderBuffer already
>>> does the required delete.
>>>
>>> ReorderBufferSerializeTXN, which spills the txns to disk, doesn't appear
>>> to have any guard to ensure that the segment files don't already exist
>>> when it goes to serialize a snapshot. Adding one there would probably be
>>> expensive; we don't know the last lsn of the txn yet, so to be really
>>> safe we'd have to do a directory listing and scan for any txn-$OURXID-*
>>> entries.
>>>
>>> So to fix, I suggest that we should do a
>>> slot-specific StartupReorderBuffer-style deletion when we start a new
>>> decoding session on a slot, per attached patch.
>>>
>>> It might be nice to also add a hook on proc exit, so we don't have stale
>>> buffers lying around until next decoding session, but I didn't want to
>>> add new global state to reorderbuffer.c so I've left that for now.
>>
>>
>> Hmm, can't we simply call the new cleanup function in
>> ReplicationSlotRelease()? That's called during process exit and we know
>> there about slot so no extra global variables are needed.
>>
> 
> I guess that ReplicationSlotRelease() currently might not get called
> if walsender exits by proc_exit(). ReplicationSlotRelease() can is
> called by some functions such as WalSndErrorCleanup(), but at least in
> the case where wal sender exits due to failed to write data to socket,
> ReplicationSlotRelease() didn't get called as far as I tested.
> 

Are you sure about that testing? Did you test it with replication slot
active? ReplicationSlotRelease() is called from ProcKill() if the
process is using a slot and should be called for any kind of exit except
for outright crash (the kind that makes postgres kill all backends). If
it wasn't we'd never unlock the replication slot used by the exiting
walsender.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services

Re: Getting rid of "tuple concurrently updated" elog()s with concurrent DDLs (at least ALTER TABLE)

2017-12-26 Thread Michael Paquier

On Tue, Dec 26, 2017 at 4:30 PM, Andres Freund  wrote:
> You're proposing to lock the entire relation against many forms of concurrent 
> DDL, just to get rid of that error? That seems unacceptable.
> Isn't the canonical way to solve this to take object locks?

Sure. That's where things in lmgr.c come into play, like
LockSharedObject(), and you could hold with an exclusive lock on a
given object until the end of a transaction before opening the catalog
relation with heap_open(), however with those you need to know the
object OID before taking a lock on the parent relation, right? So you
face problems with lock upgrades, or race conditions show up more
easily. I have to admit that I have not dug much into the problem yet,
it is easy enough to have isolation tests by the way, and I just
noticed that ALTER DATABASE SET can equally trigger the error.
-- 
Michael

Re: [HACKERS] Flexible configuration for full-text search

2017-12-26 Thread Aleksandr Parfenov

On Tue, 26 Dec 2017 13:51:03 +0300
Arthur Zakirov  wrote:

> On Mon, Dec 25, 2017 at 05:15:07PM +0300, Aleksandr Parfenov wrote:
> 
> Is I understood users need to rewrite their configurations if they
> use unaccent dictionary, for example. It is not good I think. Users
> will be upset about that if they use only old configuration and they
> don't need new configuration.
> 
> From my point of view it is necessary to keep old configuration
> syntax.

I see your point. I will rework a patch to keep the backward
compatibility and return the TSL_FILTER entry into documentation.

> > I decide to rename newly added column to 'configuration' and keep
> > column 'dictionaries' with an array of all dictionaries used in
> > configuration (no matter how). Also, I fixed a bug in 'command'
> > output of the ts_debug in some cases.  
> 
> Maybe it would be better to keep the 'dictionary' column name? Is
> there a reason why it was renamed to 'command'?

I changed the name bacause it may contain more than one dictionary
interconnected via operators (e.g. 'english_stem UNION simple') and
the word 'dictionary' doesn't fully describe a content of the column
now. Also, a type of column was changed from regdictionary to text in
order to put operators into the output.

-- 
Aleksandr Parfenov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Re: [HACKERS] Replication status in logical replication

2017-12-26 Thread Tels

Moin,

On Tue, December 26, 2017 5:26 am, Masahiko Sawada wrote:
> On Tue, Dec 26, 2017 at 6:19 PM, Tels 
> wrote:
>> Moin,
>>
>> On Mon, December 25, 2017 7:26 pm, Masahiko Sawada wrote:
>>> On Tue, Dec 26, 2017 at 1:10 AM, Petr Jelinek
>>>  wrote:
 On 21/11/17 22:06, Masahiko Sawada wrote:
>
> After investigation, I found out that my previous patch was wrong
> direction. I should have changed XLogSendLogical() so that we can
> check the read LSN and set WalSndCaughtUp = true even after read a
> record without wait. Attached updated patch passed 'make
> check-world'.
> Please review it.
>

 Hi,

 This version looks good to me and seems to be in line with what we do
 in
 physical replication.

 Marking as ready for committer.
>>
>> (Sorry Masahiko, you'll get this twice, as fumbled the reply button.)
>>
>> I have not verifed that comment and/or code are correct, just a grammar
>> fix:
>>
>> +/*
>> + * If we've sent a record is at or beyond the flushed
>> point, then
>> + * we're caught up.
>>
>> That should read more like this:
>>
>> "If we've sent a record that is at or beyond the flushed point, we have
>> caught up."
>>
>
> Thank you for reviewing the patch!
> Actually, that comment is inspired by the comment just below comment.
> ISTM it's better to fix both if grammar of them is not appropriate.

Oh yes. Your attached version reads fine to me.

All the best,

Tels

PathNameCreateTemporaryDir() vs concurrency

2017-12-26 Thread Thomas Munro

Hi hackers,

While testing parallel hash join today, I saw a couple of errors like this:

2017-12-26 23:34:37.402 NZDT [13082] ERROR:  cannot create temporary
subdirectory "base/pgsql_tmp/pgsql_tmp13080.0.sharedfileset": File
exists

There is a thinko in PathNameCreateTemporaryDir(), a new function
added in commit dc6c4c9d.  It was designed to tolerate directories
existing already but forgot to handle it in the third mkdir call, so
it fails with unlucky timing.  Please see attached.

-- 
Thomas Munro
http://www.enterprisedb.com


fix-named-tempdir-create.patch
Description: Binary data

Re: General purpose hashing func in pgbench

2017-12-26 Thread Daniel Verite

Fabien COELHO wrote:

> Most "permutation" functions are really cryptographic cyphers which are 
> quite expensive, and require powers of two, which is not what is needed. 
> ISTM that there are some constructs to deal with arbitrary sizes based on 
> cryptographic functions, but that would make it too expensive for the 
> purpose.

FWIW, you might have a look at the permuteseq extension:
  https://pgxn.org/dist/permuteseq
It permutes an arbitrary range of 64-bit integers into itself,
with a user-supplied key as the seed.
Outputs are coerced into the desired range by using the
smallest possible power of two for the Feistel cypher's
block size, and then cycle-walking over the results.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: [HACKERS] Flexible configuration for full-text search

2017-12-26 Thread Arthur Zakirov

On Mon, Dec 25, 2017 at 05:15:07PM +0300, Aleksandr Parfenov wrote:
> 
> In the current version of the patch, configurations written in old
> syntax are rewritten into the same configuration in the new syntax.
> Since new syntax doesn't support a TSL_FILTER, it was removed from the
> documentation. It is possible to store configurations written in old
> syntax in a special way and simulate a TSL_FILTER behavior for them.
> But it will lead to maintenance of two different behavior of the FTS
> depends on a version of the syntax used during configuration. Do you
> think we should keep both behaviors?

Is I understood users need to rewrite their configurations if they use unaccent 
dictionary, for example.
It is not good I think. Users will be upset about that if they use only old 
configuration and they don't need new configuration.

>From my point of view it is necessary to keep old configuration syntax.

> 
> Columns' 'dictionaries' and 'dictionary' type were changed to text
> because after the patch the configuration may be not a plain array of
> dictionaries but a complex expression tree. In the column
> 'dictionaries' the result is textual representation of configuration
> and it is the same as a result of \dF+ description of the configuration.

Oh, I understood.

> 
> I decide to rename newly added column to 'configuration' and keep
> column 'dictionaries' with an array of all dictionaries used in
> configuration (no matter how). Also, I fixed a bug in 'command' output
> of the ts_debug in some cases.

Maybe it would be better to keep the 'dictionary' column name? Is there a 
reason why it was renamed to 'command'?

> 
> Additionally, I added some examples to documentation regarding
> multilingual search and combination of exact and linguistic-aware
> search and fixed typos.

Great!


-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Re: [HACKERS] Replication status in logical replication

2017-12-26 Thread Masahiko Sawada

On Tue, Dec 26, 2017 at 6:19 PM, Tels  wrote:
> Moin,
>
> On Mon, December 25, 2017 7:26 pm, Masahiko Sawada wrote:
>> On Tue, Dec 26, 2017 at 1:10 AM, Petr Jelinek
>>  wrote:
>>> On 21/11/17 22:06, Masahiko Sawada wrote:

 After investigation, I found out that my previous patch was wrong
 direction. I should have changed XLogSendLogical() so that we can
 check the read LSN and set WalSndCaughtUp = true even after read a
 record without wait. Attached updated patch passed 'make check-world'.
 Please review it.

>>>
>>> Hi,
>>>
>>> This version looks good to me and seems to be in line with what we do in
>>> physical replication.
>>>
>>> Marking as ready for committer.
>
> (Sorry Masahiko, you'll get this twice, as fumbled the reply button.)
>
> I have not verifed that comment and/or code are correct, just a grammar fix:
>
> +/*
> + * If we've sent a record is at or beyond the flushed
> point, then
> + * we're caught up.
>
> That should read more like this:
>
> "If we've sent a record that is at or beyond the flushed point, we have
> caught up."
>

Thank you for reviewing the patch!
Actually, that comment is inspired by the comment just below comment.
ISTM it's better to fix both if grammar of them is not appropriate.

+
+  /*
+* If we've sent a record that is at or beyond the flushed point, we
+* have caught up.
+*/
+   if (sentPtr >= GetFlushRecPtr())
+   WalSndCaughtUp = true;
   }
   else
   {
   /*
* If the record we just wanted read is at or beyond the flushed
* point, then we're caught up.
*/
   if (logical_decoding_ctx->reader->EndRecPtr >= GetFlushRecPtr())
   {

Attached a updated patch. Please review it.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


logical_repl_caught_up_v3.patch
Description: Binary data

Re: BUGFIX: standby disconnect can corrupt serialized reorder buffers

2017-12-26 Thread Masahiko Sawada

On Tue, Dec 26, 2017 at 12:49 AM, Petr Jelinek
 wrote:
> Hi,
>
> thanks for writing the patch.
>
> On 05/12/17 06:58, Craig Ringer wrote:
>> Hi all
>>
>> [...]
>>> The cause appears to be that walsender.c's ProcessRepliesIfAny writes a
>> LOG for unexpected EOF then calls proc_exit(0). But  serialized txn
>> cleanup is done by
>> ReorderBufferRestoreCleanup, as called by ReorderBufferCleanupTXN, which
>> is invoked from the PG_CATCH() in ReorderBufferCommit() and on various
>> normal exits. It won't get called if we proc_exit() without an ERROR, so
>> we leave stale data lying around.

Thank you for the details. I could reproduce this bug. This bug also
happens if pq_flush_if_writable called by WalSndWriteData could not
write data (for example, the case where replicated data violate unique
constraint on the subscriber).

>>
>> It's not a problem on crash restart because StartupReorderBuffer already
>> does the required delete.
>>
>> ReorderBufferSerializeTXN, which spills the txns to disk, doesn't appear
>> to have any guard to ensure that the segment files don't already exist
>> when it goes to serialize a snapshot. Adding one there would probably be
>> expensive; we don't know the last lsn of the txn yet, so to be really
>> safe we'd have to do a directory listing and scan for any txn-$OURXID-*
>> entries.
>>
>> So to fix, I suggest that we should do a
>> slot-specific StartupReorderBuffer-style deletion when we start a new
>> decoding session on a slot, per attached patch.
>>
>> It might be nice to also add a hook on proc exit, so we don't have stale
>> buffers lying around until next decoding session, but I didn't want to
>> add new global state to reorderbuffer.c so I've left that for now.
>
>
> Hmm, can't we simply call the new cleanup function in
> ReplicationSlotRelease()? That's called during process exit and we know
> there about slot so no extra global variables are needed.
>

I guess that ReplicationSlotRelease() currently might not get called
if walsender exits by proc_exit(). ReplicationSlotRelease() can is
called by some functions such as WalSndErrorCleanup(), but at least in
the case where wal sender exits due to failed to write data to socket,
ReplicationSlotRelease() didn't get called as far as I tested.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: [HACKERS] [PATCH] Lockable views

2017-12-26 Thread Yugo Nagata

On Sat, 23 Dec 2017 09:44:30 +0900
Michael Paquier  wrote:

> On Fri, Dec 22, 2017 at 04:19:46PM +0900, Yugo Nagata wrote:
> > I was busy for and I could not work on this patch. After reading the
> > previous discussion, I still think the behavior of this patch would
> > be right. So, I would like to reregister to CF 2018-1. Do I need to
> > create a new entry on CF? or should I change the status to
> > "Moved to next CF"?
> 
> This is entirely up to you. It may make sense as well to spawn a new thread
> and mark the new patch set as v2, based on the feedback received for v1, as
> well as it could make sense to continue with this thread. If the move involves
> a complete patch rewrite with a rather different concept, a new thread is more
> adapted in my opinion.

Thank you for your comment.

I have created a new entry in CF-2017-1 and registered this thread again.

> -- 
> Michael


-- 
Yugo Nagata

Re: Protect syscache from bloating with negative cache entries

2017-12-26 Thread Kyotaro HORIGUCHI

At Fri, 22 Dec 2017 13:47:16 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20171222.134716.88479707.horiguchi.kyot...@lab.ntt.co.jp>
> Anyway, I think we are reached to a consensus that the
> time-tick-based expiration is promising. So I'll work on the way
> as the first step.

So this is the patch. It gets simpler.

# I became to think that the second step is not needed.

I'm not sure that no syscache aceess happens outside a statement
but the operations that lead to the bloat seem to be performed
while processing of a statement. So statement timestamp seems
sufficient as the aging clock.

At first I tried the simple strategy that removes entries that
have been left alone for 30 minutes or more but I still want to
alleviate the quick bloat (by non-reused entries) so I introduced
together a clock-sweep like aging mechanism. An entry is created
with naccessed = 0, then incremented up to 2 each time it is
accessed. Removal side decrements naccessed of entriies older
than 600 seconds then actually removes if it becomes 0. Entries
that just created and not used will go off in 600 seconds and
entries that have been accessed several times have 1800 seconds'
grace after the last acccess.

We could shrink bucket array together but I didn't since it is
not so large and is prone to grow up to the same size shortly
again.


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
*** a/src/backend/access/transam/xact.c
--- b/src/backend/access/transam/xact.c
***
*** 733,738  void
--- 733,741 
  SetCurrentStatementStartTimestamp(void)
  {
  	stmtStartTimestamp = GetCurrentTimestamp();
+ 
+ 	/* Set this time stamp as aproximated current time */
+ 	SetCatCacheClock(stmtStartTimestamp);
  }
  
  /*
*** a/src/backend/utils/cache/catcache.c
--- b/src/backend/utils/cache/catcache.c
***
*** 74,79 
--- 74,82 
  /* Cache management header --- pointer is NULL until created */
  static CatCacheHeader *CacheHdr = NULL;
  
+ /* Timestamp used for any operation on caches. */
+ TimestampTz	catcacheclock = 0;
+ 
  static inline HeapTuple SearchCatCacheInternal(CatCache *cache,
  	   int nkeys,
  	   Datum v1, Datum v2,
***
*** 866,875  InitCatCache(int id,
--- 869,969 
  	 */
  	MemoryContextSwitchTo(oldcxt);
  
+ 	/* initilize catcache reference clock if haven't done yet */
+ 	if (catcacheclock == 0)
+ 		catcacheclock = GetCurrentTimestamp();
+ 
  	return cp;
  }
  
  /*
+  * Remove entries that haven't been accessed for a certain time.
+  *
+  * Sometimes catcache entries are left unremoved for several reasons. We
+  * cannot allow them to eat up the usable memory and still it is better to
+  * remove entries that are no longer accessed from the perspective of memory
+  * performance ratio. Unfortunately we cannot predict that but we can assume
+  * that entries that are not accessed for long time no longer contribute to
+  * performance.
+  */
+ static bool
+ CatCacheCleanupOldEntries(CatCache *cp)
+ {
+ 	int			i;
+ 	int			nremoved = 0;
+ #ifdef CATCACHE_STATS
+ 	int			ntotal = 0;
+ 	int			tm[] = {30, 60, 600, 1200, 1800, 0};
+ 	int			cn[6] = {0, 0, 0, 0, 0};
+ 	int			cage[3] = {0, 0, 0};
+ #endif
+ 
+ 	/* Move all entries from old hash table to new. */
+ 	for (i = 0; i < cp->cc_nbuckets; i++)
+ 	{
+ 		dlist_mutable_iter iter;
+ 
+ 		dlist_foreach_modify(iter, &cp->cc_bucket[i])
+ 		{
+ 			CatCTup*ct = dlist_container(CatCTup, cache_elem, iter.cur);
+ 			long s;
+ 			int us;
+ 
+ 
+ 			TimestampDifference(ct->lastaccess, catcacheclock, &s, &us);
+ 
+ #ifdef CATCACHE_STATS
+ 			{
+ int j;
+ 
+ ntotal++;
+ for (j = 0 ; tm[j] != 0 && s > tm[j] ; j++);
+ if (tm[j] == 0) j--;
+ cn[j]++;
+ 			}
+ #endif
+ 
+ 			/*
+ 			 * Remove entries older than 600 seconds but not recently used.
+ 			 * Entries that are not accessed after creation are removed in 600
+ 			 * seconds, and that has been used several times are removed after
+ 			 * 30 minumtes ignorance. We don't try shrink buckets since they
+ 			 * are not the major part of syscache bloat and they are expected
+ 			 * to be filled shortly again.
+ 			 */
+ 			if (s > 600)
+ 			{
+ #ifdef CATCACHE_STATS
+ Assert (ct->naccess >= 0 && ct->naccess <= 2);
+ cage[ct->naccess]++;
+ #endif
+ if (ct->naccess > 0)
+ 	ct->naccess--;
+ else
+ {
+ 	if (!ct->c_list || ct->c_list->refcount == 0)
+ 	{
+ 		CatCacheRemoveCTup(cp, ct);
+ 		nremoved++;
+ 	}
+ }
+ 			}
+ 		}
+ 	}
+ 
+ #ifdef CATCACHE_STATS
+ 	ereport(DEBUG2,
+ 			(errmsg ("removed %d/%d, age(-30s:%d, -60s:%d, -600s:%d, -1200s:%d, -1800:%d) naccessed(0:%d, 1:%d, 2:%d)",
+ nremoved, ntotal,
+ cn[0], cn[1], cn[2], cn[3], cn[4],
+ cage[0], cage[1], cage[2]),
+ 			 errhidestmt(true)));
+ #endif
+ 
+ 	return nremoved > 0;
+ }
+ 
+ /*
   * Enlarge a catcache, doubling the number of buckets.
   */
  static void
***
*** 1282,1287  SearchCatCacheInternal(CatCache

Re: [HACKERS] Replication status in logical replication

2017-12-26 Thread Tels

Moin,

On Mon, December 25, 2017 7:26 pm, Masahiko Sawada wrote:
> On Tue, Dec 26, 2017 at 1:10 AM, Petr Jelinek
>  wrote:
>> On 21/11/17 22:06, Masahiko Sawada wrote:
>>>
>>> After investigation, I found out that my previous patch was wrong
>>> direction. I should have changed XLogSendLogical() so that we can
>>> check the read LSN and set WalSndCaughtUp = true even after read a
>>> record without wait. Attached updated patch passed 'make check-world'.
>>> Please review it.
>>>
>>
>> Hi,
>>
>> This version looks good to me and seems to be in line with what we do in
>> physical replication.
>>
>> Marking as ready for committer.

(Sorry Masahiko, you'll get this twice, as fumbled the reply button.)

I have not verifed that comment and/or code are correct, just a grammar fix:

+/*
+ * If we've sent a record is at or beyond the flushed
point, then
+ * we're caught up.

That should read more like this:

"If we've sent a record that is at or beyond the flushed point, we have
caught up."

All the best,

Tels

Re: Ethiopian calendar year(DATE TYPE) are different from the Gregorian calendar year

2017-12-26 Thread Pavel Stehule

Hi

2017-12-26 8:23 GMT+01:00 Lelisa Diriba :

> In Ethiopia the year have 13 months and but in Gregorian calendar the year
> have 12 months,
> The Ethiopian society's are want to use his Ethiopian calendar year,i have
> the algorithm,
> but the way i add to the postgresql source code as EXTENSION? or to the
> backend(to kernel)?
> 13th month have 5 days and in fourth year 13th month have 6 days.that
> means 1 year have 365.25 days,my algorithm converts  Gregorian calendar
> year to Ethiopian calendar year.
>

It can be done via extension - you should to implement own date and
timestamp type with own input, output functions

Regards

Pavel

57 matches

Mail list logo