date:20160811

Hi all,

As CVE-2016-5424 has put recently in light, using LF and CR in
database and role names can lead to unexpected problems in the way
they are handled in logical backups or generated command lines. There
is as well a comment in the code mentioning a potential restriction
for that, precisely in fe_utils/string_utils.c:
+ * Forbid LF or CR characters, which have scant practical use beyond designing
+ * security breaches.  The Windows command shell is unusable as a conduit for
+ * arguments containing LF or CR characters.  A future major release should
+ * reject those characters in CREATE ROLE and CREATE DATABASE, because use
+ * there eventually leads to errors here.

Note that pg_dump[all] and pg_upgrade already have safeguards against
those things per the same routines putting quotes for execution as
commands into psql and shell. So attached is a patch to implement this
restriction in the backend, and I am adding that to the next CF for
10.0. Attached is as well a script able to trigger those errors.
Thoughts?
-- 
Michael


forbid-cr-lf.patch
Description: invalid/octet-stream
#!/usr/bin/perl

# Generate a string made of the given range of ASCII characters
sub generate_ascii_string
{
my ($from_char, $to_char) = @_;
my $res;

for my $i ($from_char .. $to_char)
{
	$res .= sprintf("%c", $i);
}
return $res;
}

my $lf_str = generate_ascii_string(7, 10);
my $cr_str = generate_ascii_string(11, 13);

system('createdb', $lf_str);
system('createdb', $cr_str);
system('createuser', $lf_str);
system('createuser', $cr_str);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add hint for function named "is"


On 8/11/16 4:54 PM, Tom Lane wrote:

which probably contributes to Jim's confusion.  I think what is happening
in the trouble case is that since IS has lower precedence than Op, the
grammar decides it ought to resolve || as a postfix operator, and then
it effectively has
('x' ||) IS ...


FWIW, is() || is() blows up in the same way.


I'm not sure there's much we can do about this.  Even if we had control of
what Bison prints for syntax errors, which we don't really, it's hard to
see what condition we could trigger the hint on that wouldn't result in
false positives at least as often as something helpful.  (Note that the
grammar's behavior can't really depend on whether a function named is()
actually exists in the catalogs.)


Is there a place in the error reporting path where we'd still have 
access to the 'is' token, and have enough control to look for a relevant 
function?

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] new autovacuum criterion for visible pages


On 8/11/16 10:59 AM, Jeff Janes wrote:

On Thu, Aug 11, 2016 at 8:32 AM, Amit Kapila  wrote:

On Thu, Aug 11, 2016 at 2:09 AM, Jeff Janes  wrote:

I wanted to create a new relopt named something like
autovacuum_vacuum_pagevisible_factor which would cause autovacuum to
vacuum a table once less than a certain fraction of the relation's
pages are marked allvisible.



Why would it more convenient for a user to set such a parameter as
compare to existing parameters (autovacuum_vacuum_threshold +
autovacuum_vacuum_scale_factor)?


Insertions and HOT-updates clear vm bits but don't increment the
counters that those existing parameters are compared to.

Also, the relationship between number of updated/deleted rows and the
number of hint-bits cleared can be hard to predict due to possible
clustering of the updates into the same blocks.  So it would be hard
to know what to set the values to.


I'm wondering if also creating the same options for all-frozen pages 
would be worthwhile. I don't see an obvious use for that, but maybe 
someone else does (and adding both at once would presumably be the least 
amount of work...)

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] condition variables

On Thu, Aug 11, 2016 at 8:44 PM, Thomas Munro
 wrote:
> In contrast, this proposal leaves it up to client code to get that
> right, similarly to the way you need to do things in a certain order
> when waiting for state changes with latches.  You could say that it's
> more error prone: I think there have been a few cases of incorrectly
> coded latch/state-change wait loops in the past.  On the other hand,
> it places no requirements on the synchronisation mechanism the client
> code uses for the related shared state.  pthread_cond_wait requires
> you to pass in a pointer to the related pthread_mutex_t, whereas with
> this proposal client code is free to use atomic ops, lwlocks,
> spinlocks or any other mutual exclusion mechanism to coordinate state
> changes and deal with cache coherency.

I think you have accurately stated the pros and cons of this approach.
On the whole I think it's a good trade-off.  In particular, not being
locked into a specific synchronization method seems like a good idea
from here.  If you had to pick just one, it would be hard to decide
between spinlocks and LWLocks, and "atomics" isn't a sufficiently
specific thing to code to.

> Then there is the question of what happens when the backend that is
> supposed to be doing the signalling dies or aborts, which Tom Lane
> referred to in his reply.  In those other libraries there is no such
> concern: it's understood that these are low level thread
> synchronisation primitives and if you're waiting for something that
> never happens, you'll be waiting forever.  I don't know what the
> answer is in general for Postgres condition variables, but...

As I noted in my reply to Tom, for parallel query, we're going to kill
all the workers at the same time, so the problem doesn't arise.  When
we use this mechanism outside that context, we just have to do it
correctly.  I don't think that's especially hard, but could somebody
mess it up?  Sure.

> The thing that I personally am working on currently that is very
> closely related and could use this has a more specific set of
> circumstances:  I want "join points" AKA barriers.  Something like
> pthread_barrier_t.  (I'm saying "join point" rather than "barrier" to
> avoid confusion with compiler and memory barriers, barrier.h etc.)
> Join points let you wait for all workers in a known set to reach a
> given point, possibly with a phase number or at least sense (one bit
> phase counter) to detect synchronisation bugs.  They also select one
> worker arbitrarily to receive a different return value when releasing
> workers from a join point, for cases where a particular phase of
> parallel work needs to be done by exactly one worker while the others
> sit on the bench: for example initialisation, cleanup or merging (CF
> PTHREAD_BARRIER_SERIAL_THREAD).  Clearly a join point could be not
> much more than a condition variable and some state tracking arrivals
> and departures, but I think that this higher level synchronisation
> primitive might have an advantage over raw condition variables in the
> abort case: it can know the total set of workers that its waiting for,
> if they are somehow registered with it first, and registration can
> include arranging for cleanup hooks to do the right thing.  It's
> already a requirement for a join point to know which workers exist (or
> at least how many).  Then the deal would then be that when you call
> joinpoint_join(&some_joinpoint, phase), it will return only when all
> peers have joined or detached, where the latter happens automatically
> if they abort or die.  Not at all sure of the details yet...  but I
> suspect join points are useful for a bunch of things like parallel
> sort, parallel hash join (my project), and anything else involving
> phases or some form of "fork/join" parallelism.

If I'm right that the abort/die case doesn't really need any special
handling here, then I think this gets a lot simpler.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] condition variables

On Thu, Aug 11, 2016 at 6:37 PM, Peter Geoghegan  wrote:
> I notice that you acquire a spinlock within the implementation of
> condition variables. Is it worth any effort to consolidate the number
> of spinlock acquisitions? In other words, maybe the most common idioms
> should be baked into the ConditionVariable interface, which could save
> callers from having to use their own mutex variable.

One thing to keep in mind is that spinlocks are extremely fast as long
as you don't have too many processes contending for them.  With
parallel groups (or I/O-in-progress wait queues) of single digit
number of processes, I doubt that consolidating the spinlock
acquisitions will produce any measurable benefit.  If we get to the
point of having parallel groups containing scores of processes, that
could change.

Also, right now we don't have enough users of the CV interface to know
what the most common idioms will be.  We could speculate, but if we
do, I'll start of the speculation by guessing that there will be a lot
of diversity, and not too much that keeps getting repeated.  If that
proves to be wrong, of course, we can always go back and change it
later.  We're not writing this on stone tablets.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] condition variables

On Thu, Aug 11, 2016 at 6:12 PM, Tom Lane  wrote:
>> But if we replace the io_in_progress locks with
>> condition variables, then that doesn't happen any more.  Nobody is
>> "holding" the condition variable, so it doesn't get "released" when
>> the process doing I/O aborts.  Instead, they just keep sleeping until
>> the aborting process reaches AbortBufferIO, and then it broadcasts on
>> the condition variable and wakes everybody up, which seems a good deal
>> nicer.
>
> Hmm.  I fear the only reason you see an advantage there is that you don't
> (yet) have any general-purpose mechanism for an aborting transaction to
> satisfy its responsibilities vis-a-vis waiters on condition variables.

In the specific case of parallel query, this problem doesn't really
arise: if an error happens, it will be propagated back to the leader
(unless it occurs in the leader as an initial matter) and the leader
will then kill of all of the other workers.  Since waiting for a
condition variable doesn't block interrupts, the workers will all stop
waiting and die like the obedient lemmings they are.

> Instead, this wins specifically because you stuck some bespoke logic into
> AbortBufferIO.  OK ... but that sounds like we're going to end up with
> every single condition variable that ever exists in the system needing to
> be catered for separately and explicitly during transaction abort cleanup.
> Which does not sound promising from a reliability standpoint.  On the
> other hand, I don't know what the equivalent rule to "release all LWLocks
> during abort" might look like for condition variables, so I don't know
> if it's even possible to avoid that.

I don't think it is possible to avoid that.  Certainly the fact that
LWLocks are automatically released is in most scenarios a big
advantage, but for content locks it isn't.  I note that I didn't so
much insert bespoke logic there as replace the existing bespoke logic
with less-ugly bespoke logic.

One way to conceptualize a condition variable is that it is the wait
queue for an LWLock without the lock itself.  The "lock modes" are
defined by the interaction between the condition checked by waiters
before sleeping and the shared memory updates performed before
signalling or broadcasting.  This separation of concerns allows for
enormous flexibility - you could use CVs to implement an LWLock-like
construct with an arbitrary set of lock modes and an arbitrary
conflict matrix, for example.  But I think that in most cases it will
be best to leave mutual exclusion to LWLocks, and use CVs when what we
need is not mutual exclusion but rather waiting for an operation
performed by some other backend to complete (successfully or
otherwise).  The IO-in-progress example is just such a case: the
current spinning behavior arises from the fact that the buffer's
I/O-in-progress flag does not get cleared until AFTER processes has
been dropped off the wait queue.  The normal pattern with a CV will be
(1) perform a shared memory update that will, if seen by other
processes, confirm that the condition has been met and then (2)
broadcast on the CV.  There's no way for transaction abort to know
what (1) looks like, even if somehow were made aware of the CV so that
it could do (2).

> I encourage you to pursue this, because indeed LWLocks aren't always
> an ideal solution, but I think it requires some careful thought about
> what transaction aborts will do with condition variables.

Thanks.  As stated above, I believe the right answer is in fact
"nothing", because this facility is too low-level to permit a
categorical answer.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] condition variables

2016-08-11 Thread Thomas Munro

On Fri, Aug 12, 2016 at 9:47 AM, Robert Haas  wrote:
> https://en.wikipedia.org/wiki/Monitor_(synchronization)#Condition_variables_2
>
> Basically, a condition variable has three operations: you can wait for
> the condition variable; you can signal the condition variable to wake
> up one waiter; or you can broadcast on the condition variable to wake
> up all waiters.  Atomically with entering the wait, you must be able
> to check whether the condition is satisfied.  So, in my
> implementation, a condition variable wait loop looks like this:
>
> for (;;)
> {
> ConditionVariablePrepareToSleep(cv);
> if (condition for which we are waiting is satisfied)
> break;
> ConditionVariableSleep();
> }
> ConditionVariableCancelSleep();
>
> To wake up one waiter, another backend can call
> ConditionVariableSignal(cv); to wake up all waiters,
> ConditionVariableBroadcast(cv).

It is interesting to compare this interface with Wikipedia's
description, POSIX's pthread_cond_t and C++'s std::condition_variable.

In those interfaces, the wait operation takes a mutex which must
already be held by the caller.  It unlocks the mutex and begins
waiting atomically.  Then when it returns, the mutex is automatically
reacquired.  This approach avoids race conditions as long as the
shared state change you are awaiting is protected by that mutex.  If
you check that state before waiting and while still holding the lock,
you can be sure not to miss any change signals, and then when it
returns you can check the state again and be sure that no one can be
concurrently changing it.

In contrast, this proposal leaves it up to client code to get that
right, similarly to the way you need to do things in a certain order
when waiting for state changes with latches.  You could say that it's
more error prone: I think there have been a few cases of incorrectly
coded latch/state-change wait loops in the past.  On the other hand,
it places no requirements on the synchronisation mechanism the client
code uses for the related shared state.  pthread_cond_wait requires
you to pass in a pointer to the related pthread_mutex_t, whereas with
this proposal client code is free to use atomic ops, lwlocks,
spinlocks or any other mutual exclusion mechanism to coordinate state
changes and deal with cache coherency.

Then there is the question of what happens when the backend that is
supposed to be doing the signalling dies or aborts, which Tom Lane
referred to in his reply.  In those other libraries there is no such
concern: it's understood that these are low level thread
synchronisation primitives and if you're waiting for something that
never happens, you'll be waiting forever.  I don't know what the
answer is in general for Postgres condition variables, but...

The thing that I personally am working on currently that is very
closely related and could use this has a more specific set of
circumstances:  I want "join points" AKA barriers.  Something like
pthread_barrier_t.  (I'm saying "join point" rather than "barrier" to
avoid confusion with compiler and memory barriers, barrier.h etc.)
Join points let you wait for all workers in a known set to reach a
given point, possibly with a phase number or at least sense (one bit
phase counter) to detect synchronisation bugs.  They also select one
worker arbitrarily to receive a different return value when releasing
workers from a join point, for cases where a particular phase of
parallel work needs to be done by exactly one worker while the others
sit on the bench: for example initialisation, cleanup or merging (CF
PTHREAD_BARRIER_SERIAL_THREAD).  Clearly a join point could be not
much more than a condition variable and some state tracking arrivals
and departures, but I think that this higher level synchronisation
primitive might have an advantage over raw condition variables in the
abort case: it can know the total set of workers that its waiting for,
if they are somehow registered with it first, and registration can
include arranging for cleanup hooks to do the right thing.  It's
already a requirement for a join point to know which workers exist (or
at least how many).  Then the deal would then be that when you call
joinpoint_join(&some_joinpoint, phase), it will return only when all
peers have joined or detached, where the latter happens automatically
if they abort or die.  Not at all sure of the details yet...  but I
suspect join points are useful for a bunch of things like parallel
sort, parallel hash join (my project), and anything else involving
phases or some form of "fork/join" parallelism.

Or perhaps that type of thinking about error handling should be pushed
down to the condition variable.  How would that look: all potential
signallers would have to register to deliver a goodbye signal in their
abort and shmem exit paths?  Then what happens if you die before
registering?  I think even if you find a way to do that I'd still need
to do similar extra work on top for my join points concept, becau

Re: [HACKERS] [Patch] New psql prompt substitution %r (m = master, r = replica)

On Fri, Aug 12, 2016 at 12:51 AM, Tom Lane  wrote:
> Peter Eisentraut  writes:
>> On 8/11/16 6:04 AM, Aleksander Alekseev wrote:
>>> Suggested patch introduces an %r substitution in psql's prompt. This
>>> substitution allows to display whether user is connected to master or
>>> replica right in a prompt.
>
>> In the near future, there will (probably) be a lot more variants about
>> what it means to be a master or a replica.  There will be logical
>> replication, where you could be a publisher of something and a consumer
>> of something else.  You could even be a logical consumer but a physical
>> master.  So a global binary facility is probably not very forward
>> looking and will lead to confusion.
>
> Also, the patch as given is broken since it fails to account for the
> server being promoted while a psql session is open.

 {
+{"is_master", PGC_INTERNAL, UNGROUPED,
+gettext_noop("Shows whether the current instance is
master or replica."),
+NULL,
+GUC_REPORT | GUC_NO_SHOW_ALL | GUC_NO_RESET_ALL |
GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+},
+&session_is_master,
+false,
+NULL, NULL, NULL
+},
Having a GUC for that purpose is not a fruitful approach. And it seems
to me that this patch is dead-in-the water because this makes prompt
parsing dependent on a status only known by the server, which would
require in the worst case to issue an SQL based on for example
pg_is_in_recovery() in get_prompt(): all the other fields using libpq
routines fetch values defined when the connection is established, like
the session PID or the database.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] new autovacuum criterion for visible pages

Michael Paquier  writes:
> In short, autovacuum will need to scan by itself the VM of each
> relation and decide based on that.

That seems like a worthwhile approach to pursue.  The VM is supposed to be
small, and if you're worried it isn't, you could sample a few pages of it.
I do not think any of the ideas proposed so far for tracking the
visibility percentage on-the-fly are very tenable.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [Patch] New psql prompt substitution %r (m = master, r = replica)

2016-08-11 Thread David Fetter

On Thu, Aug 11, 2016 at 01:04:19PM +0300, Aleksander Alekseev wrote:
> Hello.
> 
> Suggested patch introduces an %r substitution in psql's prompt. This
> substitution allows to display whether user is connected to master or
> replica right in a prompt.

This is a neat idea, but there are some issues.

- There's a new GUC.  This is probably not compelling enough a feature for that.

- The check, if I understand correctly, is only done on connect, even though it
  could change during a session.

How about something that:

- Allowed setting a query to be executed after each command.

- Put the results of that query into a psql variable.

This differs from \gset in that it would be executed silently at the
end of each command.

What say?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] new autovacuum criterion for visible pages

On Thu, Aug 11, 2016 at 4:21 PM, Michael Paquier
 wrote:
> On Thu, Aug 11, 2016 at 3:29 PM, Michael Paquier
>  wrote:
>> On Thu, Aug 11, 2016 at 5:39 AM, Jeff Janes  wrote:
>>> I wanted to create a new relopt named something like
>>> autovacuum_vacuum_pagevisible_factor which would cause autovacuum to
>>> vacuum a table once less than a certain fraction of the relation's
>>> pages are marked allvisible.
>>
>> Interesting idea.
>>
>>> 1) One issue is that pg_class.relpages and pg_class.relallvisible are
>>> themselves only updated by vacuum/analyze.  In the absence of manual
>>> vacuum or analyze, this means that if the new criterion uses those
>>> field, it could only kick in after an autoanalyze has already been
>>> done, which means that autovacuum_vacuum_pagevisible_factor could not
>>> meaningfully be set lower than autovacuum_analyze_scale_factor.
>>>
>>> Should relallvisible be moved/copied from pg_class to
>>> pg_stat_all_tables, so that it is maintained by the stats collector?
>>> Or should the autovacuum worker just walk the vm of every table with a
>>> defined autovacuum_vacuum_pagevisible_factor each time it is launched
>>> to get an up-to-date count that way?
>>
>> relation_needs_vacanalyze has access to Form_pg_class, so it is not a
>> problem to use the value of relallvisible there to decide if a
>> vacuum/analyze should be run.
>
> Doh. I missed your point. One idea perhaps would be to have an
> additional field that updates the number of pages having their VM bits
> cleared, or just decrement relallvisible when that happens, and use
> that in relation_needs_vacanalyze to do the decision-making. But that
> would require updating stats each time there is a VM cleared in heap
> operations, which would be really costly...
>
> The optimizer does not depend directly on pgstat when fetching the
> estimation information it needs, so it may be wiser to not add this
> dependency, and one can disable pgstat_track_counts so moving this
> information out of pg_class is not a good idea.

With a somewhat fresher mind...

The main issue regarding this proposal can be summarized as that: as
track_counts can be disabled by users so moving relallvisible into
pgstat cannot be done except if I am missing something. The VM bits
cleared need to be tracked either by decrementing
pg_class.relallvisible, with a different counter in pg_class, or with
a completely different mechanism. Still I am scared of overall
performance impact because as the VM bit clearing is quite spread so
pg_class or the new relation where this is tracked would become really
bloated.

In short, autovacuum will need to scan by itself the VM of each
relation and decide based on that. I would not expect much performance
impact, but disabling that by default would have no impact on existing
deployments.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] condition variables

2016-08-11 Thread Peter Geoghegan

On Thu, Aug 11, 2016 at 2:47 PM, Robert Haas  wrote:
> Another approach to the problem is to use a latch wait loop.  That
> almost works.  Interrupts can be serviced, and you can recheck shared
> memory to see whether the condition for proceeding is satisfied after
> each iteration of the loop.  There's only one problem: when you do
> something that might cause the condition to be satisfied for other
> waiting backends, you need to set their latch - but you don't have an
> easy way to know exactly which processes are waiting, so how do you
> call SetLatch?

That's what I ended up doing with parallel CREATE INDEX. It worked,
but it would be nice to have a general purpose facility for waiting
for conditions to change.

> https://en.wikipedia.org/wiki/Monitor_(synchronization)#Condition_variables_2
>
> Basically, a condition variable has three operations: you can wait for
> the condition variable; you can signal the condition variable to wake
> up one waiter; or you can broadcast on the condition variable to wake
> up all waiters.  Atomically with entering the wait, you must be able
> to check whether the condition is satisfied.  So, in my
> implementation, a condition variable wait loop looks like this:
>
> for (;;)
> {
> ConditionVariablePrepareToSleep(cv);
> if (condition for which we are waiting is satisfied)
> break;
> ConditionVariableSleep();
> }
> ConditionVariableCancelSleep();
>
> To wake up one waiter, another backend can call
> ConditionVariableSignal(cv); to wake up all waiters,
> ConditionVariableBroadcast(cv).

This seems convenient.

I notice that you acquire a spinlock within the implementation of
condition variables. Is it worth any effort to consolidate the number
of spinlock acquisitions? In other words, maybe the most common idioms
should be baked into the ConditionVariable interface, which could save
callers from having to use their own mutex variable.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] condition variables

Robert Haas  writes:
> I had what I think is a better idea, which is to introduce a notion of
> condition variables.

Interesting proposal.

> ... Using condition variables here seems to
> have a couple of advantages.  First, it means that a backend waiting
> for buffer I/O to complete is interruptible.  Second, it fixes a
> long-running bit of nastiness in AbortBufferIO: right now, if a
> backend that is doing buffer I/O aborts, the abort causes it to
> release all of its LWLocks, including the buffer I/O lock. Everyone
> waiting for that buffer busy-loops until the aborting process gets
> around to reacquiring the lock and updating the buffer state in
> AbortBufferIO.  But if we replace the io_in_progress locks with
> condition variables, then that doesn't happen any more.  Nobody is
> "holding" the condition variable, so it doesn't get "released" when
> the process doing I/O aborts.  Instead, they just keep sleeping until
> the aborting process reaches AbortBufferIO, and then it broadcasts on
> the condition variable and wakes everybody up, which seems a good deal
> nicer.

Hmm.  I fear the only reason you see an advantage there is that you don't
(yet) have any general-purpose mechanism for an aborting transaction to
satisfy its responsibilities vis-a-vis waiters on condition variables.
Instead, this wins specifically because you stuck some bespoke logic into
AbortBufferIO.  OK ... but that sounds like we're going to end up with
every single condition variable that ever exists in the system needing to
be catered for separately and explicitly during transaction abort cleanup.
Which does not sound promising from a reliability standpoint.  On the
other hand, I don't know what the equivalent rule to "release all LWLocks
during abort" might look like for condition variables, so I don't know
if it's even possible to avoid that.

I encourage you to pursue this, because indeed LWLocks aren't always
an ideal solution, but I think it requires some careful thought about
what transaction aborts will do with condition variables.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add hint for function named "is"

"David E. Wheeler"  writes:
> On Aug 11, 2016, at 2:11 PM, Jim Nasby  wrote:
>> SELECT 'x'||is();
>> ERROR:  syntax error at or near "("

> Why does it need quotation marks in this case?

It doesn't, if you do something like

regression=# select is();
ERROR:  function is() does not exist
LINE 1: select is();
   ^

which probably contributes to Jim's confusion.  I think what is happening
in the trouble case is that since IS has lower precedence than Op, the
grammar decides it ought to resolve || as a postfix operator, and then
it effectively has
('x' ||) IS ...
which leaves noplace to go except IS NULL and other IS-something syntaxes.
You'd likely have similar problems with any other keyword that has lower
precedence than Op; but a large fraction of those are fully-reserved words
and so no one would have had any expectation of being able to leave them
unquoted anyway.

I'm not sure there's much we can do about this.  Even if we had control of
what Bison prints for syntax errors, which we don't really, it's hard to
see what condition we could trigger the hint on that wouldn't result in
false positives at least as often as something helpful.  (Note that the
grammar's behavior can't really depend on whether a function named is()
actually exists in the catalogs.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] condition variables

Hi,

Some of my EnterpriseDB colleagues and I have been working on various
parallel query projects, all of which have been previously disclosed
here:

https://wiki.postgresql.org/wiki/EnterpriseDB_database_server_roadmap

One issue we've encountered is that it's not very easy for one process
in a group of cooperating parallel processes to wait for another
process in that same group. One idea is to have one process grab an
LWLock and other processes try to acquire it, but that actually
doesn't work very well. A pretty obvious problem is that it holds of
interrupts for the entire time that you are holding the lock, which is
pretty undesirable. A more subtle problem is that it's easy to
conceive of situations where the LWLock paradigm is just a very poor
fit for what you actually want to do. For example, suppose you have a
computation which proceeds in two phases: each backend that finishes
phase 1 must wait until all backends finish phase 1, and once all have
finished, all can begin phase 2. You could handle this case by having
an LWLock which everyone holds during phase 1 in shared mode, and then
everyone must briefly acquire it in exclusive mode before starting
phase 2, but that's an awful hack. It also has race conditions: what
if someone finishes phase 1 before everyone has started phase 1? And
what if there are 10 phases instead of 2?

Another approach to the problem is to use a latch wait loop. That
almost works. Interrupts can be serviced, and you can recheck shared
memory to see whether the condition for proceeding is satisfied after
each iteration of the loop. There's only one problem: when you do
something that might cause the condition to be satisfied for other
waiting backends, you need to set their latch - but you don't have an
easy way to know exactly which processes are waiting, so how do you
call SetLatch? I originally thought of adding a function like
SetAllLatches(ParallelContext *) and maybe that can work, but then I
had what I think is a better idea, which is to introduce a notion of
condition variables. Condition variables, of course, are a standard
synchronization primitive:

https://en.wikipedia.org/wiki/Monitor_(synchronization)#Condition_variables_2

Basically, a condition variable has three operations: you can wait for
the condition variable; you can signal the condition variable to wake
up one waiter; or you can broadcast on the condition variable to wake
up all waiters. Atomically with entering the wait, you must be able
to check whether the condition is satisfied. So, in my
implementation, a condition variable wait loop looks like this:

for (;;)
{
ConditionVariablePrepareToSleep(cv);
if (condition for which we are waiting is satisfied)
break;
ConditionVariableSleep();
}
ConditionVariableCancelSleep();

To wake up one waiter, another backend can call
ConditionVariableSignal(cv); to wake up all waiters,
ConditionVariableBroadcast(cv).

I am cautiously optimistic that this design will serve a wide variety
of needs for parallel query development - basically anything that
needs to wait for another process to reach a certain point in the
computation that can be detected through changes in shared memory
state. The attached patch condition-variable-v1.patch implements this
API. I originally open-coded the wait queue for this, but I've just
finished rebasing it on top of Thomas Munro's proclist stuff, so
before applying this patch you need the one from here:

https://www.postgresql.org/message-id/CAEepm=0vvr9zgwht67rwutfwmeby1gigptbk3xfpdbbgetz...@mail.gmail.com

At some point while hacking on this I realized that we could actually
replace the io_in_progress locks with condition variables; the
attached patch buffer-io-cv-v1.patch does this (it must be applied on
top of the proclist patch from the above email and also on top of
condition-variable-v1.patch). Using condition variables here seems to
have a couple of advantages. First, it means that a backend waiting
for buffer I/O to complete is interruptible. Second, it fixes a
long-running bit of nastiness in AbortBufferIO: right now, if a
backend that is doing buffer I/O aborts, the abort causes it to
release all of its LWLocks, including the buffer I/O lock. Everyone
waiting for that buffer busy-loops until the aborting process gets
around to reacquiring the lock and updating the buffer state in
AbortBufferIO. But if we replace the io_in_progress locks with
condition variables, then that doesn't happen any more. Nobody is
"holding" the condition variable, so it doesn't get "released" when
the process doing I/O aborts. Instead, they just keep sleeping until
the aborting process reaches AbortBufferIO, and then it broadcasts on
the condition variable and wakes everybody up, which seems a good deal
nicer.

I'm very curious to know whether other people like this abstraction
and whether they think it will be useful for things they want to do
with parallel query (or otherwise). Comments welcome. Revie

Re: [HACKERS] Add hint for function named "is"

2016-08-11 Thread David E. Wheeler

On Aug 11, 2016, at 2:11 PM, Jim Nasby  wrote:

> CREATE FUNCTION pg_temp.is() RETURNS text LANGUAGE sql AS $$SELECT 
> 'x'::text$$;
> SELECT 'x'||is();
> ERROR:  syntax error at or near "("
> LINE 1: SELECT 'x'||is();
> 
> I was finally able to figure out this was because "is" needs to be quoted; is 
> there a way this could be hinted?
> 
> FWIW, the real-world case here comes from using pgTap, which has an is() 
> function. I've used that countless times by itself without quoting, so it 
> never occurred to me that the syntax error was due to lack of quotes.

Why does it need quotation marks in this case?

D

smime.p7s
Description: S/MIME cryptographic signature

Re: [HACKERS] Allowing GIN array_ops to work on anyarray

2016-08-11 Thread M Enrique

This is awesome. I will build it to start using and testing it in my
development environment. Thank you so much for making this change.

On Thu, Aug 11, 2016 at 11:33 AM Tom Lane  wrote:

> In
> https://www.postgresql.org/message-id/15293.1466536...@sss.pgh.pa.us
> I speculated that it might not take too much to replace all the variants
> of GIN array_ops with a single polymorphic opclass over anyarray.
> Attached is a proposed patch that does that.
>
> There are two bits of added functionality needed to make this work:
>
> 1. We need to abstract the storage type.  The patch does this by teaching
> catalog/index.c to recognize an opckeytype specified as ANYELEMENT with an
> opcintype of ANYARRAY, and doing the array element type lookup at index
> creation time.
>
> 2. We need to abstract the key comparator.  The patch does this by
> teaching gin/ginutil.c that if the opclass omits a GIN_COMPARE_PROC,
> it should look up the default btree comparator for the index key type.
>
> Both of these seem to me to be reasonable general-purpose behaviors with
> potential application to other opclasses.
>
> In the aforementioned message I worried that a core opclass defined this
> way might conflict with user-built opclasses for specific array types,
> but it seems to work out fine without any additional tweaks: CREATE INDEX
> already prefers an exact match if it finds one, and only falls back to
> matching anyarray when it doesn't.  Also, all the replaced opclasses are
> presently default for their types, which means that pg_dump won't print
> them explicitly in CREATE INDEX commands, so we don't have a dump/reload
> or pg_upgrade hazard from them disappearing.
>
> A potential downside is that for an opclass defined this way, we add a
> lookup_type_cache() call to each initGinState() call.  That's basically
> just a single dynahash lookup once the caches are populated, so it's not
> much added cost, but conceivably it could be measurable in bulk insert
> operations.  If it does prove objectionable my inclination would be to
> look into ways to avoid the repetitive function lookups of initGinState,
> perhaps by letting it cache that stuff in the index's relcache entry.
>
> I'll put this on the September commitfest docket.
>
> regards, tom lane
>
>

Re: [HACKERS] Small issues in syncrep.c

2016-08-11 Thread Alvaro Herrera

Simon Riggs wrote:

> Good catch.
> 
> I've updated Julien's patch to reflect Michael's suggestion.
> 
> Looks good to apply immediately.
> 
> 14e8803f1 was only a partial patch for the syncrep code, so I don't
> see any reason to keep the code as it currently is in 9.5/9.6.
> 
> Any objections to backpatching this to 9.5 and 9.6?

No objection to backpatching; the current state looks like a very
strange coding pattern only.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Add hint for function named "is"

CREATE FUNCTION pg_temp.is() RETURNS text LANGUAGE sql AS $$SELECT 
'x'::text$$;

SELECT 'x'||is();
ERROR:  syntax error at or near "("
LINE 1: SELECT 'x'||is();

I was finally able to figure out this was because "is" needs to be 
quoted; is there a way this could be hinted?


FWIW, the real-world case here comes from using pgTap, which has an is() 
function. I've used that countless times by itself without quoting, so 
it never occurred to me that the syntax error was due to lack of quotes.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] money type overflow checks

2016-08-11 Thread Peter Eisentraut

On 8/5/16 1:14 PM, Tom Lane wrote:
> No, I don't think it's sufficient after a multiplication by 10.  That
> would be enough to shift some bits clear out of the word, but there's
> no certainty that the new sign bit would be 1.
> 
> The scheme used in scanint8 is safe.  But I think it was written that way
> mainly to avoid hard-wired assumptions about how wide int64 is, a
> consideration that's a mite obsolete now.

OK, I did it like int8, and added more tests.  My original patch didn't
get the most negative integer right.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 7c902e5f9a9051923df7d343cb00c93e6774a16e Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Thu, 11 Aug 2016 12:00:00 -0400
Subject: [PATCH v2] Add overflow checks to money type input function

The money type input function did not have any overflow checks at all.
There were some regression tests that purported to check for overflow,
but they actually checked for the overflow behavior of the int8 type
before casting to money.  Remove those unnecessary checks and add some
that actually check the money input function.
---
 src/backend/utils/adt/cash.c| 51 --
 src/test/regress/expected/money.out | 85 +
 src/test/regress/sql/money.sql  | 26 ++--
 3 files changed, 146 insertions(+), 16 deletions(-)

diff --git a/src/backend/utils/adt/cash.c b/src/backend/utils/adt/cash.c
index b336185..b2676e8 100644
--- a/src/backend/utils/adt/cash.c
+++ b/src/backend/utils/adt/cash.c
@@ -189,13 +189,30 @@ cash_in(PG_FUNCTION_ARGS)
 	printf("cashin- string is '%s'\n", s);
 #endif
 
+	/*
+	 * We accumulate the absolute amount in "value" and then apply the sign at
+	 * the end.  (The sign can appear before or after the digits, so it would
+	 * be more complicated to do otherwise.)  Because of the larger range of
+	 * negative signed integers, we build "value" in the negative and then
+	 * flip the sign at the end, catching most-negative-number overflow if
+	 * necessary.
+	 */
+
 	for (; *s; s++)
 	{
 		/* we look for digits as long as we have found less */
 		/* than the required number of decimal places */
 		if (isdigit((unsigned char) *s) && (!seen_dot || dec < fpoint))
 		{
-			value = (value * 10) + (*s - '0');
+			Cash newvalue = (value * 10) - (*s - '0');
+
+			if (newvalue / 10 != value)
+ereport(ERROR,
+		(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+		 errmsg("value \"%s\" is out of range for type money",
+str)));
+
+			value = newvalue;
 
 			if (seen_dot)
 dec++;
@@ -214,11 +231,27 @@ cash_in(PG_FUNCTION_ARGS)
 
 	/* round off if there's another digit */
 	if (isdigit((unsigned char) *s) && *s >= '5')
-		value++;
+		value--;
+
+	if (value > 0)
+		ereport(ERROR,
+(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("value \"%s\" is out of range for type money",
+		str)));
 
 	/* adjust for less than required decimal places */
 	for (; dec < fpoint; dec++)
-		value *= 10;
+	{
+		Cash newvalue = value * 10;
+
+		if (newvalue / 10 != value)
+			ereport(ERROR,
+	(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+	 errmsg("value \"%s\" is out of range for type money",
+			str)));
+
+		value = newvalue;
+	}
 
 	/*
 	 * should only be trailing digits followed by whitespace, right paren,
@@ -247,7 +280,17 @@ cash_in(PG_FUNCTION_ARGS)
 			str)));
 	}
 
-	result = value * sgn;
+	if (sgn > 0)
+	{
+		result = -value;
+		if (result < 0)
+			ereport(ERROR,
+	(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+	 errmsg("value \"%s\" is out of range for type money",
+			str)));
+	}
+	else
+		result = value;
 
 #ifdef CASHDEBUG
 	printf("cashin- result is " INT64_FORMAT "\n", result);
diff --git a/src/test/regress/expected/money.out b/src/test/regress/expected/money.out
index 538235c..3be209d 100644
--- a/src/test/regress/expected/money.out
+++ b/src/test/regress/expected/money.out
@@ -185,6 +185,83 @@ SELECT * FROM money_data;
  $123.46
 (1 row)
 
+-- input checks
+SELECT '1234567890'::money;
+   money   
+---
+ $1,234,567,890.00
+(1 row)
+
+SELECT '12345678901234567'::money;
+   money
+
+ $12,345,678,901,234,567.00
+(1 row)
+
+SELECT '123456789012345678'::money;
+ERROR:  value "123456789012345678" is out of range for type money
+LINE 1: SELECT '123456789012345678'::money;
+   ^
+SELECT '9223372036854775807'::money;
+ERROR:  value "9223372036854775807" is out of range for type money
+LINE 1: SELECT '9223372036854775807'::money;
+   ^
+SELECT '-12345'::money;
+money
+-
+ -$12,345.00
+(1 row)
+
+SELECT '-1234567890'::money;
+   money
+
+ -$1,234,567,890.00
+(1 row)
+
+SELECT '-12345678901234567'::money;
+money
+-
+ -$12,345,678,901,234,567.00
+(1 r

Re: [HACKERS] No longer possible to query catalogs for index capabilities?

Robert Haas  writes:
> On Thu, Aug 11, 2016 at 11:35 AM, Tom Lane  wrote:
>> Well, if it's unqueryable they won't be able to query it no matter how
>> hard they try ;-).

> Sure, but as this thread already demonstrates, they may also complain
> forcefully about having lost that capability.  You argued when
> removing the pg_am columns that nobody cared about the ability to
> query any of them;

Sir, that's just historical revisionism.  I/we asked at the time whether
people needed any of that info, and heard nothing but crickets.  It was
mentioned multiple times during the development thread, see for example
https://www.postgresql.org/message-id/17342.1439225415%40sss.pgh.pa.us
or
https://www.postgresql.org/message-id/19218.1440604259%40sss.pgh.pa.us
or even the commit message in question (65c5fcd35):

A disadvantage is that SQL-level code can no longer see attributes
of index AMs; in particular, some of the crosschecks in the opr_sanity
regression test are no longer possible from SQL.  We've addressed that
by adding a facility for the index AM to perform such checks instead.
(Much more could be done in that line, but for now we're content if the
amvalidate functions more or less replace what opr_sanity used to do.)
We might also want to expose some sort of reporting functionality, but
this patch doesn't do that.

I will admit that I'd rather minimize than maximize the amount of
information we expose here, but I think that's an entirely defensible
position.

> ... You argue against
> these things on the grounds that they might change later, but the
> overwhelming evidence from posts on this list is that people would
> prefer to have access to APIs that might not be stable rather than
> have no access at all.

That doesn't stop them from bitching when we do change things they
were depending on.  I'm fine with exposing things there is a clear
use-case for, but I do not see that there is a reasonable use-case
for exposing ampredlocks.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] No longer possible to query catalogs for index capabilities?

Kevin Grittner  writes:
> On Wed, Aug 10, 2016 at 5:14 PM, Tom Lane  wrote:
>> In short, I do not see a good reason to expose ampredlocks at the SQL
>> level, and I think there needs to be a darn good reason to expose any of
>> this stuff, not just "maybe some DBA will think he needs to query this".

> As I said before, there's probably not a lot of benefit exposing it
> *now*, when only btree indexes support fine-grained predicate
> locks; but if we were to add support for one or two more AMs per
> release for the next several releases, it could become a
> significant benefit to a DBA who's trying to figure out a problem
> -- especially if that DBA is not a skilled C programmer.

By then, we might have a better idea of whether a per-AM boolean flag is a
sensible long-term representation or not.  Right now, I say that this does
little except lock us into something we may well wish to get out of.

Also, I'm very skeptical of the implied position that pg_am properties
should explain everything a DBA needs to know about the different AMs.
That's never been even remotely true, and if that's the sort of standard
we wish to strive for, an API that can return a few boolean properties
ain't gonna cut it.  I think we should limit our ambitions here to
exposing properties that *applications* demonstrably have uses for.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Allowing GIN array_ops to work on anyarray

In
https://www.postgresql.org/message-id/15293.1466536...@sss.pgh.pa.us
I speculated that it might not take too much to replace all the variants
of GIN array_ops with a single polymorphic opclass over anyarray.
Attached is a proposed patch that does that.

There are two bits of added functionality needed to make this work:

1. We need to abstract the storage type.  The patch does this by teaching
catalog/index.c to recognize an opckeytype specified as ANYELEMENT with an
opcintype of ANYARRAY, and doing the array element type lookup at index
creation time.

2. We need to abstract the key comparator.  The patch does this by
teaching gin/ginutil.c that if the opclass omits a GIN_COMPARE_PROC,
it should look up the default btree comparator for the index key type.

Both of these seem to me to be reasonable general-purpose behaviors with
potential application to other opclasses.

In the aforementioned message I worried that a core opclass defined this
way might conflict with user-built opclasses for specific array types,
but it seems to work out fine without any additional tweaks: CREATE INDEX
already prefers an exact match if it finds one, and only falls back to
matching anyarray when it doesn't.  Also, all the replaced opclasses are
presently default for their types, which means that pg_dump won't print
them explicitly in CREATE INDEX commands, so we don't have a dump/reload
or pg_upgrade hazard from them disappearing.

A potential downside is that for an opclass defined this way, we add a
lookup_type_cache() call to each initGinState() call.  That's basically
just a single dynahash lookup once the caches are populated, so it's not
much added cost, but conceivably it could be measurable in bulk insert
operations.  If it does prove objectionable my inclination would be to
look into ways to avoid the repetitive function lookups of initGinState,
perhaps by letting it cache that stuff in the index's relcache entry.

I'll put this on the September commitfest docket.

regards, tom lane

diff --git a/doc/src/sgml/gin.sgml b/doc/src/sgml/gin.sgml
index 05d92eb..d0d2ba8 100644
*** a/doc/src/sgml/gin.sgml
--- b/doc/src/sgml/gin.sgml
***
*** 441,462 
   
  
   
!There are three methods that an operator class for
 GIN must provide:
  
!  
! 
!  int compare(Datum a, Datum b)
!  
!   
!Compares two keys (not indexed items!) and returns an integer less than
!zero, zero, or greater than zero, indicating whether the first key is
!less than, equal to, or greater than the second.  Null keys are never
!passed to this function.
!   
!  
! 
! 
  
   Datum *extractValue(Datum itemValue, int32 *nkeys,
  bool **nullFlags)
--- 441,450 
   
  
   
!There are two methods that an operator class for
 GIN must provide:
  
!   
  
   Datum *extractValue(Datum itemValue, int32 *nkeys,
  bool **nullFlags)
***
*** 645,651 
--- 633,670 
   
  

+  
+ 
+  
+   In addition, GIN must have a way to sort the key values stored in the index.
+   The operator class can define the sort ordering by specifying a comparison
+   method:
  
+   
+ 
+  int compare(Datum a, Datum b)
+  
+   
+Compares two keys (not indexed items!) and returns an integer less than
+zero, zero, or greater than zero, indicating whether the first key is
+less than, equal to, or greater than the second.  Null keys are never
+passed to this function.
+   
+  
+ 
+   
+ 
+   Alternatively, if the operator class does not provide a compare
+   method, GIN will look up the default btree operator class for the index
+   key data type, and use its comparison function.  It is recommended to
+   specify the comparison function in a GIN operator class that is meant for
+   just one data type, as looking up the btree operator class costs a few
+   cycles.  However, polymorphic GIN operator classes (such
+   as array_ops) typically cannot specify a single comparison
+   function.
+  
+ 
+  
Optionally, an operator class for GIN can supply the
following method:
  
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index a2450f4..8bfa924 100644
*** a/src/backend/access/gin/ginutil.c
--- b/src/backend/access/gin/ginutil.c
***
*** 22,28 
--- 22,30 
  #include "miscadmin.h"
  #include "storage/indexfsm.h"
  #include "storage/lmgr.h"
+ #include "utils/builtins.h"
  #include "utils/index_selfuncs.h"
+ #include "utils/typcache.h"
  
  
  /*
*** initGinState(GinState *state, Relation i
*** 104,112 
  		origTupdesc->attrs[i]->attcollation);
  		}
  
! 		fmgr_info_copy(&(state->compareFn[i]),
! 	   index_getprocinfo(index, i + 1, GIN_COMPARE_PROC),
! 	   CurrentMemoryContext);
  		fmgr_info_copy(&(state->extractValueFn[i]),
  	   index_getprocinfo(index, i + 1, GIN_EXTRACTV

Re: [HACKERS] No longer possible to query catalogs for index capabilities?

2016-08-11 Thread Joshua D. Drake


On 08/11/2016 10:46 AM, Kevin Grittner wrote:

On Wed, Aug 10, 2016 at 5:14 PM, Tom Lane  wrote:

Kevin Grittner  writes:



But a DBA who has a problem doesn't care what the truth will be in
a year or two -- the interest is in *right now* on one particular
cluster.


If you are a DBA wanting to know how fine-grained the locking is
in a particular index type, you really need to read the source code
or ask a hacker.




This has to be a joke. With the greatest respect, this show a world of 
disconnect from the people that actually use this software.


JD

--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] LWLocks in DSM memory

On Thu, Jul 28, 2016 at 6:51 PM, Thomas Munro
 wrote:
> That version didn't actually make LWLock any smaller, because of the
> additional offset stored in proclist_head when initialising the list.
> Here is a new version that requires callers to provide it when
> accessing the list, bringing sizeof(LWLock) down to 16 on LP64/LLP64
> systems.  In the spirit of the dlist_container macro, I added macros
> so you can name the link member rather than providing its offset:
> proclist_push_tail(&list, procno, lwWaitLink).

I have reviewed this patch.  I like the design and overall I would say
that the patch looks quite elegant.  The only problem I found is that
proclist_types.h doesn't seem to need to #include .  I tried
taking that out, and, at least on my machine, it still compiles just
fine.

Of course, performance is a concern, as with any change that touches
deep internals.  Andres and Thomas seem to be in agreement that this
patch should cause a performance loss but that it should be
undetectable in practice, since only the slow path where we're
entering the kernel anyway is affected.  I agree with that conclusion.
I spent a little bit of time trying to think of a case where this
would have a measurable impact, and I'm not coming up with anything.
We've done a lot of work over the last few releases to try to
eliminate LWLock contention, and the only way you could even
theoretically see an impact from this is if you can come up with a
workload with furious LWLock contention so that there is lots and lots
of linked-list manipulation going on inside lwlock.c.  But I don't
really know of a workload that behaves that way, and even if I did, I
think the number of cycles we're talking about here is too small to be
measurable.  Having to put processes to sleep is going to be multiple
orders of magnitude more costly than any possible details of how we
handle the linked list manipulation.

I also agree with the goal of the patch: the reason why I developed
the idea of LWLock tranches was with the goal of being able to put
LWLocks inside DSM segments, and that worked fine up until Andres
changed lwlock.c to use dlists.  This change will get us back to being
able to use LWLocks inside of DSM segments.  Of course, we have no
code like that today; if we did, it would be broken.  But Thomas will
be submitting code that does exactly that in the near future, and I
suspect other people will want to do it, too. As a fringe benefit, I
have another patch I'm working on that can make use of this proclist
infrastructure.

Therefore, I plan to commit this patch, removing the #include
 unless someone convinces me we need it, shortly after
development for v10 opens, unless there are objections before then.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] No longer possible to query catalogs for index capabilities?

2016-08-11 Thread Kevin Grittner

On Wed, Aug 10, 2016 at 5:14 PM, Tom Lane  wrote:
> Kevin Grittner  writes:

>> Where [whether the AM supports predicate locking at a
>> granularity finer than the provided default of index relation]
>> would be interesting to know is [ ... ] (when there is more than
>> just btree supported, so it is not so easy to remember) if you
>> are a DBA investigating a high rate of serialization failures
>> and want to know whether indexes of a certain type have
>> index-relation predicate locking granularity or something more
>> fine-grained.  [This] use case seems plausible once there is
>> more of a mix of support among the AMs.
>
> TBH, that line of thought impresses me not at all, because I do not see
> a reason for SQL queries to need to see internal behaviors of AMs, and
> especially not at levels as crude as boolean properties of entire AMs,
> because that's just about guaranteed to become a lie (or at least not
> enough of the truth) in a year or two.

But a DBA who has a problem doesn't care what the truth will be in
a year or two -- the interest is in *right now* on one particular
cluster.

> If you are a DBA wanting to know how fine-grained the locking is
> in a particular index type, you really need to read the source code
> or ask a hacker.

Really?

> In short, I do not see a good reason to expose ampredlocks at the SQL
> level, and I think there needs to be a darn good reason to expose any of
> this stuff, not just "maybe some DBA will think he needs to query this".

As I said before, there's probably not a lot of benefit exposing it
*now*, when only btree indexes support fine-grained predicate
locks; but if we were to add support for one or two more AMs per
release for the next several releases, it could become a
significant benefit to a DBA who's trying to figure out a problem
-- especially if that DBA is not a skilled C programmer.  So, the
next question is, if it is somewhat likely to become useful as an
exposed property in some future release, are we better off exposing
it now, even though it might not be referenced much, or wait until
we see demand popping up on the lists by people needing the
information to solve problems they are having?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Heap WARM Tuples - Design Draft

2016-08-11 Thread Claudio Freire

On Thu, Aug 11, 2016 at 11:07 AM, Jim Nasby  wrote:
> On 8/10/16 12:48 PM, Claudio Freire wrote:
>>
>> On Tue, Aug 9, 2016 at 11:39 PM, Jim Nasby 
>> wrote:
>>>
>>> On 8/9/16 6:44 PM, Claudio Freire wrote:


 Since we can lookup all occurrences of k1=a index=0 and k2=a index=0,
 and in fact we probably did so already as part of the update logic
>>>
>>>
>>>
>>> That's a change from what currently happens, right?
>>>
>>> The reason I think that's important is that dropping the assumption that
>>> we
>>> can't safely re-find index entries from the heap opens up other
>>> optimizations, ones that should be significantly simpler to implement.
>>> The
>>> most obvious example being getting rid of full index scans in vacuum.
>>> While
>>> that won't help with write amplification, it would reduce the cost of
>>> vacuum
>>> enormously. Orders of magnitude wouldn't surprise me in the least.
>>>
>>> If that's indeed a prerequisite to WARM it would be great to get that
>>> groundwork laid early so others could work on other optimizations it
>>> would
>>> enable.
>>
>>
>> I can do that. I've been prospecting the code to see what changes it
>> would entail already.
>>
>> But it's still specific to btree, I'm not sure the same optimizations
>> can be applied to GIN (maybe, if the posting list is sorted) or GIST
>> (probably, since it's like a btree, but I don't know the code well
>> enough).
>>
>> Certainly hash indexes won't support it.
>
>
> Why not? If this is all predicated on re-finding index keys based on heap
> data then this is just another index lookup, no?

A lookup on a hash index cannot be made to work for both key-only
lookups and key-ctid lookups, it's a limitation of the data structure.

A key-only lookup can potentially return too many results that don't
match the ctid so a walk of all equal-key item pointers is out of the
question.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Improved ICU patch - WAS: [HACKERS] Implementing full UTF-8 support (aka supporting 0x00)

2016-08-11 Thread Peter Geoghegan

On Thu, Aug 11, 2016 at 4:22 AM, Palle Girgensohn  wrote:
> But in your strxfrm code in PostgreSQL, the keys are cached, and represented
> as int64:s if I remember correctly, so perhaps there is still a benefit
> using the abbreviated keys? More testing is required, I guess...

ICU's ucol_nextSortKeyPart() interface is faster for this, I believe,
and works because you only need the first sizeof(Datum) bytes (4 or 8
bytes). So, you have the right idea about using it (at least for the
abbreviated key stuff), but the second last argument needs to be
sizeof(Datum).

The whole point of the abbreviated key optimization is that you can
avoid pointer chasing during each and every comparison. Your test here
is invalid because it doesn't involved the reuse of the keys. Often,
during a sort, each item has to be compared about 20 times.

I've experimented with ICU, and know it works well with this. You
really need to create a scenario with a real sort, and all the
conditions I describe.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.6 phrase search distance specification

2016-08-11 Thread Ryan Pedela

On Thu, Aug 11, 2016 at 10:42 AM, Ryan Pedela 
wrote:

> On Thu, Aug 11, 2016 at 9:27 AM, Oleg Bartunov 
> wrote:
>
>> On Tue, Aug 9, 2016 at 9:59 PM, Ryan Pedela 
>> wrote:
>> >
>> >
>>
>> >  I would say that it is worth it to have a "phrase slop" operator
>> (Apache
>> > Lucene terminology). Proximity search is extremely useful for improving
>> > relevance and phrase slop is one of the tools to achieve that.
>> >
>>
>> It'd be great if you explain what is "phrase slop". I assume it's not
>> about search, but about relevance.
>>
>
> Sure. An exact phrase query has slop = 0 which means find all terms in the
> exact positions relative to each other. Phrase query with slop > 0 means
> find all terms within  positions relative to each other. If slop =
> 10, find all terms within 10 positions of each other. Here is a concrete
> example from my current work searching SEC filings.
>
> Bill Gates' full legal name is William H. Gates, III. In the SEC database
> [1], his name is GATES WILLIAM H III. If you are searching the records of
> people within the SEC database and you want to find Bill Gates, most users
> will type "bill gates". Since there are many people with the first name
> Bill (William) and the last name Gates, Bill Gates most likely won't be the
> first result with a standard keyword query. Likewise an exact phrase query
> (slop = 0) will not find him either because the first and last names are
> transposed. What you need is a phrase query with a slop = 2 which will
> match "William Gates", "William H Gates", "Gates William", etc. There is
> still the issue of Bill vs William, but that can be solved with synonyms
> and is a different topic.
>
> 1. https://www.sec.gov/cgi-bin/browse-edgar?CIK=902012&owner
> =exclude&action=getcompany&Find=Search
>


One more thing. In that trivial example, an AND query would probably do a
great job too. However if you are searching for Bill Gates in large text
documents rather than a list of names, an AND query will not give you very
good results because the words "bill" and "gates" are so common.

Re: [HACKERS] 9.6 phrase search distance specification

2016-08-11 Thread Ryan Pedela

On Thu, Aug 11, 2016 at 9:27 AM, Oleg Bartunov  wrote:

> On Tue, Aug 9, 2016 at 9:59 PM, Ryan Pedela 
> wrote:
> >
> >
>
> >  I would say that it is worth it to have a "phrase slop" operator (Apache
> > Lucene terminology). Proximity search is extremely useful for improving
> > relevance and phrase slop is one of the tools to achieve that.
> >
>
> It'd be great if you explain what is "phrase slop". I assume it's not
> about search, but about relevance.
>

Sure. An exact phrase query has slop = 0 which means find all terms in the
exact positions relative to each other. Phrase query with slop > 0 means
find all terms within  positions relative to each other. If slop =
10, find all terms within 10 positions of each other. Here is a concrete
example from my current work searching SEC filings.

Bill Gates' full legal name is William H. Gates, III. In the SEC database
[1], his name is GATES WILLIAM H III. If you are searching the records of
people within the SEC database and you want to find Bill Gates, most users
will type "bill gates". Since there are many people with the first name
Bill (William) and the last name Gates, Bill Gates most likely won't be the
first result with a standard keyword query. Likewise an exact phrase query
(slop = 0) will not find him either because the first and last names are
transposed. What you need is a phrase query with a slop = 2 which will
match "William Gates", "William H Gates", "Gates William", etc. There is
still the issue of Bill vs William, but that can be solved with synonyms
and is a different topic.

1. https://www.sec.gov/cgi-bin/browse-edgar?CIK=902012&owner=exclude&action=
getcompany&Find=Search

Thanks,
Ryan

Re: [HACKERS] No longer possible to query catalogs for index capabilities?

2016-08-11 Thread Greg Stark

On Thu, Aug 11, 2016 at 5:24 PM, Robert Haas  wrote:
>   You argue against
> these things on the grounds that they might change later, but the
> overwhelming evidence from posts on this list is that people would
> prefer to have access to APIs that might not be stable rather than
> have no access at all.

I don't think it's useful to take this ultimatum approach. I would say
abstraction boundaries are a fairly well-proven C.S. tool at this
point -- and indeed by sometime last century. The real question is
where do the benefits outweigh the costs and that's going to be a
question of balancing conflicting priorities. Not one where an
ultimatum is justified.

So the real question is, are index access methods a place where we
want to take short cuts and just expose internals or is this a place
where we should spend the effort to design good abstractions? At face
value it certainly seems like a line worth defending but historically
it's been a kind of half-hearted abstraction since it was never clear
what new access methods might need and how to abstractly define every
possible attribute they might have. And the push away from SQL defined
attributes seems to be conceding defeat on that front.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] No longer possible to query catalogs for index capabilities?

On Thu, Aug 11, 2016 at 11:35 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Wed, Aug 10, 2016 at 6:14 PM, Tom Lane  wrote:
>>> In short, I do not see a good reason to expose ampredlocks at the SQL
>>> level, and I think there needs to be a darn good reason to expose any of
>>> this stuff, not just "maybe some DBA will think he needs to query this".
>
>> I don't think you're being unreasonable, but I don't agree with your
>> approach.  I think that we should expose everything we reasonably can,
>> and if we have to change it later then it will be a backward
>> compatibility break.  Making it unqueryable in the hopes that people
>> won't try to query it is futile.
>
> Well, if it's unqueryable they won't be able to query it no matter how
> hard they try ;-).

Sure, but as this thread already demonstrates, they may also complain
forcefully about having lost that capability.  You argued when
removing the pg_am columns that nobody cared about the ability to
query any of them; when that turned out to be false, you argued that
they could hard-code the index types instead of querying for
capabilities; when there was opposition to that, you fell back to your
present position of arguing that only the smallest possible subset of
it should be made queryable.  This is basically the same argument we
have every time somebody wants to remove a "static" from a function
prototype or stick a PGDLLIMPORT on a variable.  You argue against
these things on the grounds that they might change later, but the
overwhelming evidence from posts on this list is that people would
prefer to have access to APIs that might not be stable rather than
have no access at all.  I don't expect this post or any other to
convince you that such a view is in fact sensible, but I could hope
that at some point you might be willing to admit that it's
widely-held.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] new autovacuum criterion for visible pages

2016-08-11 Thread Jeff Janes

On Thu, Aug 11, 2016 at 8:32 AM, Amit Kapila  wrote:
> On Thu, Aug 11, 2016 at 2:09 AM, Jeff Janes  wrote:
>> I wanted to create a new relopt named something like
>> autovacuum_vacuum_pagevisible_factor which would cause autovacuum to
>> vacuum a table once less than a certain fraction of the relation's
>> pages are marked allvisible.
>>
>
> Why would it more convenient for a user to set such a parameter as
> compare to existing parameters (autovacuum_vacuum_threshold +
> autovacuum_vacuum_scale_factor)?

Insertions and HOT-updates clear vm bits but don't increment the
counters that those existing parameters are compared to.

Also, the relationship between number of updated/deleted rows and the
number of hint-bits cleared can be hard to predict due to possible
clustering of the updates into the same blocks.  So it would be hard
to know what to set the values to.

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [Patch] New psql prompt substitution %r (m = master, r = replica)

Peter Eisentraut  writes:
> On 8/11/16 6:04 AM, Aleksander Alekseev wrote:
>> Suggested patch introduces an %r substitution in psql's prompt. This
>> substitution allows to display whether user is connected to master or
>> replica right in a prompt.

> In the near future, there will (probably) be a lot more variants about
> what it means to be a master or a replica.  There will be logical
> replication, where you could be a publisher of something and a consumer
> of something else.  You could even be a logical consumer but a physical
> master.  So a global binary facility is probably not very forward
> looking and will lead to confusion.

Also, the patch as given is broken since it fails to account for the
server being promoted while a psql session is open.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [Patch] New psql prompt substitution %r (m = master, r = replica)

2016-08-11 Thread Peter Eisentraut

On 8/11/16 6:04 AM, Aleksander Alekseev wrote:
> Suggested patch introduces an %r substitution in psql's prompt. This
> substitution allows to display whether user is connected to master or
> replica right in a prompt.

In the near future, there will (probably) be a lot more variants about
what it means to be a master or a replica.  There will be logical
replication, where you could be a publisher of something and a consumer
of something else.  You could even be a logical consumer but a physical
master.  So a global binary facility is probably not very forward
looking and will lead to confusion.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] No longer possible to query catalogs for index capabilities?

Robert Haas  writes:
> On Wed, Aug 10, 2016 at 6:14 PM, Tom Lane  wrote:
>> In short, I do not see a good reason to expose ampredlocks at the SQL
>> level, and I think there needs to be a darn good reason to expose any of
>> this stuff, not just "maybe some DBA will think he needs to query this".

> I don't think you're being unreasonable, but I don't agree with your
> approach.  I think that we should expose everything we reasonably can,
> and if we have to change it later then it will be a backward
> compatibility break.  Making it unqueryable in the hopes that people
> won't try to query it is futile.

Well, if it's unqueryable they won't be able to query it no matter how
hard they try ;-).  But my point here is that up to now, we never had the
opportunity to draw a line between user-visible and non-user-visible AM
properties; if it needed to be in pg_am, that's where it went.  Now that
we do have an opportunity, we should draw the line in an intelligent
fashion, not blindly assume that everything that was in pg_am should
remain exposed.  I think that neither amoptionalkey nor ampredlocks is
of real use to applications, and there are easily foreseeable reasons
why they would disappear or change behavior substantially.  So I feel
we should leave them out of the API.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] improved DefElem list processing

2016-08-11 Thread Peter Eisentraut

On 8/5/16 11:25 AM, Peter Eisentraut wrote:
> On 8/4/16 2:21 PM, Tom Lane wrote:
>> Forgot to mention: seems like you should have added a location
>> argument to makeDefElem.
> 
> I was hesitating to do that lest it break extensions or something, but I
> guess we break bigger things than that all the time.  I'll change it.

In order not to work on two patches that directly conflict with each
other, I have proceeded with the location patch and postponed the
duplicate checking patch.

Attached is a biggish patch to review.  It adds location information to
all places DefElems are created in the parser and then adds errposition
information in a lot of places, but surely not all of them.  That can be
improved over time.

I'm not happy that utils/acl.h has prototypes for aclchk.c, because
acl.h is included all over the place.  Perhaps I should make a
src/include/catalog/aclchk.c to clean that up.

Here are some example commands to try for getting suitable error messages:

create collation foo (foo = bar, bar = foo);
copy test from stdin (null 'x', null 'x');
create function foo (a int, b int) returns int as $$ select a+b $$
language sql language sql;
create function foo (a int, b int) returns int as $$ select a+b $$
language sql volatile stable;
create function foo (a int, b int) returns int as $$ select a+b $$
language sql with (foo = bar);
create sequence foo minvalue 1 minvalue 2;
create type foo (foo = bar);
create user foo createdb nocreatedb;
explain (foo, bar) select 1;

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 8751bc7fe1ac057e0b66bf879d5e1988d132db38 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Thu, 11 Aug 2016 12:00:00 -0400
Subject: [PATCH v2] Add location field to DefElem

Add a location field to the DefElem struct, used to parse many utility
commands.  Update various error messages to supply error position
information.

To propogate the error position information in a more systematic way,
create a ParseState in standard_ProcessUtility() and pass that to
interested functions implementing the utility commands.  This seems
better than passing the query string and then reassembling a parse state
ad hoc, which violates the encapsulation of the ParseState type.
---
 contrib/file_fdw/file_fdw.c|  16 +-
 src/backend/access/common/reloptions.c |   2 +-
 src/backend/catalog/aclchk.c   |   8 +-
 src/backend/commands/aggregatecmds.c   |   7 +-
 src/backend/commands/collationcmds.c   |   5 +-
 src/backend/commands/copy.c|  93 ++
 src/backend/commands/dbcommands.c  |  61 +++---
 src/backend/commands/define.c  |   9 -
 src/backend/commands/explain.c |   8 +-
 src/backend/commands/extension.c   |  25 ++-
 src/backend/commands/functioncmds.c|  57 +++---
 src/backend/commands/sequence.c|  36 ++--
 src/backend/commands/tsearchcmds.c |   8 +-
 src/backend/commands/typecmds.c|   8 +-
 src/backend/commands/user.c|  41 ++--
 src/backend/commands/view.c|   4 +-
 src/backend/nodes/copyfuncs.c  |   1 +
 src/backend/nodes/equalfuncs.c |   2 +
 src/backend/nodes/makefuncs.c  |   6 +-
 src/backend/nodes/outfuncs.c   |   1 +
 src/backend/nodes/readfuncs.c  |   1 +
 src/backend/parser/gram.y  | 248 -
 src/backend/parser/parse_utilcmd.c |   5 +-
 src/backend/replication/logical/logicalfuncs.c |   2 +-
 src/backend/replication/repl_gram.y|  16 +-
 src/backend/tcop/utility.c |  64 ---
 src/include/commands/collationcmds.h   |   2 +-
 src/include/commands/copy.h|   7 +-
 src/include/commands/dbcommands.h  |   4 +-
 src/include/commands/defrem.h  |  13 +-
 src/include/commands/explain.h |   3 +-
 src/include/commands/extension.h   |   4 +-
 src/include/commands/sequence.h|   5 +-
 src/include/commands/typecmds.h|   2 +-
 src/include/commands/user.h|   3 +-
 src/include/nodes/makefuncs.h  |   4 +-
 src/include/nodes/parsenodes.h |   1 +
 src/include/utils/acl.h|   4 +-
 38 files changed, 438 insertions(+), 348 deletions(-)

diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index c049131..1d0049f 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -293,7 +293,7 @@ file_fdw_validator(PG_FUNCTION_ARGS)
 	/*
 	 * Now apply the core COPY code's validation logic for more checks.
 	 */
-	ProcessCopyOptions(NULL, true, other_options);
+	ProcessCopyOptions(NULL, NULL, true, other_options);
 
 	/*
 	 * F

Re: [HACKERS] new autovacuum criterion for visible pages

2016-08-11 Thread Amit Kapila

On Thu, Aug 11, 2016 at 2:09 AM, Jeff Janes  wrote:
> I wanted to create a new relopt named something like
> autovacuum_vacuum_pagevisible_factor which would cause autovacuum to
> vacuum a table once less than a certain fraction of the relation's
> pages are marked allvisible.
>

Why would it more convenient for a user to set such a parameter as
compare to existing parameters (autovacuum_vacuum_threshold +
autovacuum_vacuum_scale_factor)?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.6 phrase search distance specification

2016-08-11 Thread Oleg Bartunov

On Tue, Aug 9, 2016 at 9:59 PM, Ryan Pedela  wrote:
>
>

>  I would say that it is worth it to have a "phrase slop" operator (Apache
> Lucene terminology). Proximity search is extremely useful for improving
> relevance and phrase slop is one of the tools to achieve that.
>

It'd be great if you explain what is "phrase slop". I assume it's not
about search, but about relevance.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extract text from XML

2016-08-11 Thread Tobias Bussmann

> I have found a basic use case which is supported by the xml2 module,
> but is unsupported by the new XML API. 
> It is not possible to correctly extract text

Indeed. I came accross this shortcomming some months ago myself but still 
manage an item on my ToDo list to report it here as the deprecation notice at 
https://www.postgresql.org/docs/devel/static/xml2.html#AEN180625 asks for. 
Done, thanks ;)

I did some archive-browsing on that topic. The issue (if you want to call it 
that way) was introduced by an patch to ensure xpath() always returns xml, 
applied for 9.2 after some discussion: 
https://www.postgresql.org/message-id/201106291934.23089.rsmogura%40softperience.eu
 and is since then known: 
https://www.postgresql.org/message-id/1409795403248-5817667.post%40n5.nabble.com
 The new behaviour was later reported as a bug and discussed again: 
https://www.postgresql.org/message-id/CAAY5AM1L83y79rtOZAUJioREO6n4%3DXAFKcGu6qO3hCZE1yJytg%40mail.gmail.com

Anyhow - (un)escaping functions to support the text<->xml conversion are often 
talked about but still seem only to be found in xml2 module. Seeing a xmltable 
implementing patch here recently, these functions would be another step to make 
the contrib module obsolete, finally.

> Perhaps a function xpath_value(text, xml) -> text[] would close the gap?

such an design, resembling the xml2 behaviour, would certainly fit the need, 
imho.

regards
Tobias

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical Replication WIP

2016-08-11 Thread Petr Jelinek


Hi,

On 11/08/16 13:34, Stas Kelvich wrote:


* max_logical_replication_workers mentioned everywhere in docs, but guc.c 
defines
variable called max_logical_replication_processes for postgresql.conf



Ah changed it in code but not in docs, will fix.


* Since pg_subscription already shared across the cluster, it can be also handy 
to
share pg_publications too and allow publication of tables from different 
databases. That
is rare scenarios but quite important for virtual hosting use case — tons of 
small databases
in a single postgres cluster.


You can't decode changes from multiple databases in one slot so I don't 
see the usefulness there. The pg_subscription is currently shared 
because it's technical necessity (as in I don't see how to solve the 
need to access the catalog from launcher in any other way) not because I 
think it's great design :)




* There is no way to see attachecd tables/schemas to publication through \drp



That's mostly intentional as publications for table are visible in \d, 
but I am not against adding it to \drp.



* As far as I understand there is no way to add table/tablespace right in CREATE
PUBLICATION and one need explicitly do ALTER PUBLICATION right after creation.
May be add something like WITH TABLE/TABLESPACE to CREATE?



Yes, as I said to Masahiko Sawada, it's just not there yet but I plan to 
have that.



* So binary protocol goes into core. Is it still possible to use it as decoding 
plugin for
manually created walsender? May be also include json as it was in pglogical? 
While
i’m not arguing that it should be done, i’m interested about your opinion on 
that.



Well the plugin is bit more integrated into the publication infra so if 
somebody would want to use it directly they'd have to use that part as 
well. OTOH the protocol itself is provided as API so it's reusable by 
other plugins if needed.


JSON plugin is something that would be nice to have in core as well, but 
I don't think it's part of this patch.



* Also I’ve noted that you got rid of reserved byte (flags) in protocol 
comparing to
pglogical_native. It was very handy to use it for two phase tx decoding (0 — 
usual
commit, 1 — prepare, 2 — commit prepared), because both prepare and commit
prepared generates commit record in xlog.


Hmm maybe commit message could get it back. PGLogical has them sprinkled 
all around the protocol which I don't really like so I want to limit 
them to the places where they are actually useful.





On 05 Aug 2016, at 18:00, Petr Jelinek  wrote:

- DDL, I see several approaches we could do here for 10.0. a) don't
  deal with DDL at all yet, b) provide function which pushes the DDL
  into replication queue and then executes on downstream (like
  londiste, slony, pglogical do), c) capture the DDL query as text
  and allow user defined function to be called with that DDL text on
  the subscriber


* Since here DDL is mostly ALTER / CREATE / DROP TABLE (or am I wrong?) may be
we can add something like WITH SUBSCRIBERS to statements?



Not sure I follow. How does that help?


* Talking about exact mechanism of DDL replication I like you variant b), but 
since we
have transactional DDL, we can do two phase commit here. That will require two 
phase
decoding and some logic about catching prepare responses through logical 
messages. If that
approach sounds interesting i can describe proposal in more details and create 
a patch.



I'd think that such approach is somewhat more interesting with c) 
honestly. The difference between b) and c) is mostly about explicit vs 
implicit. I definitely would like to see the 2PC patch updated to work 
with this. But maybe it's wise to wait a while until the core of the 
patch stabilizes during the discussion.



* Also I wasn’t able actually to run replication itself =) While regression 
tests passes, TAP
tests and manual run stuck. pg_subscription_rel.substate never becomes ‘r’. 
I’ll investigate
that more and write again.


Interesting, please keep me posted. It's possible for tables to stay in 
's' state for some time if there is nothing happening on the server, but 
that should not mean anything is stuck.




* As far as I understand sync starts automatically on enabling publication. May 
be split that
logic into a different command with some options? Like don’t sync at all for 
example.



I think SYNC should be option of subscription creation just like 
INITIALLY ENABLED/DISABLED is. And then there should be interface to 
resync a table manually (like pglogical has). Not yet sure how that 
interface should look like in terms of DDL though.



* When I’m trying to create subscription to non-existent publication, CREATE 
SUBSRITION
creates replication slot and do not destroys it:

# create subscription sub connection 'host=127.0.0.1 dbname=postgres' 
publication mypub;
NOTICE:  created replication slot "sub" on provider
ERROR:  could not receive list of replicated tables from the provider: ERROR:  
cache lookup failed for publicati

Re: [HACKERS] Slowness of extended protocol

2016-08-11 Thread Shay Rojansky

On Thu, Aug 11, 2016 at 1:22 PM, Vladimir Sitnikov <
sitnikov.vladi...@gmail.com> wrote:

2) The driver can use safepoints and autorollback to the good "right before
> failure" state in case of a known failure. Here's the implementation:
> https://github.com/pgjdbc/pgjdbc/pull/477
>
As far as I can remember, performance overheads are close to zero (no extra
> roundtrips to create a safepoint)
>

What? Do you mean you do implicit savepoints and autorollback too? How does
the driver decide when to do a savepoint? Is it on every single command? If
not, commands can get lost when an error is raised and you automatically
roll back? If you do a savepoint on every single command, that surely would
impact performance even without extra roundtrips...?

You seem to have a very "unique" idea of what a database driver should do
under-the-hood for its users. At the very least I can say that your concept
is very far from almost any database driver I've seen up to now (PostgreSQL
JDBC, psycopg, Npgsql, libpq...). I'm not aware of other drivers that
implicitly prepare statements, and definitely of no drivers that internally
create savepoints and roll the back without explicit user APIs. At the very
least you should be aware (and also clearly admit!) that you're doing
something very different - not necessarily wrong - and not attempt to
impose your ideas on everyone as if it's the only true way to write a db
driver.

3) Backend could somehow invalidate prepared statements, and notify clients
> accordingly. Note: the problem is hard in a generic case, however it might
> be not that hard if we fix well-known often-used cases like "a column is
> added". That however, would add memory overheads to store additional maps
> like "table_oid -> statement_names[]"
>

Assuming your driver supports batching/pipelining (and I hope it does),
that doesn't make sense. Say I send a 2-statement batch, with the first one
a DDL and with the second one some prepared query invalidated by the first.
When your DDL is executed by PostgreSQL your hypothetical notification is
sent. But the second query, which should be invalidated, has already been
sent to the server (because of batching), and boom.

4) Other. For instance, new message flow so frontend and backend could
> re-negotiate "binary vs text formats for the new resulting type". Or
> "ValidatePreparedStatement" message that would just validate the statement
> and avoid killing the transaction if the statement is invalid. Or whatever
> else there can be invented.
>

When would you send this ValidatePreparedStatement? Before each execution
as a separate roundtrip? That would kill performance. Would you send it in
the same packet before Execute? In that case you still get the error when
Execute is evaluated...

There really is no solution to this problem within the current PostgreSQL
way of doing things - although of course we could reinvent the entire
PostgreSQL protocol here to accommodate for your special driver...

The basic truth is this... In every db driver I'm familiar with programmers
are expected to manage query preparation on their own. They're supposed to
do it based on knowledge only they have, weighing pros and cons. They have
responsibility over their own code and they don't outsource major decisions
like this to their driver. When they get an error from PostgreSQL, it's
triggered by something they did in a very explicit and clear way - and they
therefore have a good chance to understand what's going on. They generally
don't get errors triggered by some under-the-hood magic their driver
decided to do for them, and which are hard to diagnose and understand.
Their drivers are simple, predictable and lean, and they don't have to
include complex healing logic to deal with errors they themselves triggered
with under-the-hood logic (e.g. implicit savepoints).

> Shay>So the general point is that the existence of pools is problematic
> for the argument "always prepare for recurring statements".
>
> So what?
> Don't use pools that issue "discard all" or configure them accordingly.
> That's it.
> In Java world, no wildly used pool defaults to "discard everything"
> strategy.
>

I don't know much about the Java world, but both pgbouncer and pgpool (the
major pools?) send DISCARD ALL by default. That is a fact, and it has
nothing to do with any bugs or issues pgbouncer may have. I'm tempted to go
look at other pools in other languages but frankly I don't think that would
have any effect in this conversation...

Re: [HACKERS] Heap WARM Tuples - Design Draft


On 8/10/16 12:48 PM, Claudio Freire wrote:

On Tue, Aug 9, 2016 at 11:39 PM, Jim Nasby  wrote:

On 8/9/16 6:44 PM, Claudio Freire wrote:


Since we can lookup all occurrences of k1=a index=0 and k2=a index=0,
and in fact we probably did so already as part of the update logic



That's a change from what currently happens, right?

The reason I think that's important is that dropping the assumption that we
can't safely re-find index entries from the heap opens up other
optimizations, ones that should be significantly simpler to implement. The
most obvious example being getting rid of full index scans in vacuum. While
that won't help with write amplification, it would reduce the cost of vacuum
enormously. Orders of magnitude wouldn't surprise me in the least.

If that's indeed a prerequisite to WARM it would be great to get that
groundwork laid early so others could work on other optimizations it would
enable.


I can do that. I've been prospecting the code to see what changes it
would entail already.

But it's still specific to btree, I'm not sure the same optimizations
can be applied to GIN (maybe, if the posting list is sorted) or GIST
(probably, since it's like a btree, but I don't know the code well
enough).

Certainly hash indexes won't support it.


Why not? If this is all predicated on re-finding index keys based on 
heap data then this is just another index lookup, no?

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_ctl promote wait

On Thu, Aug 11, 2016 at 3:24 AM, Peter Eisentraut
 wrote:
> On 8/7/16 9:44 PM, Michael Paquier wrote:
 This is not a good
 >> idea, and the idea of putting a wait argument in get_controlfile does
 >> not seem a good interface to me. I'd rather see get_controlfile be
 >> extended with a flag saying no_error_on_failure and keep the wait
 >> logic within pg_ctl.
>>> >
>>> > I guess we could write a wrapper function in pg_ctl that encapsulated
>>> > the wait logic.
>> That's what I would do.
>
> New patches, incorporating your suggestions.

Thanks for the new set!

> I moved some of the error handling out of get_controlfile() and back
> into the callers, because it was getting too weird that that function
> knew so much about the callers' intentions.  That way we don't actually
> have to change the signature.

I have looked at them and the changes are looking fine for me. So I
have switched the patch as ready for committer, aka you.

Just a nit:
+   if (wait_seconds > 0)
+   {
+   sleep(1);
+   wait_seconds--;
+   continue;
+   }
This may be better this pg_usleep() instead of sleep().
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Slowness of extended protocol

2016-08-11 Thread Vladimir Sitnikov

Shay>As I said, an error is going to kill the ongoing transaction, how can
this be solved without application logic?

1) At least, some well-defined error code should be created for that kind
of matter.

2) The driver can use safepoints and autorollback to the good "right before
failure" state in case of a known failure. Here's the implementation:
https://github.com/pgjdbc/pgjdbc/pull/477
As far as I can remember, performance overheads are close to zero (no extra
roundtrips to create a safepoint)

3) Backend could somehow invalidate prepared statements, and notify clients
accordingly. Note: the problem is hard in a generic case, however it might
be not that hard if we fix well-known often-used cases like "a column is
added". That however, would add memory overheads to store additional maps
like "table_oid -> statement_names[]"

4) Other. For instance, new message flow so frontend and backend could
re-negotiate "binary vs text formats for the new resulting type". Or
"ValidatePreparedStatement" message that would just validate the statement
and avoid killing the transaction if the statement is invalid. Or whatever
else there can be invented.


Shay>So the general point is that the existence of pools is problematic for
the argument "always prepare for recurring statements".

So what?
Don't use pools that issue "discard all" or configure them accordingly.
That's it.
In Java world, no wildly used pool defaults to "discard everything"
strategy.

Please stop saying "pgbouncer" as its issue is confirmed, and pgbouncer
developers did acknowledge they would prefer to solve "prepared statement
issue" right inside pgbouncer without any cooperation from driver
developers.

Do you mean in C# world major connection pools default to "discard all"
setup? That sounds strange to me.

Vladimir

>

Re: [HACKERS] Surprising behaviour of \set AUTOCOMMIT ON

2016-08-11 Thread David G. Johnston

On Thu, Aug 11, 2016 at 2:11 AM, Venkata Balaji N  wrote:

>
> [...]
>  committing all the previously open transactions
> [...]
>

"All"?  There can only ever be at most one open transaction at any given
time...

I don't have a fundamental issue with saying "when turning auto-commit on
you are also requesting that the open transaction, if there is one, is
committed immediately."  I'm more inclined to think an error is the correct
solution - or to respond in a way conditional to the present usage
(interactive vs. script).  I disagree with  Robert's unsubstantiated belief
regarding ON_ERROR_STOP and think that it captures the relevant user-intent
for this behavior as well.

David J.

Re: [HACKERS] Logical Replication WIP

2016-08-11 Thread Stas Kelvich

> On 05 Aug 2016, at 18:00, Petr Jelinek  wrote:
> 
> Hi,
> 
> as promised here is WIP version of logical replication patch.

Great!

Proposed DDL about publication/subscriptions looks very nice to me.

Some notes and thoughts about patch:

* Clang grumbles at following pieces of code:

apply.c:1316:6: warning: variable 'origin_startpos' is used uninitialized 
whenever 'if' condition is false [-Wsometimes-uninitialized]

tablesync.c:436:45: warning: if statement has empty body [-Wempty-body]
if (wait_for_sync_status_change(tstate));

* max_logical_replication_workers mentioned everywhere in docs, but guc.c 
defines
variable called max_logical_replication_processes for postgresql.conf

* Since pg_subscription already shared across the cluster, it can be also handy 
to
share pg_publications too and allow publication of tables from different 
databases. That
is rare scenarios but quite important for virtual hosting use case — tons of 
small databases
in a single postgres cluster.

* There is no way to see attachecd tables/schemas to publication through \drp

* As far as I understand there is no way to add table/tablespace right in CREATE
PUBLICATION and one need explicitly do ALTER PUBLICATION right after creation.
May be add something like WITH TABLE/TABLESPACE to CREATE?

* So binary protocol goes into core. Is it still possible to use it as decoding 
plugin for
manually created walsender? May be also include json as it was in pglogical? 
While
i’m not arguing that it should be done, i’m interested about your opinion on 
that.

* Also I’ve noted that you got rid of reserved byte (flags) in protocol 
comparing to
pglogical_native. It was very handy to use it for two phase tx decoding (0 — 
usual
commit, 1 — prepare, 2 — commit prepared), because both prepare and commit 
prepared generates commit record in xlog.

> On 05 Aug 2016, at 18:00, Petr Jelinek  wrote:
> 
> - DDL, I see several approaches we could do here for 10.0. a) don't
>   deal with DDL at all yet, b) provide function which pushes the DDL
>   into replication queue and then executes on downstream (like
>   londiste, slony, pglogical do), c) capture the DDL query as text
>   and allow user defined function to be called with that DDL text on
>   the subscriber

* Since here DDL is mostly ALTER / CREATE / DROP TABLE (or am I wrong?) may be
we can add something like WITH SUBSCRIBERS to statements?

* Talking about exact mechanism of DDL replication I like you variant b), but 
since we
have transactional DDL, we can do two phase commit here. That will require two 
phase
decoding and some logic about catching prepare responses through logical 
messages. If that
approach sounds interesting i can describe proposal in more details and create 
a patch.

* Also I wasn’t able actually to run replication itself =) While regression 
tests passes, TAP
tests and manual run stuck. pg_subscription_rel.substate never becomes ‘r’. 
I’ll investigate
that more and write again.

* As far as I understand sync starts automatically on enabling publication. May 
be split that
logic into a different command with some options? Like don’t sync at all for 
example.

* When I’m trying to create subscription to non-existent publication, CREATE 
SUBSRITION
creates replication slot and do not destroys it:

# create subscription sub connection 'host=127.0.0.1 dbname=postgres' 
publication mypub;
NOTICE:  created replication slot "sub" on provider
ERROR:  could not receive list of replicated tables from the provider: ERROR:  
cache lookup failed for publication 0
CONTEXT:  slot "sub", output plugin "pgoutput", in the list_tables callback

after that:

postgres=# drop subscription sub;
ERROR:  subscription "sub" does not exist
postgres=# create subscription sub connection 'host=127.0.0.1 dbname=postgres' 
publication pub;
ERROR:  could not crate replication slot "sub": ERROR:  replication slot "sub" 
already exists

* Also can’t drop subscription:

postgres=# \drs
  List of subscriptions
 Name | Database | Enabled | Publication |Conninfo
--+--+-+-+
 sub  | postgres | t   | {mypub} | host=127.0.0.1 dbname=postgres
(1 row)

postgres=# drop subscription sub;
ERROR:  unrecognized object class: 6102

* Several time i’ve run in a situation where provider's postmaster ignores 
Ctrl-C until subscribed
node is switched off.

* Patch with small typos fixed attached.

I’ll do more testing, just want to share what i have so far.

typos.diff
Description: Binary data

-- 
Stas Kelvich
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Improved ICU patch - WAS: [HACKERS] Implementing full UTF-8 support (aka supporting 0x00)

2016-08-11 Thread Palle Girgensohn


> 11 aug. 2016 kl. 11:15 skrev Palle Girgensohn :
> 
>> 
>> 11 aug. 2016 kl. 03:05 skrev Peter Geoghegan :
>> 
>> On Wed, Aug 10, 2016 at 1:42 PM, Palle Girgensohn  
>> wrote:
>>> They've been used for the FreeBSD ports since 2005, and have served us 
>>> well. I have of course updated them regularly. In this latest version, I've 
>>> removed support for other encodings beside UTF-8, mostly since I don't know 
>>> how to test them, but also, I see little point in supporting anything else 
>>> using ICU.
>> 
>> Looks like you're not using the ICU equivalent of strxfrm(). While 9.5
>> is not the release that introduced its use, it did expand it
>> significantly. I think you need to fix this, even though it isn't
>> actually used to sort text at present, since presumably FreeBSD builds
>> of 9.5 don't TRUST_STRXFRM. Since you're using ICU, though, you could
>> reasonably trust the ICU equivalent of strxfrm(), so that's a missed
>> opportunity. (See the wiki page on the abbreviated keys issue [1] if
>> you don't know what I'm talking about.)
> 
> My plan was to get it working without TRUST_STRXFRM first, and then add that 
> functinality. I've made some preliminary tests using ICU:s ucol_getSortKey 
> but I will have to test it a bit more. For now, I just expect not to trust 
> strxfrm. It is the first iteration wrt strxfrm, the plan is to use that code 
> base.

Here are some preliminary results running 1 times comparing the same two 
strings in a tight loop.

  ucol_strcollUTF8: -1  0.002448
   strcoll: 1   0.060711
  ucol_strcollIter: -1  0.009221
 direct memcmp: 1   0.000457
 memcpy memcmp: 1   0.001706
memcpy strcoll: 1   0.068425
   nextSortKeyPart: -1  0.041011
ucnv_toUChars + getSortKey: -1  0.050379


correct answer is -1, but since we compare åasdf and äasdf with a Swedish 
locale, memcmp and strcoll fails of course, as espected. Direct memcmp is 5 
times faster than ucol_strcollUTF8 (used in my patch), but sadly the best 
implementation using sort keys with ICU, nextSortKeyPart, is way slower.



startTime = getRealTime();
for ( int i = 0; i < loop; i++) {
result = ucol_strcollUTF8(collator, arg1, len1, arg2, len2, 
&status);
}
endTime = getRealTime();
printf("%30s: %d\t%lf\n", "ucol_strcollUTF8", result, endTime - 
startTime);



vs


int sortkeysize=8;

startTime = getRealTime();
uint8_t key1[sortkeysize], key2[sortkeysize];
uint32_t sState[2], tState[2];
UCharIterator sIter, tIter;

for ( int i = 0; i < loop; i++) {
uiter_setUTF8(&sIter, arg1, len1);
uiter_setUTF8(&tIter, arg2, len2);
sState[0] = 0; sState[1] = 0;
tState[0] = 0; tState[1] = 0;
ucol_nextSortKeyPart(collator, &sIter, sState, key1, 
sortkeysize, &status);
ucol_nextSortKeyPart(collator, &tIter, tState, key2, 
sortkeysize, &status);
result = memcmp (key1, key2, sortkeysize);
}
endTime = getRealTime();
printf("%30s: %d\t%lf\n", "nextSortKeyPart", result, endTime - 
startTime);



But in your strxfrm code in PostgreSQL, the keys are cached, and represented as 
int64:s if I remember correctly, so perhaps there is still a benefit using the 
abbreviated keys? More testing is required, I guess...

Palle





signature.asc
Description: Message signed with OpenPGP using GPGMail

Re: [HACKERS] Slowness of extended protocol

2016-08-11 Thread Shay Rojansky

On Thu, Aug 11, 2016 at 8:39 AM, Vladimir Sitnikov <
sitnikov.vladi...@gmail.com> wrote:

>  Shay:
>
>> Prepared statements can have very visible effects apart from the speedup
>> they provide (e.g. failure because of schema changes) It's not that
>> these effects can't be worked around - they can be - but programmers can be
>> surprised by these effects, which can cause difficult-to-diagnose issues.
>>
>
> The specific effect of "cached plan cannot change return type" can be
> solved by cooperation of backend and frontend (~driver) developers.
>

I really don't see how. As I said, an error is going to kill the ongoing
transaction, how can this be solved without application logic? Unless you
propose keeping track of all statements executed in the query and resend
them?

I'm asking this out of genuine interest (I might be missing something),
since I'm seriously considering implementing implicit query preparation in
Npgsql.

I find that solving that kind of issues is more important than investing
> into "ParseBindExecDeallocateInOneGo" message.
>

I don't think it's solvable. But regardless, it's totally to invest in two
things, I think it's totally legitimate to invest in implicit query
preparation.

> I hope you would forgive me if I just stop the discussion here.
>

Of course, no problem.

> I find I'd better spent that time on just fixing pgbouncer issue rather
> than discussing if it is pgbouncer's or postgresql's issue.
>

The point is that it's not just a pgbouncer "bug". Connection pools like
pgbouncer (there are others) usually send DISCARD ALL on connection close,
which makes prepared statements useless for short-lived connection
scenarios (since you have to reprepare). I don't know if that's changeable
across pool implementations in general, and even if so, requires
significant user understanding for tweaking the pool. So the general point
is that the existence of pools is problematic for the argument "always
prepare for recurring statements".

I'm sorry for being impolite if I was ever.
>

I don't think you were impolite, I personally learned a few things from the
conversation and will probably implement opt-in implicit query preparation
thanks to it.

[HACKERS] [Patch] New psql prompt substitution %r (m = master, r = replica)

2016-08-11 Thread Aleksander Alekseev

Hello.

Suggested patch introduces an %r substitution in psql's prompt. This
substitution allows to display whether user is connected to master or
replica right in a prompt.

Usage example:

```
$ cat ~/.psqlrc 
\set PROMPT1 '%p (%r) =# '

$ psql -p 5432
psql (9.6beta4)
Type "help" for help.

20638 (m) =# \q

$ psql -p 5433
psql (9.6beta4)
Type "help" for help.

20647 (r) =# \q
```

Currently I'm working on some functionality involving master-slave
replication. Sometimes I accidentally connect to a master instead of
replica and vice versa. Presence of functionality described above would
make my work much more convenient. 

Hopefully there are other people in PostgreSQL community who feel that
way.

-- 
Best regards,
Aleksander Alekseev
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index d9bce25..ce15a66 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -3628,6 +3628,13 @@ testdb=> INSERT INTO my_table VALUES (:'content');
   
 
   
+%r
+
+ Equals m if current instance is a master and r if it's a replica.
+
+  
+
+  
 %R
 
 
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index d4625a6..4436561 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -31,6 +31,7 @@
 #include 
 #endif
 
+#include "access/xlog.h"
 #include "access/htup_details.h"
 #include "catalog/pg_authid.h"
 #include "libpq/libpq.h"
@@ -565,7 +566,9 @@ InitializeSessionUserId(const char *rolename, Oid roleid)
 	SetConfigOption("is_superuser",
 	AuthenticatedUserIsSuperuser ? "on" : "off",
 	PGC_INTERNAL, PGC_S_OVERRIDE);
-
+	SetConfigOption("is_master",
+	RecoveryInProgress() ? "off" : "on",
+	PGC_INTERNAL, PGC_S_OVERRIDE);
 	ReleaseSysCache(roleTup);
 }
 
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 9c93df0..d3eb49a 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -471,6 +471,7 @@ int			huge_pages;
  */
 static char *syslog_ident_str;
 static bool session_auth_is_superuser;
+static bool session_is_master;
 static double phony_random_seed;
 static char *client_encoding_string;
 static char *datestyle_string;
@@ -900,6 +901,16 @@ static struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 	{
+		{"is_master", PGC_INTERNAL, UNGROUPED,
+			gettext_noop("Shows whether the current instance is master or replica."),
+			NULL,
+			GUC_REPORT | GUC_NO_SHOW_ALL | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+		},
+		&session_is_master,
+		false,
+		NULL, NULL, NULL
+	},
+	{
 		{"bonjour", PGC_POSTMASTER, CONN_AUTH_SETTINGS,
 			gettext_noop("Enables advertising the server via Bonjour."),
 			NULL
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 2450b9c..da667af 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -1919,22 +1919,18 @@ is_select_command(const char *query)
 	return false;
 }
 
-
 /*
- * Test if the current user is a database superuser.
- *
- * Note: this will correctly detect superuserness only with a protocol-3.0
- * or newer backend; otherwise it will always say "false".
+ * If given parameter status is "on" returns true, otherwise returns false.
  */
-bool
-is_superuser(void)
+static bool
+internal_get_bool_parameter_status(const char* param_name)
 {
 	const char *val;
 
 	if (!pset.db)
 		return false;
 
-	val = PQparameterStatus(pset.db, "is_superuser");
+	val = PQparameterStatus(pset.db, param_name);
 
 	if (val && strcmp(val, "on") == 0)
 		return true;
@@ -1944,6 +1940,19 @@ is_superuser(void)
 
 
 /*
+ * Test if the current user is a database superuser.
+ *
+ * Note: this will correctly detect superuserness only with a protocol-3.0
+ * or newer backend; otherwise it will always say "false".
+ */
+bool
+is_superuser(void)
+{
+	return internal_get_bool_parameter_status("is_superuser");
+}
+
+
+/*
  * Test if the current session uses standard string literals.
  *
  * Note: With a pre-protocol-3.0 connection this will always say "false",
@@ -1952,19 +1961,17 @@ is_superuser(void)
 bool
 standard_strings(void)
 {
-	const char *val;
-
-	if (!pset.db)
-		return false;
-
-	val = PQparameterStatus(pset.db, "standard_conforming_strings");
-
-	if (val && strcmp(val, "on") == 0)
-		return true;
-
-	return false;
+	return internal_get_bool_parameter_status("standard_conforming_strings");
 }
 
+/*
+ * Test if the current instance is a master.
+ */
+bool
+is_master(void)
+{
+	return internal_get_bool_parameter_status("is_master");
+}
 
 /*
  * Return the session user of the current connection.
diff --git a/src/bin/psql/common.h b/src/bin/psql/common.h
index bdcb58f..5a6635f 100644
--- a/src/bin/psql/common.h
+++ b/src/bin/psql/common.h
@@ -39,6 +39,7 @@ extern int	PSQLexecWatch(const char *query, const printQueryOpt *opt);
 extern bool SendQuery(const char *query);
 
 extern bool is_superuser(void);
+ex

Re: [HACKERS] Bug in to_timestamp().

2016-08-11 Thread Artur Zakirov


Hello,

On 14.07.2016 12:16, Pavel Stehule wrote:


last point was discussed in thread related to to_date_valid function.

Regards

Pavel


Thank you.

Here is my patch. It is a proof of concept.

Date/Time Formatting


There are changes in date/time formatting rules:

- now to_timestamp() and to_date() skip spaces in the input string and 
in the formatting string unless FX option is used, as Amul Sul wrote on 
first message of this thread. But Ex.2 gives an error now with this 
patch (should we fix this too?).


- in the code space characters and separator characters have different 
types of FormatNode. Separator characters are characters ',', '-', '.', 
'/' and ':'. This is done to have different rules of formatting to space 
and separator characters.
If FX option isn't used then PostgreSQL do not insist that separator in 
the formatting string should match separator in the formatting string. 
But count of separators should be equal with or without FX option.


- now PostgreSQL check is there a closing quote. Otherwise the error is 
raised.


Still PostgreSQL do not insist that text character in the formatting 
string should match text character in the input string. It is not 
obvious if this should be fixed. Because we may have different character 
case or character with accent mark or without accent mark.
But I suppose that it is not right just check text character count. For 
example, there is unicode version of space character U+00A0.


Code changes


- new defines:

#define NODE_TYPE_SEPARATOR 4
#define NODE_TYPE_SPACE 5

- now DCH_cache_getnew() is called after parse_format(). Because now 
parse_format() can raise an error and in the next attempt 
DCH_cache_search() could return broken cache entry.



This patch do not handle all noticed issues in this thread, since still 
there is not consensus about them. So this patch in a proof of concept 
status and it can be changed.


Of course this patch can be completely wrong. But it tries to introduce 
more formal rules for formatting.


I will be grateful for notes and remarks.

--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7830334..5656245 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -6083,9 +6083,10 @@ SELECT regexp_matches('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');
  
   
to_timestamp and to_date
-   skip multiple blank spaces in the input string unless the
-   FX option is used. For example,
-   to_timestamp('2000JUN', ' MON') works, but
+   skip multiple blank spaces in the input string and in the formatting
+   string unless the FX option is used. For example,
+   to_timestamp('2000JUN', ' MON')
+   and to_timestamp('2000 JUN', 'MON') work, but
to_timestamp('2000JUN', 'FX MON') returns an error
because to_timestamp expects one space only.
FX must be specified as the first item in
@@ -6121,6 +6122,19 @@ SELECT regexp_matches('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');
skips two input characters.
   
  
+ 
+ 
+  
+   In to_date and to_timestamp separator
+   characters ',', '-',
+   '.', '/' and ':'
+   outside of double-quoted strings skip the number of input characters
+   contained in the string unless the FX option is used.
+   If FX option is specified then consumed separator
+   character in the formatting string must match the separator character
+   in the input string.
+  
+ 
 
  
   
diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index bbd97dc..ef39df9 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -169,6 +169,8 @@ struct FormatNode
 #define NODE_TYPE_END		1
 #define NODE_TYPE_ACTION	2
 #define NODE_TYPE_CHAR		3
+#define NODE_TYPE_SEPARATOR	4
+#define NODE_TYPE_SPACE		5
 
 #define SUFFTYPE_PREFIX		1
 #define SUFFTYPE_POSTFIX	2
@@ -947,6 +949,7 @@ typedef struct NUMProc
 static const KeyWord *index_seq_search(char *str, const KeyWord *kw,
  const int *index);
 static const KeySuffix *suff_search(char *str, const KeySuffix *suf, int type);
+static bool is_char_separator(char *str);
 static void NUMDesc_prepare(NUMDesc *num, FormatNode *n);
 static void parse_format(FormatNode *node, char *str, const KeyWord *kw,
 			 const KeySuffix *suf, const int *index, int ver, NUMDesc *Num);
@@ -963,7 +966,6 @@ static void dump_node(FormatNode *node, int max);
 static const char *get_th(char *num, int type);
 static char *str_numth(char *dest, char *num, int type);
 static int	adjust_partial_year_to_2020(int year);
-static int	strspace_len(char *str);
 static void from_char_set_mode(TmFromChar *tmfc, const FromCharDateMode mode);
 static void from_char_set_int(int *dest, const int value, const FormatNode *node);
 stat

Re: Improved ICU patch - WAS: [HACKERS] Implementing full UTF-8 support (aka supporting 0x00)

2016-08-11 Thread Palle Girgensohn


> 11 aug. 2016 kl. 03:05 skrev Peter Geoghegan :
> 
> On Wed, Aug 10, 2016 at 1:42 PM, Palle Girgensohn  wrote:
>> They've been used for the FreeBSD ports since 2005, and have served us well. 
>> I have of course updated them regularly. In this latest version, I've 
>> removed support for other encodings beside UTF-8, mostly since I don't know 
>> how to test them, but also, I see little point in supporting anything else 
>> using ICU.
> 
> Looks like you're not using the ICU equivalent of strxfrm(). While 9.5
> is not the release that introduced its use, it did expand it
> significantly. I think you need to fix this, even though it isn't
> actually used to sort text at present, since presumably FreeBSD builds
> of 9.5 don't TRUST_STRXFRM. Since you're using ICU, though, you could
> reasonably trust the ICU equivalent of strxfrm(), so that's a missed
> opportunity. (See the wiki page on the abbreviated keys issue [1] if
> you don't know what I'm talking about.)

My plan was to get it working without TRUST_STRXFRM first, and then add that 
functinality. I've made some preliminary tests using ICU:s ucol_getSortKey but 
I will have to test it a bit more. For now, I just expect not to trust strxfrm. 
It is the first iteration wrt strxfrm, the plan is to use that code base.

> 
> Shouldn't you really have a strxfrm() wrapper, used across the board,
> including for callers outside of varlena.c? convert_string_datum() has
> been calling strxfrm() for many releases now. These calls are still
> used in FreeBSD builds, I would think, which seems like a bug that is
> not dodged by simply not defining TRUST_STRXFRM. Isn't its assumption
> that that matching the ordering used elsewhere not really hold on
> FreeBSD builds?

I was not aware of convert_string_datum, I will check that, thanks! Using a 
wrapper across the board seems like a good idea for refactoring.

> 
> [1] https://wiki.postgresql.org/wiki/Abbreviated_keys_glibc_issue
> --
> Peter Geoghegan



signature.asc
Description: Message signed with OpenPGP using GPGMail

Re: [HACKERS] Assertion failure in REL9_5_STABLE

2016-08-11 Thread Marko Tiikkaja


On 11/08/16 8:48 AM, Michael Paquier wrote:

On Thu, Aug 11, 2016 at 8:09 AM, Marko Tiikkaja  wrote:

On 2016-08-11 12:09 AM, Alvaro Herrera wrote:


BTW this is not a regression, right?  It's been broken all along.  Or am
I mistaken?


You're probably right.  I just hadn't realized I could run our app against
9.5 with --enable-cassert until last week.


Just wondering... If you revert 1f9534b4 and/or b33e81cb do you still
see a problem?


Yeah, no effect.


.m


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Slowness of extended protocol

2016-08-11 Thread Vladimir Sitnikov

 Shay:

> Prepared statements can have very visible effects apart from the speedup
> they provide (e.g. failure because of schema changes) It's not that these
> effects can't be worked around - they can be - but programmers can be
> surprised by these effects, which can cause difficult-to-diagnose issues.
>

The specific effect of "cached plan cannot change return type" can be
solved by cooperation of backend and frontend (~driver) developers.
I find that solving that kind of issues is more important than investing
into "ParseBindExecDeallocateInOneGo" message.


I hope you would forgive me if I just stop the discussion here.
I find I'd better spent that time on just fixing pgbouncer issue rather
than discussing if it is pgbouncer's or postgresql's issue.

I'm sorry for being impolite if I was ever.

Vladimir

>

Re: [HACKERS] Slowness of extended protocol

2016-08-11 Thread Shay Rojansky

Vladimir wrote:

Shay> As Tom said, if an application can benefit from preparing, the
> developer has the responsibility (and also the best knowledge) to manage
> preparation, not the driver. Magical behavior under the hood causes
> surprises, hard-to-diagnose bugs etc.
>
> Why do you do C# then?
> Aren't you supposed to love machine codes as the least magical thing?
> Even x86 does not describe "the exact way the code should be executed".
> All the CPUs shuffle the instructions to make it faster.
>

These are valid questions. I have several answers to that.

First, it depends on the extent to which the "under the hood" activity may
have visible side-effects. CPU instruction reordering is 100% invisible,
having no effect whatsoever on the application apart from speeding it up.
Prepared statements can have very visible effects apart from the speedup
they provide (e.g. failure because of schema changes). It's not that these
effects can't be worked around - they can be - but programmers can be
surprised by these effects, which can cause difficult-to-diagnose issues.
Nobody needs to be really aware of CPU instruction reordering because it
will never cause an application issue (barring a CPU bug).

Another important visible effect of preparing a statement is server-side
cost (i.e. memory). I'm not sure what the overhead is, but it is there and
an irresponsible "prepare everything everywhere" can create a significant
resource drain on your server. I know pgjdbc has knobs for controlling how
many statements are prepared, but in a large-scale performance-sensitive
app a programmer may want to manually decide, on a per-query basis, whether
they want to prepare or not.

But the most important question is that this is a choice that should be
left to the developer, rather than imposed. In some cases developers *do*
drop down to assembly, or prefer to call system calls directly to be in
control of exactly what's happening. They should be able to do this. Now,
of course you can provide an option which disables implicit preparing, but
if you refuse to optimize non-prepared paths you're effectively forcing the
programmer to go down the prepared statement path.

Shay>As Tom said, if an application can benefit from preparing, the
> developer has the responsibility
>
> Does developer have the responsibility to choose between "index scan" and
> "table seq scan"? So your "developer has the responsibility" is like
> building on sand.
>

I'm very glad you raised that point. The programmers indeed outsources that
decision to the database, because the database has the most *knowledge* on
optimal execution (e.g. which indices are defined). Implicit preparing, on
the other hand, has nothing to do with knowledge: as Tom said, the driver
knows absolutely nothing about the application, and doesn't have any
advantage over the programmer in making that decision.

But note that the DBA *does* have to decide which indices exist and which
don't. Would you accept a database that automatically creates indices based
on use, i.e. "x queries caused full table scans, so I'll silently create an
index here"? I wouldn't, it would be utter nonsense and create lots of
unpredictability in terms of performance and resource drain.

My experience shows, that people are very bad at predicting where the
> performance problem would be.
> For 80% (or even 99%) of the cases, they just do not care thinking if a
> particular statement should be server-prepared or not.
> They have absolutely no idea how much resources it would take and so on.
>

And for those people it's great to have an implicit preparation feature,
most probably opt-in. But that shouldn't mean we shouldn't optimize things
for the rest.

> ORMs have no that notion of "this query must be server-prepared, while
> that one must not be".
> And so on.
>
> It is somewhat insane to assume people would use naked SQL. Of course they
> would use ORMs and alike, so they just would not be able to pass that
> additional parameter telling if a particular query out of thousands should
> be server-prepared or not.
>

Uh, I really wouldn't make statements like that if you want to be taken
seriously. Tons of applications use naked SQL and avoid ORMs for many
reasons (e.g. performance benefits of hand-crafted SQL), such general
statements on what applications do in the world are, well, silly. Again,
ORMs are an argument for why implicit preparation should exist (I made that
argument myself above), but they're not an argument for why optimizing
other paths should be excluded.

> Vladimir> "cached plan cannot change result type" -- PostgreSQL just fails
> to execute the server-prepared statement if a table was altered.
>
> Shay>How exactly do users cope with this in pgjdbc? Do they have some
> special API to flush (i.e. deallocate) prepared statements which they're
> supposed to use after a DDL?
>
> First of all, pgjdbc does report those problems to hackers.
> Unfortunately, it is still "not implemented".
> Then,

Re: [HACKERS] new autovacuum criterion for visible pages