Re: [HACKERS] pg_config --version

2016-11-26 Thread Michael Paquier
On Sun, Nov 27, 2016 at 9:16 AM, David Fetter  wrote:
> While updating some extensions, I noticed that pg_config --version
> produces output that's...maybe not quite as useful as it might be, at
> least to a machine, so I'd like to throw out some proposals to fix the
> situation.
>
> Add a --version-numeric option to pg_config
>
> or
> Replace the current --version option with its bare numeric version
>
> or
> Add another line of output to the current --version output, which
> would be the numeric version by itself
>
> What say?

You may want to look at this thread that treats more or less the same topic:
https://www.postgresql.org/message-id/cab7npqtadajpx8ik4v3uyjbo2kmo8rhzqjkdsladdrannrg...@mail.gmail.com
And this has resulted in commit a5d489cc:
commit: a5d489ccb7e613c7ca3be6141092b8c1d2c13fa7
author: Tom Lane 
date: Thu, 2 Jul 2015 17:24:36 -0400
Make numeric form of PG version number readily available in Makefiles.

I would imagine that the common position has not changed much since,
and as Makefile.global.in provides this data... Doing more work in
pg_config is not really necessary for extensions.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UNDO and in-place update

2016-11-26 Thread Amit Kapila
On Fri, Nov 25, 2016 at 11:23 PM, Bruce Momjian  wrote:
> On Thu, Nov 24, 2016 at 12:23:28PM -0500, Robert Haas wrote:
>> I agree up to a point.  I think we need to design our own system as
>> well as we can, not just copy what others have done.  For example, the
>> design I sketched will work with all of PostgreSQL's existing index
>> types.  You need to modify each AM in order to support in-place
>> updates when a column indexed by that AM has been modified, and that's
>> probably highly desirable, but it's not a hard requirement.
>
> I feel you are going to get into the problem of finding the index entry
> for the old row --- the same thing that is holding back more aggressive
> WARM updates.
>

I think I see a problem in that regards when the index column in
updated twice with the same value.  For example,

table -  test_undo(c1 int, c2 char(10));
index -  idx_tu on test_undo(c1);

Step-1
Insert (2, 'abc')
Heap - 2, abc
Index - 2
Commit;

Step -2
Update (3,def) where c1 = 2;
Heap - 3, def
Index - 3 (another value will be 2)
Undo - 2, abc
Commit;

At this point a new query which scans the index for value 2 will reach
to the tuple (3,def).  As scan started after all commits, tuple
(3,def) appears okay, but it can compare the key value in the tuple to
detect that this is not the right tuple.  So, all is well till here.

Step -3
Update (2) where c1 = 3;
Heap - 2, def
Index - 2 (another values will be 3,2)
Undo - 3;
Commit

At this point, index scan for value 2 will find index tuple of step-1
(2) and will conclude 2,def as a right tuple, but actually, that is
wrong as the step-1 (2) index tuple should not be visible to the user.
Do you also this as a problem or am I missing something?  If this a
problem, then I think we need some form of delete marking system for
the index, probably transaction visibility information as well.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel bitmap heap scan

2016-11-26 Thread Dilip Kumar
On Sun, Nov 27, 2016 at 3:15 AM, Robert Haas  wrote:
> I think the Barrier stuff has a process for choosing one worker to
> conduct a particular phase.  So it seems like if the Barrier API is
> well-designed, you should be able to use it to decide who will conduct
> the index scan, and then when that's done everyone can proceed to
> scanning the heap.  If that can't work for some reason, Thomas should
> probably adjust his API so it does.  He's presenting that as a
> generally-useful primitive...

If I understand the barrier API correctly, It has two Part.
1. BarrierInit  2. BarrierWait.

1. In BarrierInit we defined that, how many worker(lets say nworkers)
should cross the barrier, before we are allowed to cross the
BarriedWait.

2. BarrierWait, will actually make calling process wait until
BarrierWait is not called for nworkers times.

So I am not very clear, If we call BarrierInit with nworkers=1, then
first question is when should we call BarrierWait, because as soon as
we call BarrierWait count will reach 1, and now everyone is allowed to
proceed. so obviously It should be called once the Bitmap is Ready.

Second question is, if it's called only after Bitmap is ready, then
what about other process, how they are supposed to wait until bitmap
is not ready. If they wait using BarrierWait, it again make the count
1 and everyone is allowed to proceed. Which doesn't seems correct.

Correct me if I am missing something ?

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_config --version

2016-11-26 Thread David Fetter
Folks,

While updating some extensions, I noticed that pg_config --version
produces output that's...maybe not quite as useful as it might be, at
least to a machine, so I'd like to throw out some proposals to fix the
situation.

Add a --version-numeric option to pg_config

or
Replace the current --version option with its bare numeric version

or
Add another line of output to the current --version output, which
would be the numeric version by itself

I'm partial to the first because it's clear what's being asked for,
the second because the product name does nothing for comprehension,
and the third because it would be less strain on things that already
parse the output.

A somewhat larger project, not sure whether it's worth doing, would be
to enable pg_config to print arbitrary combinations of the GUCs it
could know about.

What say?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in TS_phrase_execute

2016-11-26 Thread Tom Lane
I wrote:
> but I notice that some normalization seems to be getting done by
> tsqueryin:

> regression=# select $$( 'sanct' & 'peter' ) <-> ( 'sanct' & 'peter' 
> )$$::tsquery;
> tsquery   
>   
   
> 
> ---
>  'sanct' <-> 'sanct' & 'peter' <-> 'sanct' & 'sanct' <-> 'peter' & 'peter' 
> <-> '
> peter'
> (1 row)

BTW, it seems like that normalization is wrong.  The transformed query
should (and does) match the string "sanct sanct peter sanct sanct peter
peter peter", since each of the <-> pairs has a match somewhere in there.
But I would expect the original query to be specifying that a match occurs
at exactly one place, which of course is unsatisfiable since 'sanct' and
'peter' can't match the same word.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Alternative MATERIALIZED VIEW design and implementation with history table and other features

2016-11-26 Thread Jim Nasby

On 11/22/16 9:11 PM, Nico Williams wrote:

But we needed a method for recording deltas from REFRESHes, and that's
not supported.  So I coded up my own version of materialized views, in
PlPgSQL, that does provide a history feature.


Getting history tracking included in core is going to take a LOT of 
effort; not from a code standpoint, but to get everyone to agree on the 
use cases, interface, etc. In short: I wouldn't go there.


What *would* be useful is work on generating delta information from a 
set of changes to a table: see below.



Besides a history feature, this includes the ability to record changes
made to a materialized view's materialization table, which means I can
have triggers that update the materialized view.


Just looking at the commend block in the file, it's not clear what you 
mean by this. Does this mean if you ALTER the "materialization table" 
(which I assume is the materialized output?), does the view somehow get 
modified?



Of particular interest may be the fact that the FULL OUTER JOIN that PG
does for REFRESH CONCURRENTLY, and which I copied here, doesn't deal
well with views that have NULLs in any columns used in the join.  It
would be nice to have an equijoin that uses IS NOT DISTINCT FROM rather
than just =, and then refreshing could use such a join in order to deal
properly with NULLs.


Well, you could potentially just cast the composite types to text and 
join on that, but...


Kevin Grittner (author of the original matviews patch) has mentioned 
detecting changes to underlying relations previously[1]. Getting even a 
simple form of that working is going to be critical for any kind of 
incremental matview updating. It sounds like you're doing something 
different from incremental update, but the core problem is still the 
same: how to efficiently identify changes to underlying matview data and 
apply those changes to the view.


I suggest taking a look at what Kevin's written about that, as well as 
the paper he mentioned. Some of this does assume that you have the 
equivalent to NEW and OLD, but IIRC those are not actually mandatory 
(ie: you could use the current matview contents as OLD and the dynamic 
view as NEW). Even a pure SQL/plpgsql implementation of what's in the 
paper that Kevin mentioned would probably be valuable to ongoing work, 
as well as being applicable to what you've done.


1: 
https://www.postgresql.org/message-id/1371480075.55528.yahoomail...@web162901.mail.bf1.yahoo.com

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel bitmap heap scan

2016-11-26 Thread Robert Haas
On Sat, Nov 26, 2016 at 7:40 AM, Dilip Kumar  wrote:
> IMHO, barrier is used when multiple worker are doing some work
> together in phase1, and before moving to next phase all have to
> complete phase1, so we put barrier, so that before starting next phase
> all cross the barrier.
>
> But here case is different, only one worker need to finish the phase1,
> and as soon as it's over all can start phase2. But we don't have
> requirement that all worker should cross certain barrier. In this case
> even though some worker did not start, other worker can do their work.

I think the Barrier stuff has a process for choosing one worker to
conduct a particular phase.  So it seems like if the Barrier API is
well-designed, you should be able to use it to decide who will conduct
the index scan, and then when that's done everyone can proceed to
scanning the heap.  If that can't work for some reason, Thomas should
probably adjust his API so it does.  He's presenting that as a
generally-useful primitive...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in TS_phrase_execute

2016-11-26 Thread Tom Lane
Andreas Seltenreich  writes:
> the query below triggers an assertion in TS_phrase_execute.  Testing was
> done on master at dbdfd11.

> -- TRAP: FailedAssertion("!(curitem->qoperator.oper == 4)", File: 
> "tsvector_op.c", Line: 1432)

> select 'moscow' @@
>ts_rewrite('moscow', 'moscow',
>   ts_rewrite(
>tsquery_phrase('moscow','moscow'),
>'moscow',
>$$ 'sanct' & 'peter'$$));

Hmm.  If you run the ts_rewrite alone, it prints

regression=# select  ts_rewrite('moscow', 'moscow',
  ts_rewrite(
 tsquery_phrase('moscow','moscow'),
 'moscow',
 $$ 'sanct' & 'peter'$$));
   ts_rewrite
-
 ( 'sanct' & 'peter' ) <-> ( 'sanct' & 'peter' )
(1 row)

and if you put that in explicitly, all's well:

regression=# select 'moscow' @@ $$( 'sanct' & 'peter' ) <-> ( 'sanct' & 'peter' 
)$$::tsquery;
 ?column? 
--
 f
(1 row)

but I notice that some normalization seems to be getting done by
tsqueryin:

regression=# select $$( 'sanct' & 'peter' ) <-> ( 'sanct' & 'peter' 
)$$::tsquery;
tsquery 
   

---
 'sanct' <-> 'sanct' & 'peter' <-> 'sanct' & 'sanct' <-> 'peter' & 'peter' <-> '
peter'
(1 row)

so this seems to boil down to ts_rewrite failing to apply required
normalization.  Or maybe the normalization shouldn't be required.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Skipping PgStat_FunctionCallUsage for many expressions

2016-11-26 Thread Andres Freund


On November 26, 2016 8:06:26 AM PST, Tom Lane  wrote:
>Robert Haas  writes:
>> On Fri, Nov 25, 2016 at 11:12 PM, Andres Freund 
>wrote:
>>> while working on my faster expression evaluation stuff I noticed
>that a
>>> lot of expression types that call functions don't call the necessary
>>> functions to make track_functions work.
>>> 
>>> ExecEvalFunc/ExecEvalOper (via ExecMakeFunctionResultNoSets) call
>>> pgstat_init_function_usage/pgstat_end_function_usage, but others
>like
>>> ExecEvalRowCompare, ExecEvalMinMax, ExecEvalNullIf,
>ExecEvalDistinct,
>>> ExecEvalScalarArrayOp (and indirectly ExecEvalArrayCoerceExpr)
>don't.
>>> 
>>> Similarly InvokeFunctionExecuteHook isn't used very thoroughly.
>>> 
>>> Are these worth fixing? I suspect yes. If so, do we want to
>backpatch?
>
>Those don't call functions, they call operators.  Yes, I know that an
>operator has a function underlying it, but the user-level expectation
>for
>track_functions is that what it counts are things that look
>syntactically
>like function calls.  I'm not eager to add tracking overhead for cases
>that there's been exactly zero field demand for.

But we do track for OpExprs? Otherwise I'd agree.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Wrong order of tests in findDependentObjects()

2016-11-26 Thread Tom Lane
It suddenly struck me that the problem being complained of in bug #14434
is that dependency.c's findDependentObjects() is handling extension
dependencies incorrectly.  It has this logic:

/*
 * This object is part of the internal implementation of
 * another object, or is part of the extension that is the
 * other object.  We have three cases:
 *
 * 1. At the outermost recursion level, we normally disallow
 * the DROP.  (We just ereport here, rather than proceeding,
 * since no other dependencies are likely to be interesting.)
 * However, there are exceptions.
 */
if (stack == NULL)
{
/*
 * Exception 1a: if the owning object is listed in
 * pendingObjects, just release the caller's lock and
 * return.  We'll eventually complete the DROP when we
 * reach that entry in the pending list.
 */
...

/*
 * Exception 1b: if the owning object is the extension
 * currently being created/altered, it's okay to continue
 * with the deletion.  This allows dropping of an
 * extension's objects within the extension's scripts, as
 * well as corner cases such as dropping a transient
 * object created within such a script.
 */
...

/* No exception applies, so throw the error */
...

The bug report occurs because the sequence's extension membership
pg_depend record is hit one recursion level down, so that stack!=NULL.
Instead, we should rearrange this so that "exception 1b" is allowed
whether or not we are at the outermost recursion level.  The assumption
is that the sequence creation/upgrade script knows what it's doing and
should not be forced to issue manual ALTER EXTENSION DROP commands
before it can do it.

While looking at that, I wonder if "exception 1a" isn't wrongly placed
as well.  Why shouldn't we let that case occur below top level?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make default TABLESPACE belong to target table.

2016-11-26 Thread Tom Lane
Amit Kapila  writes:
> What will such a storage parameter (default_tablespace) mean at table
> level and how it will different from existing default_tablespace?  I
> think the usage asked by Amos is quite genuine, but not sure if
> introducing default_tablespace as a storage level parameter is the
> best way to address it.  Another way could be that we allow the user
> to specify something like tablespace_name / table> or something like that.

That seems overcomplicated, and it will also pose quite some hazard
for pg_dump for example.  It feels like "action at a distance", in
that creation of an index will now depend on properties that aren't
immediately obvious.

I was thinking about introducing a new GUC, named something like
default_index_tablespace, which would need to have at least these
behaviors:

1. index tablespace is same as default_tablespace (the backwards
compatible, and therefore the default, behavior).

2. index tablespace is same as table's tablespace.

3. default_index_tablespace names a specific tablespace.

Point 3 isn't in the current request but I'm pretty sure I've heard
it requested in the past, so that people can conveniently put all
tables in tablespace X and all indexes in tablespace Y.

If we just did points 1 and 2 then a bool GUC would suffice.  I'm
not sure how to handle all three cases cleanly.  We could define
default_index_tablespace as empty to get point 1 or a tablespace
name to get point 3, but that leaves us having to use some magic
string for point 2, which would be messy --- what if it conflicts
with someone's choice of a tablespace name?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Skipping PgStat_FunctionCallUsage for many expressions

2016-11-26 Thread Tom Lane
Robert Haas  writes:
> On Fri, Nov 25, 2016 at 11:12 PM, Andres Freund  wrote:
>> while working on my faster expression evaluation stuff I noticed that a
>> lot of expression types that call functions don't call the necessary
>> functions to make track_functions work.
>> 
>> ExecEvalFunc/ExecEvalOper (via ExecMakeFunctionResultNoSets) call
>> pgstat_init_function_usage/pgstat_end_function_usage, but others like
>> ExecEvalRowCompare, ExecEvalMinMax, ExecEvalNullIf, ExecEvalDistinct,
>> ExecEvalScalarArrayOp (and indirectly ExecEvalArrayCoerceExpr) don't.
>> 
>> Similarly InvokeFunctionExecuteHook isn't used very thoroughly.
>> 
>> Are these worth fixing? I suspect yes. If so, do we want to backpatch?

Those don't call functions, they call operators.  Yes, I know that an
operator has a function underlying it, but the user-level expectation for
track_functions is that what it counts are things that look syntactically
like function calls.  I'm not eager to add tracking overhead for cases
that there's been exactly zero field demand for.

> If it doesn't torpedo performance, I assume we should fix and back-patch.

It's certainly not going to help.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Skipping PgStat_FunctionCallUsage for many expressions

2016-11-26 Thread Robert Haas
On Fri, Nov 25, 2016 at 11:12 PM, Andres Freund  wrote:
> while working on my faster expression evaluation stuff I noticed that a
> lot of expression types that call functions don't call the necessary
> functions to make track_functions work.
>
> ExecEvalFunc/ExecEvalOper (via ExecMakeFunctionResultNoSets) call
> pgstat_init_function_usage/pgstat_end_function_usage, but others like
> ExecEvalRowCompare, ExecEvalMinMax, ExecEvalNullIf, ExecEvalDistinct,
> ExecEvalScalarArrayOp (and indirectly ExecEvalArrayCoerceExpr) don't.
>
> Similarly InvokeFunctionExecuteHook isn't used very thoroughly.
>
> Are these worth fixing? I suspect yes. If so, do we want to backpatch?

If it doesn't torpedo performance, I assume we should fix and back-patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should we fix postmaster to avoid slow shutdown?

2016-11-26 Thread Robert Haas
On Thu, Nov 24, 2016 at 12:41 AM, Tsunakawa, Takayuki
 wrote:
> From: pgsql-hackers-ow...@postgresql.org
>> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tom Lane
>> Robert Haas  writes:
>> > I agree.  However, in many cases, the major cost of a fast shutdown is
>> > getting the dirty data already in the operating system buffers down to
>> > disk, not in writing out shared_buffers itself.  The latter is
>> > probably a single-digit number of gigabytes, or maybe double-digit.
>> > The former might be a lot more, and the write of the pgstat file may
>> > back up behind it.  I've seen cases where an 8kB buffered write from
>> > Postgres takes tens of seconds to complete because the OS buffer cache
>> > is already saturated with dirty data, and the stats files could easily
>> > be a lot more than that.
>>
>> I think this is mostly FUD, because we don't fsync the stats files.  Maybe
>> we should, but we don't today.  So even if we have managed to get the system
>> into a state where physical writes are heavily backlogged, that's not a
>> reason to assume that the stats collector will be unable to do its thing
>> promptly.  All it has to do is push a relatively small amount of data into
>> kernel buffers.
>
> I'm sorry for my late reply, yesterday was a national holiday in Japan.
>
> It's not FUD.  I understand you hit the slow stats file write problem during 
> some regression test.  You said it took 57 seconds to write the stats file 
> during the postmaster shutdown.  That caused pg_ctl stop to fail due to its 
> 60 second timeout.  Even the regression test environment suffered from the 
> trouble.

+1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical decoding on standby

2016-11-26 Thread Robert Haas
On Wed, Nov 23, 2016 at 8:37 AM, Craig Ringer  wrote:
>>> The last checkpoint's oldestXid, and ShmemVariableCache's oldestXid,
>>> are already held down by ProcArray's catalog_xmin. But that doesn't
>>> mean we haven't removed newer tuples from specific relations and
>>> logged that in xl_heap_clean, etc, including catalogs or user
>>> catalogs, it only means the clog still exists for those XIDs.
>>
>> Really?
>
> (Note the double negative above).
>
> Yes, necessarily so. You can't look up xids older than the clog
> truncation threshold at oldestXid, per our discussion on txid_status()
> and traceable commit. But the tuples from that xact aren't guaranteed
> to exist in any given relation; vacuum uses vacuum_set_xid_limits(...)
> which calls GetOldestXmin(...); that in turn scans ProcArray to find
> the oldest xid any running xact cares about. It might bump it down
> further if there's a replication slot requirement or based on
> vacuum_defer_cleanup_age, but it doesn't care in the slightest about
> oldestXmin.
>
> oldestXmin cannot advance until vacuum has removed all tuples for that
> xid and advanced the database's datfrozenxid. But a given oldestXmin
> says nothing about which tuples, catalog or otherwise, still exist and
> are acessible.

Right.  Sorry, my mistake.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [sqlsmith] Failed assertion in TS_phrase_execute

2016-11-26 Thread Andreas Seltenreich
Hi,

the query below triggers an assertion in TS_phrase_execute.  Testing was
done on master at dbdfd11.

regards,
Andreas

-- TRAP: FailedAssertion("!(curitem->qoperator.oper == 4)", File: 
"tsvector_op.c", Line: 1432)

select 'moscow' @@
   ts_rewrite('moscow', 'moscow',
  ts_rewrite(
 tsquery_phrase('moscow','moscow'),
 'moscow',
 $$ 'sanct' & 'peter'$$));


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Random PGDLLIMPORTing

2016-11-26 Thread Magnus Hagander
On Fri, Nov 25, 2016 at 9:54 AM, Craig Ringer  wrote:

> On 25 November 2016 at 14:16, Tom Lane  wrote:
> > Craig Ringer  writes:
> >> PGDLLIMPORT is free, so the question should be "is there a reason not
> >> to add it here?".
> >
> > TBH, my basic complaint about it is that I do not like Microsoft's tool
> > chain assuming that it's entitled to demand that people sprinkle
> > Microsoft-specific droppings throughout what would otherwise be platform
> > independent source code.
>
> Yeah, I know how you feel. I'm not a fan myself, I just tolerate it as
> one of those annoying things we must put up with, like Perl 5.8 for
> ancient systems, not using C99, and !Linux/FBSD support. Which,
> actually, we _do_ actively work for - take for example the atomics
> work, which went to considerable lengths not to break weird niche
> platforms.
>
> > However, Victor Wagner's observation upthread is quite troubling:
> >
> >>> It worth checking actual variable definitions, not just declarations.
> >>> I've found recently, that at least in MSVC build system, only
> >>> initialized variables are included into postgres.def file, and so are
> >>> actually exported from the backend binary.
> >
> > If this is correct (don't ask me, I don't do Windows) then the issue is
> > not whether "PGDLLIMPORT is free".  This puts two separate source-code
> > demands on variables that we want to make available to extensions,
> neither
> > of which is practically checkable on non-Windows platforms.
>
> I agree that, if true, that's a real concern and something that needs
> addressing to avoid a growing maintenance burden from Windows.
>

It would probably not be very hard to create a test that just scans the
sourcecode for PGDLLIMPORT:ed variables and generates a C file that links
to them all, tries to build that and sees if it blows up. If I understand
the issue correctly, that should fail in these cases. So we could make that
a platform specific test that runs, and have the buildfarm run it so we'd
at least detect such problems early. Or am I missing something?



> > I think that basically it's going to be on the heads of people who
> > want to work on Windows to make sure that things work on that platform.
> > That is the contract that every other platform under the sun understands,
> > but it seems like Windows people think it's on the rest of us to make
> > their platform work.  I'm done with that.
>
> Well, I'm trying to be one of those "Windows people" to the degree of
> spotting and addressing issues. Like this one. But it's not worth
> arguing this one if it's more than a trivial "meh, done" fix, since
> it's likely to only upset code that's testing assertions (like BDR or
> pglogical).
>
>
So when we *do* identify these places, through projects like that or just
general complaints, do we see any actual risk in backpatching such
additions? As long as the linking is done by name and not by number, it
should be fully safe, right?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Quorum commit for multiple synchronous replication.

2016-11-26 Thread Michael Paquier
On Tue, Nov 15, 2016 at 7:08 PM, Masahiko Sawada  wrote:
> Attached latest version patch incorporated review comments. After more
> thought, I agree and changed the value of standby priority in quorum
> method so that it's not set 1 forcibly. The all standby priorities are
> 1 If s_s_names = 'ANY(*)'.
> Please review this patch.

Sorry for my late reply. Here is my final lookup.

 
-num_sync ( standby_name [, ...] )
+[ANY] num_sync (
standby_name [, ...] )
+FIRST num_sync (
standby_name [, ...] )
 standby_name [, ...
This can just be replaced with [ ANY | FIRST ]. There is no need for
braces as the keyword is not mandatory.

+is the name of a standby server.
+FIRST and ANY specify the method used by
+the master to control the standby servres.
 
s/servres/servers/.

if (priority == 0)
values[7] = CStringGetTextDatum("async");
else if (list_member_int(sync_standbys, i))
-   values[7] = CStringGetTextDatum("sync");
+   values[7] = SyncRepConfig->sync_method == SYNC_REP_PRIORITY ?
+   CStringGetTextDatum("sync") : CStringGetTextDatum("quorum");
else
values[7] = CStringGetTextDatum("potential");
This can be simplified a bit as "quorum" is the state value for all
standbys with a non-zero priority when the method is set to
SYNC_REP_QUORUM:
if (priority == 0)
values[7] = CStringGetTextDatum("async");
+   else if (SyncRepConfig->sync_method == SYNC_REP_QUORUM)
+   values[7] = CStringGetTextDatum("quorum");
else if (list_member_int(sync_standbys, i))
values[7] = CStringGetTextDatum("sync");
else

SyncRepConfig data is made external to syncrep.c with this patch as
walsender.c needs to look at the sync method in place, no complain
about that after considering if there could be a more elegant way to
do things without this change.

While reviewing the patch, I have found a couple of incorrectly shaped
sentences, both in the docs and some comments. Attached is a new
version with this word-smithing. The patch is now switched as ready
for committer.
-- 
Michael
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index dcd0663..bff932b 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3029,42 +3029,76 @@ include_dir 'conf.d'
 transactions waiting for commit will be allowed to proceed after
 these standby servers confirm receipt of their data.
 The synchronous standbys will be those whose names appear
-earlier in this list, and
+in this list, and
 that are both currently connected and streaming data in real-time
 (as shown by a state of streaming in the
 
-pg_stat_replication view).
-Other standby servers appearing later in this list represent potential
-synchronous standbys. If any of the current synchronous
-standbys disconnects for whatever reason,
-it will be replaced immediately with the next-highest-priority standby.
-Specifying more than one standby name can allow very high availability.
+pg_stat_replication view). If the keyword
+FIRST is specified, other standby servers appearing
+later in this list represent potential synchronous standbys.
+If any of the current synchronous standbys disconnects for
+whatever reason, it will be replaced immediately with the
+next-highest-priority standby. Specifying more than one standby
+name can allow very high availability.


 This parameter specifies a list of standby servers using
 either of the following syntaxes:
 
-num_sync ( standby_name [, ...] )
+[ ANY | FIRST ] num_sync ( standby_name [, ...] )
 standby_name [, ...]
 
 where num_sync is
 the number of synchronous standbys that transactions need to
 wait for replies from,
 and standby_name
-is the name of a standby server. For example, a setting of
-3 (s1, s2, s3, s4) makes transaction commits wait
-until their WAL records are received by three higher-priority standbys
-chosen from standby servers s1, s2,
-s3 and s4.
+is the name of a standby server.
+FIRST and ANY specify the method used by
+the master to control the standby servers.
 
 
-The second syntax was used before PostgreSQL
+The keyword FIRST, coupled with num_sync,
+makes transaction commits wait until WAL records are received
+from the num_sync standbys with highest priority number.
+For example, a setting of FIRST 3 (s1, s2, s3, s4)
+makes transaction commits wait until their WAL records are received
+by the three higher-priority standbys chosen from standby servers
+s1, s2, s3 and s4.
+
+
+  

Re: [HACKERS] Parallel bitmap heap scan

2016-11-26 Thread Dilip Kumar
On Tue, Nov 22, 2016 at 9:05 AM, Dilip Kumar  wrote:
> On Fri, Nov 18, 2016 at 9:59 AM, Amit Khandekar  
> wrote:
>
> Thanks for the review..

I have worked on these comments..
>
>> In pbms_is_leader() , I didn't clearly understand the significance of
>> the for-loop. If it is a worker, it can call
>> ConditionVariablePrepareToSleep() followed by
>> ConditionVariableSleep(). Once it comes out of
>> ConditionVariableSleep(), isn't it guaranteed that the leader has
>> finished the bitmap ? If yes, then it looks like it is not necessary
>> to again iterate and go back through the pbminfo->state checking.
>> Also, with this, variable queuedSelf also might not be needed. But I
>> might be missing something here.
>
> I have taken this logic from example posted on conditional variable thread[1]
>
> for (;;)
> {
> ConditionVariablePrepareToSleep(cv);
> if (condition for which we are waiting is satisfied)
> break;
> ConditionVariableSleep();
> }
> ConditionVariableCancelSleep();
>
> [1] 
> https://www.postgresql.org/message-id/CA%2BTgmoaj2aPti0yho7FeEf2qt-JgQPRWb0gci_o1Hfr%3DC56Xng%40mail.gmail.com

>
> So it appears to me even if we come out of ConditionVariableSleep();
> we need to verify our condition and then only break.
>
> Not sure what happens if worker calls
>> ConditionVariable[Prepare]Sleep() but leader has already called
>> ConditionVariableBroadcast(). Does the for loop have something to do
>> with this ? But this can happen even with the current for-loop, it
>> seems.
> If leader has already called ConditionVariableBroadcast, but after
> ConditionVariablePrepareToSleep we will check the condition again
> before calling ConditionVariableSleep. And condition check is under
> SpinLockAcquire(>state_mutex);
>
> However I think there is one problem in my code (I think you might be
> pointing same), that after ConditionVariablePrepareToSleep, if
> pbminfo->state is already PBM_FINISHED, I am not resetting needWait to
> false, and which can lead to the problem.

I have fixed the defect what I have mentioned above,

>
>>
>>
>>> #3. Bitmap processing (Iterate and process the pages).
>>> In this phase each worker will iterate over page and chunk array and
>>> select heap pages one by one. If prefetch is enable then there will be
>>> two iterator. Since multiple worker are iterating over same page and
>>> chunk array we need to have a shared iterator, so we grab a spin lock
>>> and iterate within a lock, so that each worker get and different page
>>> to process.
>>
>> tbm_iterate() call under SpinLock :
>> For parallel tbm iteration, tbm_iterate() is called while SpinLock is
>> held. Generally we try to keep code inside Spinlock call limited to a
>> few lines, and that too without occurrence of a function call.
>> Although tbm_iterate() code itself looks safe under a spinlock, I was
>> checking if we can squeeze SpinlockAcquire() and SpinLockRelease()
>> closer to each other. One thought is :  in tbm_iterate(), acquire the
>> SpinLock before the while loop that iterates over lossy chunks. Then,
>> if both chunk and per-page data remain, release spinlock just before
>> returning (the first return stmt). And then just before scanning
>> bitmap of an exact page, i.e. just after "if (iterator->spageptr <
>> tbm->npages)", save the page handle, increment iterator->spageptr,
>> release Spinlock, and then use the saved page handle to iterate over
>> the page bitmap.
>
> Main reason to keep Spin lock out of this function to avoid changes
> inside this function, and also this function takes local iterator as
> input which don't have spin lock reference to it. But that can be
> changed, we can pass shared iterator to it.
>
> I will think about this logic and try to update in next version.
Still this issue is not addressed.
Logic inside tbm_iterate is using same variable, like spageptr,
multiple places. IMHO this complete logic needs to be done under one
spin lock.

>
>>
>> prefetch_pages() call under Spinlock :
>> Here again, prefetch_pages() is called while pbminfo->prefetch_mutex
>> Spinlock is held. Effectively, heavy functions like PrefetchBuffer()
>> would get called while under the Spinlock. These can even ereport().
>> One option is to use mutex lock for this purpose. But I think that
>> would slow things down. Moreover, the complete set of prefetch pages
>> would be scanned by a single worker, and others might wait for this
>> one. Instead, what I am thinking is: grab the pbminfo->prefetch_mutex
>> Spinlock only while incrementing pbminfo->prefetch_pages. The rest
>> part viz : iterating over the prefetch pages, and doing the
>> PrefetchBuffer() need not be synchronised using this
>> pgbinfo->prefetch_mutex Spinlock. pbms_parallel_iterate() already has
>> its own iterator spinlock. Only thing is, workers may not do the
>> actual PrefetchBuffer() sequentially. One of them might shoot ahead
>> and prefetch 3-4 pages while the other is lagging with 

Re: [HACKERS] Parallel bitmap heap scan

2016-11-26 Thread Dilip Kumar
On Wed, Nov 23, 2016 at 12:31 PM, Dilip Kumar  wrote:

I tried to address these comments in my new version, All comments are
fixed except below

>> + *
>> + *#2. Bitmap processing (Iterate and process the pages).
>> + *. In this phase each worker will iterate over page and
>> chunk array and
>> + *select heap pages one by one. If prefetch is enable then there 
>> will
>> + *be two iterator.
>> + *. Since multiple worker are iterating over same page and chunk 
>> array
>> + *we need to have a shared iterator, so we grab a spin lock and 
>> iterate
>> + *within a lock.
>>
>> The formatting of this comment is completely haphazard.
>
> I will fix this..
I have changed the formatting, and also moved the algorithm
description inside function body.
I am not sure does this meet your expectation or we should change it further ?

>
>> +/* reset parallel bitmap scan info, if present */
>> +if (node->parallel_bitmap)
>> +{
>> +ParallelBitmapInfo *pbminfo = node->parallel_bitmap;
>> +
>> +pbminfo->state = PBM_INITIAL;
>> +pbminfo->tbmiterator.schunkbit = 0;
>> +pbminfo->tbmiterator.spageptr = 0;
>> +pbminfo->tbmiterator.schunkptr = 0;
>> +pbminfo->prefetch_iterator.schunkbit = 0;
>> +pbminfo->prefetch_iterator.spageptr = 0;
>> +pbminfo->prefetch_iterator.schunkptr = 0;
>> +pbminfo->prefetch_pages = 0;
>> +pbminfo->prefetch_target = -1;
>> +}
>>
>> This is obviously not going to work in the face of concurrent
>> activity.  When we did Parallel Seq Scan, we didn't make any changes
>> to the rescan code at all.  I think we are assuming that there's no
>> way to cause a rescan of the driving table of a parallel query; if
>> that's wrong, we'll need some fix, but this isn't it.  I would just
>> leave this out.
>
> In heap_rescan function we have similar change..
>
> if (scan->rs_parallel != NULL)
> {
>   parallel_scan = scan->rs_parallel;
>   SpinLockAcquire(_scan->phs_mutex);
>   parallel_scan->phs_cblock = parallel_scan->phs_startblock;
>   SpinLockRelease(_scan->phs_mutex);
> }
>
> And this is not for driving table, it's required for the case when
> gather is as inner node for JOIN.
> consider below example. I know it's a bad plan but we can produce this plan :)
>
> Outer NodeInner Node
> SeqScan(t1)   NLJ   (Gather -> Parallel SeqScan (t2)).
>
> Reason for doing same is that, during ExecReScanGather we don't
> recreate new DSM instead of that we just reinitialise the DSM
> (ExecParallelReinitialize).

This is not fixed, reason is already explained.

>> +static bool
>> +pbms_is_leader(ParallelBitmapInfo *pbminfo)
>>
>> I think you should see if you can use Thomas Munro's barrier stuff for
>> this instead.
>
> Okay, I will think over it.

IMHO, barrier is used when multiple worker are doing some work
together in phase1, and before moving to next phase all have to
complete phase1, so we put barrier, so that before starting next phase
all cross the barrier.

But here case is different, only one worker need to finish the phase1,
and as soon as it's over all can start phase2. But we don't have
requirement that all worker should cross certain barrier. In this case
even though some worker did not start, other worker can do their work.

>>
>> In general, the amount of change in nodeBitmapHeapScan.c seems larger
>> than I would have expected.  My copy of that file has 655 lines; this
>> patch adds 544 additional lines.  I think/hope that some of that can
>> be simplified away.
>
> I will work on this.

I have removed some function which was actually not required, and code
can be merged in main function. Almost reduced by 100 lines.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make default TABLESPACE belong to target table.

2016-11-26 Thread Amit Kapila
On Sat, Nov 26, 2016 at 8:57 AM, Michael Paquier
 wrote:
> On Sat, Nov 26, 2016 at 11:25 AM, Amos Bird  wrote:
>> How about making a storage parameter "default_tablespace" that also
>> covers CREATE INDEX and other stuff?
>
> That's exactly the idea, the one at relation-level gets priority on
> the global one defined in postgresql.conf.
>

What will such a storage parameter (default_tablespace) mean at table
level and how it will different from existing default_tablespace?  I
think the usage asked by Amos is quite genuine, but not sure if
introducing default_tablespace as a storage level parameter is the
best way to address it.  Another way could be that we allow the user
to specify something like tablespace_name / or something like that.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Index Scans

2016-11-26 Thread Amit Kapila
On Sat, Oct 22, 2016 at 9:07 AM, Amit Kapila  wrote:
> On Fri, Oct 21, 2016 at 10:55 PM, Robert Haas  wrote:

I have rebased the patch (parallel_index_scan_v2) based on latest
commit e8ac886c (condition variables).  I have removed the usage of
ConditionVariablePrepareToSleep as that is is no longer mandatory.  I
have also updated docs for wait event introduced by this patch (thanks
to Dilip for noticing it).  There is no change in
parallel_index_opt_exec_support patch, but just attaching here for
easier reference.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


parallel_index_scan_v2.patch
Description: Binary data


parallel_index_opt_exec_support_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers