Re: [HACKERS] Restricting maximum keep segments by repslots

2020-05-17 Thread Alvaro Herrera
On 2020-May-16, Andres Freund wrote:

> Hi,
> 
> On 2020-05-16 22:51:50 -0400, Alvaro Herrera wrote:
> > On 2020-May-16, Andres Freund wrote:
> > 
> > > I, independent of this patch, added a few additional paths in which
> > > checkpointer's latch is reset, and I found a few shutdowns in regression
> > > tests to be extremely slow / timing out.  The reason for that is that
> > > the only check for interrupts is at the top of the loop. So if
> > > checkpointer gets SIGUSR2 we don't see ShutdownRequestPending until we
> > > decide to do a checkpoint for other reasons.
> > 
> > Ah, yeah, this seems a genuine bug.
> > 
> > > I also suspect that it could have harmful consequences to not do a
> > > AbsorbSyncRequests() if something "ate" the set latch.
> > 
> > I traced through this when looking over the previous fix, and given that
> > checkpoint execution itself calls AbsorbSyncRequests frequently, I
> > don't think this one qualifies as a bug.
> 
> There's no AbsorbSyncRequests() after CheckPointBuffers(), I think. And
> e.g. CheckPointTwoPhase() could take a while. Which then would mean that
> we'd potentially not AbsorbSyncRequests() until checkpoint_timeout
> causes us to wake up. Am I missing something?

True.  There's no delay like CheckpointWriteDelay in that code though,
so the "a while" is much smaller.  My understanding of these sync
requests is that they're not for immediate processing anyway -- I mean
it's okay for checkpointer to take a bit of time before syncing ... or
am I mistaken?  (If another sync request is queued and the queue hasn't
been emptied, that would set the latch again, so it's not like this
could fill the queue arbitrarily.)

> > > One way to do that would be to WaitLatch() call to much earlier, and
> > > only do a WaitLatch() if do_checkpoint is false.  Roughly like in the
> > > attached.
> > 
> > Hm.  I'd do "WaitLatch() / continue" in the "!do_checkpoint" block, and
> > put the checpkoint code not in the else block; seems easier to read to
> > me.
> 
> Yea, that'd probably be better. I was also pondering if we shouldn't
> just move the checkpoint code into, gasp, it's own function ;)

That might work :-)

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Add A Glossary

2020-05-17 Thread Jürgen Purtz

On 17.05.20 08:51, Alvaro Herrera wrote:

Any object that
exists in a database is local, regardless of whether it exists in a
schema or not.
This implies that the term "local" is unnecessary, just call them "SQL 
object".

"Extensions" is one type of object that does not belong
in a schema.  "Foreign data wrapper" is another type of object that does
not belong in a schema.  ...  They are*not*
global objects.
postgres_fdw is a module among many others. It's only an example for 
"extensions" and has no different nature. Yes, they are not global SQL 
objects because they don't belong to the cluster.


In summary we have 3 types of objects: belonging to a schema, to a 
database, or to the cluster (global). Maybe, we can avoid the use of the 
different names 'local SQL object' and 'global SQL object' at all and 
just call them 'SQL object'. 'global SQL object' is used only once. We 
could rephrase "A set of databases and accompanying global SQL objects 
... " to "A set of databases and accompanying SQL objects, which exists 
at the cluster level, ... "



TBH I'm not sure of this term at all.  I think we sometimes use the
term "bloat" to talk about the dead rows only, ignoring the free space.


That's a good example for the necessity of the glossary. Currently we 
don't have a common understanding about all of our used terms. The 
glossary shall fix that and give a mandatory definition - after a 
clearing discussion.


--

Jürgen Purtz






Re: Add A Glossary

2020-05-17 Thread Jürgen Purtz

On 17.05.20 08:51, Alvaro Herrera wrote:

On 15.05.20 02:00, Alvaro Herrera wrote:

Thanks everybody.  I have compiled together all the suggestions and the
result is in the attached patch.  Some of it is of my own devising.

* I changed "instance", and made "cluster" be mostly a synonym of that.

In my understanding, "instance" and "cluster" should be different things,
not only synonyms. "instance" can be the term for permanently fluctuating
objects (processes and RAM) and "cluster" can denote the more static objects
(directories and files). What do you think? If you agree, I would create a
patch.

I don't think that's the general understanding of those terms.  For all
I know, they*are*  synonyms, and there's no specific term for "the
fluctuating objects" as you call them.  The instance is either running
(in which case there are processes and RAM) or it isn't.

We have the basic tools "initdb — create a new PostgreSQL database 
cluster" which affects nothing but files, and we have "pg_ctl — 
initialize, start, stop, or control a PostgreSQL server" which - 
directly - affects nothing but processes and RAM. (Here the term 
"server" collides with new definitions in the glossary. But that's 
another story.)


--

Jürgen Purtz




Re: Add A Glossary

2020-05-17 Thread Erik Rijkers

On 2020-05-17 08:51, Alvaro Herrera wrote:

On 2020-May-17, Jürgen Purtz wrote:


On 15.05.20 02:00, Alvaro Herrera wrote:
> Thanks everybody.  I have compiled together all the suggestions and the
>
> * I changed "instance", and made "cluster" be mostly a synonym of that.
In my understanding, "instance" and "cluster" should be different 
things,


I don't think that's the general understanding of those terms.  For all
I know, they *are* synonyms, and there's no specific term for "the
fluctuating objects" as you call them.  The instance is either running
(in which case there are processes and RAM) or it isn't.


For what it's worth, I've also always understood 'instance' as 'a 
running database'.  I admit it might be a left-over from my oracle 
years:


  
https://docs.oracle.com/cd/E11882_01/server.112/e40540/startup.htm#CNCPT601


There, 'instance' clearly refers to a running database.  When that 
database is stopped, it ceases to be an instance.  I've always 
understood this to be the same for the PostgreSQL 'instance'.  Once 
stopped, it is no longer an instance, but it is, of course, still a 
cluster.


I know, we don't have to do the same as Oracle, but clearly it's going 
to be an ongoing source of misunderstanding if we define such a 
high-level term differently.


Erik Rijkers




Re: pgsql: Show opclass and opfamily related information in psql

2020-05-17 Thread Alexander Korotkov
On Thu, May 14, 2020 at 1:34 PM Alexander Korotkov
 wrote:
> On Thu, May 14, 2020 at 1:30 PM Nikita Glukhov  
> wrote:
> > I agree that this patch is an improvement.
>
> OK, I'm going to push this patch if no objections.
> (Sergey doesn't seem to continue involvement in PostgreSQL
> development, so it doesn't look like we should wait for him)

Pushed.  I also applied the same ordering modification to \dAp.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: pgbench: option delaying queries till connections establishment?

2020-05-17 Thread Fabien COELHO


Hello,


I've merged all time-related stuff (time_t, instr_time, int64) to use a
unique type (pg_time_usec_t) and set of functions/macros, which simplifies
the code somehow.


Hm.  I'm not convinced it's a good idea for pgbench to do its own thing
here.


I really think that the refactoring part is a good thing because cloc and 
cost is reduced (time arithmetic is an ugly pain with instr_time).


I have split the patch.

* First patch reworks time measurements in pgbench.

It creates a convenient pg_time_usec_t and use it everywhere, getting rid 
of "instr_time_t". The code is somehow simplified wrt what time are taken

and what they mean.

Instead of displaying 2 tps at the end, which is basically insane, it 
shows one tps for --connect, which includes reconnection times, and one 
tps for the usual one connection at startup which simply ignores the 
initial connection time.


This (mostly) refactoring reduces the cloc.

* Second patch adds a barrier before starting the bench

It applies on top of the previous one. The initial imbalance due to thread 
creation times is smoothed.


I may add a --start-on option afterwards so that several pgbench (running 
on distinct hosts) can be synchronized, which would be implemented as a 
delay inserted by thread 0 before the barrier.


The windows implementation is more or less blind, if someone can confirm 
that it works, it would be nice.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index f5a51d3732..26b4c4f61c 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -58,8 +58,10 @@ number of clients: 10
 number of threads: 1
 number of transactions per client: 1000
 number of transactions actually processed: 1/1
-tps = 85.184871 (including connections establishing)
-tps = 85.296346 (excluding connections establishing)
+latency average = 11.013 ms
+latency stddev = 7.351 ms
+initial connection time = 45.758 ms
+tps = 896.967014 (without initial connection establishing)
 
 
   The first six lines report some of the most important parameter
@@ -2228,7 +2230,6 @@ END;
   
For the default script, the output will look similar to this:
 
-starting vacuum...end.
 transaction type: 
 scaling factor: 1
 query mode: simple
@@ -2236,22 +2237,22 @@ number of clients: 10
 number of threads: 1
 number of transactions per client: 1000
 number of transactions actually processed: 1/1
-latency average = 15.844 ms
-latency stddev = 2.715 ms
-tps = 618.764555 (including connections establishing)
-tps = 622.977698 (excluding connections establishing)
+latency average = 10.870 ms
+latency stddev = 7.341 ms
+initial connection time = 30.954 ms
+tps = 907.949122 (without initial connection establishing)
 statement latencies in milliseconds:
-0.002  \set aid random(1, 10 * :scale)
-0.005  \set bid random(1, 1 * :scale)
-0.002  \set tid random(1, 10 * :scale)
-0.001  \set delta random(-5000, 5000)
-0.326  BEGIN;
-0.603  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
-0.454  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
-5.528  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
-7.335  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;
-0.371  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
-1.212  END;
+0.001  \set aid random(1, 10 * :scale)
+0.001  \set bid random(1, 1 * :scale)
+0.001  \set tid random(1, 10 * :scale)
+0.000  \set delta random(-5000, 5000)
+0.046  BEGIN;
+0.151  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
+0.107  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
+4.241  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
+5.245  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;
+0.102  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
+0.974  END;
 
   
 
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 08a5947a9e..46be67adaf 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -291,9 +291,9 @@ typedef struct SimpleStats
  */
 typedef struct StatsData
 {
-	time_t		start_time;		/* interval start time, for aggregates */
-	int64		cnt;			/* number of transactions, including skipped */
-	int64		skipped;		/* number of transactions skipped under --rate
+	pg_time_usec_t	start_time;	/* interval start time, for aggregates */
+	int64			cnt;		/* number of transactions, including skipped */
+	int64			skipped;	/* number of transactions skipped under --rate
  * and --latency-limit */
 	SimpleStats latency;
 	SimpleStats lag;
@@ -416,11 +416,11 @@ typedef struct
 	int			nvariables;		/* number of va

Re: Postgres default FILLFACTOR value

2020-05-17 Thread Virender Singla
In Postgres, Index FILLFACTOR only works for monotonically increasing
column values and for random values it will do 50:50 block split. However
it's really less likely that monotonically increasing columns gets updated
then why we need to waste that 10% space and also making Index range scan
on such tables less performant.

postgres=> create table test(id bigint);
CREATE TABLE
postgres=> CREATE INDEX idx1_test ON test (id)  with (fillfactor = 100);
CREATE INDEX
postgres=> CREATE INDEX idx2_test ON test (id); --default to 90.
CREATE INDEX

postgres=> insert into test SELECT ceil(random() * 1000) from
generate_series(1, 1000) AS temp (id) ;
INSERT 0 1000

postgres=> \di+ idx1_test
  List of relations
 Schema |   Name| Type  |  Owner   | Table |  Size  | Description
+---+---+--+---++-
 public | idx1_test | index | postgres | test  | 278 MB |

postgres=> \di+ idx2_test
  List of relations
 Schema |   Name| Type  |  Owner   | Table |  Size  | Description
+---+---+--+---++-
 public | idx2_test | index | postgres | test  | 280 MB |

postgres=> update test set id = id+1 where id%100=0;
UPDATE 99671
postgres=> \di+ idx1_test
  List of relations
 Schema |   Name| Type  |  Owner   | Table |  Size  | Description
+---+---+--+---++-
 public | idx1_test | index | postgres | test  | 281 MB |

postgres=> \di+ idx2_test
  List of relations
 Schema |   Name| Type  |  Owner   | Table |  Size  |
+---+---+--+---++---
 public | idx2_test | index | postgres | test  | 282 MB |


On Fri, May 8, 2020 at 1:50 PM Virender Singla 
wrote:

> Why Postgres default FILLFACTOR for table is 100 and for Index is 90.
>
> Although Oracle is having completely different MVCC architecture, it uses
> default 90 for table and 100 for Index (exact reverse of Postgres)
>
> Postgres blocks needed more spaces for row update compares to Oracle
> (because Oracle keeps buffer space only for row expansion, whereas Postgres
> need to create new versioned row). As I see Postgres is more suitable for
> OLTP workload, keeping TABLE FILLFACTOR value to 90 is more suitable rather
> than stressing to save storage space. Less FILLFACTOR value will be useful
> to make UPDATEs as HOT applicable as well and that is going to benefit new
> Postgres adopting users who are initially not aware of such setting and
> only realize this later when VACUUM are really running long and Indexes
> gets bloated. .
>
> Other side Index FILLFACTOR makes sense only for existing populated tables
> and for any row (new INSERTs or INSERT coming through UPDATEs), it can fill
> the block above FILLFACTOR value. I think 100 default make more sense here.
>
>
>
>


Re: Add A Glossary

2020-05-17 Thread Alvaro Herrera
On 2020-May-17, Erik Rijkers wrote:

> On 2020-05-17 08:51, Alvaro Herrera wrote:

> > I don't think that's the general understanding of those terms.  For all
> > I know, they *are* synonyms, and there's no specific term for "the
> > fluctuating objects" as you call them.  The instance is either running
> > (in which case there are processes and RAM) or it isn't.
> 
> For what it's worth, I've also always understood 'instance' as 'a running
> database'.  I admit it might be a left-over from my oracle years:
> 
> https://docs.oracle.com/cd/E11882_01/server.112/e40540/startup.htm#CNCPT601
> 
> There, 'instance' clearly refers to a running database.  When that database
> is stopped, it ceases to be an instance.

I've never understood it that way, but I'm open to having my opinion on
it changed.  So let's discuss it and maybe gather opinions from others.

I think the terms under discussion are just

* cluster
* instance
* server

We don't have "host" (I just made it a synonym for server), but perhaps
we can add that too, if it's useful.  It would be good to be consistent
with historical Postgres usage, such as the initdb usage of "cluster"
etc.

Perhaps we should not only define what our use of each term is, but also
explain how each term is used outside PostgreSQL and highlight the
differences.  (This would be particularly useful for "cluster" ISTM.)

It seems difficult to get this sorted out before beta1, but there's
still time before the glossary is released.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: [PATCH] fix GIN index search sometimes losing results

2020-05-17 Thread Pavel Borisov
Hi, all!
Below is my variant how to patch Gin-Gist weights issue:
1. First of all I propose to shift from previously Gin's own TS_execute
variant and leave only two: TS_execute with bool result and bool type
callback and ternary TS_execute_recurse with ternary callback. I suppose
all legacy consistent callers can still use bool via provided wrapper.
2. I integrated logic for indexes which do not support weights and
positions inside (which gives MAYBE in certain cases on negation) inside
previous TS_execute_recurse function called with additional flag for this
class of indexes.
3. Check function for GIST and GIN now gives ternary result and is called
with ternary type callback. I think in future nothing prevents smoothly
shifting callback functions, check functions and even TS_execute result to
ternary.

So I also send my variant patch for review and discussion.

Regards,
Pavel Borisov

вс, 17 мая 2020 г. в 03:14, Tom Lane :

> I wrote:
> > I think the root of the problem is that if we have a query using
> > weights, and we are testing tsvector data that lacks positions/weights,
> > we can never say there's definitely a match.  I don't see any decently
> > clean way to fix this without redefining the TSExecuteCallback API
> > to return a tri-state YES/NO/MAYBE result, because really we need to
> > decide that it's MAYBE at the level of processing the QI_VAL node,
> > not later on.  I'd tried to avoid that in e81e5741a, but maybe we
> > should just bite that bullet, and not worry about whether there's
> > any third-party code providing its own TSExecuteCallback routine.
> > codesearch.debian.net suggests that there are no external callers
> > of TS_execute, so maybe we can get away with that.
>
> 0001 attached is a proposed patch that does it that way.  Given the
> API break involved, it's not quite clear what to do with this.
> ISTM we have three options:
>
> 1. Ignore the API issue and back-patch.  Given the apparent lack of
> external callers of TS_execute, maybe we can get away with that;
> but I wonder if we'd get pushback from distros that have automatic
> ABI-break detectors in place.
>
> 2. Assume we can't backpatch, but it's still OK to slip this into
> v13.  (This option clearly has a limited shelf life, but I think
> we could get away with it until late beta.)
>
> 3. Assume we'd better hold this till v14.
>
> I find #3 unduly conservative, seeing that this is clearly a bug
> fix, but on the other hand #1 is a bit scary.  Aside from the API
> issue, it's not impossible that this has introduced some corner
> case behavioral changes that we'd consider to be new bugs rather
> than bug fixes.
>
> Anyway, some notes for reviewers:
>
> * The core idea of the patch is to make the TS_execute callbacks
> have ternary results and to insist they return TS_MAYBE in any
> case where the correct result is uncertain.
>
> * That fixes the bug at hand, and it also allows getting rid of
> some kluges at higher levels.  The GIN code no longer needs its
> own TS_execute_ternary implementation, and the GIST code no longer
> needs to suppose that it can't trust NOT results.
>
> * I put some effort into not leaking memory within tsvector_op.c's
> checkclass_str and checkcondition_str.  (The final output array
> can still get leaked, I believe.  Fixing that seems like material
> for a different patch, and it might not be worth any trouble.)
>
> * The new test cases in tstypes.sql are to verify that we didn't
> change behavior of the basic tsvector @@ tsquery code.  There wasn't
> any coverage of these cases before, and the logic for checkclass_str
> without position info had to be tweaked to preserve this behavior.
>
> * The new cases in tsearch verify that the GIN and GIST code gives
> the same results as the basic operator.
>
> Now, as for the 0002 patch attached: after 0001, the only TS_execute()
> callers that are not specifying TS_EXEC_CALC_NOT are hlCover(),
> which I'd already complained is probably a bug, and the first of
> the two calls in tsrank.c's Cover().  It seems difficult to me to
> argue that it's not a bug for Cover() to process NOT in one call
> but not the other --- moreover, if there was any argument for that
> once upon a time, it probably falls to the ground now that (a) we
> have a less buggy implementation of NOT and (b) the presence of
> phrase queries significantly raises the importance of not taking
> short-cuts.  Therefore, 0002 attached rips out the TS_EXEC_CALC_NOT
> flag and has TS_execute compute NOT expressions accurately all the
> time.
>
> As it stands, 0002 changes no regression test results, which I'm
> afraid speaks more to our crummy test coverage than anything else;
> tests that exercise those two functions with NOT-using queries
> would easily show that there is a difference.
>
> Even if we decide to back-patch 0001, I would not suggest
> back-patching 0002, as it's more nearly a definitional change
> than a bug fix.  But I think it's a good idea anyway.
>
> I'll stick this in the

[PATCH] Add support to psql for edit-and-execute-command

2020-05-17 Thread Joe Wildish

Hi hackers,

Attached is a small patch for adding "edit-and-execute-command" readline 
support to psql. Bash has this concept and I miss it when using psql. It 
allows you to amend the current line in an editor by pressing "v" (when 
in vi mode) or "C-x C-e" (when in emacs mode). Those are the default 
bindings from bash although of course they can be amended in inputrc.


Most of the patch is actually shifting "do_edit" from "command.c" to 
"common.c". There is a small amendment to that function to allow vi to 
launch at the correct column offset.


I noticed that there is some logic in configure for detecting certain 
readline functions. I assume this is for compatibility sake with 
libedit/editline? Rather than testing for each rl_* function I hid the 
functionality behind HAVE_READLINE_READLINE_H .. don't know if this is 
acceptable?


-JoeFrom a314fa15f6bdf5329d3045d736e02b6835107591 Mon Sep 17 00:00:00 2001
From: Joe Wildish 
Date: Sun, 17 May 2020 21:57:10 +0100
Subject: [PATCH] Add support to psql for edit-and-execute-command

Bash has an edit-and-execute-command Readline function that brings the
current line into an editor to be amended rather than having to edit it
in-place at the prompt. This commit adds the same functionality to psql.
We use the same default bindings ("v" in vi mode, "C-x C-e" in emacs
mode) as bash.
---
 doc/src/sgml/ref/psql-ref.sgml |  38 --
 src/bin/psql/command.c | 234 +---
 src/bin/psql/common.c  | 236 +
 src/bin/psql/common.h  |   3 +
 src/bin/psql/tab-complete.c|  67 ++
 5 files changed, 337 insertions(+), 241 deletions(-)

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 07bf272a20..0cbc6809d4 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -4458,12 +4458,14 @@ testdb=> \set PROMPT1 
'%[%033[1;33;40m%]%n@%/%R%[%033[0m%]%# '
 psql supports the 
Readline
 library for convenient line editing and retrieval. The command
 history is automatically saved when psql
-exits and is reloaded when
-psql starts up. Tab-completion is also
-supported, although the completion logic makes no claim to be an
-SQL parser.  The queries generated by tab-completion
-can also interfere with other SQL commands, e.g. SET
-TRANSACTION ISOLATION LEVEL.
+exits and is reloaded when psql starts up.
+
+
+
+Tab-completion is also supported, although the completion logic makes no
+claim to be an SQL parser.  The queries generated by
+tab-completion can also interfere with other SQL commands, e.g.
+SET TRANSACTION ISOLATION LEVEL.
 If for some reason you do not like the tab completion, you
 can turn it off by putting this in a file named
 .inputrc in your home directory:
@@ -4472,9 +4474,27 @@ $if psql
 set disable-completion on
 $endif
 
-(This is not a psql but a
-Readline feature. Read its documentation
-for further details.)
+
+
+
+In addition to the usual line editing and retrieval commands,
+psql also supports invoking an editor on the 
current
+line to amend the query text, rather than having to edit it in-place at the
+prompt. This is known as "edit-and-execute-command". The default Readline
+binding for this is "v" when in vi mode and "C-x C-e" when in emacs mode. 
However,
+the binding can be configured in the usual way from the
+.inputrc file:
+
+$if psql
+"Z": edit-and-execute-command
+$endif
+
+
+
+
+(The .inputrc configuration file is not a
+psql but a Readline
+feature. Read its documentation for further details.)
 

   
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index a5160f91de..ecd2605ca9 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -148,8 +148,6 @@ static void discard_query_text(PsqlScanState scan_state, 
ConditionalStack cstack
 static void copy_previous_query(PQExpBuffer query_buf, PQExpBuffer 
previous_buf);
 static bool do_connect(enum trivalue reuse_previous_specification,
   char *dbname, char *user, char 
*host, char *port);
-static bool do_edit(const char *filename_arg, PQExpBuffer query_buf,
-   int lineno, bool *edited);
 static bool do_shell(const char *command);
 static bool do_watch(PQExpBuffer query_buf, double sleep);
 static bool lookup_object_oid(EditableObjectType obj_type, const char *desc,
@@ -1008,7 +1006,7 @@ exec_command_edit(PsqlScanState scan_state, bool 
active_branch,
/* If query_buf is empty, recall previous query 
for editing */
copy_previous_query(query_buf, previous_buf);
 
-   if (do_edit(fname, query_buf, lineno, NULL))
+   if (do_edit(fname, query_buf, lineno, 0, NULL))
  

Re: Parallel copy

2020-05-17 Thread Amit Kapila
On Fri, May 15, 2020 at 6:49 PM Robert Haas  wrote:
>
> On Fri, May 15, 2020 at 12:19 AM Amit Kapila  wrote:
> > > My sense is that it would be a lot more sensible to do it at the
> > > *beginning* of the parallel operation. Once we do it once, we
> > > shouldn't ever do it again; that's how it works now. Deferring it
> > > until later seems much more likely to break things.
> >
> > AFAIU, we always increment the command counter after executing the
> > command.  Why do we want to do it differently here?
>
> Hmm, now I'm starting to think that I'm confused about what is under
> discussion here. Which CommandCounterIncrement() are we talking about
> here?
>

The one we do after executing a non-readonly command.  Let me try to
explain by example:

CREATE TABLE tab_fk_referenced_chk(refindex INTEGER PRIMARY KEY,
height real, weight real);
insert into tab_fk_referenced_chk values( 1, 1.1, 100);
CREATE TABLE tab_fk_referencing_chk(index INTEGER REFERENCES
tab_fk_referenced_chk(refindex), height real, weight real);

COPY tab_fk_referencing_chk(index,height,weight) FROM stdin WITH(
DELIMITER ',');
1,1.1,100
1,2.1,200
1,3.1,300
\.

In the above case, even though we are executing a single command from
the user perspective, but the currentCommandId will be four after the
command.  One increment will be for the copy command and the other
three increments are for locking tuple in PK table
(tab_fk_referenced_chk) for three tuples in FK table
(tab_fk_referencing_chk).  Now, for parallel workers, it is
(theoretically) possible that the three tuples are processed by three
different workers which don't get synced as of now.  The question was
do we see any kind of problem with this and if so can we just sync it
up at the end of parallelism.

> > First, let me clarify the CTID I have used in my email are for the
> > table in which insertion is happening which means FK table.  So, in
> > such a case, we can't have the same CTIDs queued for different
> > workers.  Basically, we use CTID to fetch the row from FK table later
> > and form a query to lock (in KEY SHARE mode) the corresponding tuple
> > in PK table.  Now, it is possible that two different workers try to
> > lock the same row of PK table.  I am not clear what problem group
> > locking can have in this case because these are non-conflicting locks.
> > Can you please elaborate a bit more?
>
> I'm concerned about two workers trying to take the same lock at the
> same time. If that's prevented by the buffer locking then I think it's
> OK, but if it's prevented by a heavyweight lock then it's not going to
> work in this case.
>

We do take buffer lock in exclusive mode before trying to acquire KEY
SHARE lock on the tuple, so the two workers shouldn't try to acquire
at the same time.  I think you are trying to see if in any case, two
workers try to acquire heavyweight lock like tuple lock or something
like that to perform the operation then it will create a problem
because due to group locking it will allow such an operation where it
should not have been.  But I don't think anything of that sort is
feasible in COPY operation and if it is then we probably need to
carefully block it or find some solution for it.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: Fix a typo in slot.c

2020-05-17 Thread Amit Kapila
On Fri, May 15, 2020 at 10:08 AM Masahiko Sawada
 wrote:
>
> On Fri, 15 May 2020 at 13:26, Amit Kapila  wrote:
> >
> >
> >  /*
> > - * Allocate and initialize walsender-related shared memory.
> > + * Allocate and initialize replication slots' shared memory.
> >   */
> >
> > How about changing it to "Allocate and initialize shared memory for
> > replication slots"?
> >
>
> Agreed.
>
> Attached the updated patch.
>

Pushed.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: [PATCH] Add support to psql for edit-and-execute-command

2020-05-17 Thread Oleksandr Shulgin
On Mon, May 18, 2020 at 1:30 AM Joe Wildish  wrote:

>
> Attached is a small patch for adding "edit-and-execute-command" readline
> support to psql. Bash has this concept and I miss it when using psql. It
> allows you to amend the current line in an editor by pressing "v" (when
> in vi mode) or "C-x C-e" (when in emacs mode). Those are the default
> bindings from bash although of course they can be amended in inputrc.
>

The only difference from \e is that you don't need to jump to the end of
input first, I guess?

--
Alex