Re: Default to TIMESTAMP WITH TIME ZONE?

2021-08-12 Thread Greg Stark
I think having a GUC to change to a different set of semantics is not workable.

However that doesn't mean we can't do anything. We could have a GUC
that just disables allowing creating columns of type timestamp without
tz. That could print an error with a hint suggesting you can change
the variable if you really want to allow them.

I still would hesitate to make it the default but admins could set it
to help their developers.

You could maybe imagine a set of parameters to disable various options
that can be seen as undesirable features that sites may not want their
developers accidentally introducing.

That said, even disabling a feature is a semantic change. Consider for
example an extension that did make use of some feature. To be portable
and reliable it would have to start with a block of temporary GUC
settings to ensure it worked properly on user databases where various
settings may be in place. But that's a lot easier to manage than
subtle behavioural changes.




Re: Bug in huge simplehash

2021-08-12 Thread David Rowley
On Wed, 11 Aug 2021 at 00:10, Yura Sokolov  wrote:
> Attached v2.

Eyeballing this it looks fine, but I was a little nervous backpatching
without testing it properly, so I ended up provisioning a machine with
256GB and doing a round of testing.

I just created the most simple table I could:

create table a (a bigserial, b int);
and inserted 2^31 rows.

insert into a (b) values(1);
insert into a (b) select b from a; -- repeated until I got 2^31 rows.

set work_mem = '256GB';
set max_parallel_workers_per_gather = 0;

I could recreate the issue described with the following query:

explain (analyze , timing off) select a from a group by a;

After watching perf top for a while it switched to:

  98.90%  postgres[.] tuplehash_grow
   0.36%  [kernel][k] change_p4d_range
   0.24%  postgres[.] LookupTupleHashEntry
   0.09%  postgres[.] tts_minimal_store_tuple
   0.07%  [kernel][k] vm_normal_page
   0.02%  [kernel][k] __softirqentry_text_start
   0.02%  postgres[.] heap_fill_tuple
   0.02%  postgres[.] AllocSetAlloc

After patching I got:

explain (analyze , timing off) select a from a group by a;
   QUERY PLAN
-
 HashAggregate  (cost=35149810.71..53983243.28 rows=1883343257
width=8) (actual rows=2147483648 loops=1)
   Group Key: a
   Batches: 1  Memory Usage: 201334801kB
   ->  Seq Scan on a  (cost=0.00..30441452.57 rows=1883343257 width=8)
(actual rows=2147483648 loops=1)
 Planning Time: 0.105 ms
 Execution Time: 2173480.905 ms
(6 rows)
Time: 2173482.166 ms (36:13.482)

And, since I only had 256GB of memory on this machine and couldn't
really do 2^32 groups, I dropped SH_FILLFACTOR to 0.4 and
SH_MAX_FILLFACTOR to 0.48 and tried again to ensure I got the hash
table full message:

postgres=# explain (analyze on , timing off) select a from a group by a;
ERROR:  hash table size exceeded
Time: 1148554.672 ms (19:08.555)

After doing that, I felt a bit better about batch-patching it, so I did.

Thanks for the patch.

David




Re: Skipping logical replication transactions on subscriber side

2021-08-12 Thread Amit Kapila
On Thu, Aug 12, 2021 at 5:41 PM Greg Nancarrow  wrote:
>
> On Thu, Aug 12, 2021 at 9:18 PM Amit Kapila  wrote:
> >
> > > A minor comment on the 0001 patch: In the message I think that using
> > > "ID" would look better than lowercase "id" and AFAICS it's more
> > > consistent with existing messages.
> > >
> > > + appendStringInfo(, _(" in transaction id %u with commit timestamp 
> > > %s"),
> > >
> >
> > You have a point but I think in this case it might look a bit odd as
> > we have another field 'commit timestamp' after that which is
> > lowercase.
> >
>
> I did a quick search and I couldn't find any other messages in the
> Postgres code that use "transaction id", but I could find some that
> use "transaction ID" and "transaction identifier".
>

Okay, but that doesn't mean using it here is bad. I am personally fine
with a message containing something like "... in transaction
id 740 with commit timestamp 2021-08-10 14:44:38.058174+05:30" but I
won't mind if you and or others find some other way convenient. Any
opinion from others?

-- 
With Regards,
Amit Kapila.




Re: Next Steps with Hash Indexes

2021-08-12 Thread Amit Kapila
On Thu, Aug 12, 2021 at 8:30 PM Robert Haas  wrote:
>
> On Thu, Aug 12, 2021 at 12:22 AM Amit Kapila  wrote:
> > The design of the patch has changed since the initial proposal. It
> > tries to perform unique inserts by holding a write lock on the bucket
> > page to avoid duplicate inserts.
>
> Do you mean that you're holding a buffer lwlock while you search the
> whole bucket? If so, that's surely not OK.
>

I think here you are worried that after holding lwlock we might
perform reads of overflow pages which is not a good idea. I think
there are fewer chances of having overflow pages for unique indexes so
we don't expect such cases in common and as far as I can understand
this can happen in btree as well during uniqueness check. Now, I think
the other possibility could be that we do some sort of lock chaining
where we grab the lock of the next bucket before releasing the lock of
the current bucket as we do during bucket clean up but not sure about
the same.

I haven't studied the patch in detail so it is better for Simon to
pitch in here to avoid any incorrect information or if he has a
different understanding/idea.

-- 
With Regards,
Amit Kapila.




Re: Added schema level support for publication.

2021-08-12 Thread Amit Kapila
On Thu, Aug 12, 2021 at 5:54 PM Masahiko Sawada  wrote:
>
> On Fri, Aug 6, 2021 at 5:33 PM vignesh C  wrote:
> >
>
> ---
> Even if we drop all tables added to the publication from it, 'pubkind'
> doesn't go back to 'empty'. Is that intentional behavior? If we do
> that, we can save the lookup of pg_publication_rel and
> pg_publication_schema in some cases, and we can switch the publication
> that was created as FOR SCHEMA to FOR TABLE and vice versa.
>

Do we really want to allow users to change a publication that is FOR
SCHEMA to FOR TABLE? I see that we don't allow to do that FOR TABLES.
postgres=# Alter Publication pub add table tbl1;
ERROR:  publication "pub" is defined as FOR ALL TABLES
DETAIL:  Tables cannot be added to or dropped from FOR ALL TABLES publications.

-- 
With Regards,
Amit Kapila.




Re: Worth using personality(ADDR_NO_RANDOMIZE) for EXEC_BACKEND on linux?

2021-08-12 Thread Tom Lane
Thomas Munro  writes:
> Obviously the address may have to be adjusted if Apple moves stuff
> around, or if the initial layout and ASLR slide range turn out to be
> less constrained than I inferred by nosing around the source code and
> testing on a couple of systems.

Sure.  But we're no worse off than before; the workaround of "set
PG_SHMEM_ADDR by hand" is just as applicable as ever.

regards, tom lane




Re: Default to TIMESTAMP WITH TIME ZONE?

2021-08-12 Thread Isaac Morland
An idea just occurred to me: a systematic way to provide this feature might
be to support aliases for objects. So I could declare an alternate name for
an object, something like:

CREATE ALIAS timestamp FOR TYPE timestamptz;

Which would mean that [current schema].timestamp would now be an alternate
name for the built-in timestamptz object. There are other situations in
which being able to define aliases would be handy, including schema
migrations and probably other compatibility scenarios.

Of course I'm aware that this idea itself would need a lot of discussion
and I'm not volunteering to implement it right now, but it might be a
workable approach if aliases ever become a feature.


Re: [PATCH] test/ssl: rework the sslfiles Makefile target

2021-08-12 Thread Jacob Champion
On Tue, 2021-08-10 at 11:26 -0400, Tom Lane wrote:
> Yeah, I don't like that change either.  It would be one thing if
> we had several places in which suppressing .SECONDARY was useful,
> but if there's only one then it seems better to design around it.

Maybe. The current Makefile already tried to design around it, with
`rm`s inserted various places. That strategy won't work for the CA
state, and my personal interest in trying to manually replicate built-
in Make features is... low.

> As a concrete example of why this might be a bad idea, how sure
> are you that noplace in Makefile.global depends on that behavior?

I was hoping that, by scoping the change to only a single Makefile with
the clean_intermediates flag, I could simplify that question to "does
any place in that one Makefile rely on an affected rule from
Makefile.global?" And the answer to that appears to be "no" at the
moment, because that Makefile doesn't really need the globals for
anything but the prove_ macros.

(Things would get hairier if someone included the SSL Makefile
somewhere else, but I don't see anyone doing that now and I don't know
why someone would.)

But -- if I do spend the time to answer your broader question, does it
actually help my case? Someone could always add more stuff to
Makefile.global. It sounds like the actual fear is that we don't
understand what might be interacting with a very broad global target,
and that fear is too great to try a scoped change, in a niche Makefile,
early in a release cycle, to improve a development issue multiple
committers have now complained about.

If _that's_ the case, then our build system is holding all of us
hostage. Which is frustrating to me. Should I shift focus to help with
that, first?

--Jacob


Re: [PATCH] test/ssl: rework the sslfiles Makefile target

2021-08-12 Thread Jacob Champion
On Tue, 2021-08-10 at 16:22 +0900, Michael Paquier wrote:
> Regarding 0002, I am not sure.  Even if this reduces a lot of
> duplication, which is really nice, enforcing .SECONDARY to not trigger
> with a change impacting Makefile.global.in does not sound very
> appealing to me in the long-run, TBH.

De-duplication isn't the primary goal of the .SECONDARY change. It
definitely helps with that, but the major improvement is that Make can
maintain the CA state with less hair-pulling:

1. Developer updates an arbitrary number of certificates and runs
   `make sslfiles`.
2. Make sees that the CA state is missing, and creates it once at the
   start of the run.
3. Make runs multiple `openssl ca` commands, depending on the
   certificates being changed, which modify the CA state as they go.
4. After Make is finished, it removes all the CA files, putting your
   local state back the way it was before you ran Make.

Doing it this way has several advantages:

- The magic state files don't persist to influence a future Make run,
  so there's less chance of "I generated with local changes, then
  pulled in the changes you made, and now everything's busted in weird
  ways because my CA state disagrees with what's in the tree".

- Order-only intermediates do The Right Thing in this case -- create
  once when needed, accumulate state during the run, remove at the end
  -- whether you add a single certificate, or regenerate the entire
  tree, or even Ctrl-C halfway through. That's going to be very hard to
  imitate by sprinkling `rm`s like the current Makefile does, though
  I've been the weeds long enough that maybe I'm missing an obvious
  workaround.

- If, after all that, something still goes wrong (your machine crashes
  so Make can't clean up), `git status` will now help you debug
  dependency problems because it's no longer "normal" to carry the
  intermediate litter in your source tree.

What it doesn't fix is the fact that we're still checking in generated
files that have interdependencies, so the timestamps Make is looking at
are still going to be wrong during initial checkout. That problem
existed before and will persist after this change.

On Wed, 2021-08-11 at 09:47 +0900, Michael Paquier wrote:
> The part I am mainly objecting to is the change in Makefile.global.in,
> but I have to admit after thinking about it that enforcing SECONDARY
> may not be a good idea if other parts of the system rely on that, so
> encouraging the use of clean_intermediates may be dangerous (Tom's
> point from upthread).

I don't really want to encourage the use of clean_intermediates. I just
want Make to have its default, useful behavior for this one Makefile.

> I have not tried so I am not sure, but perhaps we should just focus on
> reducing the number of openssl commands rather than making easier the
> integration of new files?  It could be possible to close the gap with
> the addition of new files with some more documentation for future
> hackers then?

I'd rather fix the dependency/state bugs than document how to work
around them. I know the workarounds; it doesn't make working with this
Makefile any less maddening.

--Jacob


Re: Worth using personality(ADDR_NO_RANDOMIZE) for EXEC_BACKEND on linux?

2021-08-12 Thread Thomas Munro
On Fri, Aug 13, 2021 at 9:59 AM Tom Lane  wrote:
> Hmm, ok.  Small thought: it might be better to put the #if inside
> the "else {  }".  That way it scales easily to allow other
> platform-specific defaults if we find anything useful.  As-is,
> the obvious extension would end up with multiple else-blocks,
> which seems likely to confuse pgindent if nothing else.

True.  Thanks.  Pushed to all live branches.

Obviously the address may have to be adjusted if Apple moves stuff
around, or if the initial layout and ASLR slide range turn out to be
less constrained than I inferred by nosing around the source code and
testing on a couple of systems.




Re: Default to TIMESTAMP WITH TIME ZONE?

2021-08-12 Thread Bruce Momjian
On Fri, Aug 13, 2021 at 12:25:00AM +0100, Simon Riggs wrote:
> > > I thought we found that changing behavior via GUC usually ends badly.
> >
> > Yeah.  Changing from SQL-spec to not-SQL-spec behavior is going to be
> > one tough sell to begin with, even without the point that that's been
> > our behavior for over two decades.  But proposing to do it via a GUC
> > is just not-even-worth-discussing territory.  That would force every
> > wannabe-portable program to cope with both behaviors; which would
> > end up meaning that not only do you still have to take care to write
> > WITH TIME ZONE when you want that, but *also* you'd have to be sure
> > to write WITHOUT TIME ZONE when you want that.  In short, the worst
> > of both worlds.
> 
> All of which I agree with, but this wasn't a cute idea of mine, this
> was what our users have requested because of the extreme annoyance
> caused by the current behavior.

Understood, but the problem is that our users sometimes don't think
through the ramifications of their suggestions.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.





Re: Default to TIMESTAMP WITH TIME ZONE?

2021-08-12 Thread Simon Riggs
On Thu, 12 Aug 2021 at 20:07, Tom Lane  wrote:
>
> Bruce Momjian  writes:
> > On Thu, Aug 12, 2021 at 07:24:58PM +0100, Simon Riggs wrote:
> >> I heard the moan about "Why doesn't TIMESTAMP mean TIMESTAMP WITH TIME
> >> ZONE" again today, so here is something concrete to address that.
> >>
> >> AFAIK, SQL Standard requires the default to be WITHOUT TIME ZONE, but
> >> nobody seems to think that is useful.
>
> Standards compliance is useful, no?  If it isn't, there are about 1001
> other things nobody much likes about SQL.
>
> >> We even added a specially
> >> optimized ALTER TABLE command to make switching from WITHOUT to WITH
> >> TIME ZONE easy, so it is clearly an important thing to solve.
> >>
> >> So add a parameter called
> >> default_timestamp_with_timezone = off (default) | on
>
> > I thought we found that changing behavior via GUC usually ends badly.
>
> Yeah.  Changing from SQL-spec to not-SQL-spec behavior is going to be
> one tough sell to begin with, even without the point that that's been
> our behavior for over two decades.  But proposing to do it via a GUC
> is just not-even-worth-discussing territory.  That would force every
> wannabe-portable program to cope with both behaviors; which would
> end up meaning that not only do you still have to take care to write
> WITH TIME ZONE when you want that, but *also* you'd have to be sure
> to write WITHOUT TIME ZONE when you want that.  In short, the worst
> of both worlds.

All of which I agree with, but this wasn't a cute idea of mine, this
was what our users have requested because of the extreme annoyance
caused by the current behavior.

-- 
Simon Riggshttp://www.EnterpriseDB.com/




Re: Use extended statistics to estimate (Var op Var) clauses

2021-08-12 Thread Mark Dilger



> On Aug 11, 2021, at 3:48 PM, Mark Dilger  wrote:
> 
> I'm working on a correlated stats test as I write this.  I'll get back to you 
> when I have results.

Ok, the tests showed no statistically significant regressions.  All tests 
included the same sorts of whereclause expressions as used in the tests from 
yesterday's email.

The first test created loosely correlated data and found no significant row 
estimate improvements or regressions.

The second test of more tightly correlated data showed a row estimate 
improvement overall, with no class of whereclause showing an estimate 
regression.  I think the apparent regressions from yesterday were just 
statistical noise.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: badly calculated width of emoji in psql

2021-08-12 Thread Jacob Champion
On Thu, 2021-08-12 at 17:13 -0400, John Naylor wrote:
> The patch looks pretty good to me. I just have a stylistic suggestion
> which I've attached as a text file.

Getting rid of the "clever addition" looks much better to me, thanks. I
haven't verified the changes to the doc comment, but your description
seems reasonable.

> I'm a bit concerned about the build dependencies not working right,
> but it's not clear it's even due to the patch. I'll spend some time
> investigating next week.

If I vandalize src/common/wchar.c on HEAD, say by deleting the contents
of pg_wchar_table, and then run `make install`, then libpq doesn't get
rebuilt and there's no effect on the frontend. The postgres executable
does get rebuilt for the backend.

It looks like src/interfaces/libpq/Makefile doesn't have a dependency
on libpgcommon (or libpgport, for that matter). For comparison,
src/backend/Makefile has this:

OBJS = \
$(LOCALOBJS) \
$(SUBDIROBJS) \
$(top_builddir)/src/common/libpgcommon_srv.a \
$(top_builddir)/src/port/libpgport_srv.a

So I think that's a bug that needs to be fixed independently, whether
this patch goes in or not.

--Jacob


Re: Worth using personality(ADDR_NO_RANDOMIZE) for EXEC_BACKEND on linux?

2021-08-12 Thread Tom Lane
Thomas Munro  writes:
> On Fri, Aug 13, 2021 at 3:13 AM Tom Lane  wrote:
>> I don't see why that approach couldn't be incorporated into pg_ctl,
>> or the postmaster itself.  Given Andres' point that Linux ASLR
>> disable probably has to happen in pg_ctl, it seems like doing it
>> in pg_ctl in all cases is the way to move forward.

> I think doing it in the postmaster is best, since otherwise you have
> to put code into pg_regress.c and pg_ctl.c.  Here's a patch like that.

Hmm, ok.  Small thought: it might be better to put the #if inside
the "else {  }".  That way it scales easily to allow other
platform-specific defaults if we find anything useful.  As-is,
the obvious extension would end up with multiple else-blocks,
which seems likely to confuse pgindent if nothing else.

regards, tom lane




Re: Worth using personality(ADDR_NO_RANDOMIZE) for EXEC_BACKEND on linux?

2021-08-12 Thread Thomas Munro
On Fri, Aug 13, 2021 at 3:13 AM Tom Lane  wrote:
> Robert Haas  writes:
> > Ugh, OK. So, is there a way that we can get an "easy button" committed
> > to the tree?
>
> I don't see why that approach couldn't be incorporated into pg_ctl,
> or the postmaster itself.  Given Andres' point that Linux ASLR
> disable probably has to happen in pg_ctl, it seems like doing it
> in pg_ctl in all cases is the way to move forward.

I think doing it in the postmaster is best, since otherwise you have
to put code into pg_regress.c and pg_ctl.c.  Here's a patch like that.


0001-Make-EXEC_BACKEND-more-convenient-on-macOS.patch
Description: Binary data


Re: badly calculated width of emoji in psql

2021-08-12 Thread John Naylor
The patch looks pretty good to me. I just have a stylistic suggestion which
I've attached as a text file. There are also some outdated comments that
are not the responsibility of this patch, but I kind of want to fix them
now:

 *  - Hangul Jamo medial vowels and final consonants (U+1160-U+11FF)
 * have a column width of 0.

We got rid of this range in d8594d123c1, which is correct.

 *  - Other format characters (general category code Cf in the Unicode
 * database) and ZERO WIDTH SPACE (U+200B) have a column width of 0.

We don't treat Cf the same as Me or Mn, and I believe that's deliberate. We
also no longer have the exception for zero-width space.

It seems the consensus so far is that performance is not an issue, and I'm
inclined to agree.

I'm a bit concerned about the build dependencies not working right, but
it's not clear it's even due to the patch. I'll spend some time
investigating next week.

-- 
John Naylor
EDB: http://www.enterprisedb.com
diff --git a/src/common/wchar.c b/src/common/wchar.c
index 43f1078ae6..467cb8921a 100644
--- a/src/common/wchar.c
+++ b/src/common/wchar.c
@@ -623,12 +623,6 @@ mbbisearch(pg_wchar ucs, const struct mbinterval *table, 
int max)
  * category code Mn or Me in the Unicode database) have a
  * column width of 0.
  *
- *   - Other format characters (general category code Cf in the Unicode
- * database) and ZERO WIDTH SPACE (U+200B) have a column width of 
0.
- *
- *   - Hangul Jamo medial vowels and final consonants (U+1160-U+11FF)
- * have a column width of 0.
- *
  *   - Spacing characters in the East Asian Wide (W) or East Asian
  * FullWidth (F) category as defined in Unicode Technical
  * Report #11 have a column width of 2.
@@ -659,13 +653,12 @@ ucs_wcwidth(pg_wchar ucs)
   sizeof(combining) / sizeof(struct 
mbinterval) - 1))
return 0;
 
-   /*
-* if we arrive here, ucs is not a combining or C0/C1 control character
-*/
+   /* binary search in table of wide characters */
+   if (mbbisearch(ucs, east_asian_fw,
+  sizeof(east_asian_fw) / sizeof(struct 
mbinterval) - 1))
+   return 2;
 
-   return 1 +
-   mbbisearch(ucs, east_asian_fw,
-  sizeof(east_asian_fw) / sizeof(struct 
mbinterval) - 1);
+   return 1;
 }
 
 /*


Re: Default to TIMESTAMP WITH TIME ZONE?

2021-08-12 Thread Tom Lane
Bruce Momjian  writes:
> On Thu, Aug 12, 2021 at 07:24:58PM +0100, Simon Riggs wrote:
>> I heard the moan about "Why doesn't TIMESTAMP mean TIMESTAMP WITH TIME
>> ZONE" again today, so here is something concrete to address that.
>> 
>> AFAIK, SQL Standard requires the default to be WITHOUT TIME ZONE, but
>> nobody seems to think that is useful.

Standards compliance is useful, no?  If it isn't, there are about 1001
other things nobody much likes about SQL.

>> We even added a specially
>> optimized ALTER TABLE command to make switching from WITHOUT to WITH
>> TIME ZONE easy, so it is clearly an important thing to solve.
>> 
>> So add a parameter called
>> default_timestamp_with_timezone = off (default) | on

> I thought we found that changing behavior via GUC usually ends badly.

Yeah.  Changing from SQL-spec to not-SQL-spec behavior is going to be
one tough sell to begin with, even without the point that that's been
our behavior for over two decades.  But proposing to do it via a GUC
is just not-even-worth-discussing territory.  That would force every
wannabe-portable program to cope with both behaviors; which would
end up meaning that not only do you still have to take care to write
WITH TIME ZONE when you want that, but *also* you'd have to be sure
to write WITHOUT TIME ZONE when you want that.  In short, the worst
of both worlds.

regards, tom lane




Re: Default to TIMESTAMP WITH TIME ZONE?

2021-08-12 Thread Bruce Momjian
On Thu, Aug 12, 2021 at 07:24:58PM +0100, Simon Riggs wrote:
> I heard the moan about "Why doesn't TIMESTAMP mean TIMESTAMP WITH TIME
> ZONE" again today, so here is something concrete to address that.
> 
> AFAIK, SQL Standard requires the default to be WITHOUT TIME ZONE, but
> nobody seems to think that is useful. We even added a specially
> optimized ALTER TABLE command to make switching from WITHOUT to WITH
> TIME ZONE easy, so it is clearly an important thing to solve.
> 
> So add a parameter called
>default_timestamp_with_timezone = off (default) | on

I thought we found that changing behavior via GUC usually ends badly.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.





Default to TIMESTAMP WITH TIME ZONE?

2021-08-12 Thread Simon Riggs
I heard the moan about "Why doesn't TIMESTAMP mean TIMESTAMP WITH TIME
ZONE" again today, so here is something concrete to address that.

AFAIK, SQL Standard requires the default to be WITHOUT TIME ZONE, but
nobody seems to think that is useful. We even added a specially
optimized ALTER TABLE command to make switching from WITHOUT to WITH
TIME ZONE easy, so it is clearly an important thing to solve.

So add a parameter called
   default_timestamp_with_timezone = off (default) | on

Thoughts?

-- 
Simon Riggshttp://www.EnterpriseDB.com/


default_timestamp_with_timezone.v1.patch
Description: Binary data


Re: badly calculated width of emoji in psql

2021-08-12 Thread John Naylor
On Thu, Aug 12, 2021 at 12:46 PM Pavel Stehule 
wrote:

> did you run make clean?
>
> when I executed just patch & make, it didn't work

I did not, but I always have --enable-depend on. I tried again with make
clean, and ccache -C just in case, and it works now.

On Thu, Aug 12, 2021 at 12:54 PM Jacob Champion 
wrote:

> Was this a clean build? Perhaps I've introduced (or exacerbated) a
> dependency bug in the Makefile? The patch doing nothing is a surprising
> result given the code change.

Yeah, given that Pavel had the same issue, that's a possibility. I don't
recall that happening with other unicode patches I've tested.

--
John Naylor
EDB: http://www.enterprisedb.com


Re: call popcount32/64 directly on non-x86 platforms

2021-08-12 Thread John Naylor
On Thu, Aug 12, 2021 at 1:26 AM David Rowley  wrote:
> Closer, but I don't see why there's any need to make the fast and slow
> functions external.  It should be perfectly fine to keep them static.
>
> I didn't test the performance, but the attached works for me.

Thanks for that! I still get a big improvement to on Power8 / gcc 4.8, but
it's not quite as fast as earlier versions, which were around 200ms:

master: 646ms
v3: 312ms

This machine does seem to be pickier about code layout than any other I've
tried running benchmarks on, but that's still a bit much. In any case, your
version is clearer and has the intended effect, so I plan to commit that,
barring other comments.

I think I'll leave my v2-0002 aside for now, since it has wider
implications, and I have bigger things to work on.

--
John Naylor
EDB: http://www.enterprisedb.com


Re: ECPG bug fix: DECALRE STATEMENT and DEALLOCATE, DESCRIBE

2021-08-12 Thread Michael Meskes
> No, you're right, although I think it's implied. Maybe we need a
> statement along these lines:
> 
> 
> Committers are responsible for the resolution of open items that
> relate
> to commits they have made. Action needs to be taken in a timely
> fashion,
> and if there is any substantial delay in dealing with an item the
> committer should provide a date by which they expect action to be
> completed. The RMT will follow up where these requirements are not
> being
> complied with.

I think that would be helpful, yes.

> If they are fine by you then I accept that. After all, the reason we
> want you to deal with this is not only that you made the original
> commit
> but because you're the expert in this area.

I will commit the patch(es). Thanks.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De
Michael at Meskes dot (De|Com|Net|Org)
Meskes at (Debian|Postgresql) dot Org





Re: badly calculated width of emoji in psql

2021-08-12 Thread Jacob Champion
On Thu, 2021-08-12 at 12:36 -0400, John Naylor wrote:
> I tried this patch on and MacOS11/iterm2 and RHEL 7 (ssh'd from the Mac, in 
> case that matters) and the example shown at the top of the thread shows no 
> difference:
> 
> john.naylor=# \pset border 2
> Border style is 2.
> john.naylor=# SELECT U&'\+01F603';
> +--+
> | ?column? |
> +--+
> | |
> +--+
> (1 row)
> 
> (In case it doesn't render locally, the right bar in the result cell is still 
> shifted to the right.
> 
> What is the expected context to show a behavior change?

There shouldn't be anything special. (If your terminal was set up to
display emoji in single columns, that would cause alignment issues, but
in the opposite direction to the one you're seeing.)

> Does one need some specific terminal or setting?

In your case, an incorrect number of spaces are being printed, so it
shouldn't have anything to do with your terminal settings.

Was this a clean build? Perhaps I've introduced (or exacerbated) a
dependency bug in the Makefile? The patch doing nothing is a surprising
result given the code change.

--Jacob


Re: badly calculated width of emoji in psql

2021-08-12 Thread Pavel Stehule
čt 12. 8. 2021 v 18:36 odesílatel John Naylor 
napsal:

> I tried this patch on and MacOS11/iterm2 and RHEL 7 (ssh'd from the Mac,
> in case that matters) and the example shown at the top of the thread shows
> no difference:
>
> john.naylor=# \pset border 2
> Border style is 2.
> john.naylor=# SELECT U&'\+01F603';
> +--+
> | ?column? |
> +--+
> | |
> +--+
> (1 row)
>

did you run make clean?

when I executed just patch & make, it didn't work


> (In case it doesn't render locally, the right bar in the result cell is
> still shifted to the right.
>
> What is the expected context to show a behavior change? Does one need some
> specific terminal or setting?
>

I assigned screenshots


>
> --
> John Naylor
> EDB: http://www.enterprisedb.com
>


Re: badly calculated width of emoji in psql

2021-08-12 Thread John Naylor
I tried this patch on and MacOS11/iterm2 and RHEL 7 (ssh'd from the Mac, in
case that matters) and the example shown at the top of the thread shows no
difference:

john.naylor=# \pset border 2
Border style is 2.
john.naylor=# SELECT U&'\+01F603';
+--+
| ?column? |
+--+
| |
+--+
(1 row)

(In case it doesn't render locally, the right bar in the result cell is
still shifted to the right.

What is the expected context to show a behavior change? Does one need some
specific terminal or setting?

-- 
John Naylor
EDB: http://www.enterprisedb.com


Re: Avoid stuck of pbgench due to skipped transactions

2021-08-12 Thread Yugo NAGATA
On Tue, 10 Aug 2021 10:50:20 -0400
Greg Sabino Mullane  wrote:

> Apologies, just saw this. I found no problems, those "failures" were just
> me missing checkboxes on the commitfest interface. +1 on the patch.

Thank you!


-- 
Yugo NAGATA 




Re: Next Steps with Hash Indexes

2021-08-12 Thread Peter Geoghegan
On Wed, Aug 11, 2021 at 8:51 AM John Naylor
 wrote:
> (Standard disclaimer that I'm not qualified to design index AMs) I've seen 
> one mention in the literature about the possibility of simply having a btree 
> index over the hash values. That would require faster search within pages, in 
> particular using abbreviated keys in the ItemId array of internal pages [1] 
> and interpolated search rather than pure binary search (which should work 
> reliably with high-entropy keys like hash values), but doing that would speed 
> up all btree indexes, so that much is worth doing regardless of how hash 
> indexes are implemented. In that scheme, the hash index AM would just be 
> around for backward compatibility.

I think that it's possible (though hard) to do that without involving
hashing, even for datatypes like text. Having some kind of prefix
compression that makes the final abbreviated keys have high entropy
would be essential, though. I agree that it would probably be
significantly easier when you knew you were dealing with hash values,
but even there you need some kind of prefix compression.

In any case I suspect that it would make sense to reimplement hash
indexes as a translation layer between hash index opclasses and
nbtree. Robert said "Likewise, the fact that keys are stored in hash
value order within pages but that the bucket as a whole is not kept in
order seems like it's bad for search performance". Obviously we've
already done a lot of work on an index AM that deals with a fully
ordered keyspace very well. This includes dealing with large groups of
duplicates gracefully, since in a certain sense there are no duplicate
B-Tree index tuples -- the heap TID tiebreaker ensures that. And it
ensures that you have heap-wise locality within these large groups,
which is a key enabler of things like opportunistic index deletion.

When hash indexes have been used in database systems, it tends to be
in-memory database systems where the recovery path doesn't recover
indexes -- they're just rebuilt from scratch instead. If that's
already a baked-in assumption then hash indexes make more sense. To me
it seems like the problem with true hash indexes is that they're
constructed in a top-down fashion, which is approximately the opposite
of the bottom-up, incremental approach used by B-Tree indexing. This
seems to be where all the skew problems arise from. This approach
cannot be robust to changes in the keyspace over time, really.

-- 
Peter Geoghegan




Re: Add option --drop-cascade for pg_dump/restore

2021-08-12 Thread Greg Sabino Mullane
On Wed, Aug 11, 2021 at 10:53 PM Wu Haotian  wrote:

> Maybe we can add checks like "option --clean requires plain text format"?
> If so, should I start a new mail thread for this?
>

Shrug. To me, that seems related enough it could go into the existing
patch/thread.

Cheers,
Greg


Re: make MaxBackends available in _PG_init

2021-08-12 Thread Greg Sabino Mullane
On Wed, Aug 11, 2021 at 10:08 AM Tom Lane  wrote:

> You must not have enabled EXEC_BACKEND properly.  It's a compile-time
> #define that affects multiple modules, so it's easy to get wrong.
> The way I usually turn it on is
>

Thank you. I was able to get it all working, and withdraw any objections to
that bit of the patch. :)

Cheers,
Greg


Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

2021-08-12 Thread Andres Freund
Hi,

On 2021-08-12 05:48:19 -0700, Andres Freund wrote:
> I think SharedFileSetInit() needs a comment explaining that it needs to be
> called in a process-lifetime memory context if used without dsm
> segments. Because otherwise SharedFileSetDeleteOnProcExit() will access
> already freed memory (both for filesetlist and the SharedFileSet itself).

Oh. And I think it's not ok that SharedFileSetDeleteAll() unconditionally does
SharedFileSetUnregister(). SharedFileSetUnregister() asserts out if there's no
match, but DSM based sets are never entered into filesetlist. So one cannot
have a non-DSM and DSM set coexisting. Which seems surprising.

Greetings,

Andres Freund




Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

2021-08-12 Thread Andres Freund
Hi,

On 2021-08-12 15:06:23 +0530, Amit Kapila wrote:
> On Thu, Aug 12, 2021 at 1:52 PM Andres Freund  wrote:
> > I'm not so sure. Why does sharedfileset have its own proc exit hook in the
> > first place? ISTM that this should be dealt with using resowners, rathers 
> > than
> > a sharedfileset specific mechanism?
> >

> The underlying temporary files need to be closed at xact end but need
> to survive across transactions.

Why do they need to be closed at xact end? To avoid wasting memory due to too
many buffered files?


> These are registered with the resource owner via
> PathNameOpenTemporaryFile/PathNameCreateTemporaryFile and then closed
> at xact end. So, we need a way to remove the files used by the process
> (apply worker in this particular case) before process exit and used
> this proc_exit hook (possibly on the lines of AtProcExit_Files).

What I'm wondering is why it is a good idea to have a SharedFileSet specific
cleanup mechanism. One that only operates on process lifetime level, rather
than something more granular. I get that the of the files here needs to be
longer than a transaction, but that can easily be addressed by having a longer
lived resource owner.

Process lifetime may work well for the current worker.c, but even there it
doesn't seem optimal. One e.g. could easily imagine that we'd want to handle
connection errors or configuration changes without restarting the worker, in
which case process lifetime obviously isn't a good idea anymore.


I think SharedFileSetInit() needs a comment explaining that it needs to be
called in a process-lifetime memory context if used without dsm
segments. Because otherwise SharedFileSetDeleteOnProcExit() will access
already freed memory (both for filesetlist and the SharedFileSet itself).

Greetings,

Andres Freund




Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

2021-08-12 Thread Amit Kapila
On Thu, Aug 12, 2021 at 1:52 PM Andres Freund  wrote:
>
> On 2021-08-12 11:46:09 +0530, Amit Kapila wrote:
> > On Thu, Aug 12, 2021 at 11:38 AM Dilip Kumar  wrote:
> > > On Thu, Aug 12, 2021 at 7:39 AM Masahiko Sawada  
> > > wrote:
> > > > It seems to me that moving the shared fileset cleanup to
> > > > before_shmem_exit() is the right approach to fix this problem. The
> > > > issue is fixed by the attached patch.
> > >
> > > +1, the fix makes sense to me.
>
> I'm not so sure. Why does sharedfileset have its own proc exit hook in the
> first place? ISTM that this should be dealt with using resowners, rathers than
> a sharedfileset specific mechanism?
>

The underlying temporary files need to be closed at xact end but need
to survive across transactions. These are registered with the resource
owner via PathNameOpenTemporaryFile/PathNameCreateTemporaryFile and
then closed at xact end. So, we need a way to remove the files used by
the process (apply worker in this particular case) before process exit
and used this proc_exit hook (possibly on the lines of
AtProcExit_Files).

> That said, I think it's fine to go for the ordering change in the short term.
>
>
> > I have also tested and fix works for me. The fix works because
> > pgstat_initialize() is called before we register clean up in
> > SharedFileSetInit(). I am not sure if we need an Assert to ensure that
> > and if so how we can do that? Any suggestions?
>
> I don't think we need to assert that - we'd see failures soon enough if
> that rule were violated...
>

Fair enough.

-- 
With Regards,
Amit Kapila.




Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o

2021-08-12 Thread Andres Freund
Hi,

On 2021-08-12 11:46:09 +0530, Amit Kapila wrote:
> On Thu, Aug 12, 2021 at 11:38 AM Dilip Kumar  wrote:
> > On Thu, Aug 12, 2021 at 7:39 AM Masahiko Sawada  
> > wrote:
> > > It seems to me that moving the shared fileset cleanup to
> > > before_shmem_exit() is the right approach to fix this problem. The
> > > issue is fixed by the attached patch.
> >
> > +1, the fix makes sense to me.

I'm not so sure. Why does sharedfileset have its own proc exit hook in the
first place? ISTM that this should be dealt with using resowners, rathers than
a sharedfileset specific mechanism?

That said, I think it's fine to go for the ordering change in the short term.


> I have also tested and fix works for me. The fix works because
> pgstat_initialize() is called before we register clean up in
> SharedFileSetInit(). I am not sure if we need an Assert to ensure that
> and if so how we can do that? Any suggestions?

I don't think we need to assert that - we'd see failures soon enough if
that rule were violated...

Greetings,

Andres Freund




Re: Expanding regexp_matches flags

2021-08-12 Thread Tom Lane
Jordan Gigov  writes:
> A recent thread gave me the idea that it would be convenient to have
> another flag for `regexp_matches` to make it return a singular
> two-dimensional array of matches when performing a global match.

> Why? Well, basically you avoid having to aggregate the rows afterwards
> using by wrapping it in a subquery.

> Is there some interest in this?

I'm not really convinced that has any value.  The first question you
ought to be answering is whether the recently-pushed regexp function
additions don't already serve whatever use-case you had in mind.

If we do do it, I think it ought to be a different function.  "flag"
values that utterly change the meaning of the output sound like a
pretty bad idea.  Also, "returns setof text[]" is very different from
"returns text[]".  The primary reason we invented regexp_match() a few
years ago was to get away from the ugliness involved in trying to
pretend the former is the latter.

regards, tom lane




Expanding regexp_matches flags

2021-08-12 Thread Jordan Gigov
A recent thread gave me the idea that it would be convenient to have
another flag for `regexp_matches` to make it return a singular
two-dimensional array of matches when performing a global match.

Why? Well, basically you avoid having to aggregate the rows afterwards
using by wrapping it in a subquery.

Is there some interest in this?

The idea is to add a new flag `a` that would imply `g` internally when
performing the match, but then return an array, instead of a set. Or
more accurately it will return a set that will always have exactly one
array. The array would be empty if no matches are found, or would
contain arrays of match results otherwise.

I have not looked into implementing this yet, but I may have time in
8-9 days or so.

For now, I'm just looking if there's support or opposition to the idea.




Re: Worth using personality(ADDR_NO_RANDOMIZE) for EXEC_BACKEND on linux?

2021-08-12 Thread Tom Lane
Robert Haas  writes:
> On Wed, Aug 11, 2021 at 6:24 PM Thomas Munro  wrote:
>> ... I think just doing
>> something like (/me rolls dice) export PG_SHMEM_ADDR=0x800 is
>> a good candidate for something that works on both architectures, being
>> many TB away from everything else (above everything on ARM, between
>> heap etc and libs on Intel but with 8TB of space below it and 120TB
>> above).  That gets the tests passing consistently with unpatched
>> master, -DEXEC_BACKEND, on both flavours of silicon.

> Ugh, OK. So, is there a way that we can get an "easy button" committed
> to the tree?

I don't see why that approach couldn't be incorporated into pg_ctl,
or the postmaster itself.  Given Andres' point that Linux ASLR
disable probably has to happen in pg_ctl, it seems like doing it
in pg_ctl in all cases is the way to move forward.

regards, tom lane




Re: Next Steps with Hash Indexes

2021-08-12 Thread Robert Haas
On Thu, Aug 12, 2021 at 12:22 AM Amit Kapila  wrote:
> The design of the patch has changed since the initial proposal. It
> tries to perform unique inserts by holding a write lock on the bucket
> page to avoid duplicate inserts.

Do you mean that you're holding a buffer lwlock while you search the
whole bucket? If so, that's surely not OK.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Worth using personality(ADDR_NO_RANDOMIZE) for EXEC_BACKEND on linux?

2021-08-12 Thread Robert Haas
On Wed, Aug 11, 2021 at 6:24 PM Thomas Munro  wrote:
> On Thu, Aug 12, 2021 at 1:45 AM Robert Haas  wrote:
> > I think that worked for me on older macOS releases, and then it
> > stopped working at some point after some update or reinstall or
> > something. Unfortunately I don't know any more precisely than that,
> > but it does seem like we have to find some other approach to work on
> > modern systems.
>
> I gave up on trying to make that work once I realised that
> /usr/lib/dyld doesn't seem to obey the flag, so although other
> segments become deterministic and the success rate is fairly high,
> there's still a 600kb wrecking ball swinging around.  I wondered what
> the "slide" range could be... it appears to be fairly small
> (vm_map_get_max_aslr_slide_pages() seems to be the place that's
> determined and it's a 16MB or 256MB window, depending on architecture,
> if I read that right).  Given that, the death of 32 bit processes
> since Catalina, and the typical layout we see, I think just doing
> something like (/me rolls dice) export PG_SHMEM_ADDR=0x800 is
> a good candidate for something that works on both architectures, being
> many TB away from everything else (above everything on ARM, between
> heap etc and libs on Intel but with 8TB of space below it and 120TB
> above).  That gets the tests passing consistently with unpatched
> master, -DEXEC_BACKEND, on both flavours of silicon.

Ugh, OK. So, is there a way that we can get an "easy button" committed
to the tree?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2021-08-12 Thread Robert Haas
On Thu, Aug 12, 2021 at 7:40 AM Nitin Jadhav
 wrote:
> > I suggest making the GUC GUC_UNIT_MS rather than GUC_UNIT_S, but
> > expressing the default in postgresl.conf.sample as 10s rather than
> > 1ms. I tried values measured in milliseconds just for testing
> > purposes and didn't initially understand why it wasn't working. I
> > don't think there's any reason it can't work.
>
> As suggested, I have changed it to GUC_UNIT_MS and kept the default
> value to 10s. I would like to know the reason why it can't be
> GUC_UNIT_S as we are expressing the values in terms of seconds.

I mean, it *could* be. There's just no advantage. Values in seconds
will work correctly either way. But values in milliseconds will only
work if it's GUC_UNIT_MS. It seems to me that it's better to make more
things work rather than fewer.

> > There's no precedent in the tree for the use of ##__VA_ARGS__. On my
> > system it seems to work fine if I just leave out the ##. Any reason
> > not to do that?
>
> I had added this to support if no variable argument are passed to the
> macro. For example, in the previous patches we used to log the
> progress at the end of the operation like
> "ereport_startup_progress(true, "data directory sync (syncfs) complete
> after %ld.%02d s");". Since these calls are removed now, ## is not
> required. Hence removed in the attached patch.

Hmm, OK. That's actually a pretty good reason for using ## there. It
just made me nervous because we have no similar uses of ## in the
backend code. We rely on it elsewhere for concatenation, but not for
comma removal. Let's try leaving it out for now unless somebody else
shows up with a different opinion.

> I had not added extern since those function were not used in the other
> files. Now added to match the project style.

Anything that's not used in other files should be declared static in
the file itself, and not present in the header. Once you fix this for
reset_startup_progress_timeout, the header won't need to include
datatype/timestamp.h any more, which is good, because we don't want
header files to depend on more other header files than necessary.

Looking over this version, I realized something I (or you) should have
noticed sooner: you've added the RegisterTimeout call to
InitPostgres(), but that's for things that are used by all backends,
and this is only used by the startup process. So it seems to me that
the right place is StartupProcessMain. That would have the further
advantage of allowing startup_progress_timeout_handler to be made
static. begin_startup_progress_phase() and
startup_progress_timeout_has_expired() are the actual API functions
though so they will need to remain extern.

@@ -679,7 +680,6 @@ static char *recovery_target_lsn_string;
 /* should be static, but commands/variable.c needs to get at this */
 char  *role_string;

-
 /*
  * Displayable names for context types (enum GucContext)
  *

This hunk should be removed.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: call popcount32/64 directly on non-x86 platforms

2021-08-12 Thread Alvaro Herrera
On 2021-Aug-13, David Rowley wrote:

> Maybe you saw that there's no such equivalent test when we set
> HAVE_X86_64_POPCNTQ for MSVC on x86_64.  The reason for that is that
> we do the run-time test using cpuid.

Yeah, that and also I mistook the two independent "ifdef" blocks for one
block with an "#else".  Re-reading the ifdef maze, it looks reasonable.

-- 
Álvaro Herrera   39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/




Re: Shared memory size computation oversight?

2021-08-12 Thread Julien Rouhaud
On Thu, Aug 12, 2021 at 9:34 PM Peter Eisentraut
 wrote:
>
> On 27.02.21 09:08, Julien Rouhaud wrote:
> > PFA a patch that fixes pg_prewarm and pg_stat_statements explicit alignment 
> > to
> > CACHELINEALIGN, and also update the alignment in hash_estimate_size() to 
> > what I
> > think ShmemInitHash will actually consume.
>
> For extensions, wouldn't it make things better if
> RequestAddinShmemSpace() applied CACHELINEALIGN() to its argument?
>
> If you consider the typical sequence of RequestAddinShmemSpace(mysize())
> and later ShmemInitStruct(..., mysize(), ...),

But changing RequestAddinShmemSpace() to apply CACHELINEALIGN() would
only really work for that specific usage only?  If an extension does
multiple allocations you can't rely on correct results.

> then the size gets
> rounded up to cache-line size in the second case and not the first.  The
> premise in your patch is that the extension should figure that out
> before calling RequestAddinShmemSpace(), but that seems wrong.
>
> Btw., I think your patch was wrong to apply CACHELINEALIGN() to
> intermediate results rather than at the end.

I'm not sure why you think it's wrong.  It's ShmemInitStruct() that
will apply the padding, so if the extension calls it multiple times
(whether directly or indirectly), then the padding will be added
multiple times.  Which means that in theory the extension should
account for it multiple time in the amount of memory it's requesting.

But given the later discussion, ISTM that there's an agreement that
any number of CACHELINEALIGN()  overhead for any number of allocation
can be ignored as it should be safely absorbed by the 100kB extra
space.  I think that the real problem, as mentioned in my last email,
is that the shared htab size estimation doesn't really scale and can
easily eat the whole extra space if you use some not that unrealistic
parameters.  I still don't know if I made any mistake trying to
properly account for the htab allocation, but if I didn't it can be
problematic.




Re: call popcount32/64 directly on non-x86 platforms

2021-08-12 Thread John Naylor
On Thu, Aug 12, 2021 at 9:33 AM David Rowley  wrote:

> Something there that might cause confusion is we do a configure check
> to see if popcntq works and define HAVE_X86_64_POPCNTQ if it does.
> I'm still a bit confused at why we bother doing that. Surely it just
> means that if the build machine does not have popcntq that we'll
> always use pg_popcount64_slow, regardless if the machine that's
> actually running the code has popcntq.

Yeah, it's a bit strange, a configure check makes more sense if we have a
way to specify we can build with a direct call (like x86-64-v2), but we
don't right now. Maybe short-term we could always do the runtime check on
x86-64.

--
John Naylor
EDB: http://www.enterprisedb.com


Re: Shared memory size computation oversight?

2021-08-12 Thread Peter Eisentraut

On 27.02.21 09:08, Julien Rouhaud wrote:

PFA a patch that fixes pg_prewarm and pg_stat_statements explicit alignment to
CACHELINEALIGN, and also update the alignment in hash_estimate_size() to what I
think ShmemInitHash will actually consume.


For extensions, wouldn't it make things better if 
RequestAddinShmemSpace() applied CACHELINEALIGN() to its argument?


If you consider the typical sequence of RequestAddinShmemSpace(mysize()) 
and later ShmemInitStruct(..., mysize(), ...), then the size gets 
rounded up to cache-line size in the second case and not the first.  The 
premise in your patch is that the extension should figure that out 
before calling RequestAddinShmemSpace(), but that seems wrong.


Btw., I think your patch was wrong to apply CACHELINEALIGN() to 
intermediate results rather than at the end.





Re: call popcount32/64 directly on non-x86 platforms

2021-08-12 Thread David Rowley
On Fri, 13 Aug 2021 at 01:28, David Rowley  wrote:
>
> On Fri, 13 Aug 2021 at 01:11, Alvaro Herrera  wrote:
> > So when on MSVC, you don't have to check CPUID for support?
>
> That still needs to be checked in MSVC and as far as I can see it is
> being properly checked.

Something there that might cause confusion is we do a configure check
to see if popcntq works and define HAVE_X86_64_POPCNTQ if it does.
I'm still a bit confused at why we bother doing that. Surely it just
means that if the build machine does not have popcntq that we'll
always use pg_popcount64_slow, regardless if the machine that's
actually running the code has popcntq.

Maybe you saw that there's no such equivalent test when we set
HAVE_X86_64_POPCNTQ for MSVC on x86_64.  The reason for that is that
we do the run-time test using cpuid.

David




Re: [PATCH] OpenSSL: Mark underlying BIO with the appropriate type flags

2021-08-12 Thread Daniel Gustafsson
> On 12 Aug 2021, at 12:00, Itamar Gafni  wrote:

> Copying it means there are two BIO_METHOD objects with the same type (ortig
> socket and its copy) but it's unclear if that's an issue.


Reading code there doesn't seem to be a concensus really, Apple is doing it in
the Swift TLS code but otheres aren't.  Considering there's been no complaints
on this and OpenSSL < 1.1.0 is EOL I say we leave it for now.

Barring objections I will go ahead with your proposed patch on HEAD and
backpatch it all the way, once I've done more testing on it.

--
Daniel Gustafsson   https://vmware.com/





Re: call popcount32/64 directly on non-x86 platforms

2021-08-12 Thread David Rowley
On Fri, 13 Aug 2021 at 01:11, Alvaro Herrera  wrote:
> So when on MSVC, you don't have to check CPUID for support?

That still needs to be checked in MSVC and as far as I can see it is
being properly checked.

David




Re: call popcount32/64 directly on non-x86 platforms

2021-08-12 Thread Alvaro Herrera
So when on MSVC, you don't have to check CPUID for support?

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"La primera ley de las demostraciones en vivo es: no trate de usar el sistema.
Escriba un guión que no toque nada para no causar daños." (Jakob Nielsen)




Re: Grammar fix in hash index README

2021-08-12 Thread John Naylor
On Thu, Aug 12, 2021 at 4:40 AM Dilip Kumar  wrote:
>
> While reading through the README of the hash index I noticed a grammar
> error, the attached patch fixes the same.

LGTM, pushed.

--
John Naylor
EDB: http://www.enterprisedb.com


Re: Added schema level support for publication.

2021-08-12 Thread Masahiko Sawada
On Fri, Aug 6, 2021 at 5:33 PM vignesh C  wrote:
>
> Thanks for the comments, the attached v19 patch has the fixes for the 
> comments.

Thank you for updating the patch!

Here are some comments on v19 patch:

+case OCLASS_PUBLICATION_SCHEMA:
+RemovePublicationSchemaById(object->objectId);
+break;

+void
+RemovePublicationSchemaById(Oid psoid)
+{
+Relation   rel;
+HeapTuple  tup;
+
+rel = table_open(PublicationSchemaRelationId, RowExclusiveLock);
+
+tup = SearchSysCache1(PUBLICATIONSCHEMA, ObjectIdGetDatum(psoid));
+
+if (!HeapTupleIsValid(tup))
+elog(ERROR, "cache lookup failed for publication schema %u",
+ psoid);
+
+CatalogTupleDelete(rel, >t_self);
+
+ReleaseSysCache(tup);
+
+table_close(rel, RowExclusiveLock);
+}

Since RemovePublicationSchemaById() does simple catalog tuple
deletion, it seems to me that we can DropObjectById() to delete the
row of pg_publication_schema.

 ---
 {
-ScanKeyInit([0],
+ScanKeyData skey[1];
+
+ScanKeyInit([0],
 Anum_pg_class_relkind,
 BTEqualStrategyNumber, F_CHAREQ,

CharGetDatum(RELKIND_PARTITIONED_TABLE));

-scan = table_beginscan_catalog(classRel, 1, key);
+scan = table_beginscan_catalog(classRel, 1, skey);

Since we specify 1 as the number of keys in table_beginscan_catalog(),
can we reuse 'key' instead of using 'skey'?

---
Even if we drop all tables added to the publication from it, 'pubkind'
doesn't go back to 'empty'. Is that intentional behavior? If we do
that, we can save the lookup of pg_publication_rel and
pg_publication_schema in some cases, and we can switch the publication
that was created as FOR SCHEMA to FOR TABLE and vice versa.

---
+static void
+UpdatePublicationKindTupleValue(Relation rel, HeapTuple tup, int col,
+char pubkind)

Since all callers of this function specify Anum_pg_publication_pubkind
to 'col', it seems 'col' is not necessary.

---
+static void
+AlterPublicationSchemas(AlterPublicationStmt *stmt, Relation rel,
+HeapTuple tup,
Form_pg_publication pubform)

I think we don't need to pass 'pubform' to this function since we can
get it by GETSTRUCT(tup).

---
+OidschemaId = get_rel_namespace(relid);
 List  *pubids = GetRelationPublications(relid);
+List  *schemaPubids = GetSchemaPublications(schemaId);

Can we defer to get the list of schema publications (i.g.,
'schemaPubids') until we find the PUBKIND_SCHEMA publication? Perhaps
the same is true for building 'pubids'.

---
+   List of publications
+Name|  Owner   | All tables | Inserts
| Updates | Deletes | Truncates | Via root | PubKind
++--++-+-+-+---+--+-
+ testpib_ins_trunct | regress_publication_user | f  | t
| f   | f   | f | f| e
+ testpub_default| regress_publication_user | f  | f
| t   | f   | f | f| e

I think it's more readable if \dRp shows 'all tables', 'table',
'schema', and 'empty' in PubKind instead of the single character.

I think 'Pub kind' is more consistent with other column names.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Skipping logical replication transactions on subscriber side

2021-08-12 Thread Greg Nancarrow
On Thu, Aug 12, 2021 at 9:18 PM Amit Kapila  wrote:
>
> > A minor comment on the 0001 patch: In the message I think that using
> > "ID" would look better than lowercase "id" and AFAICS it's more
> > consistent with existing messages.
> >
> > + appendStringInfo(, _(" in transaction id %u with commit timestamp 
> > %s"),
> >
>
> You have a point but I think in this case it might look a bit odd as
> we have another field 'commit timestamp' after that which is
> lowercase.
>

I did a quick search and I couldn't find any other messages in the
Postgres code that use "transaction id", but I could find some that
use "transaction ID" and "transaction identifier".


Regards,
Greg Nancarrow
Fujitsu Australia




RE: [PATCH] OpenSSL: Mark underlying BIO with the appropriate type flags

2021-08-12 Thread Itamar Gafni
Not sure what is the expected use with previous versions. It used to be that 
the BIO_s_socket() would return a non-const pointer, so the original socket 
methods object could be edited.
Copying it means there are two BIO_METHOD objects with the same type (ortig 
socket and its copy) but it's unclear if that's an issue.

Itamar Gafni
Imperva

-Original Message-
From: Daniel Gustafsson 
Sent: Monday, 9 August 2021 18:36
To: Itamar Gafni 
Cc: pgsql-hackers@lists.postgresql.org
Subject: Re: [PATCH] OpenSSL: Mark underlying BIO with the appropriate type 
flags

CAUTION: This message was sent from outside the company. Do not click links or 
open attachments unless you recognize the sender and know the content is safe.


> On 6 Aug 2021, at 12:16, Itamar Gafni  wrote:

> Previous to OpenSSL version 1.1.0, the BIO methods object would be copied 
> directly from the existing socket type and then its read\write functions 
> would be replaced.
> With 1.1.0 and up, the object is created from scratch and then all its 
> methods are initialized to be the ones of the socket type, except read/write 
> which are custom.
> In this newer way, a new type is given to it by calling “BIO_get_new_index”, 
> but the related type flags aren’t added.

According to the documentation (I haven't tested it yet but will) I think you 
are right that the type should be set with the appropriate BIO_TYPE_ flags.

For OpenSSL 1.0.1 and 1.0.2, wouldn't we need to set the .type with a randomly 
chosen index anded with BIO_TYPE_DESCRIPTOR and BIO_TYPE_SOURCE_SINK as well?

--
Daniel Gustafsson   https://vmware.com/

---
NOTICE:
This email and all attachments are confidential, may be proprietary, and may be 
privileged or otherwise protected from disclosure. They are intended solely for 
the individual or entity to whom the email is addressed. However, mistakes 
sometimes happen in addressing emails. If you believe that you are not an 
intended recipient, please stop reading immediately. Do not copy, forward, or 
rely on the contents in any way. Notify the sender and/or Imperva, Inc. by 
telephone at +1 (650) 832-6006 and then delete or destroy any copy of this 
email and its attachments. The sender reserves and asserts all rights to 
confidentiality, as well as any privileges that may apply. Any disclosure, 
copying, distribution or action taken or omitted to be taken by an unintended 
recipient in reliance on this message is prohibited and may be unlawful.
Please consider the environment before printing this email.


Re: .ready and .done files considered harmful

2021-08-12 Thread Dipesh Pandit
Hi,

Thanks for the feedback.

The possible path that archiver can take for each cycle is either a fast
path or a fall-back patch. The fast path involves checking availability of
next anticipated log segment and decide the next target for archival or
a fall-back path which involves full directory scan to get the next log
segment.
We need a mechanism that enables the archiver to select the desired path
for each cycle.

This can be achieved by maintaining a shared memory flag. If this flag is
set
then archiver should take the fall-back path otherwise it should continue
with
the fast path.

This flag can be set by backend in case if an action like timeline switch,
.ready files created out of order,...  requires archiver to perform a full
directory scan.

I have incorporated these changes and updated a new patch. PFA patch v6.

Thanks,
Dipesh
From c636dffab006c15cf3a8656982679fbe6a6c6440 Mon Sep 17 00:00:00 2001
From: Dipesh Pandit 
Date: Wed, 30 Jun 2021 14:05:58 +0530
Subject: [PATCH] mitigate directory scan for WAL archiver

WAL archiver scans the status directory to identify the next WAL file
that needs to be archived. This directory scan can be minimised by
maintaining the log segment number of current file which is being
archived and incrementing it by '1' to get the next WAL file.
Archiver can check the availability of next file and in case if the
file is not available then it should fall-back to directory scan to
get the oldest WAL file.

If there is a timeline switch then archiver performs a full directory
scan to make sure that archiving history file takes precedence over
archiving WAL files on older timeline.
---
 src/backend/access/transam/xlog.c|  15 +++
 src/backend/access/transam/xlogarchive.c |   9 ++
 src/backend/postmaster/pgarch.c  | 163 ---
 src/include/postmaster/pgarch.h  |   1 +
 4 files changed, 175 insertions(+), 13 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index f84c0bb..088ab43 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -50,6 +50,7 @@
 #include "port/atomics.h"
 #include "port/pg_iovec.h"
 #include "postmaster/bgwriter.h"
+#include "postmaster/pgarch.h"
 #include "postmaster/startup.h"
 #include "postmaster/walwriter.h"
 #include "replication/basebackup.h"
@@ -7555,6 +7556,13 @@ StartupXLOG(void)
 	 */
 	if (AllowCascadeReplication())
 		WalSndWakeup();
+
+	/*
+	 * Switched to a new timeline, notify archiver to enable
+	 * directory scan.
+	 */
+	if (XLogArchivingActive())
+		PgArchEnableDirScan();
 }
 
 /* Exit loop if we reached inclusive recovery target */
@@ -7797,6 +7805,13 @@ StartupXLOG(void)
 			 EndRecPtr, reason);
 
 		/*
+		 * Switched to a new timeline, notify archiver to enable directory
+		 * scan.
+		 */
+		if (XLogArchivingActive())
+			PgArchEnableDirScan();
+
+		/*
 		 * Since there might be a partial WAL segment named RECOVERYXLOG, get
 		 * rid of it.
 		 */
diff --git a/src/backend/access/transam/xlogarchive.c b/src/backend/access/transam/xlogarchive.c
index 26b023e..bccbbb6 100644
--- a/src/backend/access/transam/xlogarchive.c
+++ b/src/backend/access/transam/xlogarchive.c
@@ -609,6 +609,15 @@ XLogArchiveCheckDone(const char *xlog)
 
 	/* Retry creation of the .ready file */
 	XLogArchiveNotify(xlog);
+
+	/*
+	 * This .ready file is created out of order, notify archiver to perform
+	 * a full directory scan to archive corresponding WAL file.
+	 */
+	StatusFilePath(archiveStatusPath, xlog, ".ready");
+	if (stat(archiveStatusPath, _buf) == 0)
+		PgArchEnableDirScan();
+
 	return false;
 }
 
diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c
index 74a7d7c..e5ea7a6 100644
--- a/src/backend/postmaster/pgarch.c
+++ b/src/backend/postmaster/pgarch.c
@@ -76,8 +76,23 @@
 typedef struct PgArchData
 {
 	int			pgprocno;		/* pgprocno of archiver process */
+
+	/*
+	 * Flag to enable/disable directory scan. If this flag is set then it
+	 * forces archiver to perform a full directory scan to get the next log
+	 * segment.
+	 */
+	pg_atomic_flag dirScan;
 } PgArchData;
 
+/*
+ * Segment number and timeline ID to identify the next file in a WAL sequence
+ */
+typedef struct readyXLogState
+{
+	XLogSegNo	lastSegNo;
+	TimeLineID	lastTLI;
+} readyXLogState;
 
 /* --
  * Local data
@@ -97,12 +112,13 @@ static volatile sig_atomic_t ready_to_stop = false;
  */
 static void pgarch_waken_stop(SIGNAL_ARGS);
 static void pgarch_MainLoop(void);
-static void pgarch_ArchiverCopyLoop(void);
+static void pgarch_ArchiverCopyLoop(readyXLogState *xlogState);
 static bool pgarch_archiveXlog(char *xlog);
-static bool pgarch_readyXlog(char *xlog);
+static bool pgarch_readyXlog(char *xlog, readyXLogState *xlogState);
 static void pgarch_archiveDone(char *xlog);
 static void pgarch_die(int code, Datum arg);
 static void HandlePgArchInterrupts(void);
+static void 

Re: when the startup process doesn't (logging startup delays)

2021-08-12 Thread Nitin Jadhav
> startup_progress_timer_expired should be declared as sig_atomic_t like
> we do in other cases (see interrupt.c).

Fixed.

> The default value of the new GUC is 10s in postgresql.conf.sample, but
> -1 in guc.c. They should both be 10s, and the one in
> postgresql.conf.sample should be commented out.

Fixed.

> I suggest making the GUC GUC_UNIT_MS rather than GUC_UNIT_S, but
> expressing the default in postgresl.conf.sample as 10s rather than
> 1ms. I tried values measured in milliseconds just for testing
> purposes and didn't initially understand why it wasn't working. I
> don't think there's any reason it can't work.

As suggested, I have changed it to GUC_UNIT_MS and kept the default
value to 10s. I would like to know the reason why it can't be
GUC_UNIT_S as we are expressing the values in terms of seconds.

> I would prefer to see log_startup_progress_interval declared and
> defined in startup.c/startup.h rather than guc.c/guc.h.

Fixed.

> There's no precedent in the tree for the use of ##__VA_ARGS__. On my
> system it seems to work fine if I just leave out the ##. Any reason
> not to do that?

I had added this to support if no variable argument are passed to the
macro. For example, in the previous patches we used to log the
progress at the end of the operation like
"ereport_startup_progress(true, "data directory sync (syncfs) complete
after %ld.%02d s");". Since these calls are removed now, ## is not
required. Hence removed in the attached patch.

> Two of the declarations in startup.h forgot the leading "extern",
> while the other two that are right next to them have it, matching
> project style.

I had not added extern since those function were not used in the other
files. Now added to match the project style.

> I'm reasonably happy with most of the identifier names now, but I
> think init_startup_progress() is confusing. The reason I think that is
> that we call it more than once, which is not really what people think
> about when they think of an "init" function, I think. It's not
> initializing the startup progress facility in general; it's preparing
> for the next phase of startup progress reporting. How about renaming
> it to begin_startup_progress_phase()? And then
> startup_process_op_start_time could be
> startup_process_phase_start_time to match.

Yes begin_startup_progress_phase() looks more appropriate. Instead of
startup_process_phase_start_time, startup_progress_phase_start_time
looks more appropriate. Changed these in the attached patch.

> SyncDataDirectory() potentially walks over the data directory three
> times: once to call do_syncfs(), once to call pre_sync_fname(), and
> once to call datadir_fsync_fname(). With this patch, the first and
> third will emit progress messages if the operation runs long, but the
> second will not. I think they should all be treated the same. Also,
> the locations where you've inserted calls to init_startup_progress()
> don't really make it clear with what code that's associated. I'd put
> them *immediately* before the call to do_syncfs() or walkdir().

Fixed.

> Remember that PostgreSQL comments are typically written "telegraph
> style," so function comments should say "Does whatever" not "It does
> whatever". Most of them are correct, but there's one sentence you need
> to fix.

Fixed in the function comments of
startup_progress_timeout_has_expired(). Please let me now if this is
not the one you wanted me to correct.

> You mentioned previously that you would add documentation, but I do
> not see it here.

Sorry. I missed this. I have added the documentation in the attached patch.
On Tue, Aug 10, 2021 at 8:56 PM Robert Haas  wrote:
>
> On Tue, Aug 10, 2021 at 9:28 AM Nitin Jadhav
>  wrote:
> > Please find the updated patch attached.
>
> I think this is getting close. The output looks nice. However, I still
> see a bunch of issues.
>
> You mentioned previously that you would add documentation, but I do
> not see it here.
>
> startup_progress_timer_expired should be declared as sig_atomic_t like
> we do in other cases (see interrupt.c).
>
> The default value of the new GUC is 10s in postgresql.conf.sample, but
> -1 in guc.c. They should both be 10s, and the one in
> postgresql.conf.sample should be commented out.
>
> I suggest making the GUC GUC_UNIT_MS rather than GUC_UNIT_S, but
> expressing the default in postgresl.conf.sample as 10s rather than
> 1ms. I tried values measured in milliseconds just for testing
> purposes and didn't initially understand why it wasn't working. I
> don't think there's any reason it can't work.
>
> I would prefer to see log_startup_progress_interval declared and
> defined in startup.c/startup.h rather than guc.c/guc.h.
>
> There's no precedent in the tree for the use of ##__VA_ARGS__. On my
> system it seems to work fine if I just leave out the ##. Any reason
> not to do that?
>
> Two of the declarations in startup.h forgot the leading "extern",
> while the other two that are right next to them have it, 

Re: [bug] Logical Decoding of relation rewrite with toast does not reset toast_hash

2021-08-12 Thread Amit Kapila
On Thu, Aug 12, 2021 at 4:30 PM Amit Kapila  wrote:
>
> On Tue, Aug 10, 2021 at 5:30 PM Drouvot, Bertrand  wrote:
> >
> >
> > Please find attached the new version that:
> >
> > - sets "relwrewrite" for the toast.
> >
>
> --- a/src/backend/commands/tablecmds.c
> +++ b/src/backend/commands/tablecmds.c
> @@ -3861,6 +3861,10 @@ RenameRelationInternal(Oid myrelid, const char
> *newrelname, bool is_internal, bo
>   */
>   namestrcpy(&(relform->relname), newrelname);
>
> + /* reset relrewrite for toast */
> + if (relform->relkind == RELKIND_TOASTVALUE)
> + relform->relrewrite = InvalidOid;
> +
>
> I find this change quite ad-hoc. I think this API is quite generic to
> make such a change. I see two ways for this (a) pass a new bool flag
> (reset_toast_rewrite) in this API and then make this change, (b) in
> the specific place where we need this, change relrewrite separately
> via a new API.
>
> I would prefer (b) in the ideal case, but I understand it would be an
> additional cost, so maybe (a) is also okay. What do you people think?
>

One minor comment:
+/*
+ * Test decoding relation rewrite with toast.
+ * The insert into tbl2 within the same transaction
+ * is there to check there is no remaining toast_hash
+ * not being reset.
+ */

You can extend each line of comment up to 80 chars. The current one
looks a bit odd.


-- 
With Regards,
Amit Kapila.




Re: Skipping logical replication transactions on subscriber side

2021-08-12 Thread Amit Kapila
On Thu, Aug 12, 2021 at 1:21 PM Greg Nancarrow  wrote:
>
> On Thu, Aug 12, 2021 at 3:54 PM Masahiko Sawada  wrote:
> >
> > I've attached the updated patches. FYI I've included the patch
> > (v8-0005) that fixes the assertion failure during shared fileset
> > cleanup to make cfbot tests happy.
> >
>
> A minor comment on the 0001 patch: In the message I think that using
> "ID" would look better than lowercase "id" and AFAICS it's more
> consistent with existing messages.
>
> + appendStringInfo(, _(" in transaction id %u with commit timestamp %s"),
>

You have a point but I think in this case it might look a bit odd as
we have another field 'commit timestamp' after that which is
lowercase.


-- 
With Regards,
Amit Kapila.




Re: [bug] Logical Decoding of relation rewrite with toast does not reset toast_hash

2021-08-12 Thread Amit Kapila
On Tue, Aug 10, 2021 at 5:30 PM Drouvot, Bertrand  wrote:
>
> Hi Amit,
>
> On 8/9/21 1:12 PM, Amit Kapila wrote:
> > On Mon, Aug 9, 2021 at 3:37 PM Drouvot, Bertrand  
> > wrote:
> >> Hi Amit,
> >>
> >> On 8/9/21 10:37 AM, Amit Kapila wrote:
> >>> On Fri, Jul 9, 2021 at 12:22 PM Drouvot, Bertrand  
> >>> wrote:
>  Please find enclosed a patch proposal to:
> 
>  * Avoid the failed assertion on current master and generate the error 
>  message instead (should the code reach that stage).
>  * Reset the toast_hash in case of relation rewrite with toast (so that 
>  the logical decoding in the above repro is working).
> 
> >>> I think instead of resetting toast_hash for this case why don't we set
> >>> 'relrewrite' for toast tables as well during rewrite? If we do that
> >>> then we will simply skip assembling toast chunks for the toast table.
> >> Thanks for looking at it!
> >>
> >> I do agree, that would be even better than the current patch approach:
> >> I'll work on it.
> >>
> >>> In make_new_heap(), we are calling NewHeapCreateToastTable() to create
> >>> toast table where we can pass additional information (probably
> >>> 'toastid'), if required to set 'relrewrite'. Additionally, let's add a
> >>> test case if possible for this.
> >> + 1 for the test case, it will be added in the next version of the patch.
> >>
> > Thanks, please see, if you can prepare patches for the back-branches as 
> > well.
>
> Please find attached the new version that:
>
> - sets "relwrewrite" for the toast.
>

--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -3861,6 +3861,10 @@ RenameRelationInternal(Oid myrelid, const char
*newrelname, bool is_internal, bo
  */
  namestrcpy(&(relform->relname), newrelname);

+ /* reset relrewrite for toast */
+ if (relform->relkind == RELKIND_TOASTVALUE)
+ relform->relrewrite = InvalidOid;
+

I find this change quite ad-hoc. I think this API is quite generic to
make such a change. I see two ways for this (a) pass a new bool flag
(reset_toast_rewrite) in this API and then make this change, (b) in
the specific place where we need this, change relrewrite separately
via a new API.

I would prefer (b) in the ideal case, but I understand it would be an
additional cost, so maybe (a) is also okay. What do you people think?

-- 
With Regards,
Amit Kapila.




Re: [bug] Logical Decoding of relation rewrite with toast does not reset toast_hash

2021-08-12 Thread Amit Kapila
On Thu, Aug 12, 2021 at 12:15 PM Drouvot, Bertrand  wrote:
>
> On 8/10/21 1:59 PM, Drouvot, Bertrand wrote:
> > Hi Amit,
> >
>
> The first version of the patch contained a change in
> ReorderBufferToastReplace() (to put the call to
> RelationIsValid(toast_rel) and display the error message when it is not
> valid before the call to ReorderBufferChangeMemoryUpdate()).
>
> That way we also avoid the failed assertion described in the first
> message of this thread (but would report the error message instead).
>
> Forgot to mention that I did not add this change in the new patch
> version
>

I think that is the right decision.

-- 
With Regards,
Amit Kapila.




Re: Added schema level support for publication.

2021-08-12 Thread Masahiko Sawada
On Mon, Aug 9, 2021 at 1:53 PM Amit Kapila  wrote:
>
> On Sun, Aug 8, 2021 at 2:52 PM vignesh C  wrote:
> >
> > On Fri, Aug 6, 2021 at 4:39 PM Amit Kapila  wrote:
> > >
> > > On Fri, Aug 6, 2021 at 2:02 PM vignesh C  wrote:
> > > >
> > > > On Wed, Aug 4, 2021 at 4:10 PM Amit Kapila  
> > > > wrote:
> > > > >
> > > > > On Tue, Aug 3, 2021 at 8:38 PM vignesh C  wrote:
> > > >
> > > > > 6.
> > > > > + {PublicationSchemaRelationId, /* PUBLICATIONSCHEMAMAP */
> > > > > + PublicationSchemaPsnspcidPspubidIndexId,
> > > > > + 2,
> > > > > + {
> > > > > + Anum_pg_publication_schema_psnspcid,
> > > > > + Anum_pg_publication_schema_pspubid,
> > > > > + 0,
> > > > > + 0
> > > > > + },
> > > > >
> > > > > Why don't we keep pubid as the first column in this index?
> > > >
> > > > I wanted to keep it similar to PUBLICATIONRELMAP, should we keep it as
> > > > it is, thoughts?
> > > >
> > >
> > > Okay, I see your point. I think for PUBLICATIONRELMAP, we need it
> > > because it is searched using the only relid in
> > > GetRelationPublications, so, similarly, in the patch, you are using
> > > schema_oid in GetSchemaPublications, so probably that will help. I was
> > > wondering why you haven't directly used the cache in
> > > GetSchemaPublications similar to GetRelationPublications?
> >
> > Both of the approaches work, I was not sure which one is better, If
> > the approach in GetRelationPublications is better, I will change it to
> > something similar to GetRelationPublications. Thoughts?
> >
>
> I think it is better to use the cache as if we don't find the entry in
> the cache, then we will anyway search the required entry via sys table
> scan, see SearchCatCacheList.

+1

I had the same comment while reading the v19 patch.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Grammar fix in hash index README

2021-08-12 Thread Dilip Kumar
While reading through the README of the hash index I noticed a grammar
error, the attached patch fixes the same.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From 2760cd42e7f6354399d30ac56741268130f6e017 Mon Sep 17 00:00:00 2001
From: Dilip Kumar 
Date: Thu, 12 Aug 2021 14:05:46 +0530
Subject: [PATCH] Fix grammar mistake in hash index README

---
 src/backend/access/hash/README | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/access/hash/README b/src/backend/access/hash/README
index 2227ebf..a2b8697 100644
--- a/src/backend/access/hash/README
+++ b/src/backend/access/hash/README
@@ -215,7 +215,7 @@ Pseudocode Algorithms
 Various flags that are used in hash index operations are described as below:
 
 The bucket-being-split and bucket-being-populated flags indicate that split
-the operation is in progress for a bucket.  During split operation, a
+operation is in progress for a bucket.  During split operation, a
 bucket-being-split flag is set on the old bucket and bucket-being-populated
 flag is set on new bucket.  These flags are cleared once the split operation
 is finished.
-- 
1.8.3.1



Re: Column Filtering in Logical Replication

2021-08-12 Thread Amit Kapila
On Thu, Aug 12, 2021 at 8:40 AM Rahila Syed  wrote:
>
>>
>> Can you please explain why you have the restriction for including
>> replica identity columns and do we want to put a similar restriction
>> for the primary key? As far as I understand, if we allow default
>> values on subscribers for replica identity, then probably updates,
>> deletes won't work as they need to use replica identity (or PK) to
>> search the required tuple. If so, shouldn't we add this restriction
>> only when a publication has been defined for one of these (Update,
>> Delete) actions?
>
>
> Yes, like you mentioned they are needed for Updates and Deletes to work.
> The restriction for including replica identity columns in column filters 
> exists because
> In case the replica identity column values did not change, the old row 
> replica identity columns
> are not sent to the subscriber, thus we would need new replica identity 
> columns
> to be sent to identify the row that is to be Updated or Deleted.
> I haven't tested if it would break Insert as well  though. I will update the 
> patch accordingly.
>

Okay, but then we also need to ensure that the user shouldn't be
allowed to enable the 'update' or 'delete' for a publication that
contains some filter that doesn't have replica identity columns.

>>
>> Another point is what if someone drops the column used in one of the
>> publications? Do we want to drop the entire relation from publication
>> or just remove the column filter or something else?
>>
>
> Thanks for pointing this out. Currently, this is not handled in the patch.
> I think dropping the column from the filter would make sense on the lines
> of the table being dropped from publication, in case of drop table.
>

I think it would be tricky if you want to remove the column from the
filter because you need to recompute the entire filter and update it
again. Also, you might need to do this for all the publications that
have a particular column in their filter clause. It might be easier to
drop the entire filter but you can check if it is easier another way
than it is good.

>>
>> Do we want to consider that the columns specified in the filter must
>> not have NOT NULL constraint? Because, otherwise, the subscriber will
>> error out inserting such rows?
>>
> I think you mean columns *not* specified in the filter must not have NOT NULL 
> constraint
> on the subscriber, as this will break during insert, as it will try to insert 
> NULL for columns
> not sent by the publisher.
>

Right.

-- 
With Regards,
Amit Kapila.




Re: Skipping logical replication transactions on subscriber side

2021-08-12 Thread Greg Nancarrow
On Thu, Aug 12, 2021 at 3:54 PM Masahiko Sawada  wrote:
>
> I've attached the updated patches. FYI I've included the patch
> (v8-0005) that fixes the assertion failure during shared fileset
> cleanup to make cfbot tests happy.
>

A minor comment on the 0001 patch: In the message I think that using
"ID" would look better than lowercase "id" and AFAICS it's more
consistent with existing messages.

+ appendStringInfo(, _(" in transaction id %u with commit timestamp %s"),


Regards,
Greg Nancarrow
Fujitsu Australia




Re: badly calculated width of emoji in psql

2021-08-12 Thread Pavel Stehule
čt 22. 7. 2021 v 0:12 odesílatel Jacob Champion 
napsal:

> On Wed, 2021-07-21 at 00:08 +, Jacob Champion wrote:
> > I note that the doc comment for ucs_wcwidth()...
> >
> > >  *- Spacing characters in the East Asian Wide (W) or East Asian
> > >  *  FullWidth (F) category as defined in Unicode Technical
> > >  *  Report #11 have a column width of 2.
> >
> > ...doesn't match reality anymore. The East Asian width handling was
> > last updated in 2006, it looks like? So I wonder whether fixing the
> > code to match the comment would not only fix the emoji problem but also
> > a bunch of other non-emoji characters.
>
> Attached is my attempt at that. This adds a second interval table,
> handling not only the emoji range in the original patch but also
> correcting several non-emoji character ranges which are included in the
> 13.0 East Asian Wide/Fullwidth sets. Try for example
>
> - U+2329 LEFT POINTING ANGLE BRACKET
> - U+16FE0 TANGUT ITERATION MARK
> - U+18000 KATAKANA LETTER ARCHAIC E
>
> This should work reasonably well for terminals that depend on modern
> versions of Unicode's EastAsianWidth.txt to figure out character width.
> I don't know how it behaves on BSD libc or Windows.
>
> The new binary search isn't free, but my naive attempt at measuring the
> performance hit made it look worse than it actually is. Since the
> measurement function was previously returning an incorrect (too short)
> width, we used to get a free performance boost by not printing the
> correct number of alignment/border characters. I'm still trying to
> figure out how best to isolate the performance changes due to this
> patch.
>
> Pavel, I'd be interested to see what your benchmarks find with this
> code. Does this fix the original issue for you?
>

This patch fixed badly formatted tables with emoji.

I checked this patch, and it is correct and a step forward, because it
dynamically sets intervals of double wide characters, and the code is more
readable.

I checked and performance, and although there is measurable slowdown, it is
negligible in absolute values. Previous code was a little bit faster - it
checked less ranges, but was not fully correct and up to date.

The patching was without problems
There are no regress tests, but I am not sure so they are necessary for
this case.
make check-world passed without problems

I'll mark this patch as ready for committer

Regards

Pavel





>
> --Jacob
>