Re: Generating partitioning tuple conversion maps faster

2018-06-24 Thread amul sul
On Mon, Jun 25, 2018 at 10:57 AM Amit Langote
 wrote:
>
> On 2018/06/25 14:12, amul sul wrote:
> > On Mon, Jun 25, 2018 at 10:30 AM David Rowley
> >  wrote:
> >>
> > [...]
> >> Given the small size of this patch. I think it's a good candidate for
> >> the July 'fest.
> >>
> >
> > Just a different thought, how about having flag array something like
> > needed_tuple_conv?
> >
> > While loading partdesc, we could calculate that the columns of partition 
> > table
> > are aligned with the parent or not?
>
> Yeah maybe, but ISTM, that could be implemented *in addition to* the
> tweaks to convert_tuples_by_name_map that David's proposing, which seem
> helpful in their own right.
>
Yes, I agree.

Regards,
Amul



Re: Generating partitioning tuple conversion maps faster

2018-06-24 Thread Amit Langote
On 2018/06/25 14:12, amul sul wrote:
> On Mon, Jun 25, 2018 at 10:30 AM David Rowley
>  wrote:
>>
> [...]
>> Given the small size of this patch. I think it's a good candidate for
>> the July 'fest.
>>
> 
> Just a different thought, how about having flag array something like
> needed_tuple_conv?
> 
> While loading partdesc, we could calculate that the columns of partition table
> are aligned with the parent or not?

Yeah maybe, but ISTM, that could be implemented *in addition to* the
tweaks to convert_tuples_by_name_map that David's proposing, which seem
helpful in their own right.

Thanks,
Amit




Re: Generating partitioning tuple conversion maps faster

2018-06-24 Thread amul sul
On Mon, Jun 25, 2018 at 10:30 AM David Rowley
 wrote:
>
[...]
> Given the small size of this patch. I think it's a good candidate for
> the July 'fest.
>

Just a different thought, how about having flag array something like
needed_tuple_conv?

While loading partdesc, we could calculate that the columns of partition table
are aligned with the parent or not?

Regards,
Amul



Generating partitioning tuple conversion maps faster

2018-06-24 Thread David Rowley
Hi,

The code in convert_tuples_by_name_map() could be made a bit smarter
to speed up the search for the column with the same name by first
looking at the same attnum in the other relation rather than starting
the search from the first attnum each time.

I imagine most people are going to create partitions with columns in
the same order as they are in the partitioned table.  This will happen
automatically if the CREATE TABLE .. PARTITION OF syntax is used, so
it makes sense to exploit that knowledge.

I've attached a patch which modifies convert_tuples_by_name_map() to
have it just start the inner loop search in the same position as we
are currently in the outer loop.  Best case, this takes the function
from being O(N^2) to O(N).  It quite possible that the partition or,
more likely, the partitioned table has had some columns dropped
(likely this lives longer than a partition), in which case we don't
want to miss the target column by 1, so I've coded the patch to count
the non-dropped columns and start the search after having ignored
dropped columns.

This patch is just a few lines of code.  I do think there's much more
room to improve this whole area. For example, build child->parent map
from the parent->child map instead of searching again later with the
inputs reversed. Although, that's more than a handful of lines of
code. The change I'm proposing here seems worthwhile to me. FWIW,
something similar is done in make_inh_translation_list(), although I
think my way might have a slightly better chance of not hitting the
worst cases search.

Benchmark: (Uses tables with 1000 columns, each with a name 11 chars in length.)


Setup:
select 'create table listp (' || string_agg('c' ||
left(md5(x::Text),10) || ' int',',') || ') partition by list (c' ||
left(md5('1'),10) || ');' from generate_Series(1,1000) x;
\gexec
create table listp0 partition of listp for values in(0);
create table listp1 partition of listp for values in(1);

select 'create table listpd (' || string_agg('c' ||
left(md5(x::Text),10) || ' int',',') || ') partition by list (c' ||
left(md5('1'),10) || ');' from generate_Series(0,1000) x;
\gexec
select 'alter table listpd drop column c' || left(md5(0::text),10) || ';';
\gexec
create table listpd0 partition of listpd for values in(0);
create table listpd1 partition of listpd for values in(1);

select 'create table list (' || string_agg('c' ||
left(md5(x::Text),10) || ' int',',') || ');' from
generate_Series(1,1000) x;
\gexec

\q
psql -t -c "select 'insert into listp values(' || string_Agg('1',',')
|| ');' from generate_Series(1,1000) x;" postgres > insertp.sql
psql -t -c "select 'insert into listpd values(' || string_Agg('1',',')
|| ');' from generate_Series(1,1000) x;" postgres > insertpd.sql
psql -t -c "select 'insert into list values(' || string_Agg('1',',')
|| ');' from generate_Series(1,1000) x;" postgres > insert.sql
psql -t -c "select 'update list set c' || left(md5('1'),10) || ' = (c'
|| left(md5('1'),10) || ' + 1) & 1;'" postgres > update.sql
psql -t -c "select 'update listp set c' || left(md5('1'),10) || ' =
(c' || left(md5('1'),10) || ' + 1) & 1;'" postgres > updatep.sql
psql -t -c "select 'update listpd set c' || left(md5('1'),10) || ' =
(c' || left(md5('1'),10) || ' + 1) & 1;'" postgres > updatepd.sql


Tests:

# Test 1: INSERT control test for non-partitioned table.
pgbench -n -T 60 -f insert.sql postgres

# Test 2: INSERT perfect match
pgbench -n -T 60 -f insertp.sql postgres

# Test 3: INSERT dropped first column from parent
pgbench -n -T 60 -f insertpd.sql postgres


# Test 4: UPDATE control test for non-partitioned table.
psql -c "truncate list;" postgres
psql -f insert.sql postgres
pgbench -n -T 60 -f update.sql postgres

# Test 5: UPDATE perfect match
psql -c "truncate listp;" postgres
psql -f insertp.sql postgres
pgbench -n -T 60 -f updatep.sql postgres

# Test 6: UPDATE dropped first column from parent
psql -c "truncate listpd;" postgres
psql -f insertpd.sql postgres
pgbench -n -T 60 -f updatepd.sql postgres


Results:

AWS m5d (fsync off)

Unpatched:

Test 1:
tps = 1055.527253 (excluding connections establishing)

Test 2:
tps = 355.308869 (excluding connections establishing)

Test 3:
tps = 361.465022 (excluding connections establishing)

Test 4:
tps = 1107.483236 (excluding connections establishing)

Test 5:
tps = 132.805775 (excluding connections establishing)

Test 6:
tps = 86.303093 (excluding connections establishing)


Patched:

Test 1:
tps = 1055.831144 (excluding connections establishing)

Test 2:
tps = 1014.282634 (excluding connections establishing)

Test 3:
tps = 982.258274 (excluding connections establishing)

Test 4:
tps = 625.429778 (excluding connections establishing)

Test 5:
tps = 525.667826 (excluding connections establishing)

Test 6:
tps = 173.529296 (excluding connections establishing)

I'd have expected Test 6 to do about 480-500tps. Adding debug to check
why it's not revealed that the search is doing what's expected. I'm
unsure why it has not improved 

Re: effect of JIT tuple deform?

2018-06-24 Thread Pavel Stehule
2018-06-25 5:41 GMT+02:00 Andres Freund :

> On 2018-06-25 05:32:37 +0200, Pavel Stehule wrote:
> > 2018-06-24 22:32 GMT+02:00 Dmitry Dolgov <9erthali...@gmail.com>:
> >
> > > > On 23 June 2018 at 08:47, Pavel Stehule 
> wrote:
> > > >
> > > >
> > > > 2018-06-23 8:35 GMT+02:00 Pavel Stehule :
> > > >>
> > > >> Hi
> > > >>
> > > >> I try to measure effect of JIT tuple deform and I don't see any
> possible
> > > >> effect.
> > > >>
> > > >> Is it this feature active in master branch?
> > > >>
> > > >> Is possible to see this feature in EXPLAIN ANALYZE?
> > > >
> > > >
> > > > Unfortunately I got slowdown
> > > >
> > > > 0. shared buffers = 1GB
> > > > 1. create table with 50 int columns
> > > > 2. insert into this table 4M rows
> > >
> > > Hi,
> > >
> > > Looks like I can reproduce the situation you're talking about (with
> some
> > > minor
> > > differences)
> > >
> > > =# explain analyze select sum(data45) from test_deform;
> > > QUERY PLAN
> > > 
> > > ---
> > >  Finalize Aggregate
> > >  (cost=127097.71..127097.72 rows=1 width=8)
> > >  (actual time=813.957..813.957 rows=1 loops=1)
> > >->  Gather
> > >(cost=127097.50..127097.71 rows=2 width=8)
> > >(actual time=813.946..813.950 rows=3 loops=1)
> > >  Workers Planned: 2
> > >  Workers Launched: 2
> > >  ->  Partial Aggregate
> > >  (cost=126097.50..126097.51 rows=1 width=8)
> > >  (actual time=802.585..802.585 rows=1 loops=3)
> > >->  Parallel Seq Scan on test_deform
> > >(cost=0.00..121930.80 rows=180 width=4)
> > >(actual time=0.040..207.326 rows=133 loops=3)
> > >  Planning Time: 0.212 ms
> > >  JIT:
> > >Functions: 6
> > >Generation Time: 3.076 ms
> > >Inlining: false
> > >Inlining Time: 0.000 ms
> > >Optimization: false
> > >Optimization Time: 1.328 ms
> > >Emission Time: 8.601 ms
> > >  Execution Time: 820.127 ms
> > > (16 rows)
> > >
> > > =# set jit_tuple_deforming to off;
> > >
> > > =# explain analyze select sum(data45) from test_deform;
> > > QUERY PLAN
> > > 
> > > ---
> > >  Finalize Aggregate
> > >  (cost=127097.71..127097.72 rows=1 width=8)
> > >  (actual time=742.578..742.578 rows=1 loops=1)
> > >->  Gather
> > >(cost=127097.50..127097.71 rows=2 width=8)
> > >(actual time=742.529..742.569 rows=3 loops=1)
> > >  Workers Planned: 2
> > >  Workers Launched: 2
> > >  ->  Partial Aggregate
> > >  (cost=126097.50..126097.51 rows=1 width=8)
> > >  (actual time=727.876..727.876 rows=1 loops=3)
> > >->  Parallel Seq Scan on test_deform
> > >(cost=0.00..121930.80 rows=180 width=4)
> > >(actual time=0.044..204.097 rows=133 loops=3)
> > >  Planning Time: 0.361 ms
> > >  JIT:
> > >Functions: 4
> > >Generation Time: 2.840 ms
> > >Inlining: false
> > >Inlining Time: 0.000 ms
> > >Optimization: false
> > >Optimization Time: 0.751 ms
> > >Emission Time: 6.436 ms
> > >  Execution Time: 749.827 ms
> > > (16 rows)
> > >
> > > But at the same time on the bigger dataset (20M rows instead of 4M) the
> > > very
> > > same query gets better:
> > >
> > > =# explain analyze select sum(data45) from test_deform;
> > > QUERY PLAN
> > > 
> > > ---
> > >  Finalize Aggregate
> > >  (cost=631482.92..631482.93 rows=1 width=8)
> > >  (actual time=2350.288..2350.288 rows=1 loops=1)
> > >->  Gather
> > >(cost=631482.71..631482.92 rows=2 width=8)
> > >(actual time=2350.276..2350.279 rows=3 loops=1)
> > >  Workers Planned: 2
> > >  Workers Launched: 2
> > >  ->  Partial Aggregate
> > >  (cost=630482.71..630482.72 rows=1 width=8)
> > >  (actual time=2328.378..2328.379 rows=1 loops=3)
> > >->  Parallel Seq Scan on test_deform
> > >(cost=0.00..609649.37 rows=837 width=4)
> > >(actual time=0.027..1175.960 rows=667 loops=3)
> > >  Planning Time: 0.600 ms
> > >  JIT:
> > >Functions: 6
> > >Generation Time: 3.661 ms
> > >Inlining: true
> > >Inlining Time: 65.373 ms
> > >Optimization: true
> > >Optimization Time: 120.885 ms
> > >Emission Time: 58.836 ms
> > >  Execution Time: 2543.280 ms
> > > (16 rows)
> > >
> > > =# set jit_tuple_deforming to off;
> > >
> > > =# explain analyze select sum(data45) from test_deform;
> > > QUERY PLAN
> > > 
> > > ---
> > >  Finalize Aggregate
> > >  

Re: effect of JIT tuple deform?

2018-06-24 Thread Andres Freund
On 2018-06-25 05:32:37 +0200, Pavel Stehule wrote:
> 2018-06-24 22:32 GMT+02:00 Dmitry Dolgov <9erthali...@gmail.com>:
> 
> > > On 23 June 2018 at 08:47, Pavel Stehule  wrote:
> > >
> > >
> > > 2018-06-23 8:35 GMT+02:00 Pavel Stehule :
> > >>
> > >> Hi
> > >>
> > >> I try to measure effect of JIT tuple deform and I don't see any possible
> > >> effect.
> > >>
> > >> Is it this feature active in master branch?
> > >>
> > >> Is possible to see this feature in EXPLAIN ANALYZE?
> > >
> > >
> > > Unfortunately I got slowdown
> > >
> > > 0. shared buffers = 1GB
> > > 1. create table with 50 int columns
> > > 2. insert into this table 4M rows
> >
> > Hi,
> >
> > Looks like I can reproduce the situation you're talking about (with some
> > minor
> > differences)
> >
> > =# explain analyze select sum(data45) from test_deform;
> > QUERY PLAN
> > 
> > ---
> >  Finalize Aggregate
> >  (cost=127097.71..127097.72 rows=1 width=8)
> >  (actual time=813.957..813.957 rows=1 loops=1)
> >->  Gather
> >(cost=127097.50..127097.71 rows=2 width=8)
> >(actual time=813.946..813.950 rows=3 loops=1)
> >  Workers Planned: 2
> >  Workers Launched: 2
> >  ->  Partial Aggregate
> >  (cost=126097.50..126097.51 rows=1 width=8)
> >  (actual time=802.585..802.585 rows=1 loops=3)
> >->  Parallel Seq Scan on test_deform
> >(cost=0.00..121930.80 rows=180 width=4)
> >(actual time=0.040..207.326 rows=133 loops=3)
> >  Planning Time: 0.212 ms
> >  JIT:
> >Functions: 6
> >Generation Time: 3.076 ms
> >Inlining: false
> >Inlining Time: 0.000 ms
> >Optimization: false
> >Optimization Time: 1.328 ms
> >Emission Time: 8.601 ms
> >  Execution Time: 820.127 ms
> > (16 rows)
> >
> > =# set jit_tuple_deforming to off;
> >
> > =# explain analyze select sum(data45) from test_deform;
> > QUERY PLAN
> > 
> > ---
> >  Finalize Aggregate
> >  (cost=127097.71..127097.72 rows=1 width=8)
> >  (actual time=742.578..742.578 rows=1 loops=1)
> >->  Gather
> >(cost=127097.50..127097.71 rows=2 width=8)
> >(actual time=742.529..742.569 rows=3 loops=1)
> >  Workers Planned: 2
> >  Workers Launched: 2
> >  ->  Partial Aggregate
> >  (cost=126097.50..126097.51 rows=1 width=8)
> >  (actual time=727.876..727.876 rows=1 loops=3)
> >->  Parallel Seq Scan on test_deform
> >(cost=0.00..121930.80 rows=180 width=4)
> >(actual time=0.044..204.097 rows=133 loops=3)
> >  Planning Time: 0.361 ms
> >  JIT:
> >Functions: 4
> >Generation Time: 2.840 ms
> >Inlining: false
> >Inlining Time: 0.000 ms
> >Optimization: false
> >Optimization Time: 0.751 ms
> >Emission Time: 6.436 ms
> >  Execution Time: 749.827 ms
> > (16 rows)
> >
> > But at the same time on the bigger dataset (20M rows instead of 4M) the
> > very
> > same query gets better:
> >
> > =# explain analyze select sum(data45) from test_deform;
> > QUERY PLAN
> > 
> > ---
> >  Finalize Aggregate
> >  (cost=631482.92..631482.93 rows=1 width=8)
> >  (actual time=2350.288..2350.288 rows=1 loops=1)
> >->  Gather
> >(cost=631482.71..631482.92 rows=2 width=8)
> >(actual time=2350.276..2350.279 rows=3 loops=1)
> >  Workers Planned: 2
> >  Workers Launched: 2
> >  ->  Partial Aggregate
> >  (cost=630482.71..630482.72 rows=1 width=8)
> >  (actual time=2328.378..2328.379 rows=1 loops=3)
> >->  Parallel Seq Scan on test_deform
> >(cost=0.00..609649.37 rows=837 width=4)
> >(actual time=0.027..1175.960 rows=667 loops=3)
> >  Planning Time: 0.600 ms
> >  JIT:
> >Functions: 6
> >Generation Time: 3.661 ms
> >Inlining: true
> >Inlining Time: 65.373 ms
> >Optimization: true
> >Optimization Time: 120.885 ms
> >Emission Time: 58.836 ms
> >  Execution Time: 2543.280 ms
> > (16 rows)
> >
> > =# set jit_tuple_deforming to off;
> >
> > =# explain analyze select sum(data45) from test_deform;
> > QUERY PLAN
> > 
> > ---
> >  Finalize Aggregate
> >  (cost=631482.92..631482.93 rows=1 width=8)
> >  (actual time=3616.977..3616.977 rows=1 loops=1)
> >->  Gather
> >(cost=631482.71..631482.92 rows=2 width=8)
> >(actual time=3616.959..3616.971 rows=3 loops=1)
> >  Workers Planned: 2
> >  Workers Launched: 2
> >  ->  Partial 

Re: Is PG built on any C compilers where int division floors?

2018-06-24 Thread Tom Lane
Chapman Flack  writes:
> C99 finally pinned down what / does on signed ints, truncating toward zero.
> Before that, it could truncate toward zero, or floor toward -inf.
> Is PostgreSQL built on any compilers/platforms that have the floor
> behavior?

I'm not sure if we still have any buildfarm animals that act that way[1],
but the project policy is that we target C90 not C99.  So wiring in any
assumptions about this would seem to be contrary to policy.

regards, tom lane

[1] gaur/pademelon might, but it's turned off and 300 miles away, so
I can't check right now.



Re: bug with expression index on partition

2018-06-24 Thread Amit Langote
On 2018/06/23 5:54, Alvaro Herrera wrote:
> On 2018-Jun-21, Amit Langote wrote:
> 
>> On 2018/06/21 15:35, Amit Langote wrote:
>>> So, CompareIndexInfo and generateClonedIndexStmt are both doing the right
>>> thing, but DefineIndex is not.  Attached is a patch to fix DefineIndex so
>>> that it converts indexParams before recursing to create the index on a
>>> partition.
>>
>> I noticed that while CompareIndexInfo and generateClonedIndexStmt would
>> reject the case where index expressions contain a whole-row Var, my patch
>> didn't teach to do the same to DefineIndex, causing asymmetric behavior.
>> So, whereas ATTACH PARTITION would error out when trying to clone a
>> parent's index that contains a whole-row Var, recursively creating an
>> index on partition won't.
>>
>> I updated the patch so that even DefineIndex will check if any whole-row
>> Vars were encountered during conversion and error out if so.
> 
> Thanks.  I pushed this version, although I tweaked the test so that this
> condition is verified in the test that is supposed to do so, rather than
> creating a whole new set of tables for this purpose.  The way it would
> fail with unpatched code is perhaps less noisy that what you proposed,
> but I think it's quite enough.

Thanks.  The test changes you committed seem fine.

Regards,
Amit




Incorrect fsync handling in pg_basebackup's tar_finish

2018-06-24 Thread Michael Paquier
Hi all,

I was just looking at the code of pg_basebackup, and noticed that we
don't actually check if the two last empty blocks of any tar file
produced are correctly fsync'd or not:
@@ -957,7 +957,10 @@ tar_finish(void)

 /* sync the empty blocks as well, since they're after the last file */
 if (tar_data->sync)
-   fsync(tar_data->fd);
+   {
+   if (fsync(tar_data->fd) != 0)
+   return false;
+   }

That looks incorrect to me, hence shouldn't something like the attached
be done?  Magnus and others, any opinions?

Thanks,
--
Michael
diff --git a/src/bin/pg_basebackup/walmethods.c b/src/bin/pg_basebackup/walmethods.c
index 331d0e7275..7867a56ee1 100644
--- a/src/bin/pg_basebackup/walmethods.c
+++ b/src/bin/pg_basebackup/walmethods.c
@@ -896,7 +896,7 @@ tar_finish(void)
 			return false;
 	}
 
-	/* A tarfile always ends with two empty  blocks */
+	/* A tarfile always ends with two empty blocks */
 	MemSet(zerobuf, 0, sizeof(zerobuf));
 	if (!tar_data->compression)
 	{
@@ -957,7 +957,10 @@ tar_finish(void)
 
 	/* sync the empty blocks as well, since they're after the last file */
 	if (tar_data->sync)
-		fsync(tar_data->fd);
+	{
+		if (fsync(tar_data->fd) != 0)
+			return false;
+	}
 
 	if (close(tar_data->fd) != 0)
 		return false;


signature.asc
Description: PGP signature


Is PG built on any C compilers where int division floors?

2018-06-24 Thread Chapman Flack
C99 finally pinned down what / does on signed ints, truncating toward zero.

Before that, it could truncate toward zero, or floor toward -inf.

Is PostgreSQL built on any compilers/platforms that have the floor
behavior?

-Chap



Re: Supporting tls-server-end-point as SCRAM channel binding for OpenSSL 1.0.0 and 1.0.1

2018-06-24 Thread Michael Paquier
On Tue, Jun 19, 2018 at 09:19:40AM +0900, Michael Paquier wrote:
> As Peter and Heikki have worked as well on all those features with me,
> are there any objection to discard this open item?  I looked again at
> the patch this morning and it is true that OpenSSL's history makes
> things harder, so keeping code consistent and simple with their last LTS
> version is appealing.

Okay, in consequence, I have moved this open item to the non-bugs
section.
--
Michael


signature.asc
Description: PGP signature


Re: postgresql_fdw doesn't handle defaults correctly

2018-06-24 Thread Amit Langote
Hi.

On 2018/06/24 2:23, Pavel Stehule wrote:
> Hi
> 
> I have a table boo
> 
> create table boo(id serial primary key, inserted date default current_date,
> v varchar);
> 
> I imported this table via simple
> 
> IMPORT FOREIGN SCHEMA public FROM SERVER foreign_server INTO public;

It seems you missed using OPTIONS (import_default 'true') here.

create schema foo;
create table foo.foo (a serial primary key, b date default current_date
not null, c int);

import foreign schema foo from server loopback into public options
(import_default 'true');

insert into public.foo (c) values (1);
select * from public.foo;
 a | b  | c
---++---
 1 | 2018-06-25 | 1
(1 row)

insert into foo.foo (c) values (2);
select * from public.foo;
 a | b  | c
---++---
 1 | 2018-06-25 | 1
 2 | 2018-06-25 | 2
(2 rows)


Thanks,
Amit




Re: Incorrect errno used with %m for backend code

2018-06-24 Thread Michael Paquier
On Sun, Jun 24, 2018 at 09:23:50PM +0900, Michael Paquier wrote:
> On Sun, Jun 24, 2018 at 09:22:01AM +0530, Ashutosh Sharma wrote:
>> Okay, I too had a quick look into the source code to see if there are
>> still some places where we could have missed to set an errno to ENOSPC
>> in case of write system call failure but, couldn't find any such place
>> in the code. The v2 version of patch looks good to me.
> 
> Thanks for the review.  I'll try to wrap that tomorrow with proper
> patches for back-branches as things diverge a bit here and there.

Pushed down to 9.3.  All branches conflicted, and particularly in
pg_basebackup some write() calls have been moved around.
--
Michael


signature.asc
Description: PGP signature


Re: PATCH: backtraces for error messages

2018-06-24 Thread Craig Ringer
On 21 June 2018 at 19:09, Kyotaro HORIGUCHI  wrote:

I think this for assertion failure is no problem but I'm not sure
> for other cases.


I think it's pretty strongly desirable for PANIC.


> We could set proper context description or other
> additional information in error messages before just dumping a
> trace for known cases.
>

Yeah. The trouble there is that there are a _lot_ of places to touch for
such things, and inevitably the one you really want to see will be
something that didn't get suitably annotated.


> Since PG9.5 RPMs are complied with --enable-dtrace so we could
> use system tap to insert the stack-trace stuff. Additional
> built-in probe in error reporting system or assertions would make
> this pluggable.
>

I don't bother with SystemTAP anymore; perf does the job for most purposes.

You can use --enable-dtrace SDTs with Perf fine. I wrote it up for Pg at
https://wiki.postgresql.org/wiki/Profiling_with_perf#PostgreSQL_pre-defined_tracepoint_events
.

Dynamic tracepoints are also very useful.

Both require debuginfo, but so does outputting of symbolic stacks per my
patch.

(That reminds me, I need to chat with Devrim about creating a longer lived
debuginfo + old versions rpms repo for Pg its self, if not the accessory
bits and pieces. I'm so constantly frustrated by not being able to get
needed debuginfo packages to investigate some core or running system
problem because they've been purged from the PGDG yum repo as soon as a new
version comes out.)

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: WIP: BRIN multi-range indexes

2018-06-24 Thread Tomas Vondra


On 06/25/2018 12:31 AM, Tomas Vondra wrote:
> On 06/24/2018 11:39 PM, Thomas Munro wrote:
>> On Sun, Jun 24, 2018 at 2:01 PM, Tomas Vondra
>>  wrote:
>>> Attached is rebased version of this BRIN patch series, fixing mostly the
>>> breakage due to 372728b0 (aka initial-catalog-data format changes). As
>>> 2018-07 CF is meant for almost-ready patches, this is more a 2018-09
>>> material. But perhaps someone would like to take a look - and I'd have
>>> to fix it anyway ...
>>
>> Hi Tomas,
>>
>> FYI Windows doesn't like this:
>>
>>   src/backend/access/brin/brin_bloom.c(146): warning C4013: 'round'
>> undefined; assuming extern returning int
>> [C:\projects\postgresql\postgres.vcxproj]
>>
>>   brin_bloom.obj : error LNK2019: unresolved external symbol round
>> referenced in function bloom_init
>> [C:\projects\postgresql\postgres.vcxproj]
>>
> 
> Thanks, I've noticed the failure before, but was not sure what's the
> exact cause. It seems there's still no 'round' on Windows, so I'll
> probably fix that by using rint() instead, or something like that.
> 

OK, here is a version tweaked to use floor()/ceil() instead of round().
Let's see if the Windows machine likes that more.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


0001-Pass-all-keys-to-BRIN-consistent-function-a-20180625.patch.gz
Description: application/gzip


0002-Move-IS-NOT-NULL-checks-to-bringetbitmap-20180625.patch.gz
Description: application/gzip


0003-BRIN-bloom-indexes-20180625.patch.gz
Description: application/gzip


0004-BRIN-multi-range-minmax-indexes-20180625.patch.gz
Description: application/gzip


Re: do I correctly understand these date/time data types?

2018-06-24 Thread Bruce Momjian
On Sun, Jun 24, 2018 at 05:13:38PM -0400, Chapman Flack wrote:
> Checking my understanding as in $subject:
> 
> A TIMESTAMP WITH TIME ZONE has had its specified time zone used at the point
> of entry to convert it to Z time, and then discarded. If I have to map one
> of these to a date/time-with-time-zone datatype in another language, I may
> as well unconditionally map it to the indicated date/time values and
> explicit time zone Z, because there is no information available to
> reconstruct what the original explicit time zone was.

You might want to read my blog about this:

https://momjian.us/main/blogs/pgblog/2017.html#September_27_2017

> A TIME WITH TIME ZONE actually carries its original explicit time zone
> with it, but only in the form of an offset, no named time zone or DST
> indication. I can map one of those faithfully if the other language has
> a time-with-zone-in-offset-form data type.
> 
> Speaking of time zones, what PLs out there have a notion of time zone
> maintained by their language runtimes, and how many of those use the
> PostgreSQL session_timezone to initialize that?

That's a good question.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +



Re: WIP: BRIN multi-range indexes

2018-06-24 Thread Tomas Vondra
On 06/24/2018 11:39 PM, Thomas Munro wrote:
> On Sun, Jun 24, 2018 at 2:01 PM, Tomas Vondra
>  wrote:
>> Attached is rebased version of this BRIN patch series, fixing mostly the
>> breakage due to 372728b0 (aka initial-catalog-data format changes). As
>> 2018-07 CF is meant for almost-ready patches, this is more a 2018-09
>> material. But perhaps someone would like to take a look - and I'd have
>> to fix it anyway ...
> 
> Hi Tomas,
> 
> FYI Windows doesn't like this:
> 
>   src/backend/access/brin/brin_bloom.c(146): warning C4013: 'round'
> undefined; assuming extern returning int
> [C:\projects\postgresql\postgres.vcxproj]
> 
>   brin_bloom.obj : error LNK2019: unresolved external symbol round
> referenced in function bloom_init
> [C:\projects\postgresql\postgres.vcxproj]
> 

Thanks, I've noticed the failure before, but was not sure what's the
exact cause. It seems there's still no 'round' on Windows, so I'll
probably fix that by using rint() instead, or something like that.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Invisible Indexes

2018-06-24 Thread Bruce Momjian
On Sun, Jun 24, 2018 at 09:59:15AM -0400, Tom Lane wrote:
> Andrew Dunstan  writes:
> > A major downside to a GUC is that you have to be aware of the current 
> > setting, since we're not going to have one settoing for each invisible 
> > index. Doing it at the SQL level you can treat each index separately. A 
> > GUC will actually involve more code, I suspect.
> 
> I'd envision it being a list of index names.  We already have most
> if not all of the underpinnings for such a thing, I believe, lurking
> around the code for search_path, temp_tablespaces, etc.

I would love to see an API that allowed hypothetical indexes too.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +



Re: WIP: BRIN multi-range indexes

2018-06-24 Thread Thomas Munro
On Sun, Jun 24, 2018 at 2:01 PM, Tomas Vondra
 wrote:
> Attached is rebased version of this BRIN patch series, fixing mostly the
> breakage due to 372728b0 (aka initial-catalog-data format changes). As
> 2018-07 CF is meant for almost-ready patches, this is more a 2018-09
> material. But perhaps someone would like to take a look - and I'd have
> to fix it anyway ...

Hi Tomas,

FYI Windows doesn't like this:

  src/backend/access/brin/brin_bloom.c(146): warning C4013: 'round'
undefined; assuming extern returning int
[C:\projects\postgresql\postgres.vcxproj]

  brin_bloom.obj : error LNK2019: unresolved external symbol round
referenced in function bloom_init
[C:\projects\postgresql\postgres.vcxproj]

-- 
Thomas Munro
http://www.enterprisedb.com



Re: comma to delimit fractional seconds

2018-06-24 Thread Chris Howard

Hi!
Thanks for the reply.

Consensus:
        Would anyone be likely to use this?  I've never heard of it before.
        And I haven't found any other mention of it other than the TODO 
list.

        So it comes down to risk vs standards compliance.

Breakage:   I don't think there is likely to be much breakage. The context
        would be only in the case of fractional seconds.  My proposed 
solution
        involves only one file for actual code changes, then some 
additional regression
        test cases.  Some of the risk may depend on how thorough the 
current

        regression tests are(?)

Proposed change:
    I propose changes to file backend/utils/adt/datetime.c.
    Most of the changes are to allow a comma to pass as a legitimate
    character in time strings.  The function ParseFractionalSecond()
    is modified to replace the ',' with a '.' so that strtod() still
    works. That means the heavy lifting with leading zeros and such
    is still done by the library function.


Chris Howard

p.s.  I've noticed that the error msg for badly formed time strings always
says the same thing "at character 13"  no matter how long the time 
string is.






On 06/23/2018 10:17 PM, Tom Lane wrote:

Chris Howard  writes:

I looked at the TODO list and saw the issue of using comma to delimit
fractional seconds. ...
Is there anything more to it than allowing HH:MM:SS,### as well as
HH:MM:SS.# ?

Here's the thing: most of the easy-looking stuff on the TODO list is
there because there's not actually consensus about how or whether
to do it.

In the case at hand, what you'd need to convince people of is that
there's not a significant penalty in terms of robustness (error
detection) if we allow commas to substitute for periods.  There's
a bunch of existing use-cases that depend on commas being noise,
so it's not obvious that this is OK.

regards, tom lane








do I correctly understand these date/time data types?

2018-06-24 Thread Chapman Flack
Checking my understanding as in $subject:

A TIMESTAMP WITH TIME ZONE has had its specified time zone used at the point
of entry to convert it to Z time, and then discarded. If I have to map one
of these to a date/time-with-time-zone datatype in another language, I may
as well unconditionally map it to the indicated date/time values and
explicit time zone Z, because there is no information available to
reconstruct what the original explicit time zone was.

A TIME WITH TIME ZONE actually carries its original explicit time zone
with it, but only in the form of an offset, no named time zone or DST
indication. I can map one of those faithfully if the other language has
a time-with-zone-in-offset-form data type.

Speaking of time zones, what PLs out there have a notion of time zone
maintained by their language runtimes, and how many of those use the
PostgreSQL session_timezone to initialize that?

My understanding of what PL/Java currently does is that it makes
explicit use of session_timezone on a handful of internal code paths
where it tries to map transparently between SQL and Java values, but it
does not at present use it to set the JVM's idea of the time zone, so that
remains at whatever the OS thinks it is, and that's what'll be used in any
random Java code using Java APIs that involve the time zone.

I wonder if that is or isn't common among PLs.

-Chap



Re: effect of JIT tuple deform?

2018-06-24 Thread Andres Freund
Hi,

On 2018-06-24 22:32:07 +0200, Dmitry Dolgov wrote:
> `perf diff` indeed shows that in the first case (with the 4M rows dataset) the
> jitted version has some noticeable delta for one call, and unfortunately so 
> far
> I couldn't figure out which one exactly because of JIT (btw, who can explain
> how to see a correct full `perf report` in this case? Somehow `perf
> inject --jit -o
> perf.data.jitted` and jit_profiling_support didn't help).

jit_profiling_support currently requires a patch (that I'm about to
merge into their trunk) in LLVM.  I've posted it a couple times to the
list.


> But since on the bigger dataset I've got expected results, maybe it's just a
> sign that JIT kicks in too early in this case and what's necessary is to 
> adjust
> jit_above_cost/jit_optimize_above_cost/jit_inline_above_cost?

Yea, that's very likely the problem.  I wonder if we should multiply the
cost w/ the number of targeted workers to reduce problems with
parallelism...

Greetings,

Andres Freund



Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2018-06-24 Thread Tomas Vondra
Hi all,

Attached is a rebased version of this patch series, mostly just fixing
the breakage caused by reworked format of initial catalog data.

Aside from that, the MCV building now adopts the logic introduced by
commit b5db1d93d2 for single-column MCV lists. The new algorithm seems
pretty good and I don't see why multi-column MCV lists should use
something special.

I'm sure there are plenty of open questions to discuss, particularly
stuff related to combining the various types of statistics to the final
estimate (a lot of that was already improved based on Dean's reviews).

On thing that occurred to me while comparing the single-column logic (as
implemented in selfuncs.c) and the new multi-column stuff, is dealing
with partially-matching histogram buckets.

In the single-column case, we pretty much assume uniform distribution in
each bucket, and linearly interpolate the selectivity. So for a bucket
with boundaries [0, 10] and condition "x <= 5" we return 0.5, for "x <
7" we return 0.7 and so on. This is what convert_to_scalar() does.

In the multi-column case, we simply count each matching bucket as 0.5,
without any attempts to linearly interpolate. It would not be difficult
to call "convert_to_scalar" for each condition (essentially repeating
the linear interpolation for each column), but then what? We could
simply compute a product of those results, of course, but that only
works assuming independence. And that's exactly the wrong thing to
assume here, considering the extended statistics are meant for cases
where the columns are not independent.

So I'd argue the 0.5 estimate for partially-matching buckets is the
right thing to do here, as it's minimizing the average error.


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


0001-multivariate-MCV-lists.patch.gz
Description: application/gzip


0002-multivariate-histograms.patch.gz
Description: application/gzip


Removing obsolete comment block at the top of nbtsort.c.

2018-06-24 Thread Peter Geoghegan
nbtsort.c has a comment block from the Berkeley days that reads:

 * This code is moderately slow (~10% slower) compared to the regular
 * btree (insertion) build code on sorted or well-clustered data. On
 * random data, however, the insertion build code is unusable -- the
 * difference on a 60MB heap is a factor of 15 because the random
 * probes into the btree thrash the buffer pool. (NOTE: the above
 * "10%" estimate is probably obsolete, since it refers to an old and
 * not very good external sort implementation that used to exist in
 * this module. tuplesort.c is almost certainly faster.)

I propose removing this whole comment block (patch attached), because:

* The "NOTE" sentence in parenthesis was actually written by Tom in
1999, as part of the original tuplesort commit. If tuplesort.c was
"almost certainly faster" in its first incarnation, what are the
chances of it actually still being slower than incremental insertions
with presorted input at this point? There were numerous large
improvements to tuplesort in the years since 1999.

* Even if the original claim was still true all these years later, the
considerations for nbtsort.c have changed so much that it couldn't
possibly matter. The fact that we're not using shared_buffers to build
indexes anymore is a significant advantage for nbtsort.c, independent
of sort performance. These days, CREATE INDEX spends most of the time
bottlenecked on WAL-logging when building a index against presorted
SERIAL-like inputs, especially when parallelism is used. Back in 1999,
there was no WAL-logging.

-- 
Peter Geoghegan
From 2a77b947e07dadef953019334062580cad560836 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan 
Date: Sun, 24 Jun 2018 11:41:16 -0700
Subject: [PATCH] Remove obsolete comment block in nbtsort.c.

Building a new index through incremental nbtree insertions would always
be significantly slower than our actual approach of sorting using
tuplesort, and then building and WAL-logging whole pages.  Remove a
comment block from the Berkeley days claiming that incremental
insertions might be slightly faster with presorted input.
---
 src/backend/access/nbtree/nbtsort.c | 9 -
 1 file changed, 9 deletions(-)

diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index e012df596e..16f5755777 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -14,15 +14,6 @@
  * its parent level.  When we have only one page on a level, it must be
  * the root -- it can be attached to the btree metapage and we are done.
  *
- * This code is moderately slow (~10% slower) compared to the regular
- * btree (insertion) build code on sorted or well-clustered data.  On
- * random data, however, the insertion build code is unusable -- the
- * difference on a 60MB heap is a factor of 15 because the random
- * probes into the btree thrash the buffer pool.  (NOTE: the above
- * "10%" estimate is probably obsolete, since it refers to an old and
- * not very good external sort implementation that used to exist in
- * this module.  tuplesort.c is almost certainly faster.)
- *
  * It is not wise to pack the pages entirely full, since then *any*
  * insertion would cause a split (and not only of the leaf page; the need
  * for a split would cascade right up the tree).  The steady-state load
-- 
2.17.1



Re: Supporting tls-server-end-point as SCRAM channel binding for OpenSSL 1.0.0 and 1.0.1

2018-06-24 Thread Alvaro Hernandez



On 24/06/18 18:49, Dave Cramer wrote:



On 29 May 2018 at 22:48, Michael Paquier > wrote:


On Tue, May 29, 2018 at 10:33:03PM -0400, Heikki Linnakangas wrote:
> Hmm. I think Peter went through this in commits ac3ff8b1d8 and
054e8c6cdb.
> If you got that working now, I suppose we could do that, but I'm
actually
> inclined to just stick to the current, more straightforward
code, and
> require OpenSSL 1.0.2 for this feature. OpenSSL 1.0.2 has been
around for
> several years now. It's not available on all the popular
platforms and
> distributions yet, but I don't want to bend over backwards to
support those.

I think that this mainly boils down to how much Postgres JDBC wants to
get support here as some vendors can maintain oldest versions of
OpenSSL
for a long time.  The extra code is not that much complicated by the
way, still it is true that HEAD is cleaner with its simplicity.


I'm unclear what this has to do with JDBC ? JDBC doesn't use OpenSSL

Alvaro ?




    It's only indirectly related. It does matter on what servers JDBC 
would be able to connect to (using SCRAM + channel binding). Only those 
with tls-server-end-point will be able to use CB with JDBC, and that is, 
as of today, only OpenSSL 1.0.2 or higher, which is not available on 
some older distributions.




    Álvaro

--

Alvaro Hernandez


---
OnGres



Re: New GUC to sample log queries

2018-06-24 Thread Vik Fearing
On 24/06/18 13:22, Adrien Nayrat wrote:
> Attached third version of the patch with documentation.

Hi.  I'm reviewing this.

>   exceeded = (log_min_duration_statement == 0 ||
>   (log_min_duration_statement > 0 &&
>(secs > log_min_duration_statement / 
> 1000 ||
> -   secs * 1000 + msecs >= 
> log_min_duration_statement)));
> +   secs * 1000 + msecs >= 
> log_min_duration_statement))) &&
> +log_sample_rate != 0 && (log_sample_rate == 
> 1 ||
> +random() <= log_sample_rate * 
> MAX_RANDOM_VALUE);

A few notes on this part, which is the crux of the patch.

1) Is there an off-by-one error here?  drandom() has the formula

(double) random() / ((double) MAX_RANDOM_VALUE + 1)

but yours doesn't look like that.

2) I think the whole thing should be separated into its own expression
instead of tagging along on an already hard to read expression.

3) Is it intentional to only sample with log_min_duration_statement and
not also with log_duration?  It seems like it should affect both.  In
both cases, the name is too generic.  Something called "log_sample_rate"
should sample **everything**.

Otherwise I think it's good.  Attached is a revised patch that fixes 2)
as well as some wordsmithing throughout.  It does not deal with the
other issues I raised.

Marked "Waiting on Author".
-- 
Vik Fearing  +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 7bfbc87109..90bbe6d423 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -5190,6 +5190,26 @@ local0.*/var/log/postgresql

   
 
+ 
+  log_sample_rate (real)
+  
+   log_sample_rate configuration parameter
+  
+  
+   
+
+ Causes logging only a fraction of the statements when  is used. The default is
+ 1, meaning log all statements longer than
+ log_min_duration_statement.  Setting this to zero
+ disables logging, same as setting
+ log_min_duration_statement to
+ -1.  Using log_sample_rate is
+ helpful when the traffic is too high to log all queries.
+
+   
+  
+
  
 
 
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index f4133953be..5590f9a9d4 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -2115,7 +2115,8 @@ check_log_statement(List *stmt_list)
 
 /*
  * check_log_duration
- *		Determine whether current command's duration should be logged
+ *		Determine whether current command's duration should be logged.
+ *		If log_sample_rate < 1.0, log only a sample.
  *
  * Returns:
  *		0 if no logging is needed
@@ -2137,6 +2138,7 @@ check_log_duration(char *msec_str, bool was_logged)
 		int			usecs;
 		int			msecs;
 		bool		exceeded;
+		bool		in_sample;
 
 		TimestampDifference(GetCurrentStatementStartTimestamp(),
 			GetCurrentTimestamp(),
@@ -2153,7 +2155,15 @@ check_log_duration(char *msec_str, bool was_logged)
 	 (secs > log_min_duration_statement / 1000 ||
 	  secs * 1000 + msecs >= log_min_duration_statement)));
 
-		if (exceeded || log_duration)
+		/*
+		 * Do not log if log_sample_rate = 0.
+		 * Log a sample if log_sample_rate <= 1 and avoid unecessary random()
+		 * call if log_sample_rate = 1.
+		 */
+		in_sample = log_sample_rate != 0 &&
+			(log_sample_rate == 1 || random() <= log_sample_rate * MAX_RANDOM_VALUE);
+
+		if (exceeded && in_sample || log_duration)
 		{
 			snprintf(msec_str, 32, "%ld.%03d",
 	 secs * 1000 + msecs, usecs % 1000);
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 859ef931e7..e89944789e 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -455,6 +455,7 @@ int			log_min_messages = WARNING;
 int			client_min_messages = NOTICE;
 int			log_min_duration_statement = -1;
 int			log_temp_files = -1;
+double 		log_sample_rate = 1.0;
 int			trace_recovery_messages = LOG;
 
 int			temp_file_limit = -1;
@@ -3257,6 +3258,17 @@ static struct config_real ConfigureNamesReal[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"log_sample_rate", PGC_SUSET, LOGGING_WHEN,
+			gettext_noop("Fraction of statements to log."),
+			gettext_noop("1.0 logs all statements."),
+			NULL
+		},
+		_sample_rate,
+		1.0, 0.0, 1.0,
+		NULL, NULL, NULL
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, 0.0, 0.0, 0.0, NULL, NULL, NULL
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 9e39baf466..3f9a016003 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -447,6 +447,8 @@
 	# statements running at 

Re: Supporting tls-server-end-point as SCRAM channel binding for OpenSSL 1.0.0 and 1.0.1

2018-06-24 Thread Dave Cramer
On 29 May 2018 at 22:48, Michael Paquier  wrote:

> On Tue, May 29, 2018 at 10:33:03PM -0400, Heikki Linnakangas wrote:
> > Hmm. I think Peter went through this in commits ac3ff8b1d8 and
> 054e8c6cdb.
> > If you got that working now, I suppose we could do that, but I'm actually
> > inclined to just stick to the current, more straightforward code, and
> > require OpenSSL 1.0.2 for this feature. OpenSSL 1.0.2 has been around for
> > several years now. It's not available on all the popular platforms and
> > distributions yet, but I don't want to bend over backwards to support
> those.
>
> I think that this mainly boils down to how much Postgres JDBC wants to
> get support here as some vendors can maintain oldest versions of OpenSSL
> for a long time.  The extra code is not that much complicated by the
> way, still it is true that HEAD is cleaner with its simplicity.
>
>
I'm unclear what this has to do with JDBC ? JDBC doesn't use OpenSSL

Alvaro ?



Dave Cramer

da...@postgresintl.com
www.postgresintl.com


Re: utilities to rebuild commit logs from wal

2018-06-24 Thread Chris Travers
On Sun, Jun 24, 2018 at 5:53 AM Bruce Momjian  wrote:

> On Fri, Jun 22, 2018 at 10:49:58AM +0200, Chris Travers wrote:
> > Before we reinvent the wheel here, does anyone know of utilities to build
> > commit logs from wal segments?  Or even to just fill with commits?
> >
> > I figure it is worth asking before I write one.
>
> Uh, what are commit log?  pg_waldump output?
>

I was missing pg_xact segments and wanting to rebuild them following a
major filesystem crash.

I ended up just writing copying in "all commits" because rollbacks were
very rare on these dbs.  I am still curious if there is an easy way if one
is missing a late segment to generate from WAL if enough WALs are stored
and the WAL level is high enough.

It's not the end of the world but this has helped me to get data about 10
hours more recent than latest backup.  So  so far so good.

>
> --
>   Bruce Momjian  http://momjian.us
>   EnterpriseDB http://enterprisedb.com
>
> + As you are, so once was I.  As I am, so you will be. +
> +  Ancient Roman grave inscription +
>


-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin


Re: Desirability of client-side expressions in psql?

2018-06-24 Thread Pavel Stehule
2018-06-24 16:13 GMT+02:00 Fabien COELHO :

>
> Hello Pavel,
>
> This is a discussion without actual patch intended for pg12, to be added
 to CF 2018-09. The expected end result is either "returned with
 feedback",
 meaning proceed to send some big patch(es), or "rejected", meaning the
 project does not want this, no point in submitting something.

>>>
>> please, can you port expression evaluation from pgbench to psql? I miss a
>> patch.
>>
>
> Indeed, I have not done it yet:-)
>
> This is a pretty large series of patches, which would take a significant
> time for developing (me), rewiewing (not me) and committing (with overload
> committers). I would not like to end in the "we do not want this feature at
> all" state *after* the time has been spent, hence this pre-patch call for
> discussion about the feature itself, before diving into coding.
>
> I take note that you are interested in actually having this feature.
>
> I am sure so psql expression evaluation has strong benefit - mainly for
>> expressions like
>>
>> \if SERVER_VERSION_NUM >= 
>>  ...
>> \endif
>>
>
> Note that this can already be done by relying on server-side expressions:
>
>  SELECT :SERVER_VERSION_NUM >= 11 AS "version_11_plus" \gset
>  \if :version_11_plus
>...
>  \endif
>
> Not very elegant, but functional. I'm not sure there is a compelling
> reason to have this feature beyond elegance.
>

not elegant, not readable, and not user friendly.

the possibility of simple expression evaluation is just practical.

Regards

Pavel


> --
> Fabien.
>


Re: Desirability of client-side expressions in psql?

2018-06-24 Thread Fabien COELHO



Hello Pavel,


This is a discussion without actual patch intended for pg12, to be added
to CF 2018-09. The expected end result is either "returned with feedback",
meaning proceed to send some big patch(es), or "rejected", meaning the
project does not want this, no point in submitting something.


please, can you port expression evaluation from pgbench to psql? I miss a
patch.


Indeed, I have not done it yet:-)

This is a pretty large series of patches, which would take a significant 
time for developing (me), rewiewing (not me) and committing (with overload 
committers). I would not like to end in the "we do not want this feature 
at all" state *after* the time has been spent, hence this pre-patch call 
for discussion about the feature itself, before diving into coding.


I take note that you are interested in actually having this feature.


I am sure so psql expression evaluation has strong benefit - mainly for
expressions like

\if SERVER_VERSION_NUM >= 
 ...
\endif


Note that this can already be done by relying on server-side expressions:

 SELECT :SERVER_VERSION_NUM >= 11 AS "version_11_plus" \gset
 \if :version_11_plus
   ...
 \endif

Not very elegant, but functional. I'm not sure there is a compelling 
reason to have this feature beyond elegance.


--
Fabien.



Re: Invisible Indexes

2018-06-24 Thread Tom Lane
Andrew Dunstan  writes:
> A major downside to a GUC is that you have to be aware of the current 
> setting, since we're not going to have one settoing for each invisible 
> index. Doing it at the SQL level you can treat each index separately. A 
> GUC will actually involve more code, I suspect.

I'd envision it being a list of index names.  We already have most
if not all of the underpinnings for such a thing, I believe, lurking
around the code for search_path, temp_tablespaces, etc.

regards, tom lane



Re: Concurrency bug in UPDATE of partition-key

2018-06-24 Thread Amit Kapila
On Sat, Jun 23, 2018 at 8:25 PM, Alvaro Herrera
 wrote:
> Would you wait a little bit before pushing this?
>

Sure.

>  I'd like to give this
> a read too.
>

That would be great.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Invisible Indexes

2018-06-24 Thread Andrew Dunstan




On 06/19/2018 02:05 PM, Robert Haas wrote:

On Mon, Jun 18, 2018 at 6:12 PM, Tom Lane  wrote:

Peter Geoghegan  writes:

On Mon, Jun 18, 2018 at 2:57 PM, Tom Lane  wrote:

I think the actually desirable way to handle this sort of thing is through
an "index advisor" sort of plugin, which can hide a given index from the
planner without any globally visible side-effects.

The globally visible side-effects are the point, though. Some users
desire cheap insurance against dropping what turns out to be the wrong
index.

Perhaps there are use-cases where you want globally visible effects,
but the primary use-case Andrew cited (i.e. EXPLAIN experimentation)
would not want that.

Anyway, if we do it with a GUC, the user can control the scope of
the effects.

Yeah, I agree that a GUC seems more powerful and easier to roll out.
A downside is that there could be cached plans still using that old
index.  If we did DDL on the index we could be sure they all got
invalidated, but otherwise how do we know?

BTW, like you, I seem to remember somebody writing an extension that
did added a GUC that did exactly this, and demoing it at a conference.
Maybe Oleg or Teodor?





A major downside to a GUC is that you have to be aware of the current 
setting, since we're not going to have one settoing for each invisible 
index. Doing it at the SQL level you can treat each index separately. A 
GUC will actually involve more code, I suspect.


cheers

andrew

--
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services