Re: speeding up planning with partitions

2019-04-05 Thread Floris Van Nee
Thanks for the details! Indeed the versions with now()/current_date use the runtime pruning rather than planning time. I wasn't aware of the use of 'today' though - that could be useful in case we're sure statements won't be prepared. Previously (v10/ partly v11) it was necessary to make sure

Re: speeding up planning with partitions

2019-04-04 Thread Floris Van Nee
Hi all, First of all I would like to thank everyone involved in this patch for their hard work on this. This is a big step forward. I've done some performance and functionality testing with the patch that was committed to master and it looks very good. I had a question about the performance of

Re: partitioning performance tests after recent patches

2019-04-15 Thread Floris Van Nee
d partitions is still valid, but it's more difficult to determine if, on top of that, extra partitions must be included due to widening of the range. -Floris From: David Rowley Sent: Monday, April 15, 2019 1:25 AM To: Floris Van Nee Cc: Pg Hackers Subj

Re: Index Skip Scan

2019-06-05 Thread Floris Van Nee
> To address this, probably we can do something like in the attached patch. > Altogether with distinct_pathkeys uniq_distinct_pathkeys are stored, which is > the same, but without the constants elimination. It's being used then for > getting the real number of distinct keys, and to check the order

Re: Index Skip Scan

2019-06-01 Thread Floris Van Nee
Hi, Thanks for the helpful replies. > Yes, good catch, I'll investigate. Looks like in the current implementation > something is not quite right, when we have this order of columns in an index > and where clause (e.g. in the examples above everything seems fine if we > create > index over

Re: Index Skip Scan

2019-05-31 Thread Floris Van Nee
After some talks with Jesper at PGCon about the Index Skip Scan, I started testing this patch, because it seems to have great potential in speeding up many of our queries (great conference by the way, really enjoyed my first time being there!). I haven't looked in depth to the code itself, but

Re: Index Skip Scan

2019-05-31 Thread Floris Van Nee
Actually I'd like to add something to this. I think I've found a bug in the current implementation. Would someone be able to check? Given a table definition of (market text, feedcode text, updated_at timestamptz, value float8) and an index on (market, feedcode, updated_at desc) (note that this

Re: Index Skip Scan

2019-06-22 Thread Floris Van Nee
The following sql statement seems to have incorrect results - some logic in the backwards scan is currently not entirely right. -Floris drop table if exists a; create table a (a int, b int, c int); insert into a (select vs, ks, 10 from generate_series(1,5) vs, generate_series(1, 1) ks);

Re: Index Skip Scan

2019-06-22 Thread Floris Van Nee
> Thanks for testing! You're right, looks like in the current implementation in > case of backwards scan there is one unnecessary extra step forward. It seems > this mistake was made, since I was concentrating only on the backward scans > with a cursor, and used not exactly correct approach to

Re: Index Skip Scan

2019-06-23 Thread Floris Van Nee
> Try the attached patch -- it has DEBUG1 traces with the contents of > index tuples from key points during index scans, allowing you to see > what's going on both as a B-Tree is descended, and as a range scan is > performed. It also shows details of the insertion scankey that is set > up within

Re: Index Skip Scan

2019-07-10 Thread Floris Van Nee
> Here is finally a new version of the patch, where all the mentioned issues > seems to be fixed and the corresponding new tests should keep it like that > (I've skipped all the pubs at PostgresLondon for that). Thanks for the new patch! Really appreciate the work you're putting into it. I

Re: Index Skip Scan

2019-07-10 Thread Floris Van Nee
> Thanks for testing! Could you provide a test case to show what exactly is the > problem? Note that in the case of a regular non-skip scan, this cursor backwards works because the Unique node on top does not support backwards scanning at all. Therefore, when creating the cursor, the actual

Re: Index Skip Scan

2019-07-10 Thread Floris Van Nee
> Thanks for testing! Could you provide a test case to show what exactly is the > problem? create table a (a int, b int, c int); insert into a (select vs, ks, 10 from generate_series(1,5) vs, generate_series(1, 1) ks); create index on a (a,b); analyze a; set enable_indexskipscan=1; //

Re: Index Skip Scan

2019-07-11 Thread Floris Van Nee
> For the general forward direction but for a backwards cursor scroll, > we'd return the lowest value for each distinct prefix, but for the > general backwards direction (DESC case) we'd return the highest value > for each distinct prefix. Looking at IndexNext() the cursor direction > seems to be

Optimize single tuple fetch from nbtree index

2019-08-02 Thread Floris Van Nee
Hi hackers, While I was reviewing some code in another patch, I stumbled upon a possible optimization in the btree index code in nbtsearch.c for queries using 'LIMIT 1'. I have written a small patch that implements this optimization, but I'd like some feedback on the design of the patch,

Re: Optimize single tuple fetch from nbtree index

2019-08-02 Thread Floris Van Nee
Hi Tom, Thanks for your quick reply! > Regardless of whether there's actually a LIMIT 1? That seems disastrous > for every other case than the narrow one where the optimization wins. > Because every other case is now going to have to touch the index page > twice. That's more CPU and about

Re: Optimize single tuple fetch from nbtree index

2019-08-26 Thread Floris Van Nee
> It seems that it contradicts the very idea of your patch, so probably we > should look for other ways to optimize this use-case. > Maybe this restriction can be relaxed for write only tables, that never > have to reread the page because of visibility, or something like that. > Also we probably

Re: Optimize single tuple fetch from nbtree index

2019-08-24 Thread Floris Van Nee
> Hello, > welcome to hackers with your first patch) Thank you. > Though, I got interested in the comment inconsistency you have found. > I added debug message into this code branch of the patch and was able to > see it in regression.diffs after 'make check': > Speaking of your patch, it seems

Re: Optimize single tuple fetch from nbtree index

2019-08-27 Thread Floris Van Nee
>> It seems that it contradicts the very idea of your patch, so probably we >> should look for other ways to optimize this use-case. >> Maybe this restriction can be relaxed for write only tables, that never >> have to reread the page because of visibility, or something like that. >> Also we

Re: Index Skip Scan

2019-08-28 Thread Floris Van Nee
> Sorry for long delay. Here is more or less what I had in mind. After changing > read_closest to read the whole page I couldn't resist to just merge it into > readpage itself, since it's basically the same. It could raise questions > about> > performance and how intrusive this patch is, but I

Re: Optimize single tuple fetch from nbtree index

2019-08-05 Thread Floris Van Nee
Hi Peter, > Actually, having looked at the test case in more detail, that now > seems less likely. The test case seems designed to reward making it > cheaper to access one specific tuple among a fairly large group of > related tuples -- reducing the number of inner scans is not going to > be

Re: Index Skip Scan

2019-08-05 Thread Floris Van Nee
Thanks for the new patch. I've reviewed the skip scan patch, but haven't taken a close look at the uniquekeys patch yet. In my previous review I mentioned that queries of the form: select distinct on(a) a,b from a where a=1; do not lead to a skip scan with the patch, even though the skip scan

Re: Index Skip Scan

2019-08-06 Thread Floris Van Nee
> Yes, the check should be for that. However, the query in question > doesn't have any query_pathkeys, and hence query_uniquekeys in > standard_qp_callback(), so therefore it isn't supported > Your scenario is covered by one of the test cases in case the > functionality is supported. But, I think

jsonb_set() strictness considered harmful to data

2019-10-20 Thread Floris Van Nee
FWIW I've been bitten by this 'feature' more than once as well, accidentally erasing a column. Now I usually write js = jsonb_set(js, coalesce(new_column, 'null'::jsonb)) to prevent erasing the whole column, and instead setting the value to a jsonb null value, but I also found the STRICT

RE: Delaying/avoiding BTreeTupleGetNAtts() call within _bt_compare()

2020-03-01 Thread Floris Van Nee
> The cfbot thinks it doesn't even apply anymore --- conflict with the dedup > patch, no doubt? Minor conflict with that patch indeed. Attached is rebased version. I'm running some tests now - will post the results here when finished. -Floris v4-0001-Avoid-pipeline-stall-in-_bt_compare.patch

RE: Delaying/avoiding BTreeTupleGetNAtts() call within _bt_compare()

2020-01-28 Thread Floris Van Nee
> > I could do some tests with the patch on some larger machines. What exact > tests do you propose? Are there some specific postgresql.conf settings and > pgbench initialization you recommend for this? And was the test above just > running 'pgbench -S' select-only with specific -T, -j and -c

RE: Delaying/avoiding BTreeTupleGetNAtts() call within _bt_compare()

2020-01-30 Thread Floris Van Nee
> > Can you test this version, Floris? The second two patches are probably not > helping here, so it would be useful if you could just test 0001-*, and then > test > all three together. I can toss the latter two patches if there is no > additional > speedup. > Here's the results for runs

RE: Index Skip Scan

2020-02-04 Thread Floris Van Nee
> this point me and Jesper inclined to go with the second option. But maybe > I'm missing something, are there any other suggestions? Unfortunately I figured this would need a more invasive fix. I tend to agree that it'd be better to not skip in situations like this. I think it'd make most

RE: Index Skip Scan

2020-01-27 Thread Floris Van Nee
Hi Dmitry, Thanks for the new patch! I tested it and managed to find a case that causes some issues. Here's how to reproduce: drop table if exists t; create table t as select a,b,b%2 as c,10 as d from generate_series(1,5) a, generate_series(1,1000) b; create index on t (a,b,c,d); -- correct

RE: Delaying/avoiding BTreeTupleGetNAtts() call within _bt_compare()

2020-01-27 Thread Floris Van Nee
> He reported that the problem went away with the patches applied. The > following pgbench SELECT-only result was sent to me privately: > > before: > single: tps = 30300.202363 (excluding connections establishing) > all cores: tps = 1026853.447047 (excluding connections

RE: Delaying/avoiding BTreeTupleGetNAtts() call within _bt_compare()

2020-02-10 Thread Floris Van Nee
> > The interesting thing now is the role of the "negative infinity test" > patch (the 0003-* patch) in all of this. I suspect that it may not be helping > us > much here. I wonder, could you test the following configurations to settle > this question? > > * with 30 clients (i.e. repeat the

RE: Index Skip Scan

2020-01-15 Thread Floris Van Nee
Hi all, I reviewed the latest version of the patch. Overall some good improvements I think. Please find my feedback below. - I think I mentioned this before - it's not that big of a deal, but it just looks weird and inconsistent to me: create table t2 as (select a, b, c, 10 d from

RE: Index Skip Scan

2020-01-21 Thread Floris Van Nee
> > Could you please elaborate why does it sound like that? If I understand > correctly, to stop a scan only "between" pages one need to use only > _bt_readpage/_bt_steppage? Other than that there is no magic with scan > position in the patch, so I'm not sure if I'm missing something here.

Re: Index Skip Scan

2020-01-22 Thread Floris Van Nee
Hi Dmitry, > > On Wed, Jan 22, 2020 at 07:50:30AM +, Floris Van Nee wrote: > > > > Anyone please correct me if I'm wrong, but I think one case where the > > current patch relies on some data from the page it has locked before it in > > checking this hi

RE: Index Skip Scan

2020-01-22 Thread Floris Van Nee
> Note in particular that index scans cannot return the same index tuple twice - > - processing a page at a time ensures that that cannot happen. > > Can a loose index scan return the same tuple (i.e. a tuple with the same heap > TID) to the executor more than once? > The loose index scan

RE: Index Skip Scan

2020-04-07 Thread Floris Van Nee
> > * Suspicious performance difference between different type of workload, > mentioned by Tomas (unfortunately I had no chance yet to investigate). > His benchmark results indeed most likely point to multiple comparisons being done. Since the most likely place where these occur is

RE: Index Skip Scan

2020-04-06 Thread Floris Van Nee
> > Hm, I wasn't aware about this one, thanks for bringing this up. Btw, Floris, I > would appreciate if in the future you can make it more visible that changes > you > suggest contain some fixes. E.g. it wasn't clear for me from your previous > email > that that's the case, and it doesn't

RE: Index Skip Scan

2020-04-06 Thread Floris Van Nee
> > > On Sun, Apr 05, 2020 at 04:30:51PM +0530, Dilip Kumar wrote: > > > > > > I was just wondering how the distinct will work with the "skip scan" > > > if we have some filter? I mean every time we select the unique row > > > based on the prefix key and that might get rejected by an external > >

RE: Delaying/avoiding BTreeTupleGetNAtts() call within _bt_compare()

2020-03-08 Thread Floris Van Nee
> Attached is v5, which inlines in a targeted fashion, pretty much in the same > way as the earliest version. This is the same as v4 in every other way. > Perhaps you can test this. > Thank you for the new patch. With the new one I am indeed able to reproduce a performance increase. It is very

RE: Index Skip Scan (new UniqueKeys)

2020-07-14 Thread Floris Van Nee
> > > - the uniquekeys that is built, still contains some redundant keys, that are > normally eliminated from the path keys lists. > > I guess you're talking about: > > + if (EC_MUST_BE_REDUNDANT(ec)) > + continue; > > Can you add some test cases to your changes to show the effect of it? It

RE: Index Skip Scan (new UniqueKeys)

2020-07-10 Thread Floris Van Nee
Hi Dmitry, Also took another look at the patch now, and found a case of incorrect data. It looks related to the new way of creating the paths, as I can't recall seeing this in earlier versions. create table t1 as select a,b,b%5 as c, random() as d from generate_series(1, 10) a,

RE: Index Skip Scan (new UniqueKeys)

2020-07-23 Thread Floris Van Nee
> > One UniqueKey can have multiple corresponding expressions, which gives us > also possibility of having one unique key with (t1.a, t2.a) and it looks now > similar to EquivalenceClass. > I believe the current definition of a unique key with two expressions (t1.a, t2.a) means that it's

RE: [PATCH] Keeps tracking the uniqueness with UniqueKey

2020-07-22 Thread Floris Van Nee
Hi Andy, A small thing I found: +static List * +get_exprs_from_uniqueindex(IndexOptInfo *unique_index, + List *const_exprs, +

visibility map corruption

2021-07-04 Thread Floris Van Nee
Hi hackers, We recently ran into an issue where the visibility map of a relation was corrupt, running Postgres 12.4. The error we'd get when running a SELECT * from this table is: could not access status of transaction 3704450152 DETAIL: Could not open file "pg_xact/0DCC": No such file or

RE: visibility map corruption

2021-07-04 Thread Floris Van Nee
> On Sun, Jul 4, 2021 at 1:44 PM Floris Van Nee > wrote: > > We recently ran into an issue where the visibility map of a relation was > corrupt, running Postgres 12.4. The error we'd get when running a SELECT * > from this table is: > > > > could not access st

RE: visibility map corruption

2021-07-04 Thread Floris Van Nee
> > I wonder if it's related to this issue: > > https://www.postgresql.org/message- > id/20210423234256.hwopuftipdmp3...@alap3.anarazel.de > > Have you increased autovacuum_freeze_max_age from its default? This > already sounds like the kind of database where that would make sense. >

RE: non-HOT update not looking at FSM for large tuple update

2021-03-27 Thread Floris Van Nee
Hi Noah, Thanks for taking a look at this patch. > > In evaluating whether this is a good choice of value, I think about the > expected page lifecycle. A tuple barely larger than fillfactor (roughly > len=1+BLCKSZ*fillfactor/100) will start on a roughly-empty page. As long as > the tuple

RE: non-HOT update not looking at FSM for large tuple update

2021-03-09 Thread Floris Van Nee
Hi, > > This patch fails to consider that len may be bigger than MaxHeapTupleSize * > 0.98, which in this case triggers a reproducable > PANIC: Good catch! I've adapted the patch with your suggested fix. > > One different question I have, though, is why we can't "just" teach vacuum > to clean

non-HOT update not looking at FSM for large tuple update

2021-02-24 Thread Floris Van Nee
Hi hackers, Recently we found a table that was slowly, but consistently increasing in size. The table has a low fill-factor set and was updated very frequently. As expected, almost all updates are HOT updates, but for some of the non-HOT updates it always wanted to use a new page, rather than

RE: non-HOT update not looking at FSM for large tuple update

2021-02-24 Thread Floris Van Nee
> That makes sense, although the exact number seems precisely tailored to your > use case. 2% gives 164 bytes of slack and doesn't seem too large. Updated > patch attached. Yeah, I tried picking it as conservative as I could, but 2% is obviously great too. :-) I can't think of any large

RE: non-HOT update not looking at FSM for large tuple update

2021-02-24 Thread Floris Van Nee
Hi John, > One idea is to take your -50 idea and make it more general and safe, by > scaling the fudge factor based on fillfactor, such that if fillfactor is less > than 100, the requested freespace is a bit smaller than the max. It's still a > bit of a hack, though. I've attached a draft of

RE: non-HOT update not looking at FSM for large tuple update

2021-02-24 Thread Floris Van Nee
> In this case, the vast majority has between 8050-8099 bytes free according to > the FSM. That means that, for this particular case, for a fillfactor of 10, > we’d need to subtract ~120 bytes or so in order to properly recycle the pages. Also, I think this "fudge" factor would need to be

RE: non-HOT update not looking at FSM for large tuple update

2021-03-08 Thread Floris Van Nee
> I've added this to the commitfest as a bug fix and added you as an author. Thanks. Patch looks good to me, but I guess there needs to be someone else reviewing too? Also, would this be a backpatchable bugfix? -Floris

RE: Why timestamptz_pl_interval and timestamptz_mi_interval are not immutable?

2021-08-16 Thread Floris Van Nee
> > What do I miss? > > -- > Best regards, > Alexander Pyhalov, > Postgres Professional > See for example around DST changes postgres=# begin; BEGIN postgres =# show timezone; TimeZone -- Europe/Amsterdam (1 row) postgres=# select '2021-03-27 15:00

RE: Commitfest 2021-11 Patch Triage - Part 2

2021-11-09 Thread Floris Van Nee
> Daniel Gustafsson writes: > > 2773: libpq compression > > === > > This patch intended to provide libpq connection compression to > > "replace SSL compression" which was doomed when the patch was > written, > > and have since been removed altogether. The initial approach

RE: pg_init_privs corruption.

2023-02-17 Thread Floris Van Nee
> Kirill Reshke writes: > > As you can see, after drop role there is invalid records in > > pg_init_privs system relation. After this, pg_dump generate sql > > statements, some of which are based on content of pg_init_privs, resulting > in invalid dump. > This is as far as I can see the same