Re: [HACKERS] Custom compression methods

2017-11-27 Thread Tomas Vondra
CATED_MEMORY (I first tried without it, and it seemed working fine). If that's the case, I bet there is a palloc that should have been palloc0, or something like that. If you still can't reproduce that, I may give you access to this machine so that you can debug it there. regards -- Tomas Vondra

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2017-11-27 Thread Tomas Vondra
o call any serialization functions). I guess if we should do (5) and (6) for the MCV lists too, it seems more convenient than the current approach. And perhaps even for the statistics added to 9.6 (it does not change the storage format). regards -- Tomas Vondra http://www.2ndQu

simplehash: tb->sizemask = 0

2017-11-27 Thread Tomas Vondra
t I get from printf("%#x\n", (unsigned int)0); That would mean SH_INITIAL_BUCKET/SH_NEXT/SH_PREV can only ever return bucket 0, no? I don't think we're building hash tables with 2^32 buckets, though. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Dev

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2017-11-25 Thread Tomas Vondra
Hi, On 11/25/2017 09:23 PM, Mark Dilger wrote: > >> On Nov 18, 2017, at 12:28 PM, Tomas Vondra <tomas.von...@2ndquadrant.com> >> wrote: >> >> Hi, >> >> Attached is an updated version of the patch, adopting the psql describe >>

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2017-11-25 Thread Tomas Vondra
On 11/26/2017 02:17 AM, Mark Dilger wrote: > >> On Nov 25, 2017, at 3:33 PM, Tomas Vondra <tomas.von...@2ndquadrant.com> >> wrote: >> >> >> >> On 11/25/2017 10:01 PM, Mark Dilger wrote: >>> >>>> On Nov 18, 2017, at 12:28 PM, Tom

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2017-11-25 Thread Tomas Vondra
Hi, On 11/25/2017 05:15 PM, Mark Dilger wrote: > >> On Nov 18, 2017, at 12:28 PM, Tomas Vondra <tomas.von...@2ndquadrant.com> >> wrote: >> >> Hi, >> >> Attached is an updated version of the patch, adopting the psql describe >>

Re: [HACKERS] Custom compression methods

2017-11-24 Thread Tomas Vondra
he right solution is. I just know that if we allow inserting data into arbitrary tables without recompression, we may end up with data that can't be decompressed. I agree that the behavior with extended storage is somewhat similar, but the important distinction is that while that is surpris

Re: [HACKERS] CUBE seems a bit confused about ORDER BY

2017-11-29 Thread Tomas Vondra
On 11/29/2017 06:13 AM, Michael Paquier wrote: > On Tue, Nov 21, 2017 at 7:07 AM, Alexander Korotkov > <a.korot...@postgrespro.ru> wrote: >> On Mon, Nov 20, 2017 at 1:59 PM, Alexander Korotkov >> <a.korot...@postgrespro.ru> wrote: >>> >>&g

Re: [HACKERS] Custom compression methods

2017-11-30 Thread Tomas Vondra
On 11/30/2017 04:20 PM, Ildus Kurbangaliev wrote: > On Thu, 30 Nov 2017 00:30:37 +0100 > Tomas Vondra <tomas.von...@2ndquadrant.com> wrote: > > ... > >> I can imagine other interesting use cases - for example values in >> JSONB columns often use the same "s

Re: [HACKERS] Custom compression methods

2017-11-24 Thread Tomas Vondra
_plain)); If needed, the complete tool is here: https://bitbucket.org/tvondra/archie regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Custom compression methods

2017-11-23 Thread Tomas Vondra
Hi, On 11/23/2017 10:38 AM, Ildus Kurbangaliev wrote: > On Tue, 21 Nov 2017 18:47:49 +0100 > Tomas Vondra <tomas.von...@2ndquadrant.com> wrote: > >>> >> >> Hmmm, it still doesn't work for me. See this: >> >> test=# create extension pg_lz

Re: [HACKERS] Custom compression methods

2017-12-01 Thread Tomas Vondra
lways fail instead until no columns > use it. Let's hear other's opinions on this bit though. > Why should this behave differently compared to data types? Seems quite against POLA, if you ask me ... If you want to remove the compression, you can do the SET NOT COMPRESSED (or whatever syn

Re: [HACKERS] Custom compression methods

2017-12-01 Thread Tomas Vondra
On 12/01/2017 08:20 PM, Robert Haas wrote: > On Fri, Dec 1, 2017 at 10:18 AM, Tomas Vondra > <tomas.von...@2ndquadrant.com> wrote: >> It has very little impact on this patch, as it has nothing to do with >> columnar storage. That is, each value is compressed independ

Re: [HACKERS] Custom compression methods

2017-12-13 Thread Tomas Vondra
On 12/13/2017 01:54 AM, Robert Haas wrote: > On Tue, Dec 12, 2017 at 5:07 PM, Tomas Vondra > <tomas.von...@2ndquadrant.com> wrote: >>> I definitely think there's a place for compression built right into >>> the data type. I'm still happy about commit >>> 1

Re: [HACKERS] Custom compression methods

2017-12-14 Thread Tomas Vondra
On 12/14/2017 04:21 PM, Robert Haas wrote: > On Wed, Dec 13, 2017 at 5:10 AM, Tomas Vondra > <tomas.von...@2ndquadrant.com> wrote: >>> 2. If several data types can benefit from a similar approach, it has >>> to be separately implemented for each one. >> &

Re: [HACKERS] Custom compression methods

2017-12-18 Thread Tomas Vondra
On 12/17/2017 04:32 AM, Robert Haas wrote: > On Thu, Dec 14, 2017 at 12:23 PM, Tomas Vondra > <tomas.von...@2ndquadrant.com> wrote: >> Can you give an example of such algorithm? Because I haven't seen such >> example, and I find arguments based on hypothetical compres

Re: [HACKERS] Custom compression methods

2017-12-13 Thread Tomas Vondra
On 12/13/2017 05:55 PM, Alvaro Herrera wrote: > Tomas Vondra wrote: > >> On 12/13/2017 01:54 AM, Robert Haas wrote: > >>> 3. Compression is only applied to large-ish values. If you are just >>> making the data type representation more compact, you p

Re: Tracking of page changes for backup purposes. PTRACK [POC]

2017-12-19 Thread Tomas Vondra
al corner cases and subtle problems. And any discussion of > change track algorithm must be aware of them. Feel free to share your > concerns and point out any shortcomings of the idea or the implementation. > Thanks for the proposal and patch! regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: WIP: BRIN multi-range indexes

2017-12-19 Thread Tomas Vondra
On 12/19/2017 08:38 PM, Mark Dilger wrote: > >> On Nov 18, 2017, at 12:45 PM, Tomas Vondra <tomas.von...@2ndquadrant.com> >> wrote: >> >> Hi, >> >> Apparently there was some minor breakage due to duplicate OIDs, so here >> is the pa

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2017-12-19 Thread Tomas Vondra
le-tap-tests --enable-depend && make -j4 && make check-world > Yeah, those steps sounds about right. Apparently this got broken by ecc27d55f4, although I don't quite understand why - but it works fine before. Can you try if it works fine on 9f4992e2a9 and fails with ecc27d55f4? re

Re: CUBE seems a bit confused about ORDER BY

2017-12-12 Thread Tomas Vondra
k them to reindex in the release notes seems OK for me. > Is there a good way to detect such cases? Either in pg_upgrade, so that we can print warnings, or at least manually (which would be suitable for release notes). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Custom compression methods

2017-12-12 Thread Tomas Vondra
On 12/12/2017 10:33 PM, Robert Haas wrote: > On Mon, Dec 11, 2017 at 2:53 PM, Tomas Vondra > <tomas.von...@2ndquadrant.com> wrote: >> But let me play the devil's advocate for a while and question the >> usefulness of this approach to compression. Some of the que

Re: [HACKERS] Custom compression methods

2017-11-20 Thread Tomas Vondra
On 11/20/2017 04:43 PM, Евгений Шишкин wrote: > > >> On Nov 20, 2017, at 18:29, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote: >> >>> >>> What about instead of dropping column we leave data uncompressed? >>> >> >> That

Re: spgist rangetypes compiler warning (gcc 7.2.0)

2017-11-18 Thread Tomas Vondra
On 11/18/2017 10:40 PM, Tom Lane wrote: > Tomas Vondra <tomas.von...@2ndquadrant.com> writes: >> while compiling on gcc 7.2.0 (on ARM), I got this warning: > >> rangetypes_spgist.c: In function 'spg_range_quad_inner_consistent': >> rangetypes_spgist.c:559:2

Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

2017-11-18 Thread Tomas Vondra
ltuples > to the total number of heap entries it chose to index, because that > is what IndexBuildHeapScan counts. Maybe we should adjust that? > You mean by only counting live tuples in IndexBuildHeapRangeScan, following whatever definition we end up using in VACUUM/ANALYZE? Seems like a goo

Re: percentile value check can be slow

2017-11-18 Thread Tomas Vondra
. If the list is dynamic (computed in the query itself), you'll still get the error much later during query execution. So if you're getting many failures like this for the "delayed error reporting" to be an issue, perhaps there's something wrong in you stack and you should address that instead? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] new function for tsquery creartion

2017-11-18 Thread Tomas Vondra
0x04 #define TS_NOT_EXAC0x08 <-- the new one Perhaps that's OK, but it seems a bit inconsistent. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

spgist rangetypes compiler warning (gcc 7.2.0)

2017-11-18 Thread Tomas Vondra
dition to if (in->traversalValue) Patch attached. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services >From 3b0528e9b56b9ff2dec4c3670d78ad9018224065 Mon Sep 17 00:00:00 2001 From: Tomas Vondra <to...@2ndquadrant

Re: [HACKERS] pg_basebackup --progress output for batch execution

2017-11-18 Thread Tomas Vondra
ipped so the other end of the pipe reads each line. Using this option with --progress will result in printing each progress output with a newline at the end, instead of a carrige return. like this: Runs pg_basebackup in batch mode, in which --progress terminates the lines with a regular newline instead of carriage return. This is useful if the output is redirected to a file or a pipe, instead of a plain console. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: WIP: BRIN multi-range indexes

2017-11-18 Thread Tomas Vondra
Hi, Apparently there was some minor breakage due to duplicate OIDs, so here is the patch series updated to current master. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 0001-Pass-all-keys-to-

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2017-11-18 Thread Tomas Vondra
Hi, Attached is an updated version of the patch, adopting the psql describe changes introduced by 471d55859c11b. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 0001-multivariate-MCV-lists.patc

Re: [HACKERS] Custom compression methods

2017-11-19 Thread Tomas Vondra
ssed varlena values easier. Or perhaps the VARSIZE_ANY / VARSIZE_ANY_EXHDR / VARDATA_ANY already support that? This part is not very clear to me. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services pg_lz4.tgz Description: application/compressed-tar

Re: percentile value check can be slow

2017-11-19 Thread Tomas Vondra
Hi, On 11/19/2017 03:10 AM, David Fetter wrote: > On Sat, Nov 18, 2017 at 11:05:47PM +0100, Tomas Vondra wrote: >> Hi, >> >> ... >> >> Is 'recognizer' an established definition I should know? Is it the same >> as 'validator' or is it something new/

Re: [HACKERS] CUBE seems a bit confused about ORDER BY

2017-11-19 Thread Tomas Vondra
in the proposed way and fixes KNN-GiST search for that.  I'm > going to register this patch to the nearest commitfest. > Seems fine to me, although perhaps it should be split into two parts. One with the cube_coord_llur fixes, and then g_cube_distance changes adding support for negative coo

Re: [HACKERS] GnuTLS support

2017-11-19 Thread Tomas Vondra
onfigure that only one TLS implementation is enabled? Either by some elaborate check, or by switching to something like --with-ssl=(openssl|gnutls) regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Custom compression methods

2017-11-19 Thread Tomas Vondra
st be no remaining data compressed with it. But that's what the patch is doing already - it enforces this using dependencies, as usual. Ildus, can you explain what you meant? How could the data still be decompressed after DROP COMPRESSION METHOD, and possibly after removing the .so library?

Re: [HACKERS] Custom compression methods

2017-11-20 Thread Tomas Vondra
On 11/20/2017 10:44 AM, Ildus Kurbangaliev wrote: > On Mon, 20 Nov 2017 00:23:23 +0100 > Tomas Vondra <tomas.von...@2ndquadrant.com> wrote: > >> On 11/15/2017 02:13 PM, Robert Haas wrote: >>> On Wed, Nov 15, 2017 at 4:09 AM, Ildus Kurbangaliev >>>

Re: [HACKERS] Custom compression methods

2017-11-20 Thread Tomas Vondra
On 11/20/2017 04:21 PM, Евгений Шишкин wrote: > > >> On Nov 20, 2017, at 18:18, Tomas Vondra <tomas.von...@2ndquadrant.com >> <mailto:tomas.von...@2ndquadrant.com>> wrote: >> >> >> I don't think we need to do anything smart here - it should behav

Re: [HACKERS] Custom compression methods

2017-11-21 Thread Tomas Vondra
;. >> test=# insert into t_lz4 select repeat(md5(1::text),300);^C >> test=# select * from t_pglz ; >> ERROR: cache lookup failed for compression options 16419 >> >> That suggests no recompression happened. > > I will check that. Is your extension publ

Re: [HACKERS] Custom compression methods

2017-12-01 Thread Tomas Vondra
On 11/30/2017 09:51 PM, Alvaro Herrera wrote: > Tomas Vondra wrote: > >> On 11/30/2017 04:20 PM, Ildus Kurbangaliev wrote: > >>> CREATE COMPRESSION METHOD ts1 FOR tsvector HANDLER >>> tsvector_compression_handler; >> >> Understood. Good to know

Re: [HACKERS] Custom compression methods

2017-12-02 Thread Tomas Vondra
ondly, all the previous attempts ran into some legal issues, i.e. licensing and/or patents. Maybe the situation changed since then (no idea, haven't looked into that), but in the past the "pluggable" approach was proposed as a way to address this. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Custom compression methods

2017-12-01 Thread Tomas Vondra
On 12/01/2017 03:23 PM, Robert Haas wrote: > On Thu, Nov 30, 2017 at 2:47 PM, Tomas Vondra > <tomas.von...@2ndquadrant.com> wrote: >> OK. I think it's a nice use case (and nice gains on the compression >> ratio), demonstrating the datatype-aware compression. The ques

Re: [HACKERS] Custom compression methods

2017-12-02 Thread Tomas Vondra
On 12/02/2017 09:24 PM, konstantin knizhnik wrote: > > On Dec 2, 2017, at 6:04 PM, Tomas Vondra wrote: > >> On 12/01/2017 10:52 PM, Andres Freund wrote: >> ... >> >> Other algorithms (e.g. zstd) got significantly better compression (25%) >> compa

Re: [HACKERS] Custom compression methods

2017-12-02 Thread Tomas Vondra
On 12/02/2017 09:38 PM, Andres Freund wrote: > Hi, > > On 2017-12-02 16:04:52 +0100, Tomas Vondra wrote: >> Firstly, it's going to be quite hard (or perhaps impossible) to find an >> algorithm that is "universally better" than pglz. Some algorithms do >>

Re: Rethinking MemoryContext creation

2017-12-11 Thread Tomas Vondra
ve to set things up so that the default > behavior for AllocSetContextCreate is to not copy. This risks breaking > callers in extensions. Not entirely sure if it's worth that --- any > thoughts? > I don't think silently breaking extensions is particularly attractive option, s

Re: Rethinking MemoryContext creation

2017-12-11 Thread Tomas Vondra
On 12/11/2017 06:22 PM, Robert Haas wrote: > On Mon, Dec 11, 2017 at 11:59 AM, Tom Lane <t...@sss.pgh.pa.us> wrote: >> Tomas Vondra <tomas.von...@2ndquadrant.com> writes: >>> On 12/11/2017 05:27 PM, Tom Lane wrote: >>>> However, unless we want t

Re: Rethinking MemoryContext creation

2017-12-11 Thread Tomas Vondra
35053 35762 So that's about 1.3% and 1.2% improvement. It seems fairly consistent, but it might easily be due to different in layout of the binaries. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Custom compression methods

2017-12-11 Thread Tomas Vondra
if the patch should introduce some infrastructure for handling the column context (e.g. column dictionary). Right now, whoever implements the compression has to implement this bit too. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support,

Re: [HACKERS] Custom compression methods

2017-12-01 Thread Tomas Vondra
On 12/01/2017 08:38 PM, Alvaro Herrera wrote: > Tomas Vondra wrote: > >> On 11/30/2017 09:51 PM, Alvaro Herrera wrote: > >>> Just passing by, but wouldn't this fit in the ACCESS METHOD group of >>> commands? So this could be simplified down to >>>

Re: POC: GROUP BY optimization

2018-06-06 Thread Tomas Vondra
parts. One that does the index magic, and one for this reordering optimization. The first part (additional indexes) seems quite fairly safe, likely to get committable soon. The other part (ndistinct reordering) IMHO requires more thought regarding costing and interaction with other query parts. Thank you

Re: Spilling hashed SetOps and aggregates to disk

2018-06-06 Thread Tomas Vondra
and not all aggregates can do that ... (certainly not in universal way). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: POC: GROUP BY optimization

2018-06-09 Thread Tomas Vondra
On 06/09/2018 08:09 PM, Tomas Vondra wrote: > > /snip/ > > 4) when adding Sort for grouping, try producing the right output order >(if the ORDER BY was specified) > BTW I've just realized we already do something similar in master. If you run a query like this: SELECT a

Re: POC: GROUP BY optimization

2018-06-09 Thread Tomas Vondra
osting in general in that patch needs more work, and I do recall Tom pointing out that the current heuristics (estimating number of sort groups using ndistincts) seems rather weak. It may not be exactly the same problem, but it seems kinda similar to what this patch does. I have a hunch that those patches will end up inventing something fairly similar. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: POC: GROUP BY optimization

2018-06-07 Thread Tomas Vondra
/costing reliable enough to make this unnecessary. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: WAL prefetch

2018-06-16 Thread Tomas Vondra
On 06/16/2018 12:06 PM, Thomas Munro wrote: On Sat, Jun 16, 2018 at 9:38 PM, Tomas Vondra wrote: On 06/15/2018 08:01 PM, Andres Freund wrote: On 2018-06-14 10:13:44 +0300, Konstantin Knizhnik wrote: On 14.06.2018 09:52, Thomas Munro wrote: Why stop at the page cache... what about shared

Re: WAL prefetch

2018-06-16 Thread Tomas Vondra
to a prefetching implementation that doesn't read into s_b. Could you elaborate why prefetching into s_b is so much better (I'm sure it has advantages, but I suppose prefetching into page cache would be much easier to implement). regards -- Tomas Vondra http://www

Re: WAL prefetch

2018-06-16 Thread Tomas Vondra
On 06/16/2018 09:02 PM, Andres Freund wrote: > On 2018-06-16 11:38:59 +0200, Tomas Vondra wrote: >> >> >> On 06/15/2018 08:01 PM, Andres Freund wrote: >>> On 2018-06-14 10:13:44 +0300, Konstantin Knizhnik wrote: >>>> >>>> >>>> O

Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

2018-06-18 Thread Tomas Vondra
umns. Even if the indexes are encrypted too, you can often deduce a lot of information from them. So what's the plan here? Disallow indexes on encrypted columns? Index encypted values directly? Something else? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Develop

Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

2018-06-13 Thread Tomas Vondra
to modify the data files) is part of the threat model, of course. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

2018-06-13 Thread Tomas Vondra
hink encrypting temporary data would be a big problem, assuming you know which key to use. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: POC: GROUP BY optimization

2018-06-16 Thread Tomas Vondra
that in one particular case. I still think we need to be careful when introducing new optimizations in this area - reordering the grouping keys by ndistinct, ORDER BY or whatever. In particular I don't think we should commit these patches that may quite easily cause regressions, and then hope some h

Re: Spilling hashed SetOps and aggregates to disk

2018-06-11 Thread Tomas Vondra
On 06/11/2018 07:14 PM, Andres Freund wrote: Hi, On 2018-06-11 17:29:52 +0200, Tomas Vondra wrote: It would be great to get something that performs better than just falling back to sort (and I was advocating for that), but I'm worried we might be moving the goalposts way too far. I'm

Re: Spilling hashed SetOps and aggregates to disk

2018-06-11 Thread Tomas Vondra
On 06/11/2018 08:13 PM, Jeff Davis wrote: > On Mon, 2018-06-11 at 19:33 +0200, Tomas Vondra wrote: >> For example we hit the work_mem limit after processing 10% tuples, >> switching to sort would mean spill+sort of 900GB of data. Or we >> might say - hmm, we're 10% through,

Re: WAL prefetch

2018-06-19 Thread Tomas Vondra
to prefetch directly to shared buffers, so that it also works with direct I/O in the future. But now you suggest to use posix_fadvise() to work around the synchronous buffer read limitation. I don't follow ... regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL

Re: WAL prefetch

2018-06-19 Thread Tomas Vondra
sure. This is why I will try to investigate it more. I'd say checkpoints already do act as such timeout (not only, but people are setting it high to get rid of FPIs). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Spilling hashed SetOps and aggregates to disk

2018-06-11 Thread Tomas Vondra
On 06/11/2018 05:16 PM, Robert Haas wrote: On Wed, Jun 6, 2018 at 8:16 PM, Tomas Vondra wrote: ... and this is pretty much what Jeff Davis suggested, I think. The trouble is, each of those cases behaves nicely/terribly in different corner cases. That's a really good point. If the number

Re: Spilling hashed SetOps and aggregates to disk

2018-06-11 Thread Tomas Vondra
more general and depends on which scheme we end up using (just hashagg, hash+sort, something else ...) regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: POC: GROUP BY optimization

2018-06-03 Thread Tomas Vondra
t from it). So to summarize this initial review - I do suggest splitting the patch into two parts. One that does the index magic, and one for this reordering optimization. The first part (additional indexes) seems quite fairly safe, likely to get committable soon. The other part (ndistinct re

Re: [PATCH] Improve geometric types

2018-06-03 Thread Tomas Vondra
On 06/03/2018 11:50 PM, Tom Lane wrote: Tomas Vondra writes: The main remaining question I have is what do do with back-branches. Shall we back-patch this or not? Given the behavioral changes involved, I'd say "no way". That's reinforced by the lack of field complaints; if there

Re: [PATCH] Improve geometric types

2018-06-03 Thread Tomas Vondra
and no one noticed/reported them, but it's still buggy and that's not fun. Opinions? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Spilling hashed SetOps and aggregates to disk

2018-06-05 Thread Tomas Vondra
On 06/05/2018 09:22 AM, David Rowley wrote: On 5 June 2018 at 17:04, Tomas Vondra wrote: On 06/05/2018 04:56 AM, David Rowley wrote: Isn't there still a problem determining when the memory exhaustion actually happens though? As far as I know, we've still little knowledge how much memory

Re: Spilling hashed SetOps and aggregates to disk

2018-06-04 Thread Tomas Vondra
was a discussion if this is actually an overhead or merely due to different binary layout. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Spilling hashed SetOps and aggregates to disk

2018-06-05 Thread Tomas Vondra
On 06/05/2018 07:46 AM, Jeff Davis wrote: On Tue, 2018-06-05 at 07:04 +0200, Tomas Vondra wrote: I expect the eviction strategy to be the primary design challenge of this patch. The other bits will be mostly determined by this one piece. Not sure I agree that this is the primary challenge

Re: Spilling hashed SetOps and aggregates to disk

2018-06-05 Thread Tomas Vondra
On 06/05/2018 02:49 PM, Andres Freund wrote: Hi, On 2018-06-05 10:05:35 +0200, Tomas Vondra wrote: My concern is more about what happens when the input tuple ordering is inherently incompatible with the eviction strategy, greatly increasing the amount of data written to disk during evictions

Re: Spilling hashed SetOps and aggregates to disk

2018-06-05 Thread Tomas Vondra
and the reasoning behind the design. Cheers Serge Salesforce cheers -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Spilling hashed SetOps and aggregates to disk

2018-06-05 Thread Tomas Vondra
their own aggregates ... regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Spilling hashed SetOps and aggregates to disk

2018-06-07 Thread Tomas Vondra
On 06/07/2018 02:18 AM, Andres Freund wrote: On 2018-06-06 17:17:52 -0700, Andres Freund wrote: On 2018-06-07 12:11:37 +1200, David Rowley wrote: On 7 June 2018 at 08:11, Tomas Vondra wrote: On 06/06/2018 04:11 PM, Andres Freund wrote: Consider e.g. a scheme where we'd switch from hashed

Re: Spilling hashed SetOps and aggregates to disk

2018-06-06 Thread Tomas Vondra
On 06/06/2018 04:11 PM, Andres Freund wrote: > On 2018-06-06 16:06:18 +0200, Tomas Vondra wrote: >> On 06/06/2018 04:01 PM, Andres Freund wrote: >>> Hi, >>> >>> On 2018-06-06 15:58:16 +0200, Tomas Vondra wrote: >>>> The other issue is that s

Re: POC: GROUP BY optimization

2018-06-06 Thread Tomas Vondra
t; >> >> If there's no extended statistic on the columns, you can do the >> current thing (assuming independence etc.). There's not much we can do >> here. > Agree > >> >> If there's an extended statistic, you can do either a greedy search >> (get the next column with the highest ndistinct coefficient) or >> exhaustive search (computing the estimated number of comparisons). >> >> Another challenge is that using only the ndistinct coefficient assumes >> uniform distribution of the values. But you can have a column with 1M >> distinct values, where a single value represents 99% of the rows. And >> another column with 100k distinct values, with actual uniform >> distribution. I'm pretty sure it'd be more efficient to place the 100k >> column first. > > Interesting.  Will think, thank you > You're welcome. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: POC: GROUP BY optimization

2018-06-06 Thread Tomas Vondra
On 06/06/2018 11:22 PM, Claudio Freire wrote: > On Wed, Jun 6, 2018 at 5:43 PM Tomas Vondra > wrote: >> >>>>>> For example, it seems to disregard that different data types have >>>>>> different comparison costs. For example comparing bytea will b

Re: Spilling hashed SetOps and aggregates to disk

2018-06-06 Thread Tomas Vondra
On 06/07/2018 02:11 AM, David Rowley wrote: > On 7 June 2018 at 08:11, Tomas Vondra wrote: >> On 06/06/2018 04:11 PM, Andres Freund wrote: >>> Consider e.g. a scheme where we'd switch from hashed aggregation to >>> sorted aggregation due to memory limits

Re: POC: GROUP BY optimization

2018-06-06 Thread Tomas Vondra
On 06/07/2018 12:18 AM, Claudio Freire wrote: > On Wed, Jun 6, 2018 at 6:57 PM Tomas Vondra > wrote: >> >> On 06/06/2018 11:22 PM, Claudio Freire wrote: >>> On Wed, Jun 6, 2018 at 5:43 PM Tomas Vondra >>> As such, estimating sort performance correctly in

Re: Spilling hashed SetOps and aggregates to disk

2018-06-06 Thread Tomas Vondra
On 06/06/2018 04:01 PM, Andres Freund wrote: Hi, On 2018-06-06 15:58:16 +0200, Tomas Vondra wrote: The other issue is that serialize/deserialize is only a part of a problem - you also need to know how to do "combine", and not all aggregates can do that ... (certainly not in uni

Re: ERROR: cannot start subtransactions during a parallel operation

2018-06-29 Thread Tomas Vondra
On 06/29/2018 07:22 PM, Andres Freund wrote: On 2018-06-29 16:50:47 +0200, Tomas Vondra wrote: On 06/29/2018 04:00 PM, Mai Peng wrote: Hello, On a pG10.4 instance, my query ( a simple select from a view) throw this error: ERROR:  cannot start subtransactions during a parallel operation

Re: effect of JIT tuple deform?

2018-06-27 Thread Tomas Vondra
by jit_optimization? Can you share the test case and some detail about the hardware and PostgreSQL configuration? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: WAL prefetch

2018-06-27 Thread Tomas Vondra
time until the end of next week (probably). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2018-06-24 Thread Tomas Vondra
ng thing to assume here, considering the extended statistics are meant for cases where the columns are not independent. So I'd argue the 0.5 estimate for partially-matching buckets is the right thing to do here, as it's minimizing the average error. regards -- Tomas Vondra h

Re: Using JSONB directly from application

2018-06-24 Thread Tomas Vondra
mat.  > But I could see a case for making the function raw_jsonb(int, jsonb) and > allowing the caller to specify what (maximum?) version of JSONB they want. > I doubt we'll introduce a new JSONB any time soon, so I wouldn't be particularly worried about this. If it eventually happens, you'll have to adapt your parser to that, I think. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: WIP: BRIN multi-range indexes

2018-06-24 Thread Tomas Vondra
On 06/24/2018 11:39 PM, Thomas Munro wrote: > On Sun, Jun 24, 2018 at 2:01 PM, Tomas Vondra > wrote: >> Attached is rebased version of this BRIN patch series, fixing mostly the >> breakage due to 372728b0 (aka initial-catalog-data format changes). As >> 2018-07 CF i

Re: WIP: BRIN multi-range indexes

2018-06-24 Thread Tomas Vondra
On 06/25/2018 12:31 AM, Tomas Vondra wrote: > On 06/24/2018 11:39 PM, Thomas Munro wrote: >> On Sun, Jun 24, 2018 at 2:01 PM, Tomas Vondra >> wrote: >>> Attached is rebased version of this BRIN patch series, fixing mostly the >>> breakage due to 372728b0 (aka in

Re: WIP: BRIN multi-range indexes

2018-06-23 Thread Tomas Vondra
bloom filter behavior). The idea is that for ranges with significantly fewer distinct values, we only store those to save space (instead of allocating the whole bloom filter with mostly 0 bits). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Supp

Re: Push down Aggregates below joins

2018-06-20 Thread Tomas Vondra
tch to allow that. > There was a patch [1] from Antonin Houska aiming to achieve something similar. IIRC it aimed to push the aggregate down in more cases, leveraging the partial aggregation stuff. I suppose your patch only aims to do the pushdown when the two-phase aggregation is not needed? reg

Re: Push down Aggregates below joins

2018-06-20 Thread Tomas Vondra
ded? [1] https://www.postgresql.org/message-id/flat/9666.1491295317@localhost regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

memory leak when serializing TRUNCATE in reorderbuffer

2018-06-20 Thread Tomas Vondra
other places in decode.c don't allocate memory directly but call ReorderBufferGetTupleBuf() instead - perhaps we should introduce a similar wrapper here too. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Ser

Re: WAL prefetch

2018-06-19 Thread Tomas Vondra
On 06/19/2018 02:33 PM, Konstantin Knizhnik wrote: On 19.06.2018 14:03, Tomas Vondra wrote: On 06/19/2018 11:08 AM, Konstantin Knizhnik wrote: ... >>> Also there are two points which makes prefetching into shared buffers more complex: 1. Need to spawn multiple workers to make

Re: WAL prefetch

2018-06-19 Thread Tomas Vondra
On 06/19/2018 04:50 PM, Konstantin Knizhnik wrote: On 19.06.2018 16:57, Ants Aasma wrote: On Tue, Jun 19, 2018 at 4:04 PM Tomas Vondra mailto:tomas.von...@2ndquadrant.com>> wrote: Right. My point is that while spawning bgworkers probably helps, I don't expect it to be

Re: [HACKERS] plpgsql - additional extra checks

2018-07-03 Thread Tomas Vondra
On 03/20/2018 01:35 PM, Tomas Vondra wrote: On 03/20/2018 05:36 AM, Pavel Stehule wrote: 2018-03-19 21:47 GMT+01:00 Tomas Vondra mailto:tomas.von...@2ndquadrant.com>>: Hi, I'm looking at the updated patch (plpgsql-extra-check-180316.patch), and this time it a

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2017-12-23 Thread Tomas Vondra
On 12/23/2017 03:03 PM, Erikjan Rijkers wrote: > On 2017-12-23 05:57, Tomas Vondra wrote: >> Hi all, >> >> Attached is a patch series that implements two features to the logical >> replication - ability to define a memory limit for the reorderbuffer >> (res

PATCH: logical_work_mem and logical streaming of large in-progress transactions

2017-12-22 Thread Tomas Vondra
o enforce the memory limit while restoring transactions spilled to disk, because we would not have the problem with restoring changes for many subtransactions. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & S

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2017-12-23 Thread Tomas Vondra
On 12/23/2017 11:23 PM, Erik Rijkers wrote: > On 2017-12-23 21:06, Tomas Vondra wrote: >> On 12/23/2017 03:03 PM, Erikjan Rijkers wrote: >>> On 2017-12-23 05:57, Tomas Vondra wrote: >>>> Hi all, >>>> >>>> Attached is a patch series that i

  1   2   3   4   5   6   >