pg_dumpall -r -c try to drop user postgres

2017-12-02 Thread Pavel Stehule
Hi

I am not sure if user postgres should be removed, so it is probably bug

pg_dumpall -r -c | grep postgres

DROP ROLE postgres;
CREATE ROLE postgres;

Regards

Pavel


Re:Re: [HACKERS] [PATCH]make pg_rewind to not copy useless WAL files

2017-12-02 Thread chenhj


At 2017-12-01 12:27:09, "Michael Paquier"  wrote:
>On Tue, Oct 3, 2017 at 1:20 AM, chenhj  wrote:
>> I had filled the authors field of this patch in commitfest, and will rebase
>> this patch if needed. Thank you for your help!
>
>The documentation of the patch needs a rebase, so I am moving it to
>next CF with "waiting on author" as status.
>
>$ git diff master --check
>src/bin/pg_rewind/pg_rewind.c:292: trailing whitespace.
>+*
>There are whitespace complains.
>-- 

>Michael


Rebased and removed the  whitespace.



regards
Chen Huajun

pg_rewind_wal_copy_reduce_v9.patch
Description: Binary data


Re: [HACKERS] Proposal: Local indexes for partitioned table

2017-12-02 Thread Alvaro Herrera
David Rowley wrote:

> So, then this patch is only really intended as a syntax shortcut for
> mass adding of indexes to each partition?

This patch is intended to serve as a basis on which to construct further
features, just like every other patch we apply.

> I feel like we could do better here with little extra effort. The
> DETACH index feature does not really seem required for this patch. I
> think just ensuring a matching index exists on each leaf partition and
> creating any which don't exist before creating the index on the target
> partitioned table seems like the correct solution. That way we can
> make that index indisvalid flag have a valid meaning all the time.
> Some later patch can invent some way to replace a bloated index.

What you're saying is that I've written code for A+B, and you're
"interested in C (which is incompatible with B), so can we have A+C and
drop B".  But in reality, there exists (unwritten) D that solves the
incompatiblity between B and C.  I'm just saying it's essentially the
same to postpone C+D than to postpone B+D, and I already have B written;
plus that way we don't have to come up with some novel way to handle
pg_dump support.  So can we get A+B committed and discuss C+D later?

A = partitioned indexes
B = pg_dump support based on ATTACH
C = your proposed planner stuff
D = correct indisvalid setting for partitioned indexes (set to false
when a partition does not contain the index)

The patch in this thread is A+B.

> Perhaps later we can invent some generic way to replace a physical
> leaf index for a given partitioned index perhaps with the same patch
> that might allow us to replace an index which is used by a constraint,
> which to me seems like a feature we should have had years ago.

This is a hypothetical feature E which would be nice (for partitioned
indexes and for ordinary indexes too) but is not strictly necessary.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Custom compression methods

2017-12-02 Thread Alvaro Herrera
Tomas Vondra wrote:

> On 12/01/2017 08:48 PM, Alvaro Herrera wrote:

> > Maybe our dependency code needs to be extended in order to support this.
> > I think the current logic would drop the column if you were to do "DROP
> > COMPRESSION .. CASCADE", but I'm not sure we'd see that as a feature.
> > I'd rather have DROP COMPRESSION always fail instead until no columns
> > use it.  Let's hear other's opinions on this bit though.
> 
> Why should this behave differently compared to data types? Seems quite
> against POLA, if you ask me ...

OK, DROP TYPE sounds good enough precedent, so +1 on that.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Custom compression methods

2017-12-02 Thread Tomas Vondra
On 12/02/2017 09:38 PM, Andres Freund wrote:
> Hi,
> 
> On 2017-12-02 16:04:52 +0100, Tomas Vondra wrote:
>> Firstly, it's going to be quite hard (or perhaps impossible) to find an
>> algorithm that is "universally better" than pglz. Some algorithms do
>> work better for text documents, some for binary blobs, etc. I don't
>> think there's a win-win option.
> 
> lz4 is pretty much there.
> 

That's a matter of opinion, I guess. It's a solid compression algorithm,
that's for sure ...

>> Secondly, all the previous attempts ran into some legal issues, i.e.
>> licensing and/or patents. Maybe the situation changed since then (no
>> idea, haven't looked into that), but in the past the "pluggable"
>> approach was proposed as a way to address this.
> 
> Those were pretty bogus.

IANAL so I don't dare to judge on bogusness of such claims. I assume if
we made it optional (e.g. configure/initdb option, it'd be much less of
an issue). Of course, that has disadvantages too (because when you
compile/init with one algorithm, and then find something else would work
better for your data, you have to start from scratch).

>
> I think we're not doing our users a favor if they've to download
> some external projects, then fiddle with things, just to not choose
> a compression algorithm that's been known bad for at least 5+ years.
> If we've a decent algorithm in-core *and* then allow extensibility, 
> that's one thing, but keeping the bad and tell forks "please take
> our users with this code we give you" is ...
> 

I don't understand what exactly is your issue with external projects,
TBH. I think extensibility is one of the great strengths of Postgres.
It's not all rainbows and unicorns, of course, and it has costs too.

FWIW I don't think pglz is a "known bad" algorithm. Perhaps there are
cases where other algorithms (e.g. lz4) are running circles around it,
particularly when it comes to decompression speed, but I wouldn't say
it's "known bad".

Not sure which forks you're talking about ...

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Partition pruning for Star Schema

2017-12-02 Thread legrand legrand
Sorry, I apologize.
I though (most) Hackers were not reading General list.

Regards
PAscal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html



Re: [HACKERS] Custom compression methods

2017-12-02 Thread Tomas Vondra

On 12/02/2017 09:24 PM, konstantin knizhnik wrote:
> 
> On Dec 2, 2017, at 6:04 PM, Tomas Vondra wrote:
> 
>> On 12/01/2017 10:52 PM, Andres Freund wrote:
>> ...
>>
>> Other algorithms (e.g. zstd) got significantly better compression (25%)
>> compared to pglz, but in exchange for longer compression. I'm sure we
>> could lower compression level to make it faster, but that will of course
>> hurt the compression ratio.
>>
>> I don't think switching to a different compression algorithm is a way
>> forward - it was proposed and explored repeatedly in the past, and every
>> time it failed for a number of reasons, most of which are still valid.
>>
>>
>> Firstly, it's going to be quite hard (or perhaps impossible) to
>> find an algorithm that is "universally better" than pglz. Some
>> algorithms do work better for text documents, some for binary
>> blobs, etc. I don't think there's a win-win option.
>>
>> Sure, there are workloads where pglz performs poorly (I've seen
>> such cases too), but IMHO that's more an argument for the custom
>> compression method approach. pglz gives you good default
>> compression in most cases, and you can change it for columns where
>> it matters, and where a different space/time trade-off makes
>> sense.
>>
>>
>> Secondly, all the previous attempts ran into some legal issues, i.e.
>> licensing and/or patents. Maybe the situation changed since then (no
>> idea, haven't looked into that), but in the past the "pluggable"
>> approach was proposed as a way to address this.
>>
>>
> 
> May be it will be interesting for you to see the following results
> of applying page-level compression (CFS in PgPro-EE) to pgbench
> data:
> 

I don't follow. If I understand what CFS does correctly (and I'm mostly
guessing here, because I haven't seen the code published anywhere, and I
assume it's proprietary), it essentially compresses whole 8kB blocks.

I don't know it reorganizes the data into columnar format first, in some
way (to make it more "columnar" which is more compressible), which would
make somewhat similar to page-level compression in Oracle.

But it's clearly a very different approach from what the patch aims to
improve (compressing individual varlena values).

> 
> All algorithms (except zlib) were used with best-speed option: using 
> better compression level usually has not so large impact on
> compression ratio (<30%), but can significantly increase time
> (several times). Certainly pgbench isnot the best candidate for
> testing compression algorithms: it generates a lot of artificial and
> redundant data. But we measured it also on real customers data and
> still zstd seems to be the best compression methods: provides good
> compression with smallest CPU overhead.
> 

I think this really depends on the dataset, and drawing conclusions
based on a single test is somewhat crazy. Especially when it's synthetic
pgbench data with lots of inherent redundancy - sequential IDs, ...

My takeaway from the results is rather that page-level compression may
be very beneficial in some cases, although I wonder how much of that can
be gained by simply using compressed filesystem (thus making it
transparent to PostgreSQL).


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Partition pruning for Star Schema

2017-12-02 Thread Andres Freund
Hi,

On 2017-12-02 13:56:51 -0700, legrand legrand wrote:
> I have a typical star schema, having dimension tables "product", "calendar"
> and "country" and a fact table "sales".
> This fact table is partitionned by time (range by month) and country
> (list).

You've posted nearly the same a few days ago:
http://archives.postgresql.org/message-id/1511986639005-0.post%40n3.nabble.com

Please don't just duplicate threads just because you've not received an
answer a few days ago. If you'd done additional research / provided more
context *and* linked to the last time you sent something it'd be
different.

Greetings,

Andres Freund



Re: [HACKERS] Parallel Hash take II

2017-12-02 Thread Andres Freund
On 2017-12-02 15:54:29 +1300, Thomas Munro wrote:
> On Sat, Dec 2, 2017 at 1:55 PM, Andres Freund  wrote:
> > - Right now RemovePgTempFilesInDir() will recurse into appropriately
> >   named directories, and when it recurses it doesn't require the same
> >   name pattern checks. I think that's good, but I think it'd be prudent
> >   to be a bit more paranoid and prevent recursing into symlinked
> >   subdirectories.
> 
> That's why it uses lstat(), so that it sees symlinks rather than what
> they point to. It only recurses if S_ISDIR(), and it unlinks anything
> else.

Right. I'd somehow confused myself by thinking one'd need an explicit
S_ISLINK check...


> Just a reminder: a couple of problems have come up recently in the
> Parallel Hash Join patch itself, so please don't consider that one
> ready for commit quite yet.  They are: (1) Handling the case where
> there is no DSA area because we're running a parallel-aware plan in
> non-parallel mode due to lack of resources; (2) Investigating a rare
> assertion failure.  For (1), that may depend on another patch that
> I'll post shortly to kill "es_query_dsa" and, come to think of it, for
> (2) it's possible that the problem is in either one of the remaining
> patches -- SharedTuplestore or Parallel Hash Join -- so please hold
> off on committing either of those until I've got to the bottom of
> that.

I'm a bit tempted to press ahead regardless of these issues. With your
consent obviously. ISTM we're pretty close to the point where this needs
to be exposed more widely and that'll surely bring more issues to light.

Greetings,

Andres Freund



Re: [HACKERS] Custom compression methods

2017-12-02 Thread Andres Freund
Hi,

On 2017-12-02 16:04:52 +0100, Tomas Vondra wrote:
> Firstly, it's going to be quite hard (or perhaps impossible) to find an
> algorithm that is "universally better" than pglz. Some algorithms do
> work better for text documents, some for binary blobs, etc. I don't
> think there's a win-win option.

lz4 is pretty much there.

> Secondly, all the previous attempts ran into some legal issues, i.e.
> licensing and/or patents. Maybe the situation changed since then (no
> idea, haven't looked into that), but in the past the "pluggable"
> approach was proposed as a way to address this.

Those were pretty bogus.  I think we're not doing our users a favor if
they've to download some external projects, then fiddle with things,
just to not choose a compression algorithm that's been known bad for at
least 5+ years.  If we've a decent algorithm in-core *and* then allow
extensibility, that's one thing, but keeping the bad and tell forks
"please take our users with this code we give you" is ...

Greetings,

Andres Freund



Re: [HACKERS] Custom compression methods

2017-12-02 Thread Tomas Vondra
On 12/01/2017 10:52 PM, Andres Freund wrote:
> On 2017-12-01 16:14:58 -0500, Robert Haas wrote:
>> Honestly, if we can give everybody a 4% space reduction by
>> switching to lz4, I think that's totally worth doing -- but let's
>> not make people choose it, let's make it the default going forward,
>> and keep pglz support around so we don't break pg_upgrade
>> compatibility (and so people can continue to choose it if for some
>> reason it works better in their use case). That kind of improvement
>> is nothing special in a specific workload, but TOAST is a pretty
>> general-purpose mechanism. I have become, through a few bitter
>> experiences, a strong believer in the value of trying to reduce our
>> on-disk footprint, and knocking 4% off the size of every TOAST
>> table in the world does not sound worthless to me -- even though
>> context-aware compression can doubtless do a lot better.
> 
> +1. It's also a lot faster, and I've seen way way to many workloads
> with 50%+ time spent in pglz.
> 

TBH the 4% figure is something I mostly made up (I'm fake news!). On the
mailing list archive (which I believe is pretty compressible) I observed
something like 2.5% size reduction with lz4 compared to pglz, at least
with the compression levels I've used ...

Other algorithms (e.g. zstd) got significantly better compression (25%)
compared to pglz, but in exchange for longer compression. I'm sure we
could lower compression level to make it faster, but that will of course
hurt the compression ratio.

I don't think switching to a different compression algorithm is a way
forward - it was proposed and explored repeatedly in the past, and every
time it failed for a number of reasons, most of which are still valid.


Firstly, it's going to be quite hard (or perhaps impossible) to find an
algorithm that is "universally better" than pglz. Some algorithms do
work better for text documents, some for binary blobs, etc. I don't
think there's a win-win option.

Sure, there are workloads where pglz performs poorly (I've seen such
cases too), but IMHO that's more an argument for the custom compression
method approach. pglz gives you good default compression in most cases,
and you can change it for columns where it matters, and where a
different space/time trade-off makes sense.


Secondly, all the previous attempts ran into some legal issues, i.e.
licensing and/or patents. Maybe the situation changed since then (no
idea, haven't looked into that), but in the past the "pluggable"
approach was proposed as a way to address this.


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: explain analyze output with parallel workers - question about meaning of information for explain.depesz.com

2017-12-02 Thread Amit Kapila
On Wed, Nov 29, 2017 at 2:04 PM, Amit Kapila  wrote:
> On Tue, Nov 28, 2017 at 9:42 PM, Robert Haas  wrote:
>> On Tue, Nov 28, 2017 at 2:23 AM, Amit Kapila  wrote:
>>> That is wrong and I think you have hit a bug.  It should be 2974 * 5 =
>>> 14870 as you have seen in other cases.  The problem is that during
>>> rescan, we generally reinitialize the required state, but we forgot to
>>> reinitialize the instrumentation related memory which is used in the
>>> accumulation of stats, so changing that would fix some part of this
>>> problem which is that at Parallel node, you won't see wrong values.
>>> However, we also need to ensure that the per-worker details also get
>>> accumulated across rescans.  Attached patch should fix the problem you
>>> are seeing.  I think this needs some more analysis and testing to see
>>> if everything works in the desired way.
>>>
>>> Is it possible for you to test the attached patch and see if you are
>>> still seeing any unexpected values?
>>
>> FWIW, this looks sensible to me.  Not sure if there's any good way to
>> write a regression test for it.
>>
>
> I think so, but not 100% sure.  I will give it a try and report back.
>

Attached patch contains regression test as well.  Note that I have
carefully disabled all variable stats by using (analyze, timing off,
summary off, costs off) and then selected parallel sequential scan on
the right of join so that we have nloops and rows as variable stats
and those should remain constant.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


fix_accum_instr_parallel_workers_v2.patch
Description: Binary data


Re: [HACKERS] Runtime Partition Pruning

2017-12-02 Thread Beena Emerson
Hello Robert,

On Sat, Dec 2, 2017 at 12:34 AM, Robert Haas  wrote:
> On Fri, Dec 1, 2017 at 6:20 AM, Beena Emerson  wrote:
>> David Q1:
>> postgres=#  explain analyse execute ab_q1 (3,3); --const
>>QUERY PLAN
>> -
>>  Append  (cost=0.00..43.90 rows=1 width=8) (actual time=0.006..0.006
>> rows=0 loops=1)
>>->  Seq Scan on ab_a3_b3  (cost=0.00..43.90 rows=1 width=8) (actual
>> time=0.005..0.005 rows=0 loops=1)
>>  Filter: ((a = 3) AND (b = 3))
>>  Planning time: 0.588 ms
>>  Execution time: 0.043 ms
>> (5 rows)
>
> I think the EXPLAIN ANALYZE input should show something attached to
> the Append node so that we can tell that partition pruning is in use.
> I'm not sure if that is as simple as "Run-Time Partition Pruning: Yes"
> or if we can give a few more useful details.
>

The output above is shown for a Const Value i.e. optimizer pruning
which I included just to show that the correct partition is chosen
even during runtime pruning for the given value. So taking your
suggestion, the output for runtime pruning could be something as
follows:

postgres=#  explain analyse execute ab_q1 (3,3);
   QUERY PLAN
-
 Append  (cost=0.00..395.10 rows=9 width=8) (actual time=0.119..0.119
rows=0 loops=1) (run-time partition pruning: on)
   ->  Seq Scan on ab_a1_b1  (cost=0.00..43.90 rows=1 width=8) (never executed)
 Filter: ((a = $1) AND (b = $2))
   ->  Seq Scan on ab_a1_b2  (cost=0.00..43.90 rows=1 width=8) (never executed)
 Filter: ((a = $1) AND (b = $2))
   ->  Seq Scan on ab_a1_b3  (cost=0.00..43.90 rows=1 width=8) (never executed)
 Filter: ((a = $1) AND (b = $2))
   ->  Seq Scan on ab_a2_b1  (cost=0.00..43.90 rows=1 width=8) (never executed)
 Filter: ((a = $1) AND (b = $2))
   ->  Seq Scan on ab_a2_b2  (cost=0.00..43.90 rows=1 width=8) (never executed)
 Filter: ((a = $1) AND (b = $2))
   ->  Seq Scan on ab_a2_b3  (cost=0.00..43.90 rows=1 width=8) (never executed)
 Filter: ((a = $1) AND (b = $2))
   ->  Seq Scan on ab_a3_b1  (cost=0.00..43.90 rows=1 width=8) (never executed)
 Filter: ((a = $1) AND (b = $2))
   ->  Seq Scan on ab_a3_b2  (cost=0.00..43.90 rows=1 width=8) (never executed)
 Filter: ((a = $1) AND (b = $2))
   ->  Seq Scan on ab_a3_b3  (cost=0.00..43.90 rows=1 width=8) (actual
time=0.006..0.006 rows=0 loops=1)
 Filter: ((a = $1) AND (b = $2))
 Planning time: 0.828 ms
 Execution time: 0.234 ms
(21 rows)


Beena Emerson

EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company