[HACKERS] SELECT FOR UPDATE regression in 9.5

2016-09-06 Thread Marti Raudsepp
Hello list

While testing an application with PostgreSQL 9.5, we experienced an issue
involving aborted subtransactions and SELECT FOR UPDATE. In certain
situations, a locking query doesn't return rows that should be visible and
already locked by the current transaction.

I bisected this down to commit 27846f02c176eebe7e08ce51ed4d52140454e196
"Optimize locking a tuple already locked by another subxact"

This issue is also reproducible on the current master branch. In an
assertions-enabled build, it traps an assertion in HeapTupleHeaderGetCmax
called by heap_lock_tuple. The following test case demonstrates the issue...

CREATE TABLE IF NOT EXISTS testcase(
id int PRIMARY KEY,
balance numeric
);
INSERT INTO testcase VALUES (1, 0);


BEGIN;
SELECT * FROM testcase WHERE testcase.id = 1 FOR UPDATE;
UPDATE testcase SET balance = balance + 400 WHERE id=1;
SAVEPOINT subxact;
UPDATE testcase SET balance = balance - 100 WHERE id=1;
ROLLBACK TO SAVEPOINT subxact;

-- "division by zero" shouldn't occur because I never deleted any rows
SELECT 1/count(*) from (
SELECT * FROM testcase WHERE id=1 FOR UPDATE
)x;
ROLLBACK;

Regards,
Marti Raudsepp


Re: [HACKERS] Allowing NOT IN to use ANTI joins

2014-06-11 Thread Marti Raudsepp
On Wed, Jun 11, 2014 at 11:53 AM, David Rowley  wrote:
>> The only way to consistently guarantee nullability is through primary
>> key constraints. Fortunately that addresses most of the use cases of
>> NOT IN(), in my experience.

>> See the comment in check_functional_grouping:

> I saw that, but I have to say I've not fully got my head around why that's
> needed just yet.

I was wrong, see Tom's reply to my email. It's OK to rely on
attnotnull for optimization decisions. The plan will be invalidated
automatically when the nullability of a referenced column changes.

check_functional_grouping needs special treatment because it decides
whether to accept/reject views; and if it has allowed creating a view,
it needs to guarantee that the dependent constraint isn't dropped for
a longer term.

> as long as the transaction id
> is taken before we start planning in session 1 then it should not matter if
> another session drops the constraint and inserts a NULL value as we won't
> see that NULL value in our transaction... I'd assume that the transaction
> has to start before it grabs the table defs that are required for planning.
> Or have I got something wrong?

1. You're assuming that query plans can only survive for the length of
a transaction. That's not true, prepared query plans can span many
transactions.

2. Also a FOR UPDATE clause can return values "from the future", if
another transaction has modified the value and already committed.

But this whole issue is moot anyway, the plan will get invalidated
when the nullability changes.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allowing NOT IN to use ANTI joins

2014-06-11 Thread Marti Raudsepp
On Sun, Jun 8, 2014 at 3:36 PM, David Rowley  wrote:
> Currently pull_up_sublinks_qual_recurse only changes the plan for NOT EXISTS
> queries and leaves NOT IN alone. The reason for this is because the values
> returned by a subquery in the IN clause could have NULLs.

There's a bug in targetListIsGuaranteedNotToHaveNulls, you have to
drill deeper into the query to guarantee the nullability of a result
column. If a table is OUTER JOINed, it can return NULLs even if the
original column specification has NOT NULL.

This test case produces incorrect results with your patch:

create table a (x int not null);
create table b (x int not null, y int not null);
insert into a values(1);
select * from a where x not in (select y from a left join b using (x));

Unpatched version correctly returns 0 rows since "y" will be NULL.
Your patch returns the value 1 from a.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Quantify small changes to predicate evaluation

2014-06-13 Thread Marti Raudsepp
On Fri, Jun 13, 2014 at 12:46 PM, Dennis Butterstein  wrote:
> I expect my current changes to be resposible for about 0.2-0.3s for this
> query but because of the huge time differences I am not able to quantify my
> changes.
>
> Maybe somebody can tell me about a better approach to quantify my changes or
> give me some input on how to get more stable postgres time measurements.

There can be other reasons, but for read-only benchmarks, much of this
variability seems to come from the operating system's power management
and scheduler.

I had some luck in reducing variance on Linux with some tricks.
Disable CPU frequency scaling:
for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do
echo performance > $i; done

Disable the turbo boost feature if your CPU supports it:
echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo

Always launch PostgreSQL and pgbench on a fixed CPU core, using "taskset -c3"

And exclude all other processes from this core (locking them to cores 0, 1, 2):
ps -A -o pid h |xargs -n1 taskset -cp -a 0-2 >/dev/null

Transparent hugepage support may also interfere:
echo never > /sys/kernel/mm/transparent_hugepage/defrag
echo never > /sys/kernel/mm/transparent_hugepage/enabled

I'm sure there are more tricks, but this should get you pretty far.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Quantify small changes to predicate evaluation

2014-06-25 Thread Marti Raudsepp
On Tue, Jun 17, 2014 at 5:23 PM, Dennis Butterstein  wrote:
> I tried the proposed tweaks and
> see some differences regarding the measurements.
> Unfortunately the variance between the runs seems to remain high.

Using these techniques I managed to get standard deviation below 1.5%
in my read-only tests (and most below 1%). Not all workloads may be
able to achieve that, but your should be able to do better than your
results.

Maybe your cooling is not sufficient? It seems your 2nd run is always
slower than the first one, maybe the CPU is doing thermal throttling?
Lowering the max frequency might be something to try to resolve that,
like "cpupower frequency-set --max 2GHz"

How do you run your benchmark, are you using pgbench? Single threaded?
Is the client locked to the same CPU core?

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_xlogdump --stats

2014-07-01 Thread Marti Raudsepp
On Wed, Jun 4, 2014 at 1:47 PM, Abhijit Menon-Sen  wrote:
> Here's a patch to make pg_xlogdump print summary statistics instead of
> individual records.

Thanks! I had a use for this feature so I backported the (first) patch
to PostgreSQL 9.3. It's a rush job so it's ugly and may have bugs, but
it works for me. Just posting here, maybe someone else will find it
useful too.

Regards,
Marti


xlogdiff_stats_93.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_xlogdump --stats

2014-07-01 Thread Marti Raudsepp
On Tue, Jul 1, 2014 at 11:13 PM, Abhijit Menon-Sen  wrote:
> In CF terms, did you form any opinion while porting the patch I posted
> about whether it's sensible/ready for inclusion in 9.5?

I didn't look at the code more than necessary to make the build work.

As far as functionality goes, it does exactly what I needed it to do;
the output is very clear. I wanted to see how much WAL is being
generated from SELECT FOR UPDATE locks.

Here are some thoughts:

The "FPI" acronym made me wonder at first. Perhaps "Full page image
size" would be clearer (exactly 20 characters long, so it fits).

But on the other hand, I find that the table is too wide, I always
need to stretch my terminal window to fit it. I don't think we need to
accommodate for 10^20 bytes ~ 87 exabytes of WAL files. :)

You might also add units (kB/MB) to the table like pg_size_pretty,
although that would make the magnitudes harder to gauge.

> P.S. In your patch, the documentation changes are duplicated.

Oh right you are, I don't know how that happened. I also missed some
record types (Btree/0x80, Btree/0xb0).

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] delta relations in AFTER triggers

2014-07-28 Thread Marti Raudsepp
On Sat, Jul 5, 2014 at 5:38 PM, Kevin Grittner  wrote:
> it seems to me that we need the full tuple to support triggers on
> FDWs, so the TID approach would be an optimization for a subset of
> the cases, and would probably be more appropriate, if we do it at
> all, in a follow-on patch
> If you disagree with that assessment, now would be a good
> time to explain your reasoning.

Maybe I just have a limited imagination because I've never found a use
for FDWs personally. But recording changes from a trigger on a FDW
table doesn't seem that useful, since you can only capture changes
done by the local node. I expect that in many situations there are
multiple writers accessing the same underlying remote table. Certainly
it's can't guarantee the consistency of materialized views.

> I took a look at whether I could avoid making OLD and NEW
> non-reserved keywords, but I didn't see how to do that without
> making FOR at least partially reserved.  If someone sees a way to
> do this without creating three new unreserved keywords
> (REFERENCING, OLD, and NEW) I'm all ears.

Sorry, I know I am very late to make this point, so feel free to ignore this.

I'm not a fan of the SQL standard syntax for this feature. One nice
thing about PostgreSQL's triggers is that you can declare the trigger
function once and re-use it on many tables. It would make more sense
if the same function declaration could say what variable/relation
names it wants to use. They're more like function argument names, not
some metadata about a table-function relationship.

Putting these in the CREATE TRIGGER command means you have to repeat
them for each table you want to apply the trigger to. It introduces
the possibility of making more mistakes without any gain in
flexibility.

But then again, I understand that there's value in supporting standard syntax.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] delta relations in AFTER triggers

2014-07-28 Thread Marti Raudsepp
On Mon, Jul 28, 2014 at 6:24 PM, Kevin Grittner  wrote:
> Do you have some other suggestion?  Keep in mind that it must allow
> the code which will *generate* the transition tables to know
> whether any of the attached triggers use a given transition table
> for the specific operation, regardless of the language of the
> trigger function.

You will need to access the pg_proc record of the trigger function
anyway, so it's just a matter of coming up with syntax that makes
sense, right?

What I had in mind was that we could re-use function argument
declaration syntax. For instance, use the "argmode" specifier to
declare OLD and NEW. Shouldn't cause grammar conflicts because the
current OUT and INOUT aren't reserved keywords.

We could also re-use the refcursor type, which already has bindings in
some PLs, if that's not too much overhead. That would make the
behavior straightforward without introducing new constructs, plus you
can pass them around between functions. Though admittedly it's
annoying to integrate cursor results into queries.

Something like:

CREATE FUNCTION trig(OLD old_rows refcursor, NEW new_rows refcursor)
RETURNS trigger LANGUAGE plpgsql AS '...';

Or maybe if the grammar allows, we could spell out "NEW TABLE", "OLD
TABLE", but that's redundant since you can already deduce that from
the refcursor type.

It could also be extended for different types, like tid[], and maybe
"record" for the FOR EACH ROW variant (dunno if that can be made to
work).

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] delta relations in AFTER triggers

2014-07-29 Thread Marti Raudsepp
On Tue, Jul 29, 2014 at 9:49 AM, Pavel Stehule  wrote:
> I dislike this proposal - it is strongly inconsistent with current trigger
> design

The real point I was trying to convey (in my previous email) is that
these declarations should be part of the trigger *function* not the
function-to-table relationship. CREATE TRIGGER shouldn't be in the
business of declaring new local variables for the trigger function.
Whether we define new syntax for that or re-use the argument list is
secondary.

But the inconsistency is deliberate, I find the current trigger API
horrible. Magic variables... Text-only TG_ARGV for arguments...
RETURNS trigger... No way to invoke trigger functions directly for
testing.

By not imitating past mistakes, maybe we can eventually arrive at a
language that makes sense.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-07-31 Thread Marti Raudsepp
On Thu, Jul 31, 2014 at 9:41 PM, Robert Haas  wrote:
> I certainly like that better than poor-man; but proxy, to me, fails to
> convey inexactness.

Maybe "abbreviated", "abridged", "minified"?

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostrgeSQL vs oracle doing 1 million sqrts am I doing it wrong?

2014-08-05 Thread Marti Raudsepp
On Mon, Aug 4, 2014 at 11:48 PM, testman1316  wrote:
> In both we ran code that did 1 million square roots (from 1 to 1 mill). Then
> did the same but within an If..Then statement.

> Note: once we started running queries on the exact same data in Oracle and
> PostgreSQL we saw a similar pattern. On basic queries little difference, but
> as they started to get more and more complex Oracle was around 3-5 faster.

Looks like from the test cases you posted, you're not actually
benchmarking any *queries*, you're comparing the speeds of the
procedural languages. And yes, PL/pgSQL is known to be a farily slow
language.

If you want fair benchmark results, you should instead concentrate on
what databases are supposed to do: store and retrieve data; finding
the most optimal way to execute complicated SQL queries. In most
setups, that's where the majority of database processor time is spent,
not procedure code.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb format is pessimal for toast compression

2014-08-12 Thread Marti Raudsepp
On Fri, Aug 8, 2014 at 10:50 PM, Hannu Krosing  wrote:
> How hard and how expensive would it be to teach pg_lzcompress to
> apply a delta filter on suitable data ?
>
> So that instead of integers their deltas will be fed to the "real"
> compressor

Has anyone given this more thought? I know this might not be 9.4
material, but to me it sounds like the most promising approach, if
it's workable. This isn't a made up thing, the 7z and LZMA formats
also have an optional delta filter.

Of course with JSONB the problem is figuring out which parts to apply
the delta filter to, and which parts not.

This would also help with integer arrays, containing for example
foreign key values to a serial column. There's bound to be some
redundancy, as nearby serial values are likely to end up close
together. In one of my past projects we used to store large arrays of
integer fkeys, deliberately sorted for duplicate elimination.

For an ideal case comparison, intar2 could be as large as intar1 when
compressed with a 4-byte wide delta filter:

create table intar1 as select array(select 1::int from
generate_series(1,100)) a;
create table intar2 as select array(select generate_series(1,100)::int) a;

In PostgreSQL 9.3 the sizes are:
select pg_column_size(a) from intar1;
  45810
select pg_column_size(a) from intar2;
420

So a factor of 87 difference.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_shmem_allocations view

2014-08-15 Thread Marti Raudsepp
Hi

On Thu, May 8, 2014 at 4:28 PM, Andres Freund  wrote:
> Ok. A new version of the patches implementing that are
> attached. Including a couple of small fixups and docs. The latter aren't
> extensive, but that doesn't seem to be warranted anyway.

Is it really actually useful to expose the segment off(set) to users?
Seems to me like unnecessary internal details leaking out.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pg_upgrade fails when postgres/template1 isn't in default tablespace

2015-07-20 Thread Marti Raudsepp
On Mon, Jun 22, 2015 at 9:20 PM, Robert Haas  wrote:
> On Fri, Jun 19, 2015 at 12:10 PM, Marti Raudsepp  wrote:
>> One of my databases failed to upgrade successfully and produced this error
>> in the copying phase:
>>
>> error while copying relation "pg_catalog.pg_largeobject"
>> ("/srv/ssd/PG_9.3_201306121/1/12023" to "/PG_9.4_201409291/1/12130"): No
>> such file or directory

> I haven't looked at the patch, but this seems like a good thing to
> fix.  Hopefully Bruce will take a look soon.

Ping? This has gone a month without a reply.

Should I add this patch to a commitfest? (I was under the impression
that bugfixes don't normally go through the commitfest process)

Regards,
Marti Raudsepp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pg_upgrade fails when postgres/template1 isn't in default tablespace

2015-07-22 Thread Marti Raudsepp
On Wed, Jul 22, 2015 at 5:00 PM, Robert Haas  wrote:
> +1.  I would recommend adding it to the CF *immediately* to have it
> get attention.   The CF app is basically our patch tracker.

Thanks, I have done so now: https://commitfest.postgresql.org/6/313/

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Capturing EXPLAIN ANALYZE for a query without discarding the normal result set

2014-01-13 Thread Marti Raudsepp
On Tue, Jan 14, 2014 at 5:06 AM, Dave Cole  wrote:
> It would be really cool if you could direct the EXPLAIN ANALYZE output to a
> temporary table so that the query being analyzed could execute normally.

You can use the auto_explain contrib module to log the query plans of
slow(er) queries:
http://www.postgresql.org/docs/current/static/auto-explain.html

If you Really Need To, you can use the csvlog log format and import
that to a table, but really it's easier to use less/grep/etc.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Capturing EXPLAIN ANALYZE for a query without discarding the normal result set

2014-01-14 Thread Marti Raudsepp
On Tue, Jan 14, 2014 at 5:13 AM, Marti Raudsepp  wrote:
> You can use the auto_explain contrib module

I just remembered that there's also the pg_stat_plans extension, which
is closer to what you asked:
https://github.com/2ndQuadrant/pg_stat_plans . This one you'll have to
build yourself (it's not in contrib).

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpgsql.consistent_into

2014-01-14 Thread Marti Raudsepp
I've always hated INTO in procedures since it makes the code harder to
follow and has very different behavior on the SQL level, in addition
to the multi-row problem you bring up. If we can make assignment
syntax more versatile and eventually replace INTO, then that solves
multiple problems in the language without breaking backwards
compatibility.

On Tue, Jan 14, 2014 at 4:30 AM, Marko Tiikkaja  wrote:
> On 2014-01-14 02:54, Marti Raudsepp wrote:
>> But PL/pgSQL already has an assignment syntax with the behavior you want:
>
> According to the docs, that doesn't set FOUND which would make this a pain
> to deal with..

Right you are. If we can extend the syntax then we could make it such
that "= SELECT" sets FOUND and other diagnostics, and a simple
assignment doesn't. Which makes sense IMO:

a = 10; -- simple assignments really shouldn't affect FOUND

With explicit SELECT, clearly the intent is to perform a query:
  a = SELECT foo FROM table;
And this could also work:
  a = INSERT INTO table (foo) VALUES (10) RETURNING foo_id;

AFAICT the fact that this works is more of an accident and should be
discouraged. We can leave it as is for compatibility's sake:
  a = foo FROM table;

Now, another question is whether it's possible to make the syntax
work. Is this an assignment from the result of a subquery, or is it a
query by itself?
  a = (SELECT foo FROM table);

Does anyone consider this proposal workable?

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance and indexes

2014-01-14 Thread Marti Raudsepp
On Tue, Jan 14, 2014 at 12:07 PM, knizhnik  wrote:
> But is it possible to use index for derived table at all?

Yes, the planner will do an index scan when it makes sense.

> Why sequential search is used for derived table in the example below:

> insert into derived_table values (2,2);
> create index derived_index on derived_table(x);
> explain select * from base_table where x>=0;

With only 1 row in the table, the planner decides there's no point in
scanning the index. Try with more realistic data.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Partial sort

2014-01-14 Thread Marti Raudsepp
On Tue, Jan 14, 2014 at 5:49 PM, Alexander Korotkov 
wrote:
>> I implemented a new
>> enable_partialsort GUC to make it easier to turn on/off

> I though about such option. Generally not because of testing convenience,
> but because of overhead of planning. This way you implement it is quite
> naive :) For instance, merge join rely on partial sort which will be
> replaced with simple sort.

Oh, this actually highlights a performance regression with the partial sort
patch. I assumed the planner will discard the full sort because of higher
costs, but it looks like the new code always assumes that a Partial sort
will be cheaper than a Join Filter without considering costs. When doing a
join USING (unique_indexed_value, something), the new plan is significantly
worse.

Unpatched:
marti=# explain analyze select * from release a join release b using (id,
name);
 Merge Join  (cost=0.85..179810.75 rows=12 width=158) (actual
time=0.011..1279.596 rows=1232610 loops=1)
   Merge Cond: (a.id = b.id)
   Join Filter: ((a.name)::text = (b.name)::text)
   ->  Index Scan using release_id_idx on release a  (cost=0.43..79120.04
rows=1232610 width=92) (actual time=0.005..211.928 rows=1232610 loops=1)
   ->  Index Scan using release_id_idx on release b  (cost=0.43..79120.04
rows=1232610 width=92) (actual time=0.004..371.592 rows=1232610 loops=1)
 Total runtime: 1309.049 ms

Patched:
 Merge Join  (cost=0.98..179810.87 rows=12 width=158) (actual
time=0.037..5034.158 rows=1232610 loops=1)
   Merge Cond: ((a.id = b.id) AND ((a.name)::text = (b.name)::text))
   ->  Partial sort  (cost=0.49..82201.56 rows=1232610 width=92) (actual
time=0.013..955.938 rows=1232610 loops=1)
 Sort Key: a.id, a.name
 Presorted Key: a.id
 Sort Method: quicksort  Memory: 25kB
 ->  Index Scan using release_id_idx on release a
 (cost=0.43..79120.04 rows=1232610 width=92) (actual time=0.007..449.332
rows=1232610 loops=1)
   ->  Materialize  (cost=0.49..85283.09 rows=1232610 width=92) (actual
time=0.019..1352.377 rows=1232610 loops=1)
 ->  Partial sort  (cost=0.49..82201.56 rows=1232610 width=92)
(actual time=0.018..1223.251 rows=1232610 loops=1)
   Sort Key: b.id, b.name
   Presorted Key: b.id
   Sort Method: quicksort  Memory: 25kB
   ->  Index Scan using release_id_idx on release b
 (cost=0.43..79120.04 rows=1232610 width=92) (actual time=0.004..597.258
rows=1232610 loops=1)
 Total runtime: 5166.906 ms


There's another "wishlist" kind of thing with top-N heapsort bounds; if I
do a query with LIMIT 1000 then every sort batch has Tuplesortstate.bound
set to 1000, but it could be reduced after each batch. If the first batch
is 900 rows then the 2nd batch only needs the top 100 rows at most.

Also, I find the name "partial sort" a bit confusing; this feature is not
actually sorting *partially*, it's finishing the sort of partially-sorted
data. Perhaps "batched sort" would explain the feature better? Because it
does the sort in multiple batches instead of all at once. But maybe that's
just me.

Regards,
Marti


Re: [HACKERS] PoC: Partial sort

2014-01-14 Thread Marti Raudsepp
On Tue, Jan 14, 2014 at 9:28 PM, Alexander Korotkov wrote:

> On Tue, Jan 14, 2014 at 11:16 PM, Marti Raudsepp  wrote:
>
>> Oh, this actually highlights a performance regression with the partial
>> sort patch.
>>
>
> Interesting. Could you share the dataset?
>

It occurs with many datasets if work_mem is sufficiently low (10MB in my
case). Here's a quicker way to reproduce a similar issue:

create table foo as select i, i as j from generate_series(1,1000) i;
create index on foo(i);
explain analyze select * from foo a join foo b using (i, j);

The real data is from the "release" table from MusicBrainz database dump:
https://musicbrainz.org/doc/MusicBrainz_Database/Download . It's nontrivial
to set up though, so if you still need the real data, I can upload a pgdump
for you.

Regards,
Marti


Re: [HACKERS] plpgsql.consistent_into

2014-01-17 Thread Marti Raudsepp
On Wed, Jan 15, 2014 at 8:23 AM, Jim Nasby  wrote:
> Do we actually support = right now? We already support
>
> v_field := field FROM table ... ;
>
> and I think it's a bad idea to have different meaning for = and :=.

That was already discussed before. Yes, we support both = and := and
they have exactly the same behavior; I agree, we should keep them
equivalent.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Partial sort

2014-01-18 Thread Marti Raudsepp
Hi,

There's another small regression with this patch when used with expensive
comparison functions, such as long text fields.

If we go through all this trouble in cmpSortSkipCols to prove that the
first N sortkeys are equal, it would be nice if Tuplesort could skip their
comparisons entirely; that's another nice optimization this patch can
provide.

I've implemented that in the attached patch, which applies on top of your
partial-sort-5.patch

Should the "Sort Key" field in EXPLAIN output be changed as well? I'd say
no, I think that makes the partial sort steps harder to read.

Generate test data:
create table longtext as select (select repeat('a', 1000*100)) a,
generate_series(1,1000) i;
create index on longtext(a);

Unpatched (using your original partial-sort-5.patch):
=# explain analyze select * from longtext order by a, i limit 10;
 Limit  (cost=2.34..19.26 rows=10 width=1160) (actual time=13477.739..13477.756
rows=10 loops=1)
   ->  Partial sort  (cost=2.34..1694.15 rows=1000 width=1160) (actual time=
13477.737..13477.742 rows=10 loops=1)
 Sort Key: a, i
 Presorted Key: a
 Sort Method: top-N heapsort  Memory: 45kB
 ->  Index Scan using longtext_a_idx on longtext  (cost=0.65..1691.65
rows=1000 width=1160) (actual time=0.015..2.364 rows=1000 loops=1)
 Total runtime: 13478.158 ms
(7 rows)

=# set enable_indexscan=off;
=# explain analyze select * from longtext order by a, i limit 10;
 Limit  (cost=198.61..198.63 rows=10 width=1160) (actual
time=6970.439..6970.458 rows=10 loops=1)
   ->  Sort  (cost=198.61..201.11 rows=1000 width=1160) (actual
time=6970.438..6970.444 rows=10 loops=1)
 Sort Key: a, i
 Sort Method: top-N heapsort  Memory: 45kB
 ->  Seq Scan on longtext  (cost=0.00..177.00 rows=1000 width=1160)
(actual time=0.007..1.763 rows=1000 loops=1)
 Total runtime: 6970.491 ms

Patched:
=# explain analyze select * from longtext order by a, i ;
 Partial sort  (cost=2.34..1694.15 rows=1000 width=1160) (actual
time=0.024..4.603 rows=1000 loops=1)
   Sort Key: a, i
   Presorted Key: a
   Sort Method: quicksort  Memory: 27kB
   ->  Index Scan using longtext_a_idx on longtext  (cost=0.65..1691.65
rows=1000 width=1160) (actual time=0.013..2.094 rows=1000 loops=1)
 Total runtime: 5.418 ms

Regards,
Marti
From fbc6c31528018bca64dc54f65e1cd292f8de482a Mon Sep 17 00:00:00 2001
From: Marti Raudsepp 
Date: Sat, 18 Jan 2014 19:16:15 +0200
Subject: [PATCH] Batched sort: skip comparisons for known-equal columns

---
 src/backend/executor/nodeSort.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/backend/executor/nodeSort.c b/src/backend/executor/nodeSort.c
index cf1f79e..5abda1d 100644
--- a/src/backend/executor/nodeSort.c
+++ b/src/backend/executor/nodeSort.c
@@ -125,10 +125,10 @@ ExecSort(SortState *node)
 	{
 		tuplesortstate = tuplesort_begin_heap(tupDesc,
 			  plannode->numCols - skipCols,
-			  &(plannode->sortColIdx)[skipCols],
-			  plannode->sortOperators,
-			  plannode->collations,
-			  plannode->nullsFirst,
+			  &(plannode->sortColIdx[skipCols]),
+			  &(plannode->sortOperators[skipCols]),
+			  &(plannode->collations[skipCols]),
+			  &(plannode->nullsFirst[skipCols]),
 			  work_mem,
 			  node->randomAccess);
 		if (node->bounded)
-- 
1.8.5.3


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Partial sort

2014-01-18 Thread Marti Raudsepp
Funny, I just wrote a patch to do that some minutes ago (didn't see your
email yet).

http://www.postgresql.org/message-id/CABRT9RCK=wmFUYZdqU_+MOFW5PDevLxJmZ5B=etjjnubvya...@mail.gmail.com

Regards,
Marti



On Sat, Jan 18, 2014 at 7:10 PM, Jeremy Harris  wrote:

> On 13/01/14 18:01, Alexander Korotkov wrote:
>
>> Thanks. It's included into attached version of patch. As wall as
>> estimation
>> improvements, more comments and regression tests fix.
>>
>
> Would it be possible to totally separate the two sets of sort-keys,
> only giving the non-index set to the tuplesort?  At present tuplesort
> will, when it has a group to sort, make wasted compares on the
> indexed set of keys before starting on the non-indexed set.
> --
> Cheers,
>   Jeremy
>
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] PoC: Partial sort

2014-01-18 Thread Marti Raudsepp
On Sat, Jan 18, 2014 at 7:22 PM, Marti Raudsepp  wrote:
> Total runtime: 5.418 ms

Oops, shouldn't have rushed this. Clearly the timings should have
tipped me off that it's broken. I didn't notice that cmpSortSkipCols
was re-using tuplesort's sortkeys.

Here's a patch that actually works; I added a new skipKeys attribute
to SortState. I had to extract the SortSupport-creation code from
tuplesort_begin_heap to a new function; but that's fine, because it
was already duplicated in ExecInitMergeAppend too.

I reverted the addition of tuplesort_get_sortkeys, which is not needed now.

Now the timings are:
Unpatched partial sort: 13478.158 ms
Full sort: 6802.063 ms
Patched partial sort: 6618.962 ms

Regards,
Marti
From 7d9f34c09e7836504725ff11be7e63a2fc438ae9 Mon Sep 17 00:00:00 2001
From: Marti Raudsepp 
Date: Mon, 13 Jan 2014 20:38:45 +0200
Subject: [PATCH] Partial sort: skip comparisons for known-equal columns

---
 src/backend/executor/nodeMergeAppend.c | 18 +-
 src/backend/executor/nodeSort.c| 26 +-
 src/backend/utils/sort/sortsupport.c   | 29 +
 src/backend/utils/sort/tuplesort.c | 31 +--
 src/include/nodes/execnodes.h  |  1 +
 src/include/utils/sortsupport.h|  3 +++
 src/include/utils/tuplesort.h  |  2 --
 7 files changed, 60 insertions(+), 50 deletions(-)

diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 74fa40d..db6ec23 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -126,19 +126,11 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
 	 * initialize sort-key information
 	 */
 	mergestate->ms_nkeys = node->numCols;
-	mergestate->ms_sortkeys = palloc0(sizeof(SortSupportData) * node->numCols);
-
-	for (i = 0; i < node->numCols; i++)
-	{
-		SortSupport sortKey = mergestate->ms_sortkeys + i;
-
-		sortKey->ssup_cxt = CurrentMemoryContext;
-		sortKey->ssup_collation = node->collations[i];
-		sortKey->ssup_nulls_first = node->nullsFirst[i];
-		sortKey->ssup_attno = node->sortColIdx[i];
-
-		PrepareSortSupportFromOrderingOp(node->sortOperators[i], sortKey);
-	}
+	mergestate->ms_sortkeys = MakeSortSupportKeys(mergestate->ms_nkeys,
+  node->sortColIdx,
+  node->sortOperators,
+  node->collations,
+  node->nullsFirst);
 
 	/*
 	 * initialize to show we have not run the subplans yet
diff --git a/src/backend/executor/nodeSort.c b/src/backend/executor/nodeSort.c
index 55cdb05..7645645 100644
--- a/src/backend/executor/nodeSort.c
+++ b/src/backend/executor/nodeSort.c
@@ -28,20 +28,19 @@ static bool
 cmpSortSkipCols(SortState *node, TupleDesc tupDesc, HeapTuple a, TupleTableSlot *b)
 {
 	int n = ((Sort *)node->ss.ps.plan)->skipCols, i;
-	SortSupport sortKeys = tuplesort_get_sortkeys(node->tuplesortstate);
 
 	for (i = 0; i < n; i++)
 	{
 		Datum datumA, datumB;
 		bool isnullA, isnullB;
-		AttrNumber attno = sortKeys[i].ssup_attno;
+		AttrNumber attno = node->skipKeys[i].ssup_attno;
 
 		datumA = heap_getattr(a, attno, tupDesc, &isnullA);
 		datumB = slot_getattr(b, attno, &isnullB);
 
 		if (ApplySortComparator(datumA, isnullA,
-  datumB, isnullB,
-  &sortKeys[i]))
+datumB, isnullB,
+&node->skipKeys[i]))
 			return false;
 	}
 	return true;
@@ -123,12 +122,21 @@ ExecSort(SortState *node)
 		tuplesort_reset((Tuplesortstate *) node->tuplesortstate);
 	else
 	{
+		/* Support structures for cmpSortSkipCols - already sorted columns */
+		if (skipCols)
+			node->skipKeys = MakeSortSupportKeys(skipCols,
+ plannode->sortColIdx,
+ plannode->sortOperators,
+ plannode->collations,
+ plannode->nullsFirst);
+
+		/* Only pass on remaining columns that are unsorted */
 		tuplesortstate = tuplesort_begin_heap(tupDesc,
-			  plannode->numCols,
-			  plannode->sortColIdx,
-			  plannode->sortOperators,
-			  plannode->collations,
-			  plannode->nullsFirst,
+			  plannode->numCols - skipCols,
+			  &(plannode->sortColIdx[skipCols]),
+			  &(plannode->sortOperators[skipCols]),
+			  &(plannode->collations[skipCols]),
+			  &(plannode->nullsFirst[skipCols]),
 			  work_mem,
 			  node->randomAccess);
 		if (node->bounded)
diff --git a/src/backend/utils/sort/sortsupport.c b/src/backend/utils/sort/sortsupport.c
index 347f448..df82f5f 100644
--- a/src/backend/utils/sort/sortsupport.c
+++ b/src/backend/utils/sort/sortsupport.c
@@ -85,6 +85,35 @@ PrepareSortSupportComparisonShim(Oid cmpFunc, SortSupport ssup)
 }
 
 /*

[HACKERS] [patch] Potential relcache leak in get_object_address_attribute

2014-01-18 Thread Marti Raudsepp
Hi list,

It looks like the get_object_address_attribute() function has a
potential relcache leak. When missing_ok=true, the relation is found
but attribute is not, then the relation isn't closed, nor is it
returned to the caller.

I couldn't figure out any ways to trigger this, but it's best to fix anyway.

Regards,
Marti
diff --git a/src/backend/catalog/objectaddress.c b/src/backend/catalog/objectaddress.c
index f8fd4f8..e22aa66 100644
--- a/src/backend/catalog/objectaddress.c
+++ b/src/backend/catalog/objectaddress.c
@@ -1024,6 +1024,7 @@ get_object_address_attribute(ObjectType objtype, List *objname,
 		address.classId = RelationRelationId;
 		address.objectId = InvalidOid;
 		address.objectSubId = InvalidAttrNumber;
+		relation_close(relation, lockmode);
 		return address;
 	}
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-18 Thread Marti Raudsepp
On Wed, Jan 15, 2014 at 5:34 AM, Jim Nasby  wrote:
> it's very common to create temporary file data that will never, ever, ever
> actually NEED to hit disk. Where I work being able to tell the kernel to
> avoid flushing those files unless the kernel thinks it's got better things
> to do with that memory would be EXTREMELY valuable

Windows has the FILE_ATTRIBUTE_TEMPORARY flag for this purpose.

ISTR that there was discussion about implementing something analogous
in Linux when ext4 got delayed allocation support, but I don't think
it got anywhere and I can't find the discussion now. I think the
proposed interface was to create and then unlink the file immediately,
which serves as a hint that the application doesn't care about
persistence.

Postgres is far from being the only application that wants this; many
people resort to tmpfs because of this:
https://lwn.net/Articles/499410/

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add value to error message when size extends

2014-01-19 Thread Marti Raudsepp
On Sun, Jan 19, 2014 at 8:10 PM, Tom Lane  wrote:
> Complaining about a too-long varchar string in this style
> seems practically guaranteed to violate that.

Agreed. But I think it would be useful to add the length of the string
being inserted; we already have it in the len variable.

> One thing that occurs to me just now is that perhaps we could report
> the failure as if it were a syntax error

That would be cool, if it can be made to work.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] improve the help message about psql -F

2014-01-19 Thread Marti Raudsepp
2014/1/17 Jov 
> but in the psql --help,-F say:
>
>> set field separator (default: "|")

> if user don't read the offical doc carefully,he can use:
>
>> psql -F , -c 'select ...'
>
> But can't get what he want.
> It is a bad user Experience.

+1 from me, patch applies and is helpful.

After patching this line in psql --help is 82 characters long; I think
it's best to keep help screens below 80 characters wide (although
there's already 1 other line violating this rule).

I think the word "set" is pretty useless there anyway, maybe remove
that so the message becomes "field separator for unaligned output
(default: "|")"

PS: There isn't an open CommitFest to add this to. Shouldn't we always
open a new CF when the last one goes in progress? If there's no date,
it may be simply called "next"

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] better atomics - v0.2

2014-01-19 Thread Marti Raudsepp
On Tue, Nov 19, 2013 at 6:38 PM, Andres Freund  wrote:
> The attached patches compile and make check successfully on linux x86,
> amd64 and msvc x86 and amd64. This time and updated configure is
> included. The majority of operations still rely on CAS emulation.

Note that the atomics-generic-acc.h file has ® characters in the
Latin-1 encoding ("Intel ® Itanium ®"). If you have to use weird
characters, I think you should make sure they're UTF-8

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-20 Thread Marti Raudsepp
On Mon, Jan 20, 2014 at 1:51 AM, Dave Chinner  wrote:
>> Postgres is far from being the only application that wants this; many
>> people resort to tmpfs because of this:
>> https://lwn.net/Articles/499410/
>
> Yes, we covered the possibility of using tmpfs much earlier in the
> thread, and came to the conclusion that temp files can be larger
> than memory so tmpfs isn't the solution here. :)

What I meant is: lots of applications want this behavior. If Linux
filesystems had support for delaying writeback for temporary files,
then there would be no point in mounting tmpfs on /tmp at all and we'd
get the best of both worlds.

Right now people resort to tmpfs because of this missing feature. And
then have their machines end up in swap hell if they overuse it.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Partial sort

2014-01-20 Thread Marti Raudsepp
Hi,

On Tue, Jan 14, 2014 at 5:49 PM, Alexander Korotkov
 wrote:
>On Tue, Jan 14, 2014 at 12:54 AM, Marti Raudsepp  wrote:
>> I've been trying it out in a few situations. I implemented a new
>> enable_partialsort GUC to make it easier to turn on/off, this way it's a lot
>> easier to test. The attached patch applies on top of partial-sort-5.patch
>
> I though about such option. Generally not because of testing convenience,
> but because of overhead of planning. This way you implement it is quite
> naive :)

I don't understand. I had another look at this and cost_sort still
seems like the best place to implement this, since that's where the
patch decides how many pre-sorted columns to use. Both mergejoin and
simple order-by plans call into it. If enable_partialsort=false then I
skip all pre-sorted options except full sort, making cost_sort behave
pretty much like it did before the patch.

I could change pathkeys_common to return 0, but that seems like a
generic function that shouldn't be tied to partialsort. The old code
paths called pathkeys_contained_in anyway, which has similar
complexity. (Apart for initial_cost_mergejoin, but that doesn't seem
special enough to make an exception for).

Or should I use?:
  enable_partialsort ? pathkeys_common(...) : 0

> For instance, merge join rely on partial sort which will be
> replaced with simple sort.

Are you saying that enable_partialsort=off should keep
partialsort-based mergejoins enabled?

Or are you saying that merge joins shouldn't use "simple sort" at all?
But merge join was already able to use full Sort nodes before your
patch.

What am I missing?

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] improve the help message about psql -F

2014-01-21 Thread Marti Raudsepp
On Mon, Jan 20, 2014 at 2:04 PM, Jov  wrote:
> reasonable,I removed the "set",patch attached.

Hi Jov,

A new commitfest was just opened, due on 2014-06. Please add your patch here:
https://commitfest.postgresql.org/action/commitfest_view?id=22

(You'll need a community account if you don't already have one)

Sometimes simple fixes like yours are merged outside a CommitFest, but
adding it there makes sure it won't get lost.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: variant of regclass

2014-01-22 Thread Marti Raudsepp
On Tue, Jan 14, 2014 at 9:28 AM, Yugo Nagata  wrote:
> Here is the patch to implement to_regclass, to_regproc, to_regoper,
> and to_regtype.

+ static Datum regclass_guts(char *class_name_or_oid, bool raiseError);

Minor bikeshedding, a lot of code currently uses an argument named
"missing_ok" for this purpose (with inverse meaning of course). Any
reasons why you chose "raiseError" instead?

I only had a brief look at the patch, so maybe I'm missing something.
But I don't think you should create 3 variants of these functions:
* parseTypeString calls parseTypeString_guts with false
* parseTypeStringMissingOk calls parseTypeString_guts with true
* parseTypeString_guts

And this is just silly:

if (raiseError)
parseTypeString(typ_name_or_oid, &result, &typmod);
else
parseTypeStringMissingOk(typ_name_or_oid, &result, &typmod);

Just add an argument to parseTypeString and patch all the callers.

> if requested object is not found,
> returns InvalidOid, rather than raises an error.

I thought the consensus was that returning NULL is better than
InvalidOid? From an earlier message:

On Thu, Dec 5, 2013 at 5:25 PM, Robert Haas  wrote:
> Another advantage of this approach is that, IIUC, type input functions
> can't return a NULL value.  So 'pg_klass'::regclass could return 0,
> but not NULL.  On the other hand, toregclass('pg_klass') *could*
> return NULL, which seems conceptually cleaner.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: variant of regclass

2014-01-22 Thread Marti Raudsepp
On Wed, Jan 22, 2014 at 1:44 PM, Yugo Nagata  wrote:
> On Wed, 22 Jan 2014 20:04:12 +0900 (JST)
> Tatsuo Ishii  wrote:
> parseTypeString() is called by some other functions and I avoided
> influences of modifying the definition on them, since this should
> raise errors in most cases. This is same reason for other *MissingOk
> functions in parse_type.c.
>
> Is it better to write definitions of these function and all there callers?

Yes, for parseTypeString certainly. There have been many refactorings
like that in the past and all of them use this pattern.

typenameTypeIdAndMod is less clear since the code paths differ so
much, maybe keep 2 versions (merging back to 1 function is OK too, but
in any case you don't need 3).

typenameTypeIdAndModMissingOk(...)
{
Type tup = LookupTypeName(pstate, typeName, typmod_p);
if (tup == NULL || !((Form_pg_type) GETSTRUCT(tup))->typisdefined)
*typeid_p = InvalidOid;
else
*typeid_p = HeapTupleGetOid(tup);

if (tup)
ReleaseSysCache(tup);
}
typenameTypeIdAndMod(...)
{
Type tup = typenameType(pstate, typeName, typmod_p);
*typeid_p = HeapTupleGetOid(tup);
ReleaseSysCache(tup);
}



Also, there's no need for "else" here:
if (raiseError)
ereport(ERROR, ...);
else
return InvalidOid;

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Partial sort

2014-01-26 Thread Marti Raudsepp
On Mon, Jan 20, 2014 at 2:43 PM, Alexander Korotkov
 wrote:
> Another changes in this version of patch:
> 1) Applied patch to don't compare skipCols in tuplesort by Marti Raudsepp
> 2) Adjusting sort bound after processing buckets.

Hi,

Here's a patch with some whitespace and coding style fixes for
partial-sort-6.patch

I tried to understand the mergejoin regression, but this code still
looks like Chinese to me. Can anyone else have a look at it?

Test case: 
http://www.postgresql.org/message-id/cabrt9rdd-p2rlrdhsmq8rcob46k4a5o+bgz_up2brgeeh4r...@mail.gmail.com
Original report:
http://www.postgresql.org/message-id/CABRT9RCLLUyJ=bkeB132aVA_mVNx5==lvvvqmvuqdgufztw...@mail.gmail.com

Regards,
Marti
From a3cedb922c5a12e43ee94b9d6f5a2aefba701708 Mon Sep 17 00:00:00 2001
From: Marti Raudsepp 
Date: Sun, 26 Jan 2014 16:25:45 +0200
Subject: [PATCH 1/2] Whitespace & coding style fixes

---
 src/backend/executor/nodeSort.c | 17 +
 src/backend/optimizer/path/costsize.c   |  8 
 src/backend/optimizer/path/pathkeys.c   | 18 +-
 src/backend/optimizer/plan/createplan.c |  2 +-
 src/backend/optimizer/plan/planner.c|  6 +++---
 src/backend/utils/sort/tuplesort.c  |  2 +-
 6 files changed, 27 insertions(+), 26 deletions(-)

diff --git a/src/backend/executor/nodeSort.c b/src/backend/executor/nodeSort.c
index f38190d..2e50497 100644
--- a/src/backend/executor/nodeSort.c
+++ b/src/backend/executor/nodeSort.c
@@ -27,13 +27,14 @@
 static bool
 cmpSortSkipCols(SortState *node, TupleDesc tupDesc, HeapTuple a, TupleTableSlot *b)
 {
-	int n = ((Sort *)node->ss.ps.plan)->skipCols, i;
+	int			i,
+n = ((Sort *)node->ss.ps.plan)->skipCols;
 
 	for (i = 0; i < n; i++)
 	{
-		Datum datumA, datumB;
-		bool isnullA, isnullB;
-		AttrNumber attno = node->skipKeys[i].ssup_attno;
+		Datum		datumA, datumB;
+		bool		isnullA, isnullB;
+		AttrNumber	attno = node->skipKeys[i].ssup_attno;
 
 		datumA = heap_getattr(a, attno, tupDesc, &isnullA);
 		datumB = slot_getattr(b, attno, &isnullB);
@@ -147,7 +148,7 @@ ExecSort(SortState *node)
 		tuplesort_set_bound(tuplesortstate, node->bound - node->bound_Done);
 
 	/*
-	 * Put next group of tuples where skipCols" sort values are equal to
+	 * Put next group of tuples where skipCols' sort values are equal to
 	 * tuplesort.
 	 */
 	for (;;)
@@ -177,10 +178,10 @@ ExecSort(SortState *node)
 			}
 			else
 			{
-bool cmp;
-cmp = cmpSortSkipCols(node, tupDesc, node->prev, slot);
+bool		equal;
+equal = cmpSortSkipCols(node, tupDesc, node->prev, slot);
 node->prev = ExecCopySlotTuple(slot);
-if (!cmp)
+if (!equal)
 	break;
 			}
 		}
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 9e79c6d..3a18632 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -1331,13 +1331,13 @@ cost_sort(Path *path, PlannerInfo *root,
 	 */
 	if (presorted_keys > 0)
 	{
-		List *groupExprs = NIL;
-		ListCell *l;
-		int i = 0;
+		List	   *groupExprs = NIL;
+		ListCell   *l;
+		int			i = 0;
 
 		foreach(l, pathkeys)
 		{
-			PathKey *key = (PathKey *)lfirst(l);
+			PathKey	   *key = (PathKey *) lfirst(l);
 			EquivalenceMember *member = (EquivalenceMember *)
 lfirst(list_head(key->pk_eclass->ec_members));
 
diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c
index 55d8ef4..1e1a09a 100644
--- a/src/backend/optimizer/path/pathkeys.c
+++ b/src/backend/optimizer/path/pathkeys.c
@@ -319,10 +319,9 @@ compare_pathkeys(List *keys1, List *keys2)
 int
 pathkeys_common(List *keys1, List *keys2)
 {
-	int n;
+	int 		n = 0;
 	ListCell   *key1,
 			   *key2;
-	n = 0;
 
 	forboth(key1, keys1, key2, keys2)
 	{
@@ -460,7 +459,7 @@ get_cheapest_fractional_path_for_pathkeys(List *paths,
 	num_groups = (double *)palloc(sizeof(double) * list_length(pathkeys));
 	foreach(l, pathkeys)
 	{
-		PathKey *key = (PathKey *)lfirst(l);
+		PathKey *key = (PathKey *) lfirst(l);
 		EquivalenceMember *member = (EquivalenceMember *)
 			lfirst(list_head(key->pk_eclass->ec_members));
 
@@ -1085,7 +1084,6 @@ find_mergeclauses_for_pathkeys(PlannerInfo *root,
 	List	   *mergeclauses = NIL;
 	ListCell   *i;
 	bool	   *used = (bool *)palloc0(sizeof(bool) * list_length(restrictinfos));
-	int			k;
 	List	   *unusedRestrictinfos = NIL;
 	List	   *usedPathkeys = NIL;
 
@@ -1103,6 +1101,7 @@ find_mergeclauses_for_pathkeys(PlannerInfo *root,
 		EquivalenceClass *pathkey_ec = pathkey->pk_eclass;
 		List	   *matched_restrictinfos = NIL;
 		ListCell   *j;
+		int			k = 0;
 
 		/*--
 		 * A mergejoin clause matches a pathkey if it has the same EC.
@@ -1140,7 +1139,6 @@ find_mergeclauses_for_pathkeys(PlannerInfo *root,
 		 * deal with the case in create_mergejoin_plan().
 		 *--
 		 */
-		k = 0;
 		foreach(j, restrictinfos)
 		{
 			Restric

Re: [HACKERS] plpgsql.warn_shadow

2014-01-27 Thread Marti Raudsepp
On Sun, Jan 26, 2014 at 11:19 AM, Simon Riggs  wrote:
> For 9.4, we should cut down the patch so it has
>   plpgsql.warnings = none (default) | all | [individual item list]

>   plpgsql.warnings_as_errors = off (default) | on

I hope I'm not late for the bikeshedding :)

Why not have 2 similar options:
plpgsql.warnings = none (default) | all | [individual item list]
plpgsql.errors = none (default) | all | [individual item list]

That would be cleaner, more flexible, and one less option to
set if you just want errors and no warnings.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Partial sort

2014-01-27 Thread Marti Raudsepp
On Mon, Jan 27, 2014 at 9:26 PM, Alexander Korotkov
 wrote:
> For now, I have attempt to fix extra columns in mergejoin problem. It would
> be nice if you test it.

Yes, it solves the test cases I was trying with, thanks.

> 1) With enable_partialsort = off all mergejoin logic should behave as
> without partial sort patch.
> 2) With partial sort patch get_cheapest_fractional_path_for_pathkeys
> function is much more expensive to execute. With enable_partialsort = off it
> should be as cheap as without partial sort patch.

When it comes to planning time, I really don't think you should
bother. The planner enable_* settings are meant for troubleshooting,
debugging and learning about the planner. You should not expect people
to disable them in a production setting. It's not worth complicating
the code for that rare case.

This is stated in the documentation
(http://www.postgresql.org/docs/current/static/runtime-config-query.html)
and repeatedly on the mailing lists.

But some benchmarks of planning performance are certainly warranted.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Partial sort

2014-01-28 Thread Marti Raudsepp
On Tue, Jan 28, 2014 at 7:51 AM, Alexander Korotkov
 wrote:
> I didn't test it, but I worry that overhead might be high.
> If it's true then it could be like constraint_exclusion option which id off
> by default because of planning overhead.

I see, that makes sense.

I will try to find the time to run some benchmarks in the coming few days.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Partial sort

2014-02-05 Thread Marti Raudsepp
On Tue, Jan 28, 2014 at 7:51 AM, Alexander Korotkov wrote:

> On Tue, Jan 28, 2014 at 7:41 AM, Marti Raudsepp  wrote:
>
>> But some benchmarks of planning performance are certainly warranted.
>>
>
> I didn't test it, but I worry that overhead might be high.
> If it's true then it could be like constraint_exclusion option which id
> off by default because of planning overhead.
>

Sorry I didn't get around to this before.

I ran some synthetic benchmarks with single-column inner joins between 5
tables, with indexes on both joined columns, using only EXPLAIN (so
measuring planning time, not execution) in 9 scenarios to excercise
different code paths. According to these measurements, the overhead ranges
between 1.0 and 4.5% depending on the scenario.


Merge join with partial sort children seems like a fairly obscure use case
(though I'm sure it can help a lot in those cases). The default should
definitely allow partial sort in normal ORDER BY queries. What's under
question here is whether to enable partial sort for mergejoin.

So I see 3 possible resolutions:
1. The overhead is deemed acceptable to enable by default, in which case
we're done here.
2. Add a three-value runtime setting like: enable_partialsort = [ off |
no_mergejoin | on ], defaulting to no_mergejoin (just to get the point
across, clearly we need better naming). This is how constraint_exclusion
works.
3. Remove the partialsort mergejoin code entirely, keeping the rest of the
cases.

What do you think?


All the tests are available here:
https://github.com/intgr/benchjunk/tree/master/partial_sort (using script
run2.sh)

Overhead by test (partial-sort-7.patch.gz):
join5.sql 2.9% (all joins on the same column)
star5.sql 1.7% ("star schema" kind of join)
line5.sql 1.9% (joins chained to each other)
lim_join5.sql 4.5% (same as above, with LIMIT 1)
lim_star5.sql 2.8%
lim_line5.sql 1.8%
limord_join5.sql 4.3% (same as above, with ORDER BY & LIMIT 1)
limord_star5.sql 3.9%
limord_line5.sql 1.0%

Full data:
PostgreSQL @ git ac8bc3b
join5.sql tps = 499.490173 (excluding connections establishing)
join5.sql tps = 503.756335 (excluding connections establishing)
join5.sql tps = 504.814072 (excluding connections establishing)
star5.sql tps = 492.799230 (excluding connections establishing)
star5.sql tps = 492.570615 (excluding connections establishing)
star5.sql tps = 491.949985 (excluding connections establishing)
line5.sql tps = 773.945050 (excluding connections establishing)
line5.sql tps = 773.858068 (excluding connections establishing)
line5.sql tps = 774.551240 (excluding connections establishing)
lim_join5.sql tps = 392.539745 (excluding connections establishing)
lim_join5.sql tps = 391.867549 (excluding connections establishing)
lim_join5.sql tps = 393.361655 (excluding connections establishing)
lim_star5.sql tps = 418.431804 (excluding connections establishing)
lim_star5.sql tps = 419.258985 (excluding connections establishing)
lim_star5.sql tps = 419.434697 (excluding connections establishing)
lim_line5.sql tps = 713.852506 (excluding connections establishing)
lim_line5.sql tps = 713.636694 (excluding connections establishing)
lim_line5.sql tps = 712.971719 (excluding connections establishing)
limord_join5.sql tps = 381.068465 (excluding connections establishing)
limord_join5.sql tps = 380.379359 (excluding connections establishing)
limord_join5.sql tps = 381.182385 (excluding connections establishing)
limord_star5.sql tps = 412.997935 (excluding connections establishing)
limord_star5.sql tps = 411.401352 (excluding connections establishing)
limord_star5.sql tps = 413.209784 (excluding connections establishing)
limord_line5.sql tps = 688.906406 (excluding connections establishing)
limord_line5.sql tps = 689.445483 (excluding connections establishing)
limord_line5.sql tps = 688.758042 (excluding connections establishing)

partial-sort-7.patch.gz
join5.sql tps = 479.508034 (excluding connections establishing)
join5.sql tps = 488.263674 (excluding connections establishing)
join5.sql tps = 490.127433 (excluding connections establishing)
star5.sql tps = 482.106063 (excluding connections establishing)
star5.sql tps = 484.179687 (excluding connections establishing)
star5.sql tps = 483.027372 (excluding connections establishing)
line5.sql tps = 758.092993 (excluding connections establishing)
line5.sql tps = 759.697814 (excluding connections establishing)
line5.sql tps = 759.792792 (excluding connections establishing)
lim_join5.sql tps = 375.517211 (excluding connections establishing)
lim_join5.sql tps = 375.539109 (excluding connections establishing)
lim_join5.sql tps = 375.841645 (excluding connections establishing)
lim_star5.sql tps = 407.683110 (excluding connections establishing)
lim_star5.sql tps = 407.414409 (excluding connections establishing)
lim_star5.sql tps = 407.526613 (excluding connections establishing)
lim_line5.sql tps = 699.905101 (excluding connections est

Re: [HACKERS] PoC: Partial sort

2014-02-06 Thread Marti Raudsepp
On Thu, Feb 6, 2014 at 5:31 AM, Robert Haas  wrote:
> Hmm, sounds a little steep.  Why is it so expensive?  I'm probably
> missing something here, because I would have thought that planner
> support for partial sorts would consist mostly of considering the same
> sorts we consider today, but with the costs reduced by the batching.

I guess it's because the patch undoes some optimizations in the
mergejoin planner wrt caching merge clauses and adds a whole lot of
code to find_mergeclauses_for_pathkeys. In other code paths the
overhead does seem to be negligible.

Notice the removal of:
/* Select the right mergeclauses, if we didn't already */
/*
 * Avoid rebuilding clause list if we already made one;
 * saves memory in big join trees...
 */

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Fwd: [HACKERS] Proposal: variant of regclass

2014-02-06 Thread Marti Raudsepp
On Tue, Jan 28, 2014 at 10:38 AM, Yugo Nagata  wrote:
> I revised the patch. Could you please review this?

I didn't test the patch due to the duplicate OID compilation error.
But a few things stuck out from the diffs:
* You added some unnecessary spaces at the beginning of the linein
OpernameGetCandidates.
* regclass_guts and regtype_guts can be simplified further by moving
the ereport() code into regclassin, thereby saving the "if
(missing_ok)"
* I would rephrase the documentation paragraph as:

to_regclass, to_regproc, to_regoper and to_regtype are functions
similar to the regclass, regproc, regoper and regtype casts, except
that they return InvalidOid (0) when the object is not found, instead
of raising an error.

On Wed, Jan 22, 2014 at 1:04 PM, Tatsuo Ishii  wrote:
>> I thought the consensus was that returning NULL is better than
>> InvalidOid? From an earlier message:

> Not sure. There's already at least one counter example:
>
> pg_my_temp_schema() oid OID of session's temporary schema, or 0 if 
> none

And there are pg_relation_filenode, pg_stat_get_backend_dbid,
pg_stat_get_backend_userid which return NULL::oid. In general I don't
like magic values like 0 in SQL. For example, this query would give
unexpected results because InvalidOid has some other meaning:

select * from pg_aggregate where aggfinalfn=to_regproc('typo');

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Partial sort

2014-02-06 Thread Marti Raudsepp
On Thu, Feb 6, 2014 at 9:15 PM, Robert Haas  wrote:
> It may be that having the capability to do a
> partial sort makes it seem worth spending more CPU looking for merge
> joins, but I'd vote for making any such change a separate patch.

Agreed.

Alexander, should I work on splitting up the patch in two, or do you
want to do it yourself?

Should I merge my coding style and enable_partialsort patches while at
it, or do you still have reservations about those?

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Partial sort

2014-02-10 Thread Marti Raudsepp
On Sun, Feb 9, 2014 at 7:37 PM, Alexander Korotkov  wrote:
> This is not only place that worry me about planning overhead. See
> get_cheapest_fractional_path_for_pathkeys. I had to estimate number of
> groups for each sorting column in order to get right fractional path.

AFAICT this only happens once per plan and the overhead is O(n) to the
number of pathkeys? I can't get worried about that, but I guess it's
better to test anyway.

PS: You didn't answer my questions about splitting the patch. I guess
I'll have to do that anyway to run the tests.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Partial sort

2014-02-12 Thread Marti Raudsepp
On Mon, Feb 10, 2014 at 8:59 PM, Alexander Korotkov
 wrote:
> Done. Patch is splitted.

Thanks!

I think the 1st patch now has a bug in initial_cost_mergejoin; you
still pass the "presorted_keys" argument to cost_sort, making it
calculate a partial sort cost, but generated plans never use partial
sort. I think 0 should be passed instead. Patch attached, needs to be
applied on top of partial-sort-basic-1 and then reverse-applied on
partial-sort-merge-1.

With partial-sort-basic-1 and this fix on the same test suite, the
planner overhead is now a more manageable 0.5% to 1.3%; one test is
faster by 0.5%. Built with asserts disabled, ran on Intel i5-3570K. In
an effort to reduce variance, I locked the server and pgbench to a
single CPU core (taskset -c 3), but there are still noticeable
run-to-run differences, so these numbers are a bit fuzzy. The faster
result is definitely not a fluke, however; it happens every time.

> On Mon, Feb 10, 2014 at 2:33 PM, Marti Raudsepp  wrote:
>> AFAICT this only happens once per plan and the overhead is O(n) to the
>> number of pathkeys?

I was of course wrong about that, it also adds extra overhead when
iterating over the paths list.


Test "taskset -c 3 run2.sh" from
https://github.com/intgr/benchjunk/tree/master/partial_sort

Overhead percentages (between best of each 3 runs):
join5.sql 0.7
star5.sql 0.8
line5.sql 0.5
lim_join5.sql -0.5
lim_star5.sql 1.3
lim_line5.sql 0.5
limord_join5.sql 0.6
limord_star5.sql 0.5
limord_line5.sql 0.7

Raw results:
git 48870dd
join5.sql tps = 509.328070 (excluding connections establishing)
join5.sql tps = 509.772190 (excluding connections establishing)
join5.sql tps = 510.651517 (excluding connections establishing)
star5.sql tps = 499.208698 (excluding connections establishing)
star5.sql tps = 498.200314 (excluding connections establishing)
star5.sql tps = 496.269315 (excluding connections establishing)
line5.sql tps = 797.968831 (excluding connections establishing)
line5.sql tps = 797.011690 (excluding connections establishing)
line5.sql tps = 796.379258 (excluding connections establishing)
lim_join5.sql tps = 394.946024 (excluding connections establishing)
lim_join5.sql tps = 395.417689 (excluding connections establishing)
lim_join5.sql tps = 395.482958 (excluding connections establishing)
lim_star5.sql tps = 423.434393 (excluding connections establishing)
lim_star5.sql tps = 423.774305 (excluding connections establishing)
lim_star5.sql tps = 424.386099 (excluding connections establishing)
lim_line5.sql tps = 733.007330 (excluding connections establishing)
lim_line5.sql tps = 731.794731 (excluding connections establishing)
lim_line5.sql tps = 732.356280 (excluding connections establishing)
limord_join5.sql tps = 385.317921 (excluding connections establishing)
limord_join5.sql tps = 385.915870 (excluding connections establishing)
limord_join5.sql tps = 384.747848 (excluding connections establishing)
limord_star5.sql tps = 417.992615 (excluding connections establishing)
limord_star5.sql tps = 416.944685 (excluding connections establishing)
limord_star5.sql tps = 418.262647 (excluding connections establishing)
limord_line5.sql tps = 708.979203 (excluding connections establishing)
limord_line5.sql tps = 710.926866 (excluding connections establishing)
limord_line5.sql tps = 710.928907 (excluding connections establishing)

48870dd + partial-sort-basic-1.patch.gz + fix-cost_sort.patch
join5.sql tps = 505.488181 (excluding connections establishing)
join5.sql tps = 507.222759 (excluding connections establishing)
join5.sql tps = 506.549654 (excluding connections establishing)
star5.sql tps = 495.432915 (excluding connections establishing)
star5.sql tps = 494.906793 (excluding connections establishing)
star5.sql tps = 492.623808 (excluding connections establishing)
line5.sql tps = 789.315968 (excluding connections establishing)
line5.sql tps = 793.875456 (excluding connections establishing)
line5.sql tps = 790.545990 (excluding connections establishing)
lim_join5.sql tps = 396.956732 (excluding connections establishing)
lim_join5.sql tps = 397.515213 (excluding connections establishing)
lim_join5.sql tps = 397.578669 (excluding connections establishing)
lim_star5.sql tps = 417.459963 (excluding connections establishing)
lim_star5.sql tps = 418.024803 (excluding connections establishing)
lim_star5.sql tps = 418.830234 (excluding connections establishing)
lim_line5.sql tps = 729.186915 (excluding connections establishing)
lim_line5.sql tps = 726.288788 (excluding connections establishing)
lim_line5.sql tps = 728.123296 (excluding connections establishing)
limord_join5.sql tps = 383.484767 (excluding connections establishing)
limord_join5.sql tps = 383.021960 (excluding connections establishing)
limord_join5.sql tps = 383.722051 (excluding connections establishing)
limord_star5.sql tps = 414.138460 (excluding connections establishing)
limord_star5.sql tps = 414.063766 (excluding connections establish

Re: [HACKERS] Draft release notes up for review

2014-02-19 Thread Marti Raudsepp
On Sun, Feb 16, 2014 at 10:41 PM, Tom Lane  wrote:
> Any comments before I start transposing them into the back branches?

Sorry I'm late.

> Shore up GRANT ... WITH ADMIN OPTION restrictions (Noah Misch)

I'm not familiar with the phrase "Shore up", I think it should use
more precise language: are the privilege checks getting more strict or
less strict?



Wow, there are quite a lot of items this time. Have you considered
grouping the items by their impact, for example security/data
corruption/crash/correctness/other? I think that would make it easier
for readers to find items they're interested in. Most changes seem
pretty straightforward to categorize; there are always outliers, but
even if a few items are miscategorized, that's an improvement over
what we have now. Of course someone has to  be willing to do that work.

If this warrants more discussion, I can draft out a proposal in a new topic.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Partial sort

2014-02-19 Thread Marti Raudsepp
On Wed, Feb 12, 2014 at 11:54 PM, Marti Raudsepp  wrote:
> With partial-sort-basic-1 and this fix on the same test suite, the
> planner overhead is now a more manageable 0.5% to 1.3%; one test is
> faster by 0.5%.

Ping, Robert or anyone, does this overhead seem bearable or is that
still too much?

Do these numbers look conclusive enough or should I run more tests?

> I think the 1st patch now has a bug in initial_cost_mergejoin; you
> still pass the "presorted_keys" argument to cost_sort, making it
> calculate a partial sort cost

Ping, Alexander?

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Selecting large tables gets killed

2014-02-20 Thread Marti Raudsepp
On Thu, Feb 20, 2014 at 12:07 PM, Ashutosh Bapat
 wrote:
> That seems a good idea. We will get rid of FETCH_COUNT then, wouldn't we?

No, I don't think we want to do that. FETCH_COUNT values greater than
1 are still useful to get reasonably tabulated output without hogging
too much memory. For example:

db=# \set FETCH_COUNT 3
db=# select repeat('a', i) a, 'x'x from generate_series(1,9)i;
  a  | x
-+---
 a   | x
 aa  | x
 aaa | x
    | x
 a  | x
 aa | x
 aaa   | x
   | x
 a | x


Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Display oprcode and its volatility in \do+

2014-02-21 Thread Marti Raudsepp
On Thu, Jan 16, 2014 at 5:22 PM, Tom Lane  wrote:
> but adding
> volatility here seems like probably a waste of valuable terminal width.
> I think that the vast majority of operators have immutable or at worst
> stable underlying functions, so this doesn't seem like the first bit
> of information I'd need about the underlying function.

For a data point, just today I wanted to look up the volatility of
pg_trgm operators, which made me remember this patch. The \do+ output
is narrow enough, I think an extra volatility column wouldn't be too
bad.

But even just having the function name is a huge improvement, at least
that allows looking up volatility using \commands without accessing
pg_operator directly.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.3 beta breaks some extensions "make install"

2013-09-10 Thread Marti Raudsepp
On Tue, May 14, 2013 at 4:12 AM, Marti Raudsepp  wrote:
> While testing out PostgreSQL 9.3beta1, I stumbled upon a problem

> % make DESTDIR=/tmp/foo install

> /usr/bin/install: will not overwrite just-created
> ‘/tmp/foo/usr/share/postgresql/extension/semver--0.3.0.sql’ with
> ‘./sql/semver--0.3.0.sql’
> make: *** [install] Error 1

On Wed, May 15, 2013 at 4:49 PM, Peter Eisentraut  wrote:
> That said, I'm obviously outnumbered here.  What about the following
> compromise:  Use the configure-selected install program inside
> PostgreSQL (which we can test easily), and use install-sh under
> USE_PGXS?  Admittedly, the make install time of extensions is probably
> not an issue.

Did we ever do anything about this? It looks like the thread got
distracted with VPATH builds and now I'm seeing this problem in 9.3.0.
:(

This occurs in Arch Linux, but for some odd reason not on Ubuntu when
using apt.postgresql.org. Somehow the pgxs.mk supplied by
apt.postgresql.org differs from the one shipped in PostgreSQL.

Is there a chance of getting this resolved in PostgreSQL or should we
get extension writers to fix their makefiles instead?

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.3 beta breaks some extensions "make install"

2013-09-13 Thread Marti Raudsepp
On Fri, Sep 13, 2013 at 6:42 PM, Cédric Villemain
 wrote:
> Marti Raudsepp  a écrit :
>>Did we ever do anything about this? It looks like the thread got
>>distracted with VPATH builds and now I'm seeing this problem in 9.3.0.

> Andrew is about to commit (well...I hope) a doc patch about that and also a 
> little fix.
> Imho this is a bugfix so I hope it will be applyed in older branches.

Oh I see, indeed commit 6697aa2bc25c83b88d6165340348a31328c35de6
"Improve support for building PGXS modules with VPATH" fixes the
problem and I see it's not present in REL9_3_0.

Andrew and others, does this seem safe enough to backport to 9.3.1?

> Apt.pgdg got the patch present in postgresql head applyed.

Erm, isn't apt.postgresql.org supposed to ship the *official*
PostgreSQL versions? Given that this issue affects all distros, I
don't see why Ubuntu/Debian need to be patched separately.

Does anyone else think this is problematic? By slipping patches into
distro-specific packages, you're confusing users (like me) and
bypassing the PostgreSQL QA process.

PS: Where are the sources used to build packages on apt.postgresql.org?

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.3 beta breaks some extensions "make install"

2013-09-18 Thread Marti Raudsepp
On Tue, Sep 17, 2013 at 10:37 AM, Cédric Villemain
 wrote:
>> Erm, isn't apt.postgresql.org supposed to ship the *official*
>> PostgreSQL versions? Given that this issue affects all distros, I
>> don't see why Ubuntu/Debian need to be patched separately.

> Well, the patches are applyed on the debian packages (not only in
> apt.pgdg.org).
> The packages provided by apt.postgresql.org are based on the 'official
> packages' from debian. (if you allow me this circle)

Oh I see, that makes sense.

>> PS: Where are the sources used to build packages on
> 9.3:
> http://anonscm.debian.org/loggerhead/pkg-postgresql/postgresql-9.3/trunk/changes

Thanks!

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.3 beta breaks some extensions "make install"

2013-09-23 Thread Marti Raudsepp
On Fri, Sep 13, 2013 at 8:17 PM, Marti Raudsepp  wrote:
> On Fri, Sep 13, 2013 at 6:42 PM, Cédric Villemain  
> wrote:
>> Andrew is about to commit (well...I hope) a doc patch about that and also a 
>> little fix.
>> Imho this is a bugfix so I hope it will be applyed in older branches.
>
> Oh I see, indeed commit 6697aa2bc25c83b88d6165340348a31328c35de6
> "Improve support for building PGXS modules with VPATH" fixes the
> problem and I see it's not present in REL9_3_0.
>
> Andrew and others, does this seem safe enough to backport to 9.3.1?

Ping? Will this be backported to 9.3 or should I report to extension
authors to fix their Makefiles?

I understand that the 9.3.1 release might still be weeks away, I'd
just like to get a vague confirmation about what committers think.

Note that this patch is already applied to Debian/Ubuntu packages
(including those on apt.postgresql.org).

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.3 beta breaks some extensions "make install"

2013-11-01 Thread Marti Raudsepp
Hi Andrew,

On Mon, Sep 23, 2013 at 6:43 PM, Andrew Dunstan  wrote:
> I'm working on it. It appears to have a slight problem or two I want to fix
> at the same time, rather than backpatch something broken.

Any progress on this? I notice that the fixes didn't make it into 9.3.1.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Less than ideal error reporting in pg_stat_statements

2015-09-25 Thread Marti Raudsepp
On Wed, Sep 23, 2015 at 3:01 AM, Peter Geoghegan  wrote:
> I think that the real problem here is that garbage collection needs to
> deal with OOM more appropriately.

+1

I've also been seeing lots of log messages saying "LOG:  out of
memory" on a server that's hosting development databases. I put off
debugging this until now because it didn't seem to have any adverse
effects on the system.

The file on my system is currently 5.1GB (!). I don't know how it got
there -- under normal circumstances we don't have any enormous
queries, but perhaps our application bugs during development triggered
that.

The configuration on this system is pg_stat_statements.max = 1 and
pg_stat_statements.track = all.


The comment near gc_qtexts says:
* This won't be called often in the typical case, since it's likely that
* there won't be too much churn, and besides, a similar compaction process
* occurs when serializing to disk at shutdown or as part of resetting.
* Despite this, it seems prudent to plan for the edge case where the file
* becomes unreasonably large, with no other method of compaction likely to
* occur in the foreseeable future.
[...]
* Load the old texts file.  If we fail (out of memory, for instance) just
* skip the garbage collection.

So, as I understand it: if the system runs low on memory for an
extended period, and/or the file grows beyond 1GB (MaxAlloc), garbage
collection stops entirely, meaning it starts leaking disk space until
a manual intervention.

It's very frustrating when debugging aides cause further problems on a
system. If the in-line compaction doesn't materialize (or it's decided
not to backport it), I would propose instead to add a test to
pgss_store() to avoid growing the file beyond MaxAlloc (or perhaps
even a lower limit). Surely dropping some statistics is better than
this pathology.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] Skip ALTER x SET SCHEMA if the schema didn't change

2015-09-28 Thread Marti Raudsepp
Hi list

The attached patch changes the behavior of multiple ALTER x SET SCHEMA
commands, to skip, rather than fail, when the old and new schema is
the same.

The advantage is that it's now easier to write DDL scripts that are indempotent.

This already matches the behavior of ALTER EXTENSION SET SCHEMA in
earlier versions, as well as many other SET-ish commands, e.g. ALTER
TABLE SET TABLESPACE, OWNER TO, CLUSTER ON, SET (storage_parameter...)
etc. I don't see why SET SCHEMA should be treated any differently.

The code is written such that object_access_hook is still called for
each object.

Regression tests included. I couldn't find any documentation that
needs changing.

Regards,
Marti


0001-Skip-ALTER-x-SET-SCHEMA-if-the-schema-didn-t-change.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes for MAX, MIN, ORDER BY?

2015-09-28 Thread Marti Raudsepp
Hi Gavin

Note that Alexander Korotkov already started work in 2013 on a
somewhat similar feature called partial sort:
http://www.postgresql.org/message-id/CAPpHfdscOX5an71nHd8WSUH6GNOCf=V7wgDaTXdDd9=gon-...@mail.gmail.com

In particular, see the 2nd patch for KNN sort -- it uses known
bounding boxes from the GiST index for sorting, which in many ways is
similar to min/max BRIN ranges.

> *partial-knn-1.patch*
>
> KNN-GiST provides ability to get ordered results from index, but this order
> is based only on index information. For instance, GiST index contains
> bounding rectangles for polygons, and we can't get exact distance to
> polygon from index (similar situation is in PostGIS). In attached patch,
> GiST distance method can set recheck flag (similar to consistent method).
> This flag means that distance method returned lower bound of distance and
> we should recheck it from heap.

Unfortunatley this work has stalled.

Regards,
Marti

On Sun, Sep 27, 2015 at 11:58 PM, Gavin Wahl  wrote:
> It seems trivial to accelerate a MAX or MIN query with a BRIN index. You
> just find the page range with the largest/smallest value, and then only scan
> that one. Would that be hard to implement? I'm interested in working on it
> if someone can give me some pointers.
>
> Somewhat harder but still possible would be using BRIN indexes to accelerate
> ORDER BY. This would require a sorting algorithm that can take advantage of
> mostly-sorted inputs. You would sort the page ranges by their minimum or
> maximum value, then feed the sorting algorithm in that order.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix float8 parsing of denormal values (on some platforms?)

2012-02-02 Thread Marti Raudsepp
On Wed, Feb 1, 2012 at 20:17, Tom Lane  wrote:
> Applied with minor revisions.

Thanks! :)

We're already seeing first buildfarm failures, on system "narwhal"
using an msys/mingw compiler.
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=narwhal&dt=2012-02-02%2005%3A00%3A02

No idea which libc it uses though. The MSVC libc looks fine because
"currawong" passes these tests.

PS: it would be useful to have the output of 'cc -v' in buildfarm
reports since the "Compiler" field may be out of date.

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch pg_is_in_backup()

2012-02-02 Thread Marti Raudsepp
On Mon, Jan 30, 2012 at 20:33, Gilles Darold  wrote:
> After some time searching for a Pg system administration function like
> pg_is_in_recovery(), let's say pg_is_in_backup(), I couldn't find one.
> The minor patch attached here adds this administrative function that can
> be used with others backup control functions. This is a very little
> patch outside documentation because the function is only a wrapper to
> the internal BackupInProgress() function, just like pg_is_in_recovery()
> is calling RecoveryInProgress().

I think it would be more useful to have a function that returns a
timestamp when the backup started. That way, it's possible to write a
generic monitoring script that alerts the sysadmin only when a backup
has been running for too long, but is silent for well-behaved backups.

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Caching for stable expressions with constant arguments v6

2012-02-04 Thread Marti Raudsepp
On Sat, Feb 4, 2012 at 09:49, Jaime Casanova  wrote:
> i little review...

Thanks! By the way, you should update to the v7 patch.

> first, i notice a change of behaviour... i'm not sure if i can say
> this is good or not.

> if you execute: select *, cached_random() from (select
> generate_series(1, 10) ) i;

Yeah, this is pretty much expected.

> seems you are moving code in simplify_function(), do you think is
> useful to do that independently? at least if it provides some clarity
> to the code

I think so, yeah, but separating it from the patch would be quite a bit of work.

>  shared_buffers | 4096

Note that the "ts" table is 36MB so it doesn't fit in your
shared_buffers now. If you increase shared_buffers, you will see
better gains for tests #1 and #2

> from what i see there is no too much gain for the amount of complexity
> added... i can see there should be cases which a lot more gain

Yeah, the goal of this test script wasn't to demonstrate the best
cases of the patch, but to display how it behaves in different
scenarios. Here's what they do:

Test case #1 is a typical "real world" query that benefits from this
patch. With a higher shared_buffers you should see a 6-7x improvement
there, which I think is a compelling speedup for a whole class of
queries.

Test #2 doesn't benefit much from the patch since now() is already a
very fast function, but the point is that even saving the function
call overhead helps noticeably.

Tests #3 and #4 show the possible *worst* case of the patch -- it's
processing a complex expression, but there's just one row in the
table, so no opportunity for caching.


Besides stable functions, this patch also improves the performance of
expressions involving placeholders parameters, such as variable
references from PL/pgSQL, since these are not constant-folded. E.g:

DECLARE
  foo timestamptz = '2012-02-04';
BEGIN
  SELECT * FROM ts WHERE ts>(foo - interval '1 day');

Before this patch, the interval subtraction was executed once per row;
now it's once per execution.

> a benchmark with pgbench scale 20 (average of 3 runs, -T 300 clients
> =1, except on second run)
> -scale 20

I think the differences here are mostly noise because the queries
pgbench generates aren't affected by caching.

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Text-any concatenation volatility acting as optimization barrier

2012-02-07 Thread Marti Raudsepp
Hi list,

Andrew Dunstan reported an awkward-seeming case on IRC where shifting
around a concatenation expression in a view made the planner choose a
good or a bad execution plan.

Simplified, it boils down to this:

db=# create table foo(i int);
db=# explain verbose select i from (select i, i::text || 'x' as asd
from foo) as subq;
Seq Scan on public.foo  (cost=0.00..34.00 rows=2400 width=4)
  Output: foo.i

db=# explain verbose select i from (select i, i || 'x'::text as asd
from foo) as subq;
Subquery Scan on subq  (cost=0.00..76.00 rows=2400 width=4)
  Output: subq.i
  ->  Seq Scan on public.foo  (cost=0.00..52.00 rows=2400 width=4)
Output: foo.i, ((foo.i)::text || 'x'::text)

Case #1 uses the normal textcat(text, text) operator by automatically
coercing 'x' as text.
However, case #2 uses the anytextcat(anynonarray, text), which is
marked as volatile thus acts as an optimization barrier. Later, the
anytextcat SQL function is inlined and the EXPLAIN VERBOSE output has
no trace of what happened.

Is this something we can, or want, to fix?

One way would be doing preprocess_expression() before
pull_up_subqueries() so function inlining happens earlier, but I can't
imagine what unintended consequences that might have.

Another option would be creating explicit immutable  text || foo
operators for common types, but that sounds pretty hacky.

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Text-any concatenation volatility acting as optimization barrier

2012-02-07 Thread Marti Raudsepp
On Tue, Feb 7, 2012 at 22:31, Andrew Dunstan  wrote:
> It gets worse if you replace the expression with a call to a (non-sql)
> function returning text, which was in fact the original use case. Then
> you're pretty  much hosed.

Oh, if it's a non-SQL function then marking it as STABLE/IMMUTABLE
should solve it -- did you try that?

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Text-any concatenation volatility acting as optimization barrier

2012-02-08 Thread Marti Raudsepp
On Wed, Feb 8, 2012 at 06:21, Tom Lane  wrote:
> Marti Raudsepp  writes:
>> Case #1 uses the normal textcat(text, text) operator by automatically
>> coercing 'x' as text.
>> However, case #2 uses the anytextcat(anynonarray, text), which is
>> marked as volatile thus acts as an optimization barrier.
>
> Hmm ... since those operators were invented (in 8.3), we have adopted a
> policy that I/O functions are presumed to be no worse than stable:
> http://archives.postgresql.org/pgsql-committers/2010-07/msg00307.php
> ISTM that would justify relabeling anytextcat/textanycat as stable,
> which should fix this.

Yes, we should definitely take advantage of that.

I scanned through all of pg_proc, there are 4 functions like this that
can be changed: textanycat, anytextcat, quote_literal and
quote_nullable. All of these have SQL wrappers to cast their argument
to ::text.

 quote_literal  | select pg_catalog.quote_literal($1::pg_catalog.text)
 quote_nullable | select pg_catalog.quote_nullable($1::pg_catalog.text)
 textanycat | select $1 || $2::pg_catalog.text
 anytextcat | select $1::pg_catalog.text || $2

Patch attached (in git am format). Passes all regression tests (except
'json' which fails on my machine even on git master).

No documentation changes necessary AFAICT.

Regards,
Marti
From e1943868d21316ff9126283efec54146c14e00fc Mon Sep 17 00:00:00 2001
From: Marti Raudsepp 
Date: Wed, 8 Feb 2012 11:26:03 +0200
Subject: [PATCH] Mark textanycat/quote_literal/quote_nullable functions as
 stable

These are SQL functions that rely on immutable functions/operators, but
were previously marked volatile, since they relied on unknown ::text
casts. As of commit aab353a60b95aadc00f81da0c6d99bde696c4b75, all text
I/O functions are guaranteed to be stable or immutable.

Author: Marti Raudsepp 
---
 src/include/catalog/catversion.h |2 +-
 src/include/catalog/pg_proc.h|8 
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/include/catalog/catversion.h b/src/include/catalog/catversion.h
index ae4e5f5..1d92ee3 100644
--- a/src/include/catalog/catversion.h
+++ b/src/include/catalog/catversion.h
@@ -53,6 +53,6 @@
  */
 
 /*			mmddN */
-#define CATALOG_VERSION_NO	201202072
+#define CATALOG_VERSION_NO	201202081
 
 #endif
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index f8d01fb..d4206f1 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2271,11 +2271,11 @@ DATA(insert OID =  1282 ( quote_ident	   PGNSP PGUID 12 1 0 0 0 f f f t f i 1 0
 DESCR("quote an identifier for usage in a querystring");
 DATA(insert OID =  1283 ( quote_literalPGNSP PGUID 12 1 0 0 0 f f f t f i 1 0 25 "25" _null_ _null_ _null_ _null_ quote_literal _null_ _null_ _null_ ));
 DESCR("quote a literal for usage in a querystring");
-DATA(insert OID =  1285 ( quote_literalPGNSP PGUID 14 1 0 0 0 f f f t f v 1 0 25 "2283" _null_ _null_ _null_ _null_ "select pg_catalog.quote_literal($1::pg_catalog.text)" _null_ _null_ _null_ ));
+DATA(insert OID =  1285 ( quote_literalPGNSP PGUID 14 1 0 0 0 f f f t f s 1 0 25 "2283" _null_ _null_ _null_ _null_ "select pg_catalog.quote_literal($1::pg_catalog.text)" _null_ _null_ _null_ ));
 DESCR("quote a data value for usage in a querystring");
 DATA(insert OID =  1289 ( quote_nullable   PGNSP PGUID 12 1 0 0 0 f f f f f i 1 0 25 "25" _null_ _null_ _null_ _null_ quote_nullable _null_ _null_ _null_ ));
 DESCR("quote a possibly-null literal for usage in a querystring");
-DATA(insert OID =  1290 ( quote_nullable   PGNSP PGUID 14 1 0 0 0 f f f f f v 1 0 25 "2283" _null_ _null_ _null_ _null_ "select pg_catalog.quote_nullable($1::pg_catalog.text)" _null_ _null_ _null_ ));
+DATA(insert OID =  1290 ( quote_nullable   PGNSP PGUID 14 1 0 0 0 f f f f f s 1 0 25 "2283" _null_ _null_ _null_ _null_ "select pg_catalog.quote_nullable($1::pg_catalog.text)" _null_ _null_ _null_ ));
 DESCR("quote a possibly-null data value for usage in a querystring");
 
 DATA(insert OID = 1798 (  oidin			   PGNSP PGUID 12 1 0 0 0 f f f t f i 1 0 26 "2275" _null_ _null_ _null_ _null_ oidin _null_ _null_ _null_ ));
@@ -2747,8 +2747,8 @@ DESCR("adjust time precision");
 DATA(insert OID = 1969 (  timetz		   PGNSP PGUID 12 1 0 0 0 f f f t f i 2 0 1266 "1266 23" _null_ _null_ _null_ _null_ timetz_scale _null_ _null_ _null_ ));
 DESCR("adjust time with time zone precision");
 
-DATA(insert OID = 2003 (  textanycat	   PGNSP PGUID 14 1 0 0 0 f f f t f v 2 0 25 "25 2776" _null_ _null_ _null_ _null_ "select $1 || $2::pg_catalog.text" _null_ _null_ _null_ ));
-DATA(insert OID = 2004 (  anytextcat	   PGNSP PGUID 14 1 0 0 0 f f f t f v 2 0 25 "2776 25" _null_ _null_ _null_ _null_ &q

Re: [HACKERS] Text-any concatenation volatility acting as optimization barrier

2012-02-08 Thread Marti Raudsepp
On Wed, Feb 8, 2012 at 18:20, Tom Lane  wrote:
> Robert Haas  writes:
>> On Wed, Feb 8, 2012 at 4:53 AM, Marti Raudsepp  wrote:
>>> Patch attached (in git am format). Passes all regression tests (except
>>> 'json' which fails on my machine even on git master).
>
>> Can you provide the diffs from the json test on your machine?  I don't
>> see any build-farm failures off-hand...
>
> I'm seeing diffs too after applying Marti's patch: instead of "z", "b",
> etc, the field labels in the json values look like "f1", "f2", etc in
> the output of queries such as

Sorry, my bad. I guess I got the two versions (patched and unpatched)
mixed up. Indeed this difference disappears when I remove my patch.
I'll have to be more careful when checking regression diffs in the
future. :)

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Enable min/max optimization for bool_and/bool_or/every

2012-02-08 Thread Marti Raudsepp
On Wed, Feb 8, 2012 at 19:48, Tom Lane  wrote:
> I applied this patch, since I was busy applying catalog changes from you
> anyway ;-).

Thanks :)

> I did think of a possible reason to reject the patch: with this change,
> the planner will take longer to plan queries involving bool_and() et al,
> since planagg.c will spend time looking (usually fruitlessly) for an
> index-based plan.

Good point, I should have done those measurements up front. Anyway,
since I've often noticed \timing to be unreliable for short queries, I
decided to retry your test with pgbench.

Long story short, I measured 27% overhead in the un-indexed column
case and 33% overhead for an indexed column. That's a lot more than I
expected. I even rebuilt and retried a few times to make sure I hadn't
botched something. The benchmark script is attached.

UNPATCHED
select bool_and(b) from unindexed;
tps = 13787.023113 (excluding connections establishing)
tps = 13880.484788 (excluding connections establishing)
tps = 13784.654542 (excluding connections establishing)
select bool_and(b) from indexed;
tps = 12536.650703 (excluding connections establishing)
tps = 12647.767993 (excluding connections establishing)
tps = 12500.956407 (excluding connections establishing)

PATCHED
select bool_and(b) from unindexed;
tps = 10096.834678 (excluding connections establishing)
tps = 10110.182425 (excluding connections establishing)
tps = 10103.904500 (excluding connections establishing)
select bool_and(b) from indexed;
tps = 8373.631407 (excluding connections establishing)
tps = 8659.917173 (excluding connections establishing)
tps = 8473.899896 (excluding connections establishing)

Regards,
Marti


bench_bool_and.sh
Description: Bourne shell script

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] RFC: Making TRUNCATE more "MVCC-safe"

2012-02-09 Thread Marti Raudsepp
Hi!

I've always been a little wary of using the TRUNCATE command due to
the warning in the documentation about it not being "MVCC-safe":
queries may silently give wrong results and it's hard to tell when
they are affected.

That got me thinking, why can't we handle this like a standby server
does -- if some query has data removed from underneath it, it aborts
with a serialization failure.

Does this solution sound like a good idea?

The attached patch is a lame attempt at implementing this. I added a
new pg_class.relvalidxmin attribute which tracks the Xid of the last
TRUNCATE (maybe it should be called reltruncatexid?). Whenever
starting a relation scan with a snapshot older than relvalidxmin, an
error is thrown. This seems to work out well since TRUNCATE updates
pg_class anyway, and anyone who sees the new relfilenode automatically
knows when it was truncated.

Am I on the right track? Are there any better ways to attach this
information to a relation?
Should I also add another counter to pg_stat_database_conflicts?
Currently this table is only used on standby servers.

Since I wrote it just this afternoon, there are a few things still
wrong with the patch (it doesn't handle xid wraparound for one), so
don't be too picky about the code yet. :)

Example:
  CREATE TABLE foo (i int);
Session A:
  BEGIN ISOLATION LEVEL REPEATABLE READ;
  SELECT txid_current(); -- Force snapshot open
Session B:
  TRUNCATE TABLE foo;
Session A:
  SELECT * FROM foo;
ERROR:  canceling statement due to conflict with TRUNCATE TABLE on foo
DETAIL:  Rows visible to this transaction have been removed.


Patch also available in my github 'truncate' branch:
https://github.com/intgr/postgres/commits/truncate

Regards,
Marti
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
new file mode 100644
index 99a431a..e909b17
*** a/src/backend/access/heap/heapam.c
--- b/src/backend/access/heap/heapam.c
*** heap_beginscan_internal(Relation relatio
*** 1175,1180 
--- 1175,1196 
  	HeapScanDesc scan;
  
  	/*
+ 	 * If the snapshot is older than relvalidxmin then that means TRUNCATE has
+ 	 * removed data that we should still see. Abort rather than giving
+ 	 * potentially bogus results.
+ 	 */
+ 	if (relation->rd_rel->relvalidxmin != InvalidTransactionId &&
+ 		snapshot->xmax != InvalidTransactionId &&
+ 		NormalTransactionIdPrecedes(snapshot->xmax, relation->rd_rel->relvalidxmin))
+ 	{
+ 		ereport(ERROR,
+ (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+  errmsg("canceling statement due to conflict with TRUNCATE TABLE on %s",
+ 		 NameStr(relation->rd_rel->relname)),
+  errdetail("Rows visible to this transaction have been removed.")));
+ 	}
+ 
+ 	/*
  	 * increment relation ref count while scanning relation
  	 *
  	 * This is just to make really sure the relcache entry won't go away while
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
new file mode 100644
index aef410a..3f9bd5d
*** a/src/backend/catalog/heap.c
--- b/src/backend/catalog/heap.c
*** InsertPgClassTuple(Relation pg_class_des
*** 787,792 
--- 787,793 
  	values[Anum_pg_class_relhastriggers - 1] = BoolGetDatum(rd_rel->relhastriggers);
  	values[Anum_pg_class_relhassubclass - 1] = BoolGetDatum(rd_rel->relhassubclass);
  	values[Anum_pg_class_relfrozenxid - 1] = TransactionIdGetDatum(rd_rel->relfrozenxid);
+ 	values[Anum_pg_class_relvalidxmin - 1] = TransactionIdGetDatum(rd_rel->relvalidxmin);
  	if (relacl != (Datum) 0)
  		values[Anum_pg_class_relacl - 1] = relacl;
  	else
*** AddNewRelationTuple(Relation pg_class_de
*** 884,889 
--- 885,891 
  		new_rel_reltup->relfrozenxid = InvalidTransactionId;
  	}
  
+ 	new_rel_reltup->relvalidxmin = InvalidTransactionId;
  	new_rel_reltup->relowner = relowner;
  	new_rel_reltup->reltype = new_type_oid;
  	new_rel_reltup->reloftype = reloftype;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
new file mode 100644
index bfbe642..0d96a6a
*** a/src/backend/catalog/index.c
--- b/src/backend/catalog/index.c
*** reindex_index(Oid indexId, bool skip_con
*** 2854,2860 
  		}
  
  		/* We'll build a new physical relation for the index */
! 		RelationSetNewRelfilenode(iRel, InvalidTransactionId);
  
  		/* Initialize the index and rebuild */
  		/* Note: we do not need to re-establish pkey setting */
--- 2854,2860 
  		}
  
  		/* We'll build a new physical relation for the index */
! 		RelationSetNewRelfilenode(iRel, InvalidTransactionId, InvalidTransactionId);
  
  		/* Initialize the index and rebuild */
  		/* Note: we do not need to re-establish pkey setting */
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
new file mode 100644
index 1f95abc..ab4aec2
*** a/src/backend/commands/sequence.c
--- b/src/backend/commands/sequence.c
*** ResetSequence(Oid seq_relid)
*** 287,293 
  	 * Create a new storage file for the sequence.	We want to keep the

Re: [HACKERS] bitfield and gcc

2012-02-13 Thread Marti Raudsepp
On Sat, Feb 11, 2012 at 01:54, Gaetano Mendola  wrote:
> I wonder if somewhere in Postgres source "we" are relying on the GCC
> "correct behaviour" regarding the read-modify-write of bitfield in
> structures.

Probably not. I'm pretty sure that we don't have any bitfields, since
not all compilers are happy with them. And it looks like this behavior
doesn't affect other kinds of struct fields.

It sounds like the GCC guys are saying that it's theoretically
possible that the compiler will generate 64-bit read-modify-writes
regardless of the struct member types. In this light, PostgreSQL code
is not "correct" -- our slock_t uses a char type on i386/AMD64/SPARC
and 32-bit int on IA-64/PPC64. There are plenty of places where it's
adjacent to other small fields.

However, I don't think the latter is a problem with any compilers in
practice, as that would break a lot more code than just btrfs and
Postgres.

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing special case OID generation

2012-02-13 Thread Marti Raudsepp
On Mon, Feb 13, 2012 at 15:08, Robert Haas  wrote:
> Honestly, I think the biggest hassle of the OID code is not so much
> the way they're generated as the way they're stored within heap
> tuples.  I've wondered whether we should go through the system
> catalogs and replace all of the hidden OID columns with a normal
> column called "oid" of type OID

Do we have a clear idea about what to do with user tables that are
created WITH OIDS? Do we care about compatibility with that at all? Do
we generate this explicit "oid" column manually or do we just tell
users to use a serial or global sequence instead?

Personally I'd also like to see us get rid of the default_with_oids
setting -- I assume the existence of that is the reason why pgAdmin
and TOAD still generate table DDL with an explicit "WITH (OIDS=FALSE)"

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CUDA Sorting

2012-02-15 Thread Marti Raudsepp
On Mon, Feb 13, 2012 at 20:48, Greg Stark  wrote:
> I don't think we should be looking at either CUDA or OpenCL directly.
> We should be looking for a generic library that can target either and
> is well maintained and actively developed.

I understand your point about using some external library for the
primitives, but I don't see why it needs to support both CUDA and
OpenCL. Libraries for GPU-accelerated primitives generally target
OpenCL *or* CUDA, not both.

As far as I understand (and someone correct me if I'm wrong), the
difference between them is mostly the API and the fact that CUDA had a
head start, and thus a larger developer community around it. (All the
early adopters went to CUDA)

But OpenCL already acts as an abstraction layer. CUDA is
NVIDIA-specific, but OpenCL is supported by AMD, Intel as well as
NVIDIA. It's pretty rare for servers to have separate graphics cards,
but recent Intel and AMD CPUs already have a GPU included on die,
which is another bonus for OpenCL.

So I'd say, the way things are heading, it's only a matter of time
before OpenCL takes over and there will be little reason to look back.

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] possible new option for wal_sync_method

2012-02-16 Thread Marti Raudsepp
On Thu, Feb 16, 2012 at 19:18, Dan Scales  wrote:
> fsync/fdatasync can be very slow on ext3, because it seems to have to
> always wait for the current filesystem meta-data transaction to complete,
> even if that meta-data operation is completely unrelated to the file
> being fsync'ed.

Use the data=writeback mount option to remove this restriction. This
is actually the suggested setting for PostgreSQL file systems:
http://www.postgresql.org/docs/current/static/wal-intro.html

(Note that this is unsafe for some other applications, so I wouldn't
use it on the root file system)

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Copyright notice for contrib/cube?

2012-02-17 Thread Marti Raudsepp
On Fri, Feb 17, 2012 at 17:42, Jay Levitt  wrote:
> Should it be something like
>
> Portions Copyright (c) 1996-2011, PostgreSQL Global Development Group
> Portions Copyright (c) 2012, TipTap Inc.

Please don't add that, just change 2011 to 2012. This is what the wiki says:

Q: May I add my own copyright notice where appropriate?
A: No, please don't. We like to keep the legal information short and
crisp. Additionally, we've heard that could possibly pose problems for
corporate users.

Q: Doesn't the PostgreSQL license itself require to keep the copyright
notice intact?
A: Yes, it does. And it is, because the PostgreSQL Global Development
Group covers all copyright holders. Also note that US law doesn't
require any copyright notice for getting the copyright granted, just
like most European laws.

https://wiki.postgresql.org/wiki/Developer_FAQ

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_test_timing tool for EXPLAIN ANALYZE overhead

2012-02-22 Thread Marti Raudsepp
On Wed, Feb 22, 2012 at 18:44, Greg Smith  wrote:
> As far as I've been able to tell, there aren't any issues unique to Windows
> there.  Multiple cores can have their TSC results get out of sync on Windows
> for the same reason they do on Linux systems, and there's also the same
> frequency/temperature issues.

Not on recent Linux kernel versions. Linux automatically detects when
the TSC is unstable (due to power management or out-of-sync
cores/sockets) and automatically falls back to the more expensive HPET
or ACPI methods.

e.g:
% dmesg |grep -i tsc
[0.00] Fast TSC calibration using PIT
[0.164075] checking TSC synchronization [CPU#0 -> CPU#1]: passed.
[0.197062] Switching to clocksource tsc
[0.260960] Marking TSC unstable due to TSC halts in idle

Whether these tests cover 100% of the possible conditions, and whether
the detection has race conditions or not, I don't know.

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_test_timing tool for EXPLAIN ANALYZE overhead

2012-02-22 Thread Marti Raudsepp
On Wed, Feb 22, 2012 at 19:36, Greg Smith  wrote:
> From the patch:
>
> Newer operating systems may check for the known TSC problems and
> switch to a slower, more stable clock source when they are seen.
> If your system supports TSC time but doesn't default to that, it
> may be disabled for a good reason.

Sorry, I was under the impression that the stability of Windows's
QueryPerformanceCounter() API is hardware-dependent, but I haven't
coded under Windows for a long time -- maybe it's improved in recent
versions.

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] Cleanup: use heap_open/heap_close consistently

2012-02-27 Thread Marti Raudsepp
Hi,

Here's a tiny cleanup: currently get_tables_to_cluster uses
heap_open() to open the relation, but then closes it with
relation_close(). Currently relation_close=heap_close, but it seems
like good idea to be consistent -- in case these functions need to
diverge in the future.

Regards,
Marti

diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 349d130..a10ec2d 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -1551,7 +1551,7 @@ get_tables_to_cluster(MemoryContext cluster_context)
}
heap_endscan(scan);

-   relation_close(indRelation, AccessShareLock);
+   heap_close(indRelation, AccessShareLock);

return rvs;
 }
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 349d130..a10ec2d 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -1551,7 +1551,7 @@ get_tables_to_cluster(MemoryContext cluster_context)
 	}
 	heap_endscan(scan);
 
-	relation_close(indRelation, AccessShareLock);
+	heap_close(indRelation, AccessShareLock);
 
 	return rvs;
 }

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Making TRUNCATE more "MVCC-safe"

2012-03-03 Thread Marti Raudsepp
On Sat, Mar 3, 2012 at 14:53, Simon Riggs  wrote:
> Thanks Noah for drawing attention to this thread. I hadn't been
> watching. As you say, this work would allow me to freeze rows at load
> time and avoid the overhead of hint bit setting, which avoids
> performance issues from hint bit setting in checksum patch.
>
> I've reviewed this patch and it seems OK to me. Good work Marti.

Thanks! This approach wasn't universally liked, but if it gets us
tangible benefits (COPY with frozenxid) then I guess it's a reason to
reconsider.

> v2 patch attached, updated to latest HEAD. Patch adds
> * relvalidxid is reset by VACUUM/ANALYZE to avoid wraparound failure

Personally I'd rather keep this out of ANALYZE -- since its purpose is
to collect stats; VACUUM is responsible for correctness (xid
wraparound etc). But I don't feel strongly about this.

A more important consideration is how this interacts with hot standby.
Currently you compare OldestXmin to relvalidxmin to decide when to
reset it. But the standby's OldestXmin may be older than the master's.
(If VACUUM removes rows then this causes a recovery conflict, but
AFAICT that won't happen if only relvalidxmin changes)

It might be more robust to wait until relfrozenxid exceeds
relvalidxmin -- by then, recovery conflict mechanisms will have taken
care of killing all older snapshots, or am I mistaken?

And a few typos the code...

+ gettext_noop("When enabled viewing a truncated or newly created table "
+ "will throw a serialization error to prevent MVCC "
+ "discrepancies that were possible priot to 9.2.")

"prior" not "priot"

+ * Reset relvalidxmin if its old enough

Should be "it's" in this context.

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PGXS ignores SHLIB_LINK when linking modules

2012-03-07 Thread Marti Raudsepp
Hi,

I'm trying to write my first PostgreSQL C extension. I used the
pgxn-utils skeleton as a base and specified some external libraries in
SHLIB_LINK. However, this variable was ignored when linking the
library (using pgxs from current git master).

After spending quite a bit of time trying to understand the Makefile
spaghetti, I noticed that the contrib extensions are using MODULE_big
instead of MODULES. When I converted my makefile, it indeed started
working as expected.

Is there any reason that pgxs ignores SHLIB_LINK when building
MODULES? Seems like a pretty unlikely use case that someone would want
to use MODULE_big and MODULES in the same makefile, so there should to
be no ambiguity -- if SHLIB_LINK is set then it should be used.

This does NOT work:
MODULES  = src/pg_journal
SHLIB_LINK  = -lsystemd-journal -lsystemd-id128

And this does:
OBJS  = src/pg_journal.o
MODULE_big= pg_journal
SHLIB_LINK  = -lsystemd-journal -lsystemd-id128

Here's the github repository for this extension:
https://github.com/intgr/pg_journal (the makefile there currently uses
a workaround)

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] Optimize IS DISTINCT FROM NULL => IS NOT NULL

2012-03-08 Thread Marti Raudsepp
Hi list,

This patch enables a simple optimization in
eval_const_expressions_mutator. If we know that one argument to
DistinctExpr is NULL then we can optimize it to a NullTest, which can
be an indexable expression.

For example the query:
EXPLAIN (costs off) SELECT * FROM foo WHERE j IS NOT DISTINCT FROM NULL;

Old behavior:
Seq Scan on foo
  Filter: (NOT (j IS DISTINCT FROM NULL::integer))

New behavior:
Index Scan using foo_j_idx on foo
  Index Cond: (j IS NULL)

Regards,
Marti
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index cd3da46..d9568ca 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -2436,7 +2436,7 @@ eval_const_expressions_mutator(Node *node,
 ListCell   *arg;
 bool		has_null_input = false;
 bool		all_null_input = true;
-bool		has_nonconst_input = false;
+Expr	   *nonconst_expr = NULL;
 Expr	   *simple;
 DistinctExpr *newexpr;
 
@@ -2463,11 +2463,11 @@ eval_const_expressions_mutator(Node *node,
 		all_null_input &= ((Const *) lfirst(arg))->constisnull;
 	}
 	else
-		has_nonconst_input = true;
+		nonconst_expr = lfirst(arg);
 }
 
 /* all constants? then can optimize this out */
-if (!has_nonconst_input)
+if (nonconst_expr == NULL)
 {
 	/* all nulls? then not distinct */
 	if (all_null_input)
@@ -2512,6 +2512,24 @@ eval_const_expressions_mutator(Node *node,
 		return (Node *) csimple;
 	}
 }
+else if (has_null_input)
+{
+	/*
+	 * We can optimize: (expr) IS DISTINCT FROM NULL
+	 * into: (expr) IS NOT NULL
+	 */
+	NullTest   *newntest;
+
+	newntest = makeNode(NullTest);
+	newntest->nulltesttype = IS_NOT_NULL;
+	newntest->arg = (Expr *) nonconst_expr;
+
+	/* make_row_distinct_op already flattens row comparisons */
+	Assert(! type_is_rowtype(exprType((Node *) nonconst_expr)));
+	newntest->argisrow = false;
+
+	return (Node *) newntest;
+}
 
 /*
  * The expression cannot be simplified any further, so build
diff --git a/src/test/regress/expected/select_distinct.out b/src/test/regress/expected/select_distinct.out
index 38107a0..e8ddc49 100644
--- a/src/test/regress/expected/select_distinct.out
+++ b/src/test/regress/expected/select_distinct.out
@@ -195,6 +195,13 @@ SELECT null IS DISTINCT FROM null as "no";
  f
 (1 row)
 
+EXPLAIN (costs off) SELECT * FROM disttable WHERE f1 IS DISTINCT FROM NULL;
+ QUERY PLAN 
+
+ Seq Scan on disttable
+   Filter: (f1 IS NOT NULL)
+(2 rows)
+
 -- negated form
 SELECT 1 IS NOT DISTINCT FROM 2 as "no";
  no 
@@ -220,3 +227,10 @@ SELECT null IS NOT DISTINCT FROM null as "yes";
  t
 (1 row)
 
+EXPLAIN (costs off) SELECT * FROM disttable WHERE f1 IS NOT DISTINCT FROM NULL;
+   QUERY PLAN   
+
+ Seq Scan on disttable
+   Filter: (f1 IS NULL)
+(2 rows)
+
diff --git a/src/test/regress/sql/select_distinct.sql b/src/test/regress/sql/select_distinct.sql
index 328ba51..9767eed 100644
--- a/src/test/regress/sql/select_distinct.sql
+++ b/src/test/regress/sql/select_distinct.sql
@@ -56,9 +56,11 @@ SELECT 1 IS DISTINCT FROM 2 as "yes";
 SELECT 2 IS DISTINCT FROM 2 as "no";
 SELECT 2 IS DISTINCT FROM null as "yes";
 SELECT null IS DISTINCT FROM null as "no";
+EXPLAIN (costs off) SELECT * FROM disttable WHERE f1 IS DISTINCT FROM NULL;
 
 -- negated form
 SELECT 1 IS NOT DISTINCT FROM 2 as "no";
 SELECT 2 IS NOT DISTINCT FROM 2 as "yes";
 SELECT 2 IS NOT DISTINCT FROM null as "no";
 SELECT null IS NOT DISTINCT FROM null as "yes";
+EXPLAIN (costs off) SELECT * FROM disttable WHERE f1 IS NOT DISTINCT FROM NULL;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Optimize IS DISTINCT FROM NULL => IS NOT NULL

2012-03-08 Thread Marti Raudsepp
On Thu, Mar 8, 2012 at 19:35, Tom Lane  wrote:
> Uh ... how much do we care about that?  I can't say that I've heard many
> people complain about the fact that IS [NOT] DISTINCT FROM is poorly
> optimized -- which it is, in general, and this patch chips away at that
> only a tiny bit, not enough to make it recommendable.

Agreed, but it was very simple to code, so I figured why not.

> Plus I don't see why anyone would write the specific case
> "IS [NOT] DISTINCT FROM NULL" when they could write half as much.

Well I can see how it might be useful in generated queries, when
comparing a column to a parameter. If they're using IS DISTINCT FROM
then it's reasonable to expect that the parameter could be NULL
sometimes.

But I don't feel strongly about this, maybe it's not worth
complicating this big function further. :)

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Caching for stable expressions with constant arguments v6

2012-03-10 Thread Marti Raudsepp
On Sat, Mar 10, 2012 at 02:05, Tom Lane  wrote:
> Marti Raudsepp  writes:
>> [ cacheexpr-v8.patch ]
>
> A few comments

Whoa, that's quite a few. Thanks for the review.

> * There's a lot of stuff that seems wrong in detail in
> eval_const_expressions_mutator, eg it'll try to wrap a plain vanilla
> Const with a CacheExpr.  I see you've hacked that case inside
> insert_cache itself, but that seems like evidence of a poorly thought
> out recursion invariant.  The function should have a better notion than
> that of whether caching is useful for a given subtree.

Well my logic was this: accessing params and consts is just as fast as
accessing the cache -- there's no evaluation involved. So there's no
point in inserting CacheExpr in front of those. All other kinds of
expressions require some sort of evaluation, so they are cached, if
they weren't already constant-folded.

> In general I'd
> not expect it to insert a CacheExpr unless there is a Param or stable
> function call somewhere below the current point, but it seems not to be
> tracking that.

I think if you continue on your "Param or stable function" train of
thought, then it boils down to "expressions that can't be
constant-folded". And that's how it already works now --
constant-foldable expressions get reduced to a Const, which isn't
cached. There's no need to track it specially since we can check
whether we've got a Const node.

> You probably need at least a three-way indicator
> returned from each recursion level: subexpression is not cacheable
> (eg it contains a Var or volatile function); subexpression is cacheable
> but there is no reason to do so (default situation); or subexpression is
> cacheable and should be cached (it contains a Param or stable function).

That could be done -- give a special cachability result in code paths
that return a constant -- but that seems like complicating the logic
and doesn't tell us anything we don't already know.



> * I'm having a hard time understanding what you did to simplify_function
> and friends; that's been whacked around quite a bit, in ways that are
> neither clearly correct nor clearly related to the goal of the patch.

Sure, I'll split this out as a separate patch and send it up.

> * I believe the unconditional datumCopy call in ExecEvalCacheExpr will
> dump core if the value is null and the type is pass-by-reference.

datumCopy already has a check for NULL pointer. But good point about
skipping type properties lookup -- will change.

> * I think you should consider getting rid of the extra argument to
> ExecInitExpr, as well as the enable flag in CacheExprState, and instead
> driving cache-enable off a new flag added to EState

Will do.

> * In the same vein, I think it would be better to avoid adding
> the extra argument to eval_const_expressions_mutator and instead pass
> that state around in the existing "context" struct argument.

I'll have to see how ugly it gets to save & restore the cachability
field in recursive calls.

> I am entirely unimpressed with the fact that you can't pass
> the extra argument through expression_tree_mutator and thus have to dumb
> the code down to fail to cache underneath any node type for which
> there's not explicit coding in eval_const_expressions_mutator.

Well I deliberately chose a whitelisting approach. Expression types
that aren't important enough to be constant-folded probably aren't
that important for caching either.

Do you think it's safe to assume that expression types we don't know
about are inherently cachable?

> +  * NOTE: This function is not indempotent. Calling it twice over an 
> expression
> * It seems like some of the ugliness here is due to thinking that
> selectivity functions can't cope with CacheExprs.  Wouldn't it be a lot
> better to make them cope?

I thought centralizing this CacheExpr-stripping to one place was a
better idea than spraying it all around the code base. We can skip the
copy *and* the check this way.

Hmm, if I passed the context to insert_cache then we could avoid this problem.

We could do the same for RelabelType in estimation mode, but I don't
know how safe that is.

> * I don't think it's a good idea for
> simplify_and_arguments/simplify_or_arguments to arbitrarily reorder the
> arguments based on cacheability.  I know that we tell people not to
> depend on order of evaluation in AND/OR lists, but that doesn't mean
> they listen

Fair enough, will change.

> * I don't believe CacheExprs can be treated as always simple by
> ruleutils' isSimpleNode

Ok.

> * I don't think I believe either of the changes you made to plpgsql's
> exec_simple_check_node.

Will change.

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Refactoring simplify_function (was: Caching constant stable expressions)

2012-03-10 Thread Marti Raudsepp
Hi list,

Per Tom's request, I split out this refactoring from my CacheExpr patch.

Basically I'm just centralizing the eval_const_expressions_mutator()
call on function arguments, from multiple different places to a single
location. Without this, it would be a lot harder to implement argument
caching logic in the CacheExpr patch.

The old call tree goes like:
case T_FuncExpr:
-> eval_const_expressions_mutator(args)
-> simplify_function
   -> reorder_function_arguments
   -> eval_const_expressions_mutator(args)
   -> add_function_defaults
   -> eval_const_expressions_mutator(args)

New call tree:
case T_FuncExpr:
-> simplify_function
   -> simplify_copy_function_arguments
  -> reorder_function_arguments
  -> add_function_defaults
   -> eval_const_expressions_mutator(args)

Assumptions being made:
* The code didn't depend on expanding existing arguments before adding defaults
* operator calls don't need to expand default arguments -- it's
currently impossible to CREATE OPERATOR with a function that has
unspecified arguments

Other changes:
* simplify_function no longer needs a 'bool has_named_args' argument
(it finds out itself).
* I added 'bool mutate_args' in its place, since some callers need to
mutate args beforehand.
* reorder_function_arguments no longer needs to keep track of which
default args were added.

Regards,
Marti
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
new file mode 100644
index cd3da46..9485e95
*** a/src/backend/optimizer/util/clauses.c
--- b/src/backend/optimizer/util/clauses.c
*** static List *simplify_and_arguments(List
*** 109,124 
  static Node *simplify_boolean_equality(Oid opno, List *args);
  static Expr *simplify_function(Expr *oldexpr, Oid funcid,
    Oid result_type, int32 result_typmod, Oid result_collid,
!   Oid input_collid, List **args,
!   bool has_named_args,
!   bool allow_inline,
    eval_const_expressions_context *context);
  static List *reorder_function_arguments(List *args, Oid result_type,
! 		   HeapTuple func_tuple,
! 		   eval_const_expressions_context *context);
  static List *add_function_defaults(List *args, Oid result_type,
! 	  HeapTuple func_tuple,
! 	  eval_const_expressions_context *context);
  static List *fetch_function_defaults(HeapTuple func_tuple);
  static void recheck_cast_function_args(List *args, Oid result_type,
  		   HeapTuple func_tuple);
--- 109,123 
  static Node *simplify_boolean_equality(Oid opno, List *args);
  static Expr *simplify_function(Expr *oldexpr, Oid funcid,
    Oid result_type, int32 result_typmod, Oid result_collid,
!   Oid input_collid, List **args_p,
!   bool mutate_args, bool allow_inline,
    eval_const_expressions_context *context);
+ static List *simplify_copy_function_arguments(List *old_args, Oid result_type,
+ 			HeapTuple func_tuple);
  static List *reorder_function_arguments(List *args, Oid result_type,
! 		   HeapTuple func_tuple);
  static List *add_function_defaults(List *args, Oid result_type,
! 	  HeapTuple func_tuple);
  static List *fetch_function_defaults(HeapTuple func_tuple);
  static void recheck_cast_function_args(List *args, Oid result_type,
  		   HeapTuple func_tuple);
*** eval_const_expressions_mutator(Node *nod
*** 2303,2336 
  		case T_FuncExpr:
  			{
  FuncExpr   *expr = (FuncExpr *) node;
! List	   *args;
! bool		has_named_args;
  Expr	   *simple;
  FuncExpr   *newexpr;
- ListCell   *lc;
- 
- /*
-  * Reduce constants in the FuncExpr's arguments, and check to
-  * see if there are any named args.
-  */
- args = NIL;
- has_named_args = false;
- foreach(lc, expr->args)
- {
- 	Node	   *arg = (Node *) lfirst(lc);
- 
- 	arg = eval_const_expressions_mutator(arg, context);
- 	if (IsA(arg, NamedArgExpr))
- 		has_named_args = true;
- 	args = lappend(args, arg);
- }
  
  /*
   * Code for op/func reduction is pretty bulky, so split it out
   * as a separate function.	Note: exprTypmod normally returns
   * -1 for a FuncExpr, but not when the node is recognizably a
   * length coercion; we want to preserve the typmod in the
!  * eventual Const if so.
   */
  simple = simplify_function((Expr *) expr,
  		   expr->funcid,
--- 2302,2318 
  		case T_FuncExpr:
  			{
  FuncExpr   *expr = (FuncExpr *) node;
! List	   *args = expr->args;
  Expr	   *simple;
  FuncExpr   *newexpr;
  
  /*
   * Code for op/func reduction is pretty bulky, so split it out
   * as a separate function.	Note: exprTypmod normally returns
   * -1 for a FuncExpr, but not when the node is recognizably a
   * length coercion; we want to preserve the typmod in the
!  * eventual Const if so. This function also mutates the
!  * argument list.
   */
  simple = simplify_function((Expr *) expr,
  		  

[HACKERS] GitHub mirror is not updating

2012-03-12 Thread Marti Raudsepp
Hi list,

I don't know who's maintaining the PostgreSQL GitHub mirror, but it
hasn't been updated for 6 days now:
https://github.com/postgres/postgres

Just letting you know.

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring simplify_function (was: Caching constant stable expressions)

2012-03-23 Thread Marti Raudsepp
On Sat, Mar 24, 2012 at 01:17, Tom Lane  wrote:
> I've applied a slightly-modified version of this after reconciling it
> with the protransform fixes.

Cool, thanks!

> I assume you are going to submit a rebased
> version of the main CacheExpr patch?

Yeah, I'm still working on addressing the comments from your last email.

Haven't had much time to work on it for the last 2 weeks, but I hope
to finish most of it this weekend.

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bison 3.0 updates

2013-07-29 Thread Marti Raudsepp
Hi,

> On 07/29/2013 01:05 AM, Tom Lane wrote:
>> Buildfarm member anchovy has been failing for the last couple of days,
>> evidently because its owner just couldn't wait to adopt bison 3.0,
>> which is all of 3 days old.

Hmm? Anchovy is upgrading automatically to newest Arch Linux packages daily.

I assumed that's a good thing -- the purpose of build farm is to test
PostgreSQL in various different real-life environments? Arch Linux is
one such environment that adopts new packages very quickly. If Arch
users are unable to build PostgreSQL then surely it's good to be
notified by the build farm before real users start reporting problems?

I don't mean to sound reluctant, I'm open to suggestions, but please
help me understand why this is bad.

On Mon, Jul 29, 2013 at 4:59 PM, Andrew Dunstan  wrote:
> Reminder to buildfarm animal owners: if you upgrade software please make
> sure your buildfarm members are still working.

If I had checked the machine's status manually after upgrading, the
best course of action would be to report the incompatibility to
PostgreSQL mailing lists. The end result is the same, in that sense,
it was "working". Manual upgrades would only delay that reporting, not
prevent it?

I realize there have been problems with anchovy that are my own fault
and I'm sorry about that. I check the buildfarm status every week or
two.

> And if the upgrade involves
> the OS or the compiler, please use the udate_personality.pl script to update
> the server.

Is it OK to run update_personality.pl automatically every day from crontab?

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bison 3.0 updates

2013-07-29 Thread Marti Raudsepp
On Mon, Jul 29, 2013 at 5:53 PM, Andres Freund  wrote:
> Both the
> gcc 4.8 and the bison 3.0 problems are things the project needs to know
> about

Perl 5.18 too: 
http://www.postgresql.org/message-id/2825.1370226...@sss.pgh.pa.us

Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bison 3.0 updates

2013-07-29 Thread Marti Raudsepp
On Mon, Jul 29, 2013 at 9:15 PM, Andrew Dunstan  wrote:
> There has to be something between "set in stone and never changes" and
> "auto-updates everything every 24 hours" that would suit us.

Well sure I could change the update frequency. But again, it seems
like delaying the inevitable.

> I'm toying with the idea of a check_upgrade mode for the buildfarm client
> where it wouldn't do a git pull, but would report changes if the build
> result was different from the previous result. You'd run this immediately
> after pulling new changes into your OS. Other, brighter ideas welcome.

What would be the right course of action if check_upgrade fails? Is it
reasonable to expect buildfarm volunteers to downgrade the system and
postpone until the problem is resolved?

Or do you think the member should be removed from the farm until the
build succeeds again? Sounds like worst of both worlds.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bison 3.0 updates

2013-07-29 Thread Marti Raudsepp
On Tue, Jul 30, 2013 at 3:56 AM, Noah Misch  wrote:
> Agreed.  Let's stick an "Updates OS packages daily, regularly acquiring
> bleeding-edge upstream releases" note on the buildfarm status page

FWIW, I added "[updated daily]" to the OS version field.

I haven't changed other configuration yet, will wait until there's a consensus.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] delta relations in AFTER triggers

2014-08-28 Thread Marti Raudsepp
On Thu, Aug 28, 2014 at 12:03 AM, Kevin Grittner  wrote:
>> In essence, make the relations work like PL/pgSQL
>> variables do. If you squint a little, the new/old relation is a variable
>> from the function's point of view, and a parameter from the
>> planner/executor's point of view. It's just a variable/parameter that
>> holds a set of tuples, instead of a single Datum.

> will be surprised if someone doesn't latch onto this to create a
> new "declared temporary table" that only exists within the scope of
> a compound statement (i.e., a BEGIN/END block).  You would DECLARE
> them just like you would a scalar variable in a PL, and they would
> have the same scope.
>
> I'll take a look at doing this in the next couple days, and see
> whether doing it that way is as easy as it seems on the face of it.

We already have all this with refcursor variables. Admittedly cursors
have some weird semantics (mutable position) and currently they're
cumbersome to combine into queries, but would a new "relation
variable" be sufficiently better to warrant a new type?

Why not extend refcursors and make them easier to use instead?

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] delta relations in AFTER triggers

2014-09-02 Thread Marti Raudsepp
On Mon, Sep 1, 2014 at 9:09 PM, Tom Lane  wrote:
> OTOH, I agree with Kevin that the things we're talking about are
> lightweight relations not variables.

My worry is that PL/pgSQL and Postgres's SQL dialect is turning into a
Frankenstein monster with many ways to do the same thing, each having
different semantics that require effort to reason about.

Variables and function arguments are non-contriversial, every
experienced coder understands their semantics without thinking twice
-- even if they're not familiar with Postgres.

The concept of "lightweight relations" that pop into existence when a
certain kind of trigger definition is used somewhere in the function
stack, without a CREATE TABLE, without being discoverable in
information_schema etc., I find needs some more justification than
I've seen in this thread. So far I've only heard that it's more
convenient to implement in the current PostgreSQL code base.

I'm sure more questions would pop up in practice, but as Heikki
mentioned: Are such relations also visible to other functions called
by the trigger function?
* If yes, this introduces non-obvious dependencies between functions.
What happens when one trigger with delta relations invokes another
trigger, does the previous one get shadowed or overwritten? What are
the interactions with search_path? Can an unprivileged function
override relation names when calling a SECURITY DEFINER function?

* If not, this further inhibits developers from properly modularizing
their trigger code (this is already a problem due to the current magic
trigger variables).

Even if these questions have reasonable answers, it takes mental
effort to remember the details. Procedure code debugging, especially
triggers, is hard enough due to poor tooling; increasing the cognitive
load should not be done lightly.

You could argue that CREATE TEMP TABLE already has some of these
problems, but it's very rare that people actually need to use that. If
delta relations get built on this new mechanism, avoiding won't be an
option any more.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] delta relations in AFTER triggers

2014-09-03 Thread Marti Raudsepp
On Wed, Sep 3, 2014 at 10:49 PM, Kevin Grittner  wrote:
> Marti Raudsepp  wrote:
>> The concept of "lightweight relations" that pop into existence when a
>> certain kind of trigger definition is used somewhere in the function
>> stack, without a CREATE TABLE, without being discoverable in
>> information_schema etc., I find needs some more justification than
>> I've seen in this thread. So far I've only heard that it's more
>> convenient to implement in the current PostgreSQL code base.
>
> It is required by the SQL standard.

I had a cursory read of the SQL 20nn draft and I don't get this
impression. The only place I could find discussing the behavior of
"transition tables" is in Foundation "4.39.1 General description of
triggers", which says:

"Special variables make the data in the transition table(s) available
to the triggered action. For a statement-level
trigger the variable is one whose value is a transition table."

There is no information about the scoping of such variables, so I
assume it refers to a regular locally scoped variable.

Did I miss something? Are you reading a different version of the spec?

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Join consolidation / Removing duplicate joins

2014-09-17 Thread Marti Raudsepp
On Wed, Sep 17, 2014 at 2:00 PM, David Rowley  wrote:
> Anyway... I've been thinking of writing some code that converts these sub
> plans into left joins where it can be proved that the subquery would only at
> most produce 1 row

> Does anyone have any thoughts on this?

+1, I've thought about this part before. There is already precedent
for inlining FROM clause subqueries into the main query, it would be
nice to do that for correlated subqueries as well. It seems we've been
adding features to the planner without fully exploiting opportunities
for normalization and consolidation of optimization techniques.

I think it's not even necessary to prove uniqueness of the subquery as
you describe. Now that 9.3 added LATERAL, a correlated subquery can be
seen as a special case of LATERAL LEFT JOIN with an additional check
to raise an error if >1 rows are returned from the inner side. And you
could optionally elide the error check if you can prove uniqueness.

Advantages:
1. Sufficiently simple lateral subqueries are already normalized into
ordinary JOINs with hash/merge support, so you would get that for free
(probably requires eliding the 1-row check).
2. We get rid of silliness like the explosion of SubPlan nodes for
each reference (see examples below).
3. Based on some naive testing, it seems that 9.5devel performs
slightly better with NestLoop LATERAL subqueries than SubPlan
correlated ones.
4. EXPLAIN output is easier to read, I find.

I suppose EXISTS/IN with correlated subqueries needs some different
treatment, as it can currently take advantage of the "hashed SubPlan"
optimization.

Can anyone see any downsides? Perhaps one day we can get rid of
SubPlan entirely, would anyone miss it?


Example of SubPlan explosion:

regression=# create view foo1 as select *, (select ten as ten2 from
tenk2 where tenk1.unique1=tenk2.unique1) from tenk1;

regression=# explain analyze select * from foo1 where ten2 between 1 and 3;
 Seq Scan on tenk1  (cost=0.00..175782.08 rows= width=244) (actual
time=0.052..49.288 rows=3000 loops=1)
   Filter: (((SubPlan 2) >= 1) AND ((SubPlan 3) <= 3))
   Rows Removed by Filter: 7000
   SubPlan 1
 ->  Index Scan using tenk2_unique1 on tenk2  (cost=0.29..8.30
rows=1 width=4) (actual time=0.001..0.002 rows=1 loops=3000)
   Index Cond: (tenk1.unique1 = unique1)
   SubPlan 2
 ->  Index Scan using tenk2_unique1 on tenk2 tenk2_1
(cost=0.29..8.30 rows=1 width=4) (actual time=0.002..0.002 rows=1
loops=1)
   Index Cond: (tenk1.unique1 = unique1)
   SubPlan 3
 ->  Index Scan using tenk2_unique1 on tenk2 tenk2_2
(cost=0.29..8.30 rows=1 width=4) (actual time=0.001..0.002 rows=1
loops=9000)
   Index Cond: (tenk1.unique1 = unique1)
 Execution time: 49.508 ms


LATERAL is a win even when using OFFSET 0 to prevent inlining:

regression=# create view foo3 as select * from tenk1 left join lateral
(select ten as ten2 from tenk2 where tenk1.unique1=tenk2.unique1
offset 0) x on true;

regression=# explain analyze select * from foo3 where ten2 between 1 and 3;
 Nested Loop  (cost=0.29..83733.00 rows=1 width=248) (actual
time=0.043..28.963 rows=3000 loops=1)
   ->  Seq Scan on tenk1  (cost=0.00..458.00 rows=1 width=244)
(actual time=0.008..1.177 rows=1 loops=1)
   ->  Subquery Scan on x  (cost=0.29..8.32 rows=1 width=4) (actual
time=0.002..0.002 rows=0 loops=1)
 Filter: ((x.ten2 >= 1) AND (x.ten2 <= 3))
 Rows Removed by Filter: 1
 ->  Index Scan using tenk2_unique1 on tenk2  (cost=0.29..8.30
rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=1)
   Index Cond: (tenk1.unique1 = unique1)
 Execution time: 29.186 ms


And if you could prove uniqueness of the inner side and inline it,
WHERE clauses can also be pushed down trivially:

regression=# create view foo2 as select * from tenk1 left join lateral
(select ten as ten2 from tenk2 where tenk1.unique1=tenk2.unique1) x on
true;

regression=# explain analyze select * from foo2 where ten2 between 1 and 3;
 Hash Join  (cost=532.50..1083.00 rows=3000 width=248) (actual
time=1.848..4.480 rows=3000 loops=1)
   Hash Cond: (tenk1.unique1 = tenk2.unique1)
   ->  Seq Scan on tenk1  (cost=0.00..458.00 rows=1 width=244)
(actual time=0.002..0.617 rows=1 loops=1)
   ->  Hash  (cost=495.00..495.00 rows=3000 width=8) (actual
time=1.837..1.837 rows=3000 loops=1)
 Buckets: 1024  Batches: 1  Memory Usage: 118kB
 ->  Seq Scan on tenk2  (cost=0.00..495.00 rows=3000 width=8)
(actual time=0.004..1.562 rows=3000 loops=1)
   Filter: ((ten >= 1) AND (ten <= 3))
   Rows Removed by Filter: 7000
 Execution time: 4.591 ms


Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Final Patch for GROUPING SETS

2014-09-17 Thread Marti Raudsepp
On Fri, Sep 12, 2014 at 9:41 PM, Andrew Gierth
 wrote:
> gsp1.patch - phase 1 code patch (full syntax, limited functionality)
> gsp2.patch - phase 2 code patch (adds full functionality using the
>  new chained aggregate mechanism)

I gave these a try by converting my current CTE-based queries into
CUBEs and it works as expected; query time is cut in half and lines of
code is 1/4 of original. Thanks!

I only have a few trivial observations; if I'm getting too nitpicky
let me know. :)


Since you were asking for feedback on the EXPLAIN output on IRC, I'd
weigh in and say that having the groups on separate lines would be
significantly more readable. It took me a while to understand what's
going on in my queries due to longer table and column names and
wrapping; The comma separators between groups are hard to distinguish.
If that can be made to work with the EXPLAIN printer without too much
trouble.

So instead of:
 GroupAggregate
   Output: four, ten, hundred, count(*)
   Grouping Sets: (onek.four, onek.ten, onek.hundred), (onek.four,
onek.ten), (onek.four), ()

Perhaps print:
   Grouping Sets: (onek.four, onek.ten, onek.hundred)
  (onek.four, onek.ten)
  (onek.four)
  ()

Or maybe:
   Grouping Set: (onek.four, onek.ten, onek.hundred)
   Grouping Set: (onek.four, onek.ten)
   Grouping Set: (onek.four)
   Grouping Set: ()

Both seem to work with the explain.depesz.com parser, although the 1st
won't be aligned as nicely.


Do you think it would be reasonable to normalize single-set grouping
sets into a normal GROUP BY? Such queries would be capable of using
HashAggregate, but the current code doesn't allow that. For example:

set enable_sort=off;
explain select two, count(*) from onek group by grouping sets (two);
Could be equivalent to:
explain select two, count(*) from onek group by two;


I'd expect GROUP BY () to be fully equivalent to having no GROUP BY
clause, but there's a difference in explain output. The former
displays "Grouping Sets: ()" which is odd, since none of the grouping
set keywords were used.

# explain select count(*) from onek group by ();
 Aggregate  (cost=77.78..77.79 rows=1 width=0)
   Grouping Sets: ()
   ->  Index Only Scan using onek_stringu1 on onek  (cost=0.28..75.28
rows=1000 width=0)

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Final Patch for GROUPING SETS

2014-09-19 Thread Marti Raudsepp
On Fri, Sep 19, 2014 at 4:45 AM, Andrew Gierth
 wrote:
>  GroupAggregate  (cost=1122.39..1197.48 rows=9 width=8)
>Group Key: two, four
>Group Key: two
>Group Key: ()

>   "Grouping Sets": [
> ["two", "four"],
> ["two"],
> []

+1 looks good to me.

> (yaml format)
> Grouping Sets:
>   - - "two"
> - "four"
>   - - "two"
>   -

Now this is weird. But is anyone actually using YAML output format, or
was it implemented simply "because we can"?

>  Marti> Do you think it would be reasonable to normalize single-set
>  Marti> grouping sets into a normal GROUP BY?
> It's certainly possible, though it would seem somewhat odd to write
> queries that way.

The reason I bring this up is that queries are frequently dynamically
generated by programs. Coders are unlikely to special-case SQL
generation when there's just a single grouping set. And that's the
power of relational databases: the optimization work is done in the
database pretty much transparently to the coder (when it works, that
is).

> would you want the original syntax preserved in views

Doesn't matter IMO.

>  Marti> I'd expect GROUP BY () to be fully equivalent to having no
>  Marti> GROUP BY clause, but there's a difference in explain
>  Marti> output. The former displays "Grouping Sets: ()" which is odd,
>  Marti> since none of the grouping set keywords were used.
> That's an implementation artifact, in the sense that we preserve the
> fact that GROUP BY () was used by using an empty grouping set. Is it
> a problem, really, that it shows up that way in explain?

No, not really a problem. :)

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] Simplify EXISTS subqueries containing LIMIT

2014-10-02 Thread Marti Raudsepp
Hi list,

Attached patch allows semijoin/antijoin/hashed SubPlan optimization
when an EXISTS subquery contains a LIMIT clause with a positive
constant. It seems to be a fairly common meme to put LIMIT 1 into
EXISTS() subqueries, and it even makes sense when you're not aware
that the database already does this optimization.

Do we want this?

It has come up in #postgresql, and at twice times on mailing lists:
http://www.postgresql.org/message-id/53279529.2070...@freemail.hu
http://www.postgresql.org/message-id/50a36820.4030...@pingpong.net

And there may even be good reasons, such as writing performant
portable SQL code for Other Databases:
https://dev.mysql.com/doc/refman/5.1/en/optimizing-subqueries.html


The code is fairly straightforward. The only ugly part is that I need
to call eval_const_expressions() on the LIMIT expression because
subquery_planner() does subquery optimizations before constant
folding. A "LIMIT 1" clause will actually produce an int8(1)
expression. And I have to drag along PlannerInfo for that.

If it fails to yield a constant we've done some useless work, but it
should be nothing compared to the caller doing a deep copy of the
whole subquery.

Regards,
Marti
From 3448408121e7e32a12fc16403c9d48bce63503f5 Mon Sep 17 00:00:00 2001
From: Marti Raudsepp 
Date: Wed, 1 Oct 2014 02:17:21 +0300
Subject: [PATCH] Simplify EXISTS subqueries containing LIMIT

---
 src/backend/optimizer/plan/subselect.c  | 40 +++--
 src/test/regress/expected/subselect.out | 14 
 src/test/regress/sql/subselect.sql  |  7 ++
 3 files changed, 54 insertions(+), 7 deletions(-)

diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 3e7dc85..09b153e 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -70,7 +70,7 @@ static Node *convert_testexpr_mutator(Node *node,
 static bool subplan_is_hashable(Plan *plan);
 static bool testexpr_is_hashable(Node *testexpr);
 static bool hash_ok_operator(OpExpr *expr);
-static bool simplify_EXISTS_query(Query *query);
+static bool simplify_EXISTS_query(PlannerInfo *root, Query *query);
 static Query *convert_EXISTS_to_ANY(PlannerInfo *root, Query *subselect,
 	  Node **testexpr, List **paramIds);
 static Node *replace_correlation_vars_mutator(Node *node, PlannerInfo *root);
@@ -452,7 +452,7 @@ make_subplan(PlannerInfo *root, Query *orig_subquery,
 	 * If it's an EXISTS subplan, we might be able to simplify it.
 	 */
 	if (subLinkType == EXISTS_SUBLINK)
-		simple_exists = simplify_EXISTS_query(subquery);
+		simple_exists = simplify_EXISTS_query(root, subquery);
 
 	/*
 	 * For an EXISTS subplan, tell lower-level planner to expect that only the
@@ -518,7 +518,7 @@ make_subplan(PlannerInfo *root, Query *orig_subquery,
 		/* Make a second copy of the original subquery */
 		subquery = (Query *) copyObject(orig_subquery);
 		/* and re-simplify */
-		simple_exists = simplify_EXISTS_query(subquery);
+		simple_exists = simplify_EXISTS_query(root, subquery);
 		Assert(simple_exists);
 		/* See if it can be converted to an ANY query */
 		subquery = convert_EXISTS_to_ANY(root, subquery,
@@ -1359,7 +1359,7 @@ convert_EXISTS_sublink_to_join(PlannerInfo *root, SubLink *sublink,
 	 * targetlist, we have to fail, because the pullup operation leaves us
 	 * with noplace to evaluate the targetlist.
 	 */
-	if (!simplify_EXISTS_query(subselect))
+	if (!simplify_EXISTS_query(root, subselect))
 		return NULL;
 
 	/*
@@ -1486,11 +1486,11 @@ convert_EXISTS_sublink_to_join(PlannerInfo *root, SubLink *sublink,
  * Returns TRUE if was able to discard the targetlist, else FALSE.
  */
 static bool
-simplify_EXISTS_query(Query *query)
+simplify_EXISTS_query(PlannerInfo *root, Query *query)
 {
 	/*
 	 * We don't try to simplify at all if the query uses set operations,
-	 * aggregates, modifying CTEs, HAVING, LIMIT/OFFSET, or FOR UPDATE/SHARE;
+	 * aggregates, modifying CTEs, HAVING, OFFSET, or FOR UPDATE/SHARE;
 	 * none of these seem likely in normal usage and their possible effects
 	 * are complex.
 	 */
@@ -1501,7 +1501,6 @@ simplify_EXISTS_query(Query *query)
 		query->hasModifyingCTE ||
 		query->havingQual ||
 		query->limitOffset ||
-		query->limitCount ||
 		query->rowMarks)
 		return false;
 
@@ -1513,6 +1512,33 @@ simplify_EXISTS_query(Query *query)
 		return false;
 
 	/*
+	 * LIMIT clause can be removed if it's a positive constant or ALL, to
+	 * prevent it from being an optimization barrier. It's a common meme to put
+	 * LIMIT 1 within EXISTS subqueries.
+	 */
+	if (query->limitCount)
+	{
+		/*
+		 * eval_const_expressions has not been called yet by subquery_planner,
+		 * may still contain int64 coercions etc.
+		 */
+		Node	   *node = eval_const_expressions(root, query->limitCount);
+		Const	   *limit;
+
+		if (! IsA(node, Const))
+			return false;
+
+		limit = (Const *) node;
+		Asser

Re: [HACKERS] CREATE IF NOT EXISTS INDEX

2014-10-02 Thread Marti Raudsepp
On Wed, Oct 1, 2014 at 2:42 PM, Fabrízio de Royes Mello
 wrote:
> So, what's the correct/best grammar?
> CREATE [ IF NOT EXISTS ] [ UNIQUE ] INDEX index_name
> or
> CREATE [ UNIQUE ] INDEX [ IF NOT EXISTS ] index_name

I've elected myself as the reviewer for this patch. Here are some
preliminary comments...

I agree with José. The 2nd is more consistent given the other syntaxes:
  CREATE { TABLE | SCHEMA | EXTENSION | ... } IF NOT EXISTS name ...
It's also compatible with SQLite's grammar:
https://www.sqlite.org/lang_createindex.html

Do we want to enforce an order on the keywords or allow both?
  CREATE INDEX IF NOT EXISTS CONCURRENTLY foo ...
  CREATE INDEX CONCURRENTLY IF NOT EXISTS foo ...

It's probably very rare to use both keywords at the same time, so I'd
prefer only the 2nd, unless someone else chimes in.

Documentation: I would prefer if the explanation were consistent with
the description for ALTER TABLE/EXTENSION; just copy it and replace
"relation" with "index".

+ ereport(NOTICE,
+ (errcode(ERRCODE_DUPLICATE_TABLE),
+  errmsg("relation \"%s\" already exists, skipping",
+ indexRelationName)));

1. Clearly "relation" should be "index".
2. Use ERRCODE_DUPLICATE_OBJECT not TABLE

+ if (n->if_not_exists && n->idxname == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+  errmsg("IF NOT EXISTS requires that you
name the index."),

I think ERRCODE_SYNTAX_ERROR makes more sense, it's something that we
decided we *don't want* to support.

- write_msg(NULL, "reading row-security enabled for table \"%s\"",
+ write_msg(NULL, "reading row-security enabled for table \"%s\"\n",

???

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to add support of "IF NOT EXISTS" to others "CREATE" statements

2014-10-02 Thread Marti Raudsepp
On Tue, Aug 26, 2014 at 4:20 PM, Heikki Linnakangas
 wrote:
> On 04/14/2014 10:31 PM, Fabrízio de Royes Mello wrote:
>> The attached patch contains CINE for sequences.
>>
>> I just strip this code from the patch rejected before.
>
> Committed with minor changes

Hmm, the CommitFest app lists Marko Tiikkaja as the reviewer, but I
can't find his review anywhere...

The documentation claims:
CREATE [ IF NOT EXISTS ] SEQUENCE name
But grammar implements it the other way around:
CREATE SEQUENCE IF NOT EXISTS name;

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE IF NOT EXISTS INDEX

2014-10-02 Thread Marti Raudsepp
On Fri, Oct 3, 2014 at 2:15 AM, Marti Raudsepp  wrote:
> + ereport(NOTICE,
> + (errcode(ERRCODE_DUPLICATE_TABLE),
> +  errmsg("relation \"%s\" already exists, skipping",
> + indexRelationName)));
>
> 1. Clearly "relation" should be "index".
> 2. Use ERRCODE_DUPLICATE_OBJECT not TABLE

My bad, this code is OK. The current code already uses "relation" and
TABLE elsewhere because indexes share the same namespace with tables.

+ /*
+  * Throw an exception when IF NOT EXISTS is used without a named
+  * index
+  */

I'd say "without an index name". And the line goes beyond 80 characters wide.

I would also move this check to after all the attributes have been
assigned, rather than splitting the assignments in half.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE IF NOT EXISTS INDEX

2014-10-03 Thread Marti Raudsepp
"On Fri, Oct 3, 2014 at 6:25 AM, Fabrízio de Royes Mello
 wrote:
>> Documentation: I would prefer if the explanation were consistent with

> "Do not throw an error if the index already exists. A notice is issued in
> this case."
> Fixed in that way. Ok?

And also "Note that there is no guarantee that the existing index is
anything like the one that would have been created."

>> I think ERRCODE_SYNTAX_ERROR makes more sense, it's something that we
>> decided we *don't want* to support.
> I don't think so. It's the same as CREATE SCHEMA IF NOT EXISTS that not
> support to include schema elements.

IMO that's wrong too, the CREATE SCHEMA documentation doesn't list it
as valid syntax.

But now that you split the syntax in two, you can simply replace
"opt_index_name" with "index_name" and it will naturally cause a
syntax error without the need for an if(). What do you think?
Patch attached, which applies on top of your v4 patch.

On Fri, Oct 3, 2014 at 6:29 AM, Fabrízio de Royes Mello
 wrote:
>> I would also move this check to after all the attributes have been
>> assigned, rather than splitting the assignments in half.
> Why? If you see other places in gram.y it's a common usage...

Looks cleaner to me: first input all the fields, then validate. And
there are examples like this too, like "COPY select_with_parens". But
this is moot now, if you agree with my grammar change.

On Fri, Oct 3, 2014 at 6:35 AM, Fabrízio de Royes Mello
 wrote:
> And just to remember please add your review to commitfest

Thanks for reminding, I always forget to update the CommitFest... :(

Regards,
Marti


0001-Simplify-CREATE-INDEX-IF-NOT-EXISTS-grammar.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE IF NOT EXISTS INDEX

2014-10-05 Thread Marti Raudsepp
On Fri, Oct 3, 2014 at 7:25 PM, Fabrízio de Royes Mello
 wrote:
> I agree with your grammar change.

Cool.

> The version 5 (attached) contains all discussed until now.

From documentation:

CREATE INDEX ... [ IF NOT EXISTS name | name ] ON ...

Maybe I'm just slow, but it took me a few minutes to understand what
this means. :)

I would add a human-language explanation to IF NOT EXISTS description:
  Index name is required when IF NOT EXISTS is specified


You have resurrected this bit again, which now conflicts with git master...

- write_msg(NULL, "reading row-security enabled for table \"%s\"",
+ write_msg(NULL, "reading row-security enabled for table \"%s\"\n",


  n->concurrent = $4;
+ n->if_not_exists = false;
  n->idxname = $5;

Minor stylistic thing: now that this is a constant, I would move it to
the end together with other constant assignments, and follow the
struct's field ordering (in both code paths):

n->isconstraint = false;
n->deferrable = false;
n->initdeferred = false;
n->if_not_exists = false;

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE IF NOT EXISTS INDEX

2014-10-05 Thread Marti Raudsepp
On Mon, Oct 6, 2014 at 4:17 AM, Fabrízio de Royes Mello
 wrote:
> On Sun, Oct 5, 2014 at 9:52 AM, Marti Raudsepp  wrote:
>> CREATE INDEX ... [ IF NOT EXISTS name | name ] ON ...
>>
>> Maybe I'm just slow, but it took me a few minutes to understand what
>> this means. :)
>
> Well, I try to show that "IF NOT EXISTS" require the "name". Is this wrong?

No, I'm sorry, you misunderstood me. It was totally correct before, it
just wasn't easy to understand at first.

> CREATE INDEX ... [ IF NOT EXISTS [ name ] ] ON ...

I think this one is wrong now. It suggests these are valid syntaxes:
CREATE INDEX ... ON ...
CREATE INDEX ... IF NOT EXISTS ON ... <-- wrong
CREATE INDEX ... IF NOT EXISTS name ON ...

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   3   >