Re: JIT compiling with LLVM v9.1

2018-02-03 Thread Andres Freund
Hi,

On 2018-02-02 18:21:12 -0800, Jeff Davis wrote:
> On Mon, Jan 29, 2018 at 1:53 AM, Andres Freund  wrote:
> >>   https://git.postgresql.org/git/users/andresfreund/postgres.git
> 
> There's a patch in there to change the scan order.

Yes - note it's "deactivated" at the moment in the series. I primarily
have it in there because I found profiles to be a lot more useful if
it's enabled, as otherwise the number of cache misses and related stalls
from heap accesses completely swamp everything else.

FWIW, there's 
http://archives.postgresql.org/message-id/20161030073655.rfa6nvbyk4w2kkpk%40alap3.anarazel.de


> I suggest that you rename the GUC "synchronize_seqscans" to something
> more generic like "optimize_scan_order", and use it to control your
> feature as well (after all, it's the same trade-off: weird scan order
> vs.  performance). Then, go ahead and commit it. FWIW I see about a 7%
> boost on my laptop[1] from that patch on master, without JIT or
> anything else.

Yea, that's roughly the same magnitude of what I'm seeing, some queries
even bigger.

I'm not sure I want to commit this right now - ISTM we couldn't default
this to on without annoying a lot of people, and letting the performance
wins on the table by default seems like a shame.  I think we should
probably either change the order we store things on the page by default
or only use the faster order if the scan above doesn't care about order
- the planner could figure that out easily.

I personally don't think it is necessary to get this committed at the
same time as the JIT stuff, so I'm not planning to push very hard on
that front. Should you be interested in taking it up, please feel
entirely free.


> I also see you dropped "7ae518bf Centralize slot deforming logic a
> bit.". Was that intentional? Do we want it?

The problem is that there's probably some controversial things in
there. I think the checks I dropped largely make no sense, but I don't
really want to push for that hard.  I suspect we probably still want it,
but I do not want to put into the critical path right now.


> I think I saw about a 2% gain here over master, but when I applied it
> on top of the fast scans it did not seem to add anything on top of
> fast scans. Seems reproducible, but I don't have an explanation.

Yea, that makes sense. The primary reason the patch is beneficial is
that it centralizes the place where the HeapTupleHeader is accessed to a
single piece of code (slot_deform_tuple()). In a lot of cases that first
access will result in a cache miss in all layers, requiring a memory
access. In slot_getsomeattrs() there's very little that can be done in
an out-of-order manner, whereas slot_deform_tuple() can continue
execution a bit further. Also, the latter will then go and sequentially
access the rest (or a significant part of) the tuple, so a centralized
access is more prefetchable.


> And you are probably already working on this, but it would be helpful
> to get the following two patches in also:
> * 3c22065f Do execGrouping via expression eval
> * a9dde4aa Allow tupleslots to have a fixed tupledesc

Yes, I plan to resume working in whipping them up into shape as soon as
I've finished the move to a shared library. This weekend I'm at fosdem,
so that's going to be after...

Thanks for looking!

Andres Freund



Re: JIT compiling with LLVM v9.1

2018-02-03 Thread Andres Freund
Hi,

On 2018-02-03 01:13:21 -0800, Andres Freund wrote:
> On 2018-02-02 18:21:12 -0800, Jeff Davis wrote:
> > I think I saw about a 2% gain here over master, but when I applied it
> > on top of the fast scans it did not seem to add anything on top of
> > fast scans. Seems reproducible, but I don't have an explanation.
> 
> Yea, that makes sense. The primary reason the patch is beneficial is
> that it centralizes the place where the HeapTupleHeader is accessed to a
> single piece of code (slot_deform_tuple()). In a lot of cases that first
> access will result in a cache miss in all layers, requiring a memory
> access. In slot_getsomeattrs() there's very little that can be done in
> an out-of-order manner, whereas slot_deform_tuple() can continue
> execution a bit further. Also, the latter will then go and sequentially
> access the rest (or a significant part of) the tuple, so a centralized
> access is more prefetchable.

Oops missed part of the argument here: The reason that isn't that large
an effect anymore with the scan order patch applied is that suddenly the
accesses are, due to the better scan order, more likely to be cacheable
and prefetchable. So in that case the few additional instructions and
branches in slot_getsomeattrs/slot_getattr don't hurt as much
anymore. IIRC I could still show it up, but it's a much smaller win.

Greetings,

Andres Freund



Regexp named capture groups

2018-02-03 Thread Joel Jacobson
Hi hackers,

Is anyone working on this feature[1] also for PostgreSQL's regex engine?

I'm thinking it could work something like this:

joel=# \df regexp_match
  List of functions
   Schema   | Name | Result data type | Argument data types |  Type
+--+--+-+
 pg_catalog | regexp_match | jsonb| text, text  | normal
 pg_catalog | regexp_match | jsonb| text, text, text| normal
(2 rows)

joel=#* SELECT regexp_match_named(
joel(#*'2018-12-31',
joel(#*'(?[0-9]{4})-(?[0-9]{2})-(?[0-9]{2})'
joel(#* );
  regexp_match_named
--
 {"day": "31", "year": "2018", "month": "12"}
(1 row)

I think this feature would be awesome, for the reasons mentioned in [1], quote:

"Referring to capture groups via numbers has several disadvantages:
1. Finding the number of a capture group is a hassle: you have to
count parentheses.
2. You need to see the regular expression if you want to understand
what the groups are for.
3. If you change the order of the capture groups, you also have to
change the matching code."

[1] http://2ality.com/2017/05/regexp-named-capture-groups.html

Best regards,

Joel Jacobson



Re: Regexp named capture groups

2018-02-03 Thread Pavel Stehule
2018-02-03 11:19 GMT+01:00 Joel Jacobson :

> Hi hackers,
>
> Is anyone working on this feature[1] also for PostgreSQL's regex engine?
>
> I'm thinking it could work something like this:
>
> joel=# \df regexp_match
>   List of functions
>Schema   | Name | Result data type | Argument data types |  Type
> +--+--+-
> +
>  pg_catalog | regexp_match | jsonb| text, text  |
> normal
>  pg_catalog | regexp_match | jsonb| text, text, text|
> normal
> (2 rows)
>
> joel=#* SELECT regexp_match_named(
> joel(#*'2018-12-31',
> joel(#*'(?[0-9]{4})-(?[0-9]{2})-(?[0-9]{2})'
> joel(#* );
>   regexp_match_named
> --
>  {"day": "31", "year": "2018", "month": "12"}
> (1 row)
>
> I think this feature would be awesome, for the reasons mentioned in [1],
> quote:
>
> "Referring to capture groups via numbers has several disadvantages:
> 1. Finding the number of a capture group is a hassle: you have to
> count parentheses.
> 2. You need to see the regular expression if you want to understand
> what the groups are for.
> 3. If you change the order of the capture groups, you also have to
> change the matching code."
>
> [1] http://2ality.com/2017/05/regexp-named-capture-groups.html


looks like nice feature

Pavel


>
>
> Best regards,
>
> Joel Jacobson
>
>


Re: Regexp named capture groups

2018-02-03 Thread Michael Paquier
On Sat, Feb 03, 2018 at 01:55:31PM +0100, Pavel Stehule wrote:
> 2018-02-03 11:19 GMT+01:00 Joel Jacobson :
>> Is anyone working on this feature[1] also for PostgreSQL's regex
>> engine?

Note that I know of.

>> I think this feature would be awesome, for the reasons mentioned in [1],
>> quote:
>>
>> "Referring to capture groups via numbers has several disadvantages:
>> 1. Finding the number of a capture group is a hassle: you have to
>> count parentheses.
>> 2. You need to see the regular expression if you want to understand
>> what the groups are for.
>> 3. If you change the order of the capture groups, you also have to
>> change the matching code."
>>
>> [1] http://2ality.com/2017/05/regexp-named-capture-groups.html
> 
> looks like nice feature

Yes, it looks that this could allow the simplification of equivalent
queries, which I guess would use a CTE to achieve the same.
--
Michael


signature.asc
Description: PGP signature


Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2018-02-03 Thread Andreas Karlsson

On 02/01/2018 04:17 PM, Mark Rofail wrote:
since its a polymorphic function it only passes if the `anyarray` is the 
same type of the `anyelement` so we are sure they are the same type. 
Can't we get the type from the anyarray ? the type is already stored in 
` arr_type`.


In theory, yes, but I saw no obvious way to get it without major changes 
to the code. Unless I am missing something, of course.


If I am right my recommendation for getting this patch in is to 
initially skip the new operators and go back to the version of the patch 
which uses @>.


Andreas



Re: pie-in-sky idea: 'sensitive' function parameters

2018-02-03 Thread Chapman Flack
On 02/02/18 22:46, Tom Lane wrote:

> ... the problem with this idea is that knowledge that the item ought to be
> hidden would be obtained only very late in the parsing process.  So for
> example if you fat-fingered something just to the left of the function
> call in the query text, or the name of the function itself, your password
> would still get exposed in the log.

That didn't seem like a deal-breaker to me, as the common case I had
imagined for it would be where the query is coded into something (a webapp,
say) that just has some bits of sensitive data to pass in, and would surely
just be tested on dummy data until it was clear the canned query was at
least free of fat-finger issues.

Indeed, should the feature have to be restricted for practicality's sake
to only working with bound parameters and certain protocol variants, it
might not be easy or even possible to use it improvisationally from psql,
but that might be an acceptable limitation.

On 02/03/18 02:14, Craig Ringer wrote:

> About the only time I think it's really viable to pursue is if it's
> restricted to bind parameters. We only log those later and more
> selectively as it is, so it seems much more reasonable to say "I never
> want  to appear in the logs".

And I think that could be an acceptable restriction. One way could use
a simple flag to accompany the parameter binding from the client, which
would be recognized early enough to keep the parameter out of the logs,
but also checked later when the function info is available, throwing an
error if it was used for a parameter not declared sensitive, or not used
for a parameter that is.

Or it could even have to be used in a prepared statement. A little
clunkier and one more round trip to the server, but probably not onerous,
and by the time PREPARE completes, surely it can be known which parameter
slots are declared that way.

Still wouldn't account for internal statements logged or errors thrown
by whatever the function does, but maybe the existing machinery for
declaring functions leakproof is at least a start on that?

-Chap



Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2018-02-03 Thread Mark Rofail
On Sat, 3 Feb 2018 at 5:14 pm, Andreas Karlsson  wrote:

> If I am right my recommendation for getting this patch in is to
> initially skip the new operators and go back to the version of the patch
> which uses @>.


We really need an initial patch to get accepted and postpone anything that
won't affect the core mechanics of the patch.

So yes, I move towards delaying the introduction of @>> to the patch.

A new patch will be ready by tomorrow.
Besr Regards,
Mark Rofail

>
>


Re: [HACKERS] MERGE SQL Statement for PG11

2018-02-03 Thread Andreas Seltenreich
Hi,

as promised in Brussels, I taught sqlsmith about MERGE and did some
testing with merge.v14.patch applied on master at 9aef173163.

So far, it found a couple of failing assertions and a suspicous error
message.  Details and Testcases against the regression database below.

regards,
Andreas

-- TRAP: FailedAssertion("!(!const Node*)(node))->type) == T_SubLink))", 
File: "clauses.c", Line: 440)
MERGE INTO public.brin_test as target_0
USING pg_catalog.pg_database as ref_0
  left join pg_catalog.pg_user_mapping as sample_0 tablesample system (2.3)
  on (pg_catalog.mul_d_interval(
cast(pg_catalog.pi() as float8),
cast(case when sample_0.umoptions is not NULL then (select write_lag 
from pg_catalog.pg_stat_replication limit 1 offset 2)
 else (select write_lag from pg_catalog.pg_stat_replication limit 1 
offset 2)
 end
   as "interval")) = (select intervalcol from public.brintest limit 1 
offset 2)
)
ON target_0.a = ref_0.encoding
WHEN NOT MATCHED AND cast(null as "timestamp") < cast(null as date)
   THEN INSERT VALUES ( 62, 6)
WHEN NOT MATCHED
  AND false
   THEN DO NOTHING;

-- TRAP: FailedAssertion("!(!const Node*)(node))->type) == T_SubLink))", 
File: "prepunion.c", Line: 2246)
MERGE INTO public.onek2 as target_0
USING public.prt1 as ref_0
  inner join public.tenk1 as ref_1
  on ((select t from public.btree_tall_tbl limit 1 offset 63)
 is not NULL)
ON target_0.stringu1 = ref_1.stringu1
WHEN NOT MATCHED THEN DO NOTHING;

-- TRAP: FailedAssertion("!(!const Node*)(node))->type) == T_Query))", 
File: "var.c", Line: 248)
MERGE INTO public.clstr_tst_inh as target_0
USING pg_catalog.pg_statio_sys_tables as ref_0
  left join public.rule_and_refint_t3 as ref_1
  on (((ref_0.heap_blks_hit is not NULL)
or (((select f1 from public.path_tbl limit 1 offset 5)
   >= (select thepath from public.shighway limit 1 offset 33)
  )
  or (cast(null as tsvector) <> cast(null as tsvector
  and (ref_0.toast_blks_read is not NULL))
ON target_0.d = ref_1.data
WHEN NOT MATCHED
  AND cast(null as int2) = pg_catalog.lastval()
   THEN DO NOTHING;

-- ERROR:  unrecognized node type: 114
MERGE INTO public.found_test_tbl as target_0
USING public.itest7a as ref_0
ON target_0.a = ref_0.a
WHEN NOT MATCHED
   THEN INSERT VALUES ((select a from public.rtest_t3 limit 1 offset 6));



Re: [HACKERS] MERGE SQL Statement for PG11

2018-02-03 Thread Simon Riggs
On 2 February 2018 at 01:59, Peter Geoghegan  wrote:
> On Thu, Feb 1, 2018 at 11:39 AM, Peter Geoghegan  wrote:
>> There is also the matter of subselects in the update targetlist, which
>> you similarly claim is only a problem of having the wrong error
>> message. The idea that those are unsupported for any principled reason
>> doesn't have any justification (unlike WHEN ... AND quals, and their
>> restrictions, which I agree are necessary). It clearly works with
>> Oracle, for example:
>>
>> http://sqlfiddle.com/#!4/2d5405/10
>>
>> You're reusing set_clause_list in the grammar, so I don't see why it
>> shouldn't work within MERGE in just the same way as it works in
>> UPDATE.
>
> Actually, I now wonder if there is a good reason for restrictions
> (e.g. no subselects) on WHEN ... AND quals, too. See this SQL fiddle
> from SQL Server:
>
> http://sqlfiddle.com/#!18/8acef/27
>
> I started looking at SQL Server's MERGE to verify that it also does
> not impose any restrictions on subselects in a MERGE UPDATE's
> targetlist, just like Oracle. Unsurprisingly, it does not. More
> surprisingly, I noticed that it also doesn't seem to impose
> restrictions on what can appear in WHEN ... AND quals.

You earlier agreed that subselects were not part of the Standard.

> Most
> surprisingly of all, even the main join ON condition itself can have
> subselects (though that's probably a bad idea).

That should be supported, though I can't think of why you'd want that either.

> What this boils down to is that I don't think that this part of your
> design is committable (from your recent v14):

So your opinion is that because v14 patch doesn't include a feature
extension that is in Oracle and SQLServer that we cannot commit this
patch.

There are quite a few minor additional things in that category and the
syntax of those two differ, so its clearly impossible to match both
exactly.

That seems like poor reasoning on why we should block the patch.

If you would like to say how you think the design should look, it
might be possible to change it for this release. Changing it in the
future would not be blocked by commiting without that.

>>> +* As top-level statements INSERT, UPDATE and DELETE have a Query,
>>> +* whereas with MERGE the individual actions do not require
>>> +* separate planning, only different handling in the executor.
>>> +* See nodeModifyTable handling of commandType CMD_MERGE.
>>> +*
>>> +* A sub-query can include the Target, but otherwise the sub-query
>>> +* cannot reference the outermost Target table at all.
>>> +*/
>>> +   qry->querySource = QSRC_PARSER;
>>> +   joinexpr = makeNode(JoinExpr);
>>> +   joinexpr->isNatural = false;
>>> +   joinexpr->alias = NULL;
>>> +   joinexpr->usingClause = NIL;
>>> +   joinexpr->quals = stmt->join_condition;
>
> I am willing to continue to engage with you on the concurrency issues
> for the time being, since that is the most pressing issue for the
> patch. We can get to this stuff later.

There are no concurrency issues. The patch gives the correct answer in
all cases, or an error to avoid problems. We've agreed that it is
desirable we remove some of those if possible, though they are there
as a result of our earlier discussions.

> Note that I consider cleaning
> up the Query and plan representations to be prerequisite to commit.

You'll need to be more specific. Later patches do move some things.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] MERGE SQL Statement for PG11

2018-02-03 Thread Simon Riggs
On 3 February 2018 at 19:57, Andreas Seltenreich  wrote:

> as promised in Brussels, I taught sqlsmith about MERGE and did some
> testing with merge.v14.patch applied on master at 9aef173163.
>
> So far, it found a couple of failing assertions and a suspicous error
> message.  Details and Testcases against the regression database below.

Brilliant work, thank you.

It will likely take some time to work through these and the current
work items but will fix.

Do you have the DDL so we can recreate these easily?

Thanks

> regards,
> Andreas
>
> -- TRAP: FailedAssertion("!(!const Node*)(node))->type) == T_SubLink))", 
> File: "clauses.c", Line: 440)
> MERGE INTO public.brin_test as target_0
> USING pg_catalog.pg_database as ref_0
>   left join pg_catalog.pg_user_mapping as sample_0 tablesample system (2.3)
>   on (pg_catalog.mul_d_interval(
> cast(pg_catalog.pi() as float8),
> cast(case when sample_0.umoptions is not NULL then (select write_lag 
> from pg_catalog.pg_stat_replication limit 1 offset 2)
>  else (select write_lag from pg_catalog.pg_stat_replication limit 
> 1 offset 2)
>  end
>as "interval")) = (select intervalcol from public.brintest limit 1 
> offset 2)
> )
> ON target_0.a = ref_0.encoding
> WHEN NOT MATCHED AND cast(null as "timestamp") < cast(null as date)
>THEN INSERT VALUES ( 62, 6)
> WHEN NOT MATCHED
>   AND false
>THEN DO NOTHING;
>
> -- TRAP: FailedAssertion("!(!const Node*)(node))->type) == T_SubLink))", 
> File: "prepunion.c", Line: 2246)
> MERGE INTO public.onek2 as target_0
> USING public.prt1 as ref_0
>   inner join public.tenk1 as ref_1
>   on ((select t from public.btree_tall_tbl limit 1 offset 63)
>  is not NULL)
> ON target_0.stringu1 = ref_1.stringu1
> WHEN NOT MATCHED THEN DO NOTHING;
>
> -- TRAP: FailedAssertion("!(!const Node*)(node))->type) == T_Query))", 
> File: "var.c", Line: 248)
> MERGE INTO public.clstr_tst_inh as target_0
> USING pg_catalog.pg_statio_sys_tables as ref_0
>   left join public.rule_and_refint_t3 as ref_1
>   on (((ref_0.heap_blks_hit is not NULL)
> or (((select f1 from public.path_tbl limit 1 offset 5)
>>= (select thepath from public.shighway limit 1 offset 33)
>   )
>   or (cast(null as tsvector) <> cast(null as tsvector
>   and (ref_0.toast_blks_read is not NULL))
> ON target_0.d = ref_1.data
> WHEN NOT MATCHED
>   AND cast(null as int2) = pg_catalog.lastval()
>THEN DO NOTHING;
>
> -- ERROR:  unrecognized node type: 114
> MERGE INTO public.found_test_tbl as target_0
> USING public.itest7a as ref_0
> ON target_0.a = ref_0.a
> WHEN NOT MATCHED
>THEN INSERT VALUES ((select a from public.rtest_t3 limit 1 offset 6));



-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] MERGE SQL Statement for PG11

2018-02-03 Thread Simon Riggs
On 1 February 2018 at 19:39, Peter Geoghegan  wrote:

> Finally, I noticed a problem with your new EXPLAIN ANALYZE instrumentation:
>
> Is it 4 rows inserted, or 0 inserted?
>
> postgres=# merge into testoids a using (select i "key", 'foo' "data"
> from generate_series(0,3) i) b on a.key = b.key when matched and 1=0
> then update set data = b.data when not matched then insert (key, data)
> values (b.key, 'foo');
> MERGE 0

Got it. I'm reporting the number of rows processed instead of the
number of rows inserted. My test happened to have those values set
equal.

Minor bug, thanks for spotting.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] MERGE SQL Statement for PG11

2018-02-03 Thread Peter Geoghegan
On Sat, Feb 3, 2018 at 2:15 PM, Simon Riggs  wrote:
>> I started looking at SQL Server's MERGE to verify that it also does
>> not impose any restrictions on subselects in a MERGE UPDATE's
>> targetlist, just like Oracle. Unsurprisingly, it does not. More
>> surprisingly, I noticed that it also doesn't seem to impose
>> restrictions on what can appear in WHEN ... AND quals.
>
> You earlier agreed that subselects were not part of the Standard.

You know that I didn't say that, Simon.

>> What this boils down to is that I don't think that this part of your
>> design is committable (from your recent v14):
>
> So your opinion is that because v14 patch doesn't include a feature
> extension that is in Oracle and SQLServer that we cannot commit this
> patch.
>
> There are quite a few minor additional things in that category and the
> syntax of those two differ, so its clearly impossible to match both
> exactly.
>
> That seems like poor reasoning on why we should block the patch.

It certainly is. Good thing I never said anything of the sort.

There are 3 specific issues on query structure, that together paint a
picture about what you're not doing in the optimizer:

1. Whether or not subselects in the UPDATE targetlist are supported.

2. Whether or not subselects in the WHEN ... AND quals support subselects.

3. Whether or not subselects are possible within the main ON () join.

I gave a lukewarm endorsement of not supporting #3, was unsure with
#2, and was very clear on #1 as soon as I saw the restriction: UPDATE
targetlist in a MERGE are *not* special, and so must support
subselects, just like ON CONFLICT DO UPDATE, for example.

> If you would like to say how you think the design should look, it
> might be possible to change it for this release. Changing it in the
> future would not be blocked by commiting without that.

I can tell you right now that you need to support subselects in the
targetlist. They are *not* an extension to the standard, AFAICT. And
even if they were, every other system supports them, and there is
absolutely no logical reason to not support them other than the fact
that doing so requires significant changes to the data structures in
the parser, planner, and executor. Reworking that will probably turn
out to be necessary for other reasons that I haven't thought of.

I think that restrictions like this are largely an accident of how
your patch evolved. It would be a lot easier to work with you if you
acknowledged that.

>> I am willing to continue to engage with you on the concurrency issues
>> for the time being, since that is the most pressing issue for the
>> patch. We can get to this stuff later.
>
> There are no concurrency issues. The patch gives the correct answer in
> all cases, or an error to avoid problems. We've agreed that it is
> desirable we remove some of those if possible, though they are there
> as a result of our earlier discussions.

You seem to presume to be in charge of the parameters of this
discussion. I don't see it that way. I think that READ COMMITTED
conflict handling semantics are by far the biggest issue for the
patch, and that we should prioritize reaching agreement there. This
needs to be worked out through community consensus, since it concerns
fundamental semantics much more than implementation details. (In
contrast, the optimizer issues I mentioned are fairly heavy on
relatively uncontentious implementation questions.)

The problem with how you've represented MERGE in the parser,
optimizer, and executor is not that it's "half-baked crap", as you
suggested others might think at the FOSDEM developer meeting [1]. I
wouldn't say that at all. What I'd say is that it's *unfinished*. It's
definitely sufficient to prototype different approaches to
concurrency, as well as to determine how triggers should work, and
many other such things. That's a good start.

I am willing to mostly put aside the other issues for the time being,
to get the difficult questions on concurrency out of the way first.
But if you don't make some broad concessions on the secondary issues
pretty quickly, then I will have to conclude that our positions are
irreconcilable. I will have nothing further to contribute to the
discussion.

[1] https://wiki.postgresql.org/wiki/FOSDEM/PGDay_2018_Developer_Meeting#Minutes
-- 
Peter Geoghegan



Re: JIT compiling with LLVM v9.1

2018-02-03 Thread Andreas Karlsson

On 02/02/2018 10:48 AM, Pierre Ducroquet wrote:

I have successfully built the JIT branch against LLVM 4.0.1 on Debian testing.
This is not enough for Debian stable (LLVM 3.9 is the latest available there),
but it's a first step.
I've split the patch in four files. The first three fix the build issues, the
last one fixes a runtime issue.
I think they are small enough to not be a burden for you in your developments.
But if you don't want to carry these ifdefs right now, I maintain them in a
branch on a personal git and rebase as frequently as I can.


I tested these patches and while the code built for me and passed the 
test suite on Debian testing I have a weird bug where the very first 
query fails to JIT while the rest work as they should. I think I need to 
dig into LLVM's codebase to see what it is, but can you reproduce this 
bug at your machine?


Code to reproduce:

SET jit_expressions = true;
SET jit_above_cost = 0;
SELECT 1;
SELECT 1;

Output:

postgres=# SELECT 1;
ERROR:  failed to jit module
postgres=# SELECT 1;
 ?column?
--
1
(1 row)

Config:

Version: You patches applied on top of 
302b7a284d30fb0e00eb5f0163aa933d4d9bea10

OS: Debian testing
llvm/clang: 4.0.1-8

Andreas



Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key

2018-02-03 Thread Amit Kapila
On Fri, Feb 2, 2018 at 2:11 PM, amul sul  wrote:
> On Fri, Jan 26, 2018 at 11:58 AM, Amit Kapila  wrote:
> []
>> I think you can manually (via debugger) hit this by using
>> PUBLICATION/SUBSCRIPTION syntax for logical replication.  I think what
>> you need to do is in node-1, create a partitioned table and subscribe
>> it on node-2.  Now, perform an Update on node-1, then stop the logical
>> replication worker before it calls heap_lock_tuple.  Now, in node-2,
>> update the same row such that it moves the row.  Now, continue the
>> logical replication worker.  I think it should hit your new code, if
>> not then we need to think of some other way.
>>
>
> I am able to hit the change log using above steps. Thanks a lot for the
> step by step guide, I really needed that.
>
> One strange behavior I found in the logical replication which is reproducible
> without attached patch as well -- when I have updated on node2 by keeping
> breakpoint before the heap_lock_tuple call in replication worker, I can see
> a duplicate row was inserted on the node2, see this:
>
..
>
> I am thinking to report this in a separate thread, but not sure if
> this is already known behaviour or not.
>

I think it is worth to discuss this behavior in a separate thread.
However, if possible, try to reproduce it without partitioning and
then report it.

>
> Updated patch attached -- correct changes in execReplication.c.
>

Your changes look correct to me.

I wonder what will be the behavior of this patch with
wal_consistency_checking [1].  I think it will generate a failure as
there is nothing in WAL to replay it.  Can you once try it?  If we see
a failure with wal consistency checker, then we need to think whether
(a) we want to deal with it by logging this information, or (b) do we
want to mask it or (c) something else?


[1] -  
https://www.postgresql.org/docs/devel/static/runtime-config-developer.html


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: [HACKERS] MERGE SQL Statement for PG11

2018-02-03 Thread Amit Kapila
On Wed, Jan 31, 2018 at 11:37 PM, Peter Geoghegan  wrote:
> On Wed, Jan 31, 2018 at 7:17 AM, Robert Haas  wrote:
>> I don't fully grok merge but suppose you have:
>>
>> WHEN MATCHED AND a = 0 THEN UPDATE ...
>> WHEN MATCHED AND a = 1 THEN UPDATE ...
>> WHEN NOT MATCHED THEN INSERT ...
>>
>> Suppose you match a tuple with a = 0 but, upon trying to update it,
>> find that it's been updated to a = 1.  It seems like there are a few
>> possible behaviors:
>>
>> 1. Throw an error!  I guess this is what the patch does now.
>
> Right.
>
>> 2. Do absolutely nothing.  I think this is what would happen with an
>> ordinary UPDATE; the tuple fails the EPQ recheck and so is not
>> updated, but that doesn't trigger anything else.
>
> I think #2 is fine if you're talking about join quals. Which, of
> course, you're not. These WHEN quals really do feel like
> tuple-at-a-time procedural code, more than set-orientated quals (if
> that wasn't true, we'd have to allow cardinality violations, which we
> at least try to avoid). Simon said something like "the SQL standard
> requires that WHEN quals be evaluated first" at one point, which makes
> sense to me.
>

It is not clear to me what is exactly your concern if we try to follow
#2?  To me, #2 seems like a natural choice.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com