Re: [HACKERS] Refactoring identifier checks to consistently use strcmp

2017-09-05 Thread Daniel Gustafsson
> On 17 Aug 2017, at 11:08, Daniel Gustafsson  wrote:
> 
>> On 16 Aug 2017, at 17:51, Tom Lane  wrote:
>> 
>> Heikki Linnakangas  writes:
>>> This no longer works:
>> 
>>> postgres=# CREATE TEXT SEARCH DICTIONARY public.simple_dict (
>>>TEMPLATE = pg_catalog.simple,
>>>"STOPWORds" = english
>>> );
>>> ERROR:  unrecognized simple dictionary parameter: "STOPWORds"
>> 
>>> In hindsight, perhaps we should always have been more strict about that 
>>> to begin wtih, but let's not break backwards-compatibility without a 
>>> better reason. I didn't thoroughly check all of the cases here, to see 
>>> if there are more like this.
>> 
>> You have a point, but I'm not sure that this is such a bad compatibility
>> break as to be a reason not to change things to be more consistent.
> 
> I agree with this, but I admittedly have no idea how common the above case
> would be in the wild.
> 
>>> It'd be nice to have some kind of a rule on when pg_strcasecmp should be 
>>> used and when strcmp() is enough. Currently, by looking at the code, I 
>>> can't tell.
>> 
>> My thought is that if we are looking at words that have been through the
>> parser, then it should *always* be plain strcmp; we should expect that
>> the parser already did the appropriate case-folding.
> 
> +1
> 
>> pg_strcasecmp would be appropriate, perhaps, if we're dealing with stuff
>> that somehow came in without going through the parser.
> 
> In that case it would be up to the consumer of the data to handle required
> case-folding for the expected input, so pg_strcasecmp or strcmp depending on
> situation.

This patch has been marked “Waiting on Author”, but I’m not sure what the
concensus of this thread came to with regards to quoted keywords and backwards
compatibility.  There seems to be a 2-1 vote for allowing a break, and forcing
all keywords out of the parser to be casefolded. Any other opinions?

cheers ./daniel



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring identifier checks to consistently use strcmp

2017-08-17 Thread Daniel Gustafsson
> On 16 Aug 2017, at 17:51, Tom Lane  wrote:
> 
> Heikki Linnakangas  writes:
>> This no longer works:
> 
>> postgres=# CREATE TEXT SEARCH DICTIONARY public.simple_dict (
>> TEMPLATE = pg_catalog.simple,
>> "STOPWORds" = english
>> );
>> ERROR:  unrecognized simple dictionary parameter: "STOPWORds"
> 
>> In hindsight, perhaps we should always have been more strict about that 
>> to begin wtih, but let's not break backwards-compatibility without a 
>> better reason. I didn't thoroughly check all of the cases here, to see 
>> if there are more like this.
> 
> You have a point, but I'm not sure that this is such a bad compatibility
> break as to be a reason not to change things to be more consistent.

I agree with this, but I admittedly have no idea how common the above case
would be in the wild.

>> It'd be nice to have some kind of a rule on when pg_strcasecmp should be 
>> used and when strcmp() is enough. Currently, by looking at the code, I 
>> can't tell.
> 
> My thought is that if we are looking at words that have been through the
> parser, then it should *always* be plain strcmp; we should expect that
> the parser already did the appropriate case-folding.

+1

> pg_strcasecmp would be appropriate, perhaps, if we're dealing with stuff
> that somehow came in without going through the parser.

In that case it would be up to the consumer of the data to handle required
case-folding for the expected input, so pg_strcasecmp or strcmp depending on
situation.

cheers ./daniel


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring identifier checks to consistently use strcmp

2017-08-16 Thread Robert Haas
On Wed, Aug 16, 2017 at 11:51 AM, Tom Lane  wrote:
> You have a point, but I'm not sure that this is such a bad compatibility
> break as to be a reason not to change things to be more consistent.

+1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring identifier checks to consistently use strcmp

2017-08-16 Thread Tom Lane
Heikki Linnakangas  writes:
> This no longer works:

> postgres=# CREATE TEXT SEARCH DICTIONARY public.simple_dict (
>  TEMPLATE = pg_catalog.simple,
>  "STOPWORds" = english
> );
> ERROR:  unrecognized simple dictionary parameter: "STOPWORds"

> In hindsight, perhaps we should always have been more strict about that 
> to begin wtih, but let's not break backwards-compatibility without a 
> better reason. I didn't thoroughly check all of the cases here, to see 
> if there are more like this.

You have a point, but I'm not sure that this is such a bad compatibility
break as to be a reason not to change things to be more consistent.

> It'd be nice to have some kind of a rule on when pg_strcasecmp should be 
> used and when strcmp() is enough. Currently, by looking at the code, I 
> can't tell.

My thought is that if we are looking at words that have been through the
parser, then it should *always* be plain strcmp; we should expect that
the parser already did the appropriate case-folding.  If the user
prevented case-folding by double-quoting, I don't have a lot of sympathy
for any complaints about it.  Generally speaking, what we're dealing with
here is things that are logically keywords but we did not wish to make
them real parser keywords.  But in SQL, once you quote a keyword, it's
not a keyword at all anymore.  So I think the argument that quoting
"stopwords" should be legal at all in this context is pretty weak,
and the argument that quoting a weirdly-cased version of it should
work is even weaker.

pg_strcasecmp would be appropriate, perhaps, if we're dealing with stuff
that somehow came in without going through the parser.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring identifier checks to consistently use strcmp

2017-08-16 Thread Heikki Linnakangas

On 04/04/2017 10:13 AM, Daniel Gustafsson wrote:

On 04 Apr 2017, at 05:52, Alvaro Herrera  wrote:

Daniel Gustafsson wrote:

Testing DefElem options is done with both strcmp() and pg_strcasecmp() a bit
mixed.  Since the option defnames are all lowercased, either via IDENT, keyword
rules or “by hand” with makeString(), using strcmp() is safe (and assumed to be
so in quite a lot of places).

While it’s not incorrect per se to use pg_strcasecmp(), it has the potential to
hide a DefElem created with a mixed-case defname where it in other places is
expected to be in lowercase, which may lead to subtle bugs.

The attached patch refactors to use strcmp() consistently for option processing
in the command code as a pre-emptive belts+suspenders move against such subtle
bugs and to make the code more consistent.  Also reorders a few checks to have
all in the same “format” and removes a comment related to the above.

Tested with randomizing case on options in make check (not included in patch).


Does it work correctly in the Turkish locale?


Yes, running make check with random case defnames under tr_TR works fine.


This no longer works:

postgres=# CREATE TEXT SEARCH DICTIONARY public.simple_dict (
TEMPLATE = pg_catalog.simple,
"STOPWORds" = english
);
ERROR:  unrecognized simple dictionary parameter: "STOPWORds"

In hindsight, perhaps we should always have been more strict about that 
to begin wtih, but let's not break backwards-compatibility without a 
better reason. I didn't thoroughly check all of the cases here, to see 
if there are more like this.


It'd be nice to have some kind of a rule on when pg_strcasecmp should be 
used and when strcmp() is enough. Currently, by looking at the code, I 
can't tell.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring identifier checks to consistently use strcmp

2017-04-04 Thread Daniel Gustafsson
> On 04 Apr 2017, at 05:52, Alvaro Herrera  wrote:
> 
> Daniel Gustafsson wrote:
>> Testing DefElem options is done with both strcmp() and pg_strcasecmp() a bit
>> mixed.  Since the option defnames are all lowercased, either via IDENT, 
>> keyword
>> rules or “by hand” with makeString(), using strcmp() is safe (and assumed to 
>> be
>> so in quite a lot of places).
>> 
>> While it’s not incorrect per se to use pg_strcasecmp(), it has the potential 
>> to
>> hide a DefElem created with a mixed-case defname where it in other places is
>> expected to be in lowercase, which may lead to subtle bugs.
>> 
>> The attached patch refactors to use strcmp() consistently for option 
>> processing
>> in the command code as a pre-emptive belts+suspenders move against such 
>> subtle
>> bugs and to make the code more consistent.  Also reorders a few checks to 
>> have
>> all in the same “format” and removes a comment related to the above.
>> 
>> Tested with randomizing case on options in make check (not included in 
>> patch).
> 
> Does it work correctly in the Turkish locale?

Yes, running make check with random case defnames under tr_TR works fine.

cheers ./daniel

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring identifier checks to consistently use strcmp

2017-04-03 Thread Alvaro Herrera
Daniel Gustafsson wrote:
> Testing DefElem options is done with both strcmp() and pg_strcasecmp() a bit
> mixed.  Since the option defnames are all lowercased, either via IDENT, 
> keyword
> rules or “by hand” with makeString(), using strcmp() is safe (and assumed to 
> be
> so in quite a lot of places).
> 
> While it’s not incorrect per se to use pg_strcasecmp(), it has the potential 
> to
> hide a DefElem created with a mixed-case defname where it in other places is
> expected to be in lowercase, which may lead to subtle bugs.
> 
> The attached patch refactors to use strcmp() consistently for option 
> processing
> in the command code as a pre-emptive belts+suspenders move against such subtle
> bugs and to make the code more consistent.  Also reorders a few checks to have
> all in the same “format” and removes a comment related to the above.
> 
> Tested with randomizing case on options in make check (not included in patch).

Does it work correctly in the Turkish locale?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Refactoring identifier checks to consistently use strcmp

2017-04-03 Thread Daniel Gustafsson
Testing DefElem options is done with both strcmp() and pg_strcasecmp() a bit
mixed.  Since the option defnames are all lowercased, either via IDENT, keyword
rules or “by hand” with makeString(), using strcmp() is safe (and assumed to be
so in quite a lot of places).

While it’s not incorrect per se to use pg_strcasecmp(), it has the potential to
hide a DefElem created with a mixed-case defname where it in other places is
expected to be in lowercase, which may lead to subtle bugs.

The attached patch refactors to use strcmp() consistently for option processing
in the command code as a pre-emptive belts+suspenders move against such subtle
bugs and to make the code more consistent.  Also reorders a few checks to have
all in the same “format” and removes a comment related to the above.

Tested with randomizing case on options in make check (not included in patch).

cheers ./daniel



defname_strcmp.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring of replication commands using printsimple

2017-02-01 Thread Michael Paquier
On Thu, Feb 2, 2017 at 4:05 AM, Robert Haas  wrote:
> On Tue, Jan 31, 2017 at 8:26 PM, Michael Paquier
>  wrote:
>> pq_sendcountedtext() does some encoding conversion, which is why I
>> haven't used because we deal only with integers in this patch... Now
>> if you wish to switch to that I have really no arguments against.
>
> OK, I see.  Well, I guess it's sensible either way, but I've committed
> this version.  Thanks.

Thanks.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring of replication commands using printsimple

2017-02-01 Thread Robert Haas
On Tue, Jan 31, 2017 at 8:26 PM, Michael Paquier
 wrote:
> pq_sendcountedtext() does some encoding conversion, which is why I
> haven't used because we deal only with integers in this patch... Now
> if you wish to switch to that I have really no arguments against.

OK, I see.  Well, I guess it's sensible either way, but I've committed
this version.  Thanks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring of replication commands using printsimple

2017-01-31 Thread Michael Paquier
On Tue, Jan 31, 2017 at 11:59 PM, Robert Haas  wrote:
> Sorry, I have a little more nitpicking.

Thanks for the input.

> How about having
> printsimple() use pq_sendcountedtext() instead of pq_sendint()
> followed by pq_sendbytes(), as it does for TEXTOID?
>
> Other than that, this looks fine to me now.

pq_sendcountedtext() does some encoding conversion, which is why I
haven't used because we deal only with integers in this patch... Now
if you wish to switch to that I have really no arguments against.
-- 
Michael


refactor-repl-cmd-output-v4.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring of replication commands using printsimple

2017-01-31 Thread Robert Haas
On Tue, Jan 31, 2017 at 12:19 AM, Michael Paquier
 wrote:
> This is a follow up of the refactoring that has been discussed in the
> thread to increase the default size of WAL segments:
> https://www.postgresql.org/message-id/cab7npqq4hynrlq+w1jrryvsysoxuqa40pyb2uw5uqkkag4h...@mail.gmail.com
>
> The discussion has resulted in the creation of a84069d9 that has
> introduced a new DestReceiver method called printsimple that does not
> need any catalog access. After some investigation, I have noticed that
> a couple of messages used in the replication protocol could be
> refactored as well:
> - IDENTIFY_SYSTEM
> - TIMELINE_HISTORY
> - CREATE_REPLICATION_SLOT
> This results in the following code reduction:
>  3 files changed, 115 insertions(+), 162 deletions(-)
>
> A commit fest entry has been created:
> https://commitfest.postgresql.org/13/978/

Sorry, I have a little more nitpicking.  How about having
printsimple() use pq_sendcountedtext() instead of pq_sendint()
followed by pq_sendbytes(), as it does for TEXTOID?

Other than that, this looks fine to me now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Refactoring of replication commands using printsimple

2017-01-30 Thread Michael Paquier
Hi all,

This is a follow up of the refactoring that has been discussed in the
thread to increase the default size of WAL segments:
https://www.postgresql.org/message-id/cab7npqq4hynrlq+w1jrryvsysoxuqa40pyb2uw5uqkkag4h...@mail.gmail.com

The discussion has resulted in the creation of a84069d9 that has
introduced a new DestReceiver method called printsimple that does not
need any catalog access. After some investigation, I have noticed that
a couple of messages used in the replication protocol could be
refactored as well:
- IDENTIFY_SYSTEM
- TIMELINE_HISTORY
- CREATE_REPLICATION_SLOT
This results in the following code reduction:
 3 files changed, 115 insertions(+), 162 deletions(-)

A commit fest entry has been created:
https://commitfest.postgresql.org/13/978/

Thanks,
-- 
Michael


refactor-repl-cmd-output-v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2016-09-27 Thread Peter Geoghegan
On Tue, Sep 27, 2016 at 2:13 PM, Tom Lane  wrote:
> I can see the value of processing unique indexes before non-unique ones.
> I'm pretty suspicious of trying to prioritize primary keys first, though,
> because (a) it's not clear why bother, and (b) PG is a tad squishy about
> whether an index is a primary key or not, so that I'd be worried about
> different backends sometimes choosing different orders.  I'd simplify
> this to "uniques in OID order then non-uniques in OID order".

I see your point. A more considered ordering of indexes for processing
by the executor (prepared for it by the relcache), including something
more that goes further than your proposal is useful in the context of
fixing the bug I mentioned [1], but for non-obvious reasons. I would
like to clarify what I meant there specifically. I am repeating
myself, but maybe I just wasn't clear before.

The theory of putting the PK first there is that we then have a
well-defined (uh, better defined) user-visible ordering *across unique
indexes* such that the problem case would *reliably* be fixed. With
only this refactoring patch applied (and no change to the relcache
ordering thing), it is then only a sheer accident of OID assignment
ordering that the INSERT ON CONFLICT problem case happens to take the
alternative path on the OP's *inferred* index (which, as it happens,
was the PK for him), rather than the other unique index that was
involved (the one that is not actually inferred, and yet is equivalent
to the PK, UPSERT-semantics-wise). So, the reporter of the bug [1] is
happy with his exact case now working, at least.

You might now counter: "But why prefer one convention over the other?
Prioritizing the PK would reliably fix that particular problem case,
but that's still pretty arbitrary."

It's true that it's somewhat arbitrary to always (speculatively)
insert into the PK first. But, I think that it's more likely that the
PK is inferred in general, and so it's more likely that users will
fall on the right side of that in practice. Besides, at least we now
have a consistent behavior.

You might also reasonably ask: "But what if there are multiple unique
indexes, none of which happen to be the PK? Isn't that subject to the
same vagaries of OID ordering anyway?"

I must admit that it is. But I don't really know where to draw the
line here. Is it worth contemplating a more complicated scheme still?
For example, trigger-style ordering; a sort order that considers index
name as a "secondary attribute", in order to ensure perfectly
consistent behavior? I must admit that I don't really have a clue
whether or not that's a good idea. It's an idea.

[1] 
https://www.postgresql.org/message-id/CAM3SWZTFTbK_Y%3D7uWJaXYLHnYQ99pV4KFpmjTKbNJR5_%2BQThzA%40mail.gmail.com
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2016-09-27 Thread Tom Lane
Heikki Linnakangas  writes:
> On 09/24/2016 02:34 PM, Peter Geoghegan wrote:
>> I go on to explain how this patch represents a partial solution to
>> that [1]. That's what I mean by "useful practical consequences". As I
>> say in [1], I think we could even get a full solution, if we applied
>> this patch and *also* made the ordering in which the relcache returns
>> a list of index OIDs more useful (it should still be based on
>> something stable, to avoid deadlocks, but more than just OID order.
>> Instead, relcache.c can sort indexes such that we insert into primary
>> keys first, then unique indexes, then all other indexes. This also
>> avoids bloat if there is a unique violation, by getting unique indexes
>> out of the way first during ExecInsert()).

> Hmm, yeah, that'd be nice to fix. I'd like to see a patch specifically 
> to fix that. I'm not convinced that we need all the complications of 
> this patch, to get that fixed. (In particular, indexam's still wouldn't 
> need to care about the different between CHECK_UNIQUE_PARTIAL and 
> CHECK_UNIQUE_SPECULATIVE, I think.)

I can see the value of processing unique indexes before non-unique ones.
I'm pretty suspicious of trying to prioritize primary keys first, though,
because (a) it's not clear why bother, and (b) PG is a tad squishy about
whether an index is a primary key or not, so that I'd be worried about
different backends sometimes choosing different orders.  I'd simplify
this to "uniques in OID order then non-uniques in OID order".

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2016-09-27 Thread Heikki Linnakangas

On 09/27/2016 11:38 AM, Peter Geoghegan wrote:

On Tue, Sep 27, 2016 at 9:25 AM, Heikki Linnakangas  wrote:

I didn't think very hard about it, but yeah, think so..


Do you think it's worth a backpatch? Or, too early to tell?


Too early to tell..

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2016-09-27 Thread Peter Geoghegan
On Tue, Sep 27, 2016 at 9:25 AM, Heikki Linnakangas  wrote:
> I didn't think very hard about it, but yeah, think so..

Do you think it's worth a backpatch? Or, too early to tell?


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2016-09-27 Thread Heikki Linnakangas

On 09/27/2016 11:22 AM, Peter Geoghegan wrote:

On Tue, Sep 27, 2016 at 9:10 AM, Heikki Linnakangas  wrote:

Hmm, yeah, that'd be nice to fix. I'd like to see a patch specifically to
fix that. I'm not convinced that we need all the complications of this
patch, to get that fixed. (In particular, indexam's still wouldn't need to
care about the different between CHECK_UNIQUE_PARTIAL and
CHECK_UNIQUE_SPECULATIVE, I think.)


But you are convinced that everything else I describe would do it?


I didn't think very hard about it, but yeah, think so..

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2016-09-27 Thread Peter Geoghegan
On Tue, Sep 27, 2016 at 9:10 AM, Heikki Linnakangas  wrote:
> Hmm, yeah, that'd be nice to fix. I'd like to see a patch specifically to
> fix that. I'm not convinced that we need all the complications of this
> patch, to get that fixed. (In particular, indexam's still wouldn't need to
> care about the different between CHECK_UNIQUE_PARTIAL and
> CHECK_UNIQUE_SPECULATIVE, I think.)

But you are convinced that everything else I describe would do it?


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2016-09-27 Thread Heikki Linnakangas

On 09/24/2016 02:34 PM, Peter Geoghegan wrote:

On Tue, Sep 20, 2016 at 8:55 AM, Heikki Linnakangas  wrote:

Likewise, in ExecInsertIndexTuples(), this makes the deferred-index
case work slightly differently from speculative insertion. It's not a big
difference, but it again forces you to keep one more scenario in mind, when
reading the code


This actually does have useful practical consequences, although I
didn't point that out earlier (though I should have). To see what I
mean, consider the complaint in this recent thread, which is based on
an actual user application problem:

https://www.postgresql.org/message-id/flat/1472134358.649524...@f146.i.mail.ru#1472134358.649524...@f146.i.mail.ru

I go on to explain how this patch represents a partial solution to
that [1]. That's what I mean by "useful practical consequences". As I
say in [1], I think we could even get a full solution, if we applied
this patch and *also* made the ordering in which the relcache returns
a list of index OIDs more useful (it should still be based on
something stable, to avoid deadlocks, but more than just OID order.
Instead, relcache.c can sort indexes such that we insert into primary
keys first, then unique indexes, then all other indexes. This also
avoids bloat if there is a unique violation, by getting unique indexes
out of the way first during ExecInsert()).


Hmm, yeah, that'd be nice to fix. I'd like to see a patch specifically 
to fix that. I'm not convinced that we need all the complications of 
this patch, to get that fixed. (In particular, indexam's still wouldn't 
need to care about the different between CHECK_UNIQUE_PARTIAL and 
CHECK_UNIQUE_SPECULATIVE, I think.)


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring of heapam code.

2016-09-26 Thread Pavan Deolasee
On Tue, Sep 6, 2016 at 8:39 PM, Anastasia Lubennikova <
a.lubennik...@postgrespro.ru> wrote:

> 06.09.2016 07:44, Pavan Deolasee:
>
>
> I don't know what to do with the CF entry itself. I could change the
> status to "waiting on author" but then the design draft may not get enough
> attention. So I'm leaving it in the current state for others to look at.
>
>
> I'll try to update patches as soon as possible. Little code cleanup will
> be useful
> regardless of final refactoring design.
>
>
I'm marking the patch as "Returned with Feedback". I assume you'll submit
fresh set of patches for the next CF anyways.

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2016-09-24 Thread Peter Geoghegan
On Tue, Sep 20, 2016 at 8:55 AM, Heikki Linnakangas  wrote:
> Thanks for working on this, and sorry for disappearing and dropping this on
> the floor earlier. This doesn't apply anymore, thanks to 9c810a2e. Shouldn't
> be hard to fix.

Thanks for looking at it again.

> I was in support of this earlier in general, even though I had some
> questions on the details. But now that I look at the patch again, I'm not so
> sure anymore.

Honestly, I almost withdrew this patch myself, simply because it has
dragged on so long. It has become ridiculous.

> I don't think this actually makes things clearer. It adds new
> cases that the code needs to deal with. Index AM writers now have to care
> about the difference between a UNIQUE_CHECK_PARTIAL and
> UNIQUE_CHECK_SPECULATIVE. You can have the exact same implementation for
> both, but at the very least the index AM author needs to read the paragraph
> you added to the docs to understand the difference. That adds some cognitive
> load.

I think it gives us the opportunity to discuss the differences, and in
particular to explain why this "speculative insertion" thing exists at
all. Besides, we can imagine an amcanunique implementation in which
the difference might matter. Honestly, since it is highly unlikely
that there ever will be another amcanunique access method, the
cognitive burden to implementers of new amcanunique AMs is not a
concern to me. Rather, the concern with that part of the patch is
educating people about how the whole speculative insertion thing works
in general, and drawing specific attention to it in a few key places
in the code.

Speculative insertion is subtle and complicated enough that I feel
that that should be well described, as I've said many times. Remember
how hard it was for us to come to agreement on the basic requirements
in the first place!

> Likewise, in ExecInsertIndexTuples(), this makes the deferred-index
> case work slightly differently from speculative insertion. It's not a big
> difference, but it again forces you to keep one more scenario in mind, when
> reading the code

This actually does have useful practical consequences, although I
didn't point that out earlier (though I should have). To see what I
mean, consider the complaint in this recent thread, which is based on
an actual user application problem:

https://www.postgresql.org/message-id/flat/1472134358.649524...@f146.i.mail.ru#1472134358.649524...@f146.i.mail.ru

I go on to explain how this patch represents a partial solution to
that [1]. That's what I mean by "useful practical consequences". As I
say in [1], I think we could even get a full solution, if we applied
this patch and *also* made the ordering in which the relcache returns
a list of index OIDs more useful (it should still be based on
something stable, to avoid deadlocks, but more than just OID order.
Instead, relcache.c can sort indexes such that we insert into primary
keys first, then unique indexes, then all other indexes. This also
avoids bloat if there is a unique violation, by getting unique indexes
out of the way first during ExecInsert()).

> So overall, I think we should just drop this. Maybe a comment somewhere
> would be in order, to point out that ExecInsertIndexTuples() and
> index_insert might perform some unnecessary work, by inserting index tuples
> for a doomed heap tuple, if a speculative insertion fails. But that's all.

Perhaps. I'm curious about what your thoughts are on what I've said
about "useful practical consequences" of finishing insertion earlier
for speculative inserters. If you're still not convinced about this
patch, having considered that as well, then I will drop the patch (or
maybe we just add some comments, as you suggest).

[1] 
https://www.postgresql.org/message-id/CAM3SWZTFTbK_Y%3D7uWJaXYLHnYQ99pV4KFpmjTKbNJR5_%2BQThzA%40mail.gmail.com
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2016-09-20 Thread Heikki Linnakangas

On 03/17/2016 01:43 AM, Peter Geoghegan wrote:

I attach a revision, that makes all the changes that Heikki suggested,
except one. As already noted several times, following this suggestion
would have added a bug. Alvaro preferred my original approach here in
any case. I refer to my original approach of making the new
UNIQUE_CHECK_SPECULATIVE case minimally different from the existing
UNIQUE_CHECK_PARTIAL case currently used for deferred unique
constraints and speculative insertion, as opposed to making the new
UNIQUE_CHECK_SPECULATIVE "like CHECK_UNIQUE_YES, but return FALSE
instead of throwing an error on conflict". That was broken because
CHECK_UNIQUE_YES waits for the outcome of an xact, which
UNIQUE_CHECK_PARTIAL never does, and so UNIQUE_CHECK_SPECULATIVE must
never do.

Any and all waits happen in the first phase of speculative insertion,
and never the seconds. I could give a complicated explanation for why,
involving a deadlock scenario, but a simple explanation will do: it
has always worked that way, and was tested to work that way.

Feedback from Heikki led to these changes for this revision:

* The use of arguments within ExecInsert() was simplified.

* More concise AM documentation.

The docs essentially describe two new concepts:

- What unique index insertion needs to know about speculative
insertion in general. This doesn't just apply to speculative inserters
themselves, of course.

- What speculative insertion is. Why it exists (why we don't just wait
on xact). In other words, what "unprincipled deadlocks" are, and how
they are avoided (from a relatively high level).

I feel like I have a responsibility to make sure that this mechanism
is well documented, especially given that not that many people were
involved in its design. It's possible that no more than the 3 original
authors of UPSERT fully understand speculative insertion -- it's easy
to miss some of the subtleties.


Thanks for working on this, and sorry for disappearing and dropping this 
on the floor earlier. This doesn't apply anymore, thanks to 9c810a2e. 
Shouldn't be hard to fix.


I was in support of this earlier in general, even though I had some 
questions on the details. But now that I look at the patch again, I'm 
not so sure anymore. I don't think this actually makes things clearer. 
It adds new cases that the code needs to deal with. Index AM writers now 
have to care about the difference between a UNIQUE_CHECK_PARTIAL and 
UNIQUE_CHECK_SPECULATIVE. You can have the exact same implementation for 
both, but at the very least the index AM author needs to read the 
paragraph you added to the docs to understand the difference. That adds 
some cognitive load. Likewise, in ExecInsertIndexTuples(), this makes 
the deferred-index case work slightly differently from speculative 
insertion. It's not a big difference, but it again forces you to keep 
one more scenario in mind, when reading the code.


So overall, I think we should just drop this. Maybe a comment somewhere 
would be in order, to point out that ExecInsertIndexTuples() and 
index_insert might perform some unnecessary work, by inserting index 
tuples for a doomed heap tuple, if a speculative insertion fails. But 
that's all.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring of heapam code.

2016-09-13 Thread Michael Paquier
On Mon, Sep 12, 2016 at 7:12 PM, Pavan Deolasee
 wrote:
> I was thinking about locking, bulk reading (like page-mode API) etc. While
> you've added an API for vacuuming, we would probably also need an API to
> collect dead tuples, pruning etc (not sure if that can be combined with
> vacuum). Of course, these are probably very specific to current
> implementation of heap/MVCC and not all storages will need that.

Likely not, but it is likely that people would like to be able to get
that if some custom method needs it. I think that what would be a good
first step here is to brainstorm a potential set of custom AMs for
tables, get the smallest, meaningful, one from the set of ideas
proposed, and define a basic set of relation/storage/table AM routines
that we can come up to support it. And then build up a prototype using
it. We have been talking a lot about refactoring and reordering stuff,
which is something that would be good in the long term to make things
more generic with heap, but getting an idea of "we-want-that" may
prove to help in designing a minimum set if features that we'd like to
have here.

This would likely need anyway to extend CREATE TABLE to be able to
pass a custom AM for a given relation and store in pg_catalog, but
that's a detail in the whole picture...
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring of heapam code.

2016-09-12 Thread Pavan Deolasee
On Tue, Sep 6, 2016 at 8:39 PM, Anastasia Lubennikova <
a.lubennik...@postgrespro.ru> wrote:

> 06.09.2016 07:44, Pavan Deolasee:
>
> 2. I don't understand the difference between PageGetItemHeapHeaderOnly()
> and PageGetItemHeap(). They seem to do exactly the same thing and can be
> used interchangeably.
>
>
> The only difference between these two macros is that
> PageGetItemHeapHeaderOnly() doesn't touch any key fields of a tuple,
> it only checks header fields (see HeapTupleHeaderData). I divided it into
> two separate functions, while I was working on page compression and
> I wanted to avoid unnecessary decompression calls. These names are
> just a kind of 'markers' to make the code reading and improving easier.
>
>
Ok. I still don't get it, but that's probably because I haven't seen a real
use case of that. Right now, both macros look exactly the same.


> 3. While I like the idea of using separate interfaces to get heap/index
> tuple from a page, may be they can internally use a common interface
> instead of duplicating what PageGetItem() does already.
>
>
> I don't sure I get it right. Do you suggest to leave PageGetItem and write
> PageGetItemHeap() and PageGetItemIndex() as wrappers on it?
> It sounds reasonable while we have similar layout for both heap and index
> pages.
> In any case, it'll be easy to separate them when necessary.
>
>
Yes, that's what I was thinking.


> 4. Should we rename PageGetItemHeap() to PageGetHeapTuple() or something
> like that?
>
>
> I don't feel like doing that, because HeapTuple is a different structure.
> What we do get from page is a HeapTupleHeaderData structure
> followed by user's data.
>

Ok, makes sense.


>
>
> I also looked at the refactoring design doc. Looks like a fine approach to
> me, but the API will need much more elaborate discussions. I am not sure if
> the interfaces as presented are enough to support everything that even
> heapam can do today.
>
>
> What features of heapam do you think could be unsupportable in this API?
> Maybe I've just missed them.
>

I was thinking about locking, bulk reading (like page-mode API) etc. While
you've added an API for vacuuming, we would probably also need an API to
collect dead tuples, pruning etc (not sure if that can be combined with
vacuum). Of course, these are probably very specific to current
implementation of heap/MVCC and not all storages will need that.


>
> I suggest refactoring, that will allow us to develop new heap-like access
> methods.
> For the first version, they must have methods to "heapify" tuple i.e turn
> internal
> representation of the data to regular HeapTuple, for example put some
> stubs into
> MVCC fields. Of course this approach has its disadvantages, such as
> performance issues.
> It definitely won't be enough to write column storage or to implement
> other great
> data structures. But it allows us not to depend of the Executor's code.
>
>
Ok, understood.


>
> - There are many upper modules that need access to system attributes
> (xmin, xmax for starters). How do you plan to handle that? You think we can
> provide enough abstraction so that the callers don't need to know the tuple
> structures of individual PAM?
>
>
> To be honest, I didn't thought about it.
> Do you mean external modules or upper levels of abstraction in Postgres?
>

I meant upper levels of abstraction like the executor. For example, while
inserting a new tuple, the executor (the index AM's insert routine to be
precise) may need to wait for another transaction to finish. Today, it can
easily get that information by looking at the xmax of the conflicting
tuple. How would we handle that with abstraction since not every PAM will
have a notion of xmax?

Thanks,
Pavan

 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Refactoring of heapam code.

2016-09-06 Thread Anastasia Lubennikova

06.09.2016 07:44, Pavan Deolasee:




On Mon, Aug 8, 2016 at 3:13 PM, Anastasia Lubennikova 
> 
wrote:




Thank you for the review, I'll fix these problems in final version.

Posting the first message I intended to raise the discussion. Patches
were attached mainly to illustrate the problem and to be more
concrete.


I started looking at the first patch posted above, but it seems you'll 
rewrite these patches after discussion on the heapam refactoring ideas 
you posted here https://wiki.postgresql.org/wiki/HeapamRefactoring 
 concludes.




Thank you for the review.


Some comments anyways on the first patch:

1. It does not apply cleanly with git-apply - many white space errors


I'll fix all merge issues and attach it to the following message.

2. I don't understand the difference 
between PageGetItemHeapHeaderOnly() and PageGetItemHeap(). They seem 
to do exactly the same thing and can be used interchangeably.


The only difference between these two macros is that
PageGetItemHeapHeaderOnly() doesn't touch any key fields of a tuple,
it only checks header fields (see HeapTupleHeaderData). I divided it into
two separate functions, while I was working on page compression and
I wanted to avoid unnecessary decompression calls. These names are
just a kind of 'markers' to make the code reading and improving easier.

3. While I like the idea of using separate interfaces to get 
heap/index tuple from a page, may be they can internally use a common 
interface instead of duplicating what PageGetItem() does already.


I don't sure I get it right. Do you suggest to leave PageGetItem and write
PageGetItemHeap() and PageGetItemIndex() as wrappers on it?
It sounds reasonable while we have similar layout for both heap and 
index pages.

In any case, it'll be easy to separate them when necessary.

4. Should we rename PageGetItemHeap() to PageGetHeapTuple() or 
something like that?


I don't feel like doing that, because HeapTuple is a different structure.
What we do get from page is a HeapTupleHeaderData structure
followed by user's data.

5. If we do this, we should probably have corresponding wrapper 
functions/macros for remaining callers of PageGetItem()



There are some calls of PageGetItem() left in BRIN and SP-gist indexes,
I can write simple wrappers for them.
I left them just because they were out of interest for my compression 
prototype.



I also looked at the refactoring design doc. Looks like a fine 
approach to me, but the API will need much more elaborate discussions. 
I am not sure if the interfaces as presented are enough to support 
everything that even heapam can do today.


What features of heapam do you think could be unsupportable in this API?
Maybe I've just missed them.

My point here is that heapam is too complicated for many simpler use-cases
read-only tables and append-only tables do not need many MVCC related tricks
like vacuum and complicated locking, tables with fixed len attributes 
can use

more optimal page layout. And so on.

I suggest refactoring, that will allow us to develop new heap-like 
access methods.
For the first version, they must have methods to "heapify" tuple i.e 
turn internal
representation of the data to regular HeapTuple, for example put some 
stubs into
MVCC fields. Of course this approach has its disadvantages, such as 
performance issues.
It definitely won't be enough to write column storage or to implement 
other great

data structures. But it allows us not to depend of the Executor's code.

And much more thoughts will be required to ensure we don't miss out 
things that new primary access method may need.



Do you have any particular ideas of new access methods?



A few points regarding how the proposed API maps to heapam:

- How do we support parallel scans on heap?


As far as I understand, parallel heap scan requires one more API function
heap_beginscan_parallel(). It uses the same API function - heap_getnext(),
but in addition to ordinary seq scan it selects page to read in a 
synchronized manner.



- Does the primary AM need to support locking of rows?


That's a good question. I'd say it should be an option.

- Would there be separate API to interpret the PAM tuple itself? 
(something that htup_details.h does today). It seems natural that 
every PAM will have its own interpretation of tuple structure and we 
would like to hide that inside PAM implementation.


As I already wrote, for the first prototype, I'd use function to "heapify"
tuple and don't care about executor issues at all. More detailed discussion
of this question you can find in another thread [1].

- There are many upper modules that need access to system attributes 
(xmin, xmax for starters). How do you plan to handle that? You think 
we can provide enough abstraction so that the callers don't need to 
know the tuple structures of individual PAM?


To be 

Re: [HACKERS] Refactoring of heapam code.

2016-09-05 Thread Pavan Deolasee
On Mon, Aug 8, 2016 at 3:13 PM, Anastasia Lubennikova <
a.lubennik...@postgrespro.ru> wrote:

>
>>
> Thank you for the review, I'll fix these problems in final version.
>
> Posting the first message I intended to raise the discussion. Patches
> were attached mainly to illustrate the problem and to be more concrete.
>

I started looking at the first patch posted above, but it seems you'll
rewrite these patches after discussion on the heapam refactoring ideas you
posted here https://wiki.postgresql.org/wiki/HeapamRefactoring concludes.

Some comments anyways on the first patch:

1. It does not apply cleanly with git-apply - many white space errors
2. I don't understand the difference between PageGetItemHeapHeaderOnly()
and PageGetItemHeap(). They seem to do exactly the same thing and can be
used interchangeably.
3. While I like the idea of using separate interfaces to get heap/index
tuple from a page, may be they can internally use a common interface
instead of duplicating what PageGetItem() does already.
4. Should we rename PageGetItemHeap() to PageGetHeapTuple() or something
like that?
5. If we do this, we should probably have corresponding wrapper
functions/macros for remaining callers of PageGetItem()

I also looked at the refactoring design doc. Looks like a fine approach to
me, but the API will need much more elaborate discussions. I am not sure if
the interfaces as presented are enough to support everything that even
heapam can do today. And much more thoughts will be required to ensure we
don't miss out things that new primary access method may need.

A few points regarding how the proposed API maps to heapam:

- How do we support parallel scans on heap?
- Does the primary AM need to support locking of rows?
- Would there be separate API to interpret the PAM tuple itself? (something
that htup_details.h does today). It seems natural that every PAM will have
its own interpretation of tuple structure and we would like to hide that
inside PAM implementation.
- There are many upper modules that need access to system attributes (xmin,
xmax for starters). How do you plan to handle that? You think we can
provide enough abstraction so that the callers don't need to know the tuple
structures of individual PAM?

I don't know what to do with the CF entry itself. I could change the status
to "waiting on author" but then the design draft may not get enough
attention. So I'm leaving it in the current state for others to look at.

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Refactoring of heapam code.

2016-08-15 Thread Anastasia Lubennikova

08.08.2016 12:43, Anastasia Lubennikova:

08.08.2016 03:51, Michael Paquier:
On Sat, Aug 6, 2016 at 2:56 AM, Alvaro Herrera 
 wrote:

Anastasia Lubennikova wrote:
So there is a couple of patches. They do not cover all mentioned 
problems,

but I'd like to get a feedback before continuing.

I agree that we could improve things in this general area, but I do not
endorse any particular changes you propose in these patches; I haven't
reviewed your patches.
Well, I am interested in the topic. And just had a look at the 
patches proposed.


+ /*
+ * Split PageGetItem into set of different macros
+ * in order to make code more readable.
+ */
+#define PageGetItemHeap(page, itemId) \
+( \
+   AssertMacro(PageIsValid(page)), \
+   AssertMacro(ItemIdHasStorage(itemId)), \
+   (HeapTupleHeader)(((char *)(page)) + ItemIdGetOffset(itemId)) \
+)
Placing into bufpage.h macros that are specific to heap and indexes is
not that much a good idea. I'd suggest having the heap ones in
heapam.h, and the index ones in a more generic place, as
src/access/common as already mentioned by Alvaro.

+   onpage_tup = PageGetItemHeapHeaderOnly(page, newitemid);
Just PageGetItemHeapHeader would be fine.

@@ -2324,7 +2087,6 @@ FreeBulkInsertState(BulkInsertState bistate)
 pfree(bistate);
  }

-
  /*
   * heap_insert - insert tuple into a heap
Those patches have some noise.

Patch 0002 is doing two things: reorganizing the order of the routines
in heapam.c and move some routines from heapam.c to hio.c. Those two
could be separate patches/

If the final result is to be able to extend custom AMs for tables, we
should have a structure like src/access/common/index.c,
src/access/common/table.c and src/access/common/relation.c for
example, and have headers dedicated to them. But yes, there is
definitely a lot of room for refactoring as we are aiming, at least I
guess so, at having more various code paths for access methods.


Thank you for the review, I'll fix these problems in final version.

Posting the first message I intended to raise the discussion. Patches
were attached mainly to illustrate the problem and to be more concrete.

Thank you for attention and feedback.
Since there are no objections to the idea in general, I'll write an 
exhaustive

README about current state of the code and then propose API design
to discuss details.

Stay tuned.



Here is the promised design draft.
https://wiki.postgresql.org/wiki/HeapamRefactoring

Looking forward to your comments.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring of heapam code.

2016-08-08 Thread Alvaro Herrera
Anastasia Lubennikova wrote:

> By the way, I thought about looking at the mentioned patch and
> probably reviewing it, but didn't find any of your patches on the
> current commitfest. Could you point out the thread?

Sorry, I haven't posted anything yet.

> >Agreed.  But changing its name while keeping it in heapam.c does not
> >really improve things enough.  I'd rather have it moved elsewhere that's
> >not specific to "heaps" (somewhere in access/common perhaps).  However,
> >renaming the functions would break third-party code for no good reason.
> >I propose that we only rename things if we also break it for other
> >reasons (say because the API changes in a fundamental way).
> 
> Yes, I agree that it should be moved to another file.
> Just to be more accurate: it's not in heapam.c now, it is in
> "src/backend/catalog/heap.c" which requires much more changes
> than I did.

Argh.  Clearly, the code organization in this area is not good at all.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring of heapam code.

2016-08-08 Thread Anastasia Lubennikova

08.08.2016 03:51, Michael Paquier:

On Sat, Aug 6, 2016 at 2:56 AM, Alvaro Herrera  wrote:

Anastasia Lubennikova wrote:

So there is a couple of patches. They do not cover all mentioned problems,
but I'd like to get a feedback before continuing.

I agree that we could improve things in this general area, but I do not
endorse any particular changes you propose in these patches; I haven't
reviewed your patches.

Well, I am interested in the topic. And just had a look at the patches proposed.

+ /*
+ * Split PageGetItem into set of different macros
+ * in order to make code more readable.
+ */
+#define PageGetItemHeap(page, itemId) \
+( \
+   AssertMacro(PageIsValid(page)), \
+   AssertMacro(ItemIdHasStorage(itemId)), \
+   (HeapTupleHeader)(((char *)(page)) + ItemIdGetOffset(itemId)) \
+)
Placing into bufpage.h macros that are specific to heap and indexes is
not that much a good idea. I'd suggest having the heap ones in
heapam.h, and the index ones in a more generic place, as
src/access/common as already mentioned by Alvaro.

+   onpage_tup = PageGetItemHeapHeaderOnly(page, newitemid);
Just PageGetItemHeapHeader would be fine.

@@ -2324,7 +2087,6 @@ FreeBulkInsertState(BulkInsertState bistate)
 pfree(bistate);
  }

-
  /*
   * heap_insert - insert tuple into a heap
Those patches have some noise.

Patch 0002 is doing two things: reorganizing the order of the routines
in heapam.c and move some routines from heapam.c to hio.c. Those two
could be separate patches/

If the final result is to be able to extend custom AMs for tables, we
should have a structure like src/access/common/index.c,
src/access/common/table.c and src/access/common/relation.c for
example, and have headers dedicated to them. But yes, there is
definitely a lot of room for refactoring as we are aiming, at least I
guess so, at having more various code paths for access methods.


Thank you for the review, I'll fix these problems in final version.

Posting the first message I intended to raise the discussion. Patches
were attached mainly to illustrate the problem and to be more concrete.

Thank you for attention and feedback.
Since there are no objections to the idea in general, I'll write an 
exhaustive

README about current state of the code and then propose API design
to discuss details.

Stay tuned.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring of heapam code.

2016-08-08 Thread Anastasia Lubennikova

05.08.2016 20:56, Alvaro Herrera:

Anastasia Lubennikova wrote:

Working on page compression and some other issues related to
access methods, I found out that the code related to heap
looks too complicated. Much more complicated, than it should be.
Since I anyway got into this area, I want to suggest a set of improvements.

Hm.  I'm hacking on heapam.c pretty heavily ATM.  Your first patch
causes no problem I think, or if it does it's probably pretty minor, so
that's okay.  I'm unsure about the second one -- I think it's okay too,
because it mostly seems to be about moving stuff from heapam.c to hio.c
and shuffling some code around that I don't think I'm modifying.


I'm sure these patches will not cause any problems. The second one is
mostly about moving unrelated stuff from heapam.c to hio.c and renaming
couple of functions in heap.c.
Anyway, the are not a final solution just proof of concept.

By the way, I thought about looking at the mentioned patch and
probably reviewing it, but didn't find any of your patches on the
current commitfest. Could you point out the thread?


Also I think that it's quite strange to have a function heap_create(), that
is called
from index_create(). It has absolutely nothing to do with heap access
method.

Agreed.  But changing its name while keeping it in heapam.c does not
really improve things enough.  I'd rather have it moved elsewhere that's
not specific to "heaps" (somewhere in access/common perhaps).  However,
renaming the functions would break third-party code for no good reason.
I propose that we only rename things if we also break it for other
reasons (say because the API changes in a fundamental way).


Yes, I agree that it should be moved to another file.
Just to be more accurate: it's not in heapam.c now, it is in
"src/backend/catalog/heap.c" which requires much more changes
than I did.

Concerning to the third-party code. It's a good observation and we
definitely should keep it in mind. Nevertheless, I doubt that there's a 
lot of

external calls to these functions. And I hope this thread will end up with
fundamental changes introducing clear API for all mentioned problems.




One more thing that requires refactoring is "RELKIND_RELATION" name.
We have a type of relation called "relation"...

I don't see us fixing this bit, really.  Historical baggage and all
that.  For example, a lot of SQL queries use "WHERE relkind = 'r'".  If
we change the name, we should probably also change the relkind constant,
and that would break the queries.  If we change the name and do not
change the constant, then we have to have a comment "we call them
RELKIND_TABLE but the char is r because it was called RELATION
previously", which isn't any better.


Good point.
I'd rather change it to RELKIND_BASIC_RELATION or something like that,
which is more specific and still goes well with 'r' constant. But minor 
changes

like that can wait until we clarify the API in general.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring of heapam code.

2016-08-07 Thread Michael Paquier
On Sat, Aug 6, 2016 at 2:56 AM, Alvaro Herrera  wrote:
> Anastasia Lubennikova wrote:
>> So there is a couple of patches. They do not cover all mentioned problems,
>> but I'd like to get a feedback before continuing.
>
> I agree that we could improve things in this general area, but I do not
> endorse any particular changes you propose in these patches; I haven't
> reviewed your patches.

Well, I am interested in the topic. And just had a look at the patches proposed.

+ /*
+ * Split PageGetItem into set of different macros
+ * in order to make code more readable.
+ */
+#define PageGetItemHeap(page, itemId) \
+( \
+   AssertMacro(PageIsValid(page)), \
+   AssertMacro(ItemIdHasStorage(itemId)), \
+   (HeapTupleHeader)(((char *)(page)) + ItemIdGetOffset(itemId)) \
+)
Placing into bufpage.h macros that are specific to heap and indexes is
not that much a good idea. I'd suggest having the heap ones in
heapam.h, and the index ones in a more generic place, as
src/access/common as already mentioned by Alvaro.

+   onpage_tup = PageGetItemHeapHeaderOnly(page, newitemid);
Just PageGetItemHeapHeader would be fine.

@@ -2324,7 +2087,6 @@ FreeBulkInsertState(BulkInsertState bistate)
pfree(bistate);
 }

-
 /*
  * heap_insert - insert tuple into a heap
Those patches have some noise.

Patch 0002 is doing two things: reorganizing the order of the routines
in heapam.c and move some routines from heapam.c to hio.c. Those two
could be separate patches/

If the final result is to be able to extend custom AMs for tables, we
should have a structure like src/access/common/index.c,
src/access/common/table.c and src/access/common/relation.c for
example, and have headers dedicated to them. But yes, there is
definitely a lot of room for refactoring as we are aiming, at least I
guess so, at having more various code paths for access methods.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring of heapam code.

2016-08-05 Thread Alvaro Herrera
Anastasia Lubennikova wrote:
> Working on page compression and some other issues related to
> access methods, I found out that the code related to heap
> looks too complicated. Much more complicated, than it should be.
> Since I anyway got into this area, I want to suggest a set of improvements.

Hm.  I'm hacking on heapam.c pretty heavily ATM.  Your first patch
causes no problem I think, or if it does it's probably pretty minor, so
that's okay.  I'm unsure about the second one -- I think it's okay too,
because it mostly seems to be about moving stuff from heapam.c to hio.c
and shuffling some code around that I don't think I'm modifying.

> Also I think that it's quite strange to have a function heap_create(), that
> is called
> from index_create(). It has absolutely nothing to do with heap access
> method.

Agreed.  But changing its name while keeping it in heapam.c does not
really improve things enough.  I'd rather have it moved elsewhere that's
not specific to "heaps" (somewhere in access/common perhaps).  However,
renaming the functions would break third-party code for no good reason.
I propose that we only rename things if we also break it for other
reasons (say because the API changes in a fundamental way).

> One more thing that requires refactoring is "RELKIND_RELATION" name.
> We have a type of relation called "relation"...

I don't see us fixing this bit, really.  Historical baggage and all
that.  For example, a lot of SQL queries use "WHERE relkind = 'r'".  If
we change the name, we should probably also change the relkind constant,
and that would break the queries.  If we change the name and do not
change the constant, then we have to have a comment "we call them
RELKIND_TABLE but the char is r because it was called RELATION
previously", which isn't any better.

> So there is a couple of patches. They do not cover all mentioned problems,
> but I'd like to get a feedback before continuing.

I agree that we could improve things in this general area, but I do not
endorse any particular changes you propose in these patches; I haven't
reviewed your patches.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring of heapam code.

2016-08-05 Thread Anastasia Lubennikova

05.08.2016 16:30, Kevin Grittner:

On Fri, Aug 5, 2016 at 2:54 AM, Anastasia Lubennikova
 wrote:


They can be logically separated into three categories:
"primary storage" - r, S, t, v. They store data and visibility information.
The only implementation is heapam.c
"secondary index" - i. They store some data and pointers to primary storage.
Various existing AMs and managed by AM API.
"virtual relations" - c, f, m. They have no physical storage, only entries
in caches and catalogs.

Materialized views (relkind == "m") have physical storage, and may have indexes.


Good point. I сonfused letters for views (virtual relations) and
materialized views (primary storage).
Should be:

"primary storage" - r, S, t, m. They store data and visibility information.
The only implementation is heapam.c
"secondary index" - i. They store some data and pointers to primary storage.
Various existing AMs and managed by AM API.
"virtual relations" - c, f, v. They have no physical storage, only entries
in caches and catalogs.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring of heapam code.

2016-08-05 Thread Kevin Grittner
On Fri, Aug 5, 2016 at 2:54 AM, Anastasia Lubennikova
 wrote:

> They can be logically separated into three categories:
> "primary storage" - r, S, t, v. They store data and visibility information.
> The only implementation is heapam.c
> "secondary index" - i. They store some data and pointers to primary storage.
> Various existing AMs and managed by AM API.
> "virtual relations" - c, f, m. They have no physical storage, only entries
> in caches and catalogs.

Materialized views (relkind == "m") have physical storage, and may have indexes.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring of heapam code.

2016-08-05 Thread Thom Brown
On 5 August 2016 at 08:54, Anastasia Lubennikova
 wrote:
> Working on page compression and some other issues related to
> access methods, I found out that the code related to heap
> looks too complicated. Much more complicated, than it should be.
> Since I anyway got into this area, I want to suggest a set of improvements.
>
> There is a number of problems I see in the existing code:
> Problem 1. Heap != Relation.
>
> File heapam.c has a note:
>  *  This file contains the heap_ routines which implement
>  *  the POSTGRES heap access method used for all POSTGRES
>  *  relations.
> This statement is wrong, since we also have index relations,
> that are implemented using other access methods.
>
> Also I think that it's quite strange to have a function heap_create(), that
> is called
> from index_create(). It has absolutely nothing to do with heap access
> method.
>
> And so on, and so on.
>
> One more thing that requires refactoring is "RELKIND_RELATION" name.
> We have a type of relation called "relation"...
>
> Problem 2.
> Some functions are shared between heap and indexes access methods.
> Maybe someday it used to be handy, but now, I think, it's time to separate
> them, because IndexTuples and HeapTuples have really little in common.
>
> Problem 3. The word "heap" is used in many different contexts.
> Heap - a storage where we have tuples and visibility information.
> Generally speaking, it can be implemented over any data structure,
> not just a plain table as it is done now.
>
> Heap - an access method, that provides a set of routines for plain table
> storage, such as insert, delete, update, gettuple, vacuum and so on.
> We use this access method not only for user's tables.
>
> There are many types of relations (pg_class.h):
> #define  RELKIND_RELATION'r'/* ordinary table */
> #define  RELKIND_INDEX   'i'/* secondary index */
> #define  RELKIND_SEQUENCE'S'/* sequence object */
> #define  RELKIND_TOASTVALUE  't'/* for out-of-line
> values */
> #define  RELKIND_VIEW'v'/* view */
> #define  RELKIND_COMPOSITE_TYPE  'c'/* composite type */
> #define  RELKIND_FOREIGN_TABLE   'f'/* foreign table */
> #define  RELKIND_MATVIEW 'm'/* materialized view */
>
> They can be logically separated into three categories:
> "primary storage" - r, S, t, v. They store data and visibility information.
> The only implementation is heapam.c
> "secondary index" - i. They store some data and pointers to primary storage.
> Various existing AMs and managed by AM API.
> "virtual relations" - c, f, m. They have no physical storage, only entries
> in caches and catalogs.
>
> Now, for some reasons, we sometimes use name "heap" for both
> "primary storage" and "virual relations". See src/backend/catalog/heap.c.
> But some functions work only with "primary storage". See pgstat_relation().
> And we constantly have to check relkind everywhere.
> I'd like it would be clear from the code interface and naming.
>
> So there is a couple of patches. They do not cover all mentioned problems,
> but I'd like to get a feedback before continuing.
>
> Patch 1:
> There is a macro "PageGetItem" in bufpage.h that is used to retrieve an item
> from the given page. It is used for both heap and index tuples.
> It doesn't seems a problem, until you are going to find anything in this
> code.
>
> The first patch "PageGetItem_refactoring.patch" is intended to fix it.
> It is just renaming:
> (IndexTuple) PageGetItem ---> PageGetItemIndex
> (HeapTupleHeader) PageGetItem ---> PageGetItemHeap
> Other types of tuples (BrinTuple, SpGistInnerTuple, SpGistLeafTuple,
> SpGistDeadTuple)
> still use PageGetItem.
>
> I also changed functions that do not access tuple's data, but only need
> HeapTupleHeader fields. They use a macro PageGetItemHeapHeaderOnly.
>
> I do not insist on these particular names, by the way.
>
> Patch 2.
> heapam.c, hio.c and src/backend/catalog/heap.c have a lot of confusing names
> and outdated comments. The patch is intended to improve it.
> Fun fact: I found several "check it later" comments unchanged since
>  "Postgres95 1.01 Distribution - Virgin Sources" commit.
>
> I wonder if we can wind better name relation_drop_with_catalog() since,
> again, it's
> not about all kinds of relations. We use another function to drop index
> relations.
>
> I hope these changes will be useful for the future development.
> Suggested patches are mostly about renaming and rearrangement of functions
> between files, so, if nobody has conceptual objections, all I need from
> reviewers
> is an attentive look to find typos, grammar mistakes and overlooked areas.

Could you add this to the commitfest?

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

[HACKERS] Refactoring of heapam code.

2016-08-05 Thread Anastasia Lubennikova

Working on page compression and some other issues related to
access methods, I found out that the code related to heap
looks too complicated. Much more complicated, than it should be.
Since I anyway got into this area, I want to suggest a set of improvements.

There is a number of problems I see in the existing code:
Problem 1. Heap != Relation.

File heapam.c has a note:
 *  This file contains the heap_ routines which implement
 *  the POSTGRES heap access method used for all POSTGRES
 *  relations.
This statement is wrong, since we also have index relations,
that are implemented using other access methods.

Also I think that it's quite strange to have a function heap_create(), 
that is called
from index_create(). It has absolutely nothing to do with heap access 
method.


And so on, and so on.

One more thing that requires refactoring is "RELKIND_RELATION" name.
We have a type of relation called "relation"...

Problem 2.
Some functions are shared between heap and indexes access methods.
Maybe someday it used to be handy, but now, I think, it's time to separate
them, because IndexTuples and HeapTuples have really little in common.

Problem 3. The word "heap" is used in many different contexts.
Heap - a storage where we have tuples and visibility information.
Generally speaking, it can be implemented over any data structure,
not just a plain table as it is done now.

Heap - an access method, that provides a set of routines for plain table
storage, such as insert, delete, update, gettuple, vacuum and so on.
We use this access method not only for user's tables.

There are many types of relations (pg_class.h):
#define  RELKIND_RELATION'r'/* ordinary table */
#define  RELKIND_INDEX   'i'/* secondary index */
#define  RELKIND_SEQUENCE'S'/* sequence object */
#define  RELKIND_TOASTVALUE  't'/* for out-of-line 
values */

#define  RELKIND_VIEW'v'/* view */
#define  RELKIND_COMPOSITE_TYPE  'c'/* composite type */
#define  RELKIND_FOREIGN_TABLE   'f'/* foreign table */
#define  RELKIND_MATVIEW 'm'/* materialized view */

They can be logically separated into three categories:
"primary storage" - r, S, t, v. They store data and visibility information.
The only implementation is heapam.c
"secondary index" - i. They store some data and pointers to primary storage.
Various existing AMs and managed by AM API.
"virtual relations" - c, f, m. They have no physical storage, only entries
in caches and catalogs.

Now, for some reasons, we sometimes use name "heap" for both
"primary storage" and "virual relations". See src/backend/catalog/heap.c.
But some functions work only with "primary storage". See pgstat_relation().
And we constantly have to check relkind everywhere.
I'd like it would be clear from the code interface and naming.

So there is a couple of patches. They do not cover all mentioned problems,
but I'd like to get a feedback before continuing.

Patch 1:
There is a macro "PageGetItem" in bufpage.h that is used to retrieve an item
from the given page. It is used for both heap and index tuples.
It doesn't seems a problem, until you are going to find anything in this 
code.


The first patch "PageGetItem_refactoring.patch" is intended to fix it.
It is just renaming:
(IndexTuple) PageGetItem ---> PageGetItemIndex
(HeapTupleHeader) PageGetItem ---> PageGetItemHeap
Other types of tuples (BrinTuple, SpGistInnerTuple, SpGistLeafTuple, 
SpGistDeadTuple)

still use PageGetItem.

I also changed functions that do not access tuple's data, but only need
HeapTupleHeader fields. They use a macro PageGetItemHeapHeaderOnly.

I do not insist on these particular names, by the way.

Patch 2.
heapam.c, hio.c and src/backend/catalog/heap.c have a lot of confusing names
and outdated comments. The patch is intended to improve it.
Fun fact: I found several "check it later" comments unchanged since
 "Postgres95 1.01 Distribution - Virgin Sources" commit.

I wonder if we can wind better name relation_drop_with_catalog() since, 
again, it's
not about all kinds of relations. We use another function to drop index 
relations.


I hope these changes will be useful for the future development.
Suggested patches are mostly about renaming and rearrangement of functions
between files, so, if nobody has conceptual objections, all I need from 
reviewers

is an attentive look to find typos, grammar mistakes and overlooked areas.

--
Anastasia Lubennikova
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/pageinspect/btreefuncs.c b/contrib/pageinspect/btreefuncs.c
index 4983bbb..257b609 100644
--- a/contrib/pageinspect/btreefuncs.c
+++ b/contrib/pageinspect/btreefuncs.c
@@ -130,7 +130,7 @@ GetBTPageStatistics(BlockNumber blkno, Buffer buffer, BTPageStat *stat)
 
 		ItemId		id = PageGetItemId(page, off);
 
-		itup = 

Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2016-04-08 Thread Robert Haas
On Tue, Apr 5, 2016 at 9:47 PM, Robert Haas  wrote:
>> I do not pursue something like this without good reason. I'm
>> optimistic that the patch will be accepted if it is carefully
>> considered.
>
> This patch has attracted no reviewers.  Any volunteers?

Since nobody has emerged to review this, I have moved it to the next CommitFest.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2016-04-05 Thread Robert Haas
On Wed, Mar 16, 2016 at 7:43 PM, Peter Geoghegan  wrote:
> On Wed, Mar 16, 2016 at 11:25 AM, Robert Haas  wrote:
>> Sure, and if everybody does that, then there will be 40 patches that
>> get updated in the last 2 days if the CommitFest, and that will be
>> impossible.  Come on.  You're demanding a degree of preferential
>> treatment which is unsupportable.
>
> It's unexpected that an entirely maintenance-orientated patch like
> this would be received this way. I'm not demanding anything, or
> applying any real pressure. Let's just get on with it.
>
> I attach a revision, that makes all the changes that Heikki suggested,
> except one. As already noted several times, following this suggestion
> would have added a bug. Alvaro preferred my original approach here in
> any case. I refer to my original approach of making the new
> UNIQUE_CHECK_SPECULATIVE case minimally different from the existing
> UNIQUE_CHECK_PARTIAL case currently used for deferred unique
> constraints and speculative insertion, as opposed to making the new
> UNIQUE_CHECK_SPECULATIVE "like CHECK_UNIQUE_YES, but return FALSE
> instead of throwing an error on conflict". That was broken because
> CHECK_UNIQUE_YES waits for the outcome of an xact, which
> UNIQUE_CHECK_PARTIAL never does, and so UNIQUE_CHECK_SPECULATIVE must
> never do.
>
> Any and all waits happen in the first phase of speculative insertion,
> and never the seconds. I could give a complicated explanation for why,
> involving a deadlock scenario, but a simple explanation will do: it
> has always worked that way, and was tested to work that way.
>
> Feedback from Heikki led to these changes for this revision:
>
> * The use of arguments within ExecInsert() was simplified.
>
> * More concise AM documentation.
>
> The docs essentially describe two new concepts:
>
> - What unique index insertion needs to know about speculative
> insertion in general. This doesn't just apply to speculative inserters
> themselves, of course.
>
> - What speculative insertion is. Why it exists (why we don't just wait
> on xact). In other words, what "unprincipled deadlocks" are, and how
> they are avoided (from a relatively high level).
>
> I feel like I have a responsibility to make sure that this mechanism
> is well documented, especially given that not that many people were
> involved in its design. It's possible that no more than the 3 original
> authors of UPSERT fully understand speculative insertion -- it's easy
> to miss some of the subtleties.
>
> I do not pursue something like this without good reason. I'm
> optimistic that the patch will be accepted if it is carefully
> considered.

This patch has attracted no reviewers.  Any volunteers?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2016-03-19 Thread Peter Geoghegan
On Wed, Mar 16, 2016 at 11:25 AM, Robert Haas  wrote:
> Sure, and if everybody does that, then there will be 40 patches that
> get updated in the last 2 days if the CommitFest, and that will be
> impossible.  Come on.  You're demanding a degree of preferential
> treatment which is unsupportable.

It's unexpected that an entirely maintenance-orientated patch like
this would be received this way. I'm not demanding anything, or
applying any real pressure. Let's just get on with it.

I attach a revision, that makes all the changes that Heikki suggested,
except one. As already noted several times, following this suggestion
would have added a bug. Alvaro preferred my original approach here in
any case. I refer to my original approach of making the new
UNIQUE_CHECK_SPECULATIVE case minimally different from the existing
UNIQUE_CHECK_PARTIAL case currently used for deferred unique
constraints and speculative insertion, as opposed to making the new
UNIQUE_CHECK_SPECULATIVE "like CHECK_UNIQUE_YES, but return FALSE
instead of throwing an error on conflict". That was broken because
CHECK_UNIQUE_YES waits for the outcome of an xact, which
UNIQUE_CHECK_PARTIAL never does, and so UNIQUE_CHECK_SPECULATIVE must
never do.

Any and all waits happen in the first phase of speculative insertion,
and never the seconds. I could give a complicated explanation for why,
involving a deadlock scenario, but a simple explanation will do: it
has always worked that way, and was tested to work that way.

Feedback from Heikki led to these changes for this revision:

* The use of arguments within ExecInsert() was simplified.

* More concise AM documentation.

The docs essentially describe two new concepts:

- What unique index insertion needs to know about speculative
insertion in general. This doesn't just apply to speculative inserters
themselves, of course.

- What speculative insertion is. Why it exists (why we don't just wait
on xact). In other words, what "unprincipled deadlocks" are, and how
they are avoided (from a relatively high level).

I feel like I have a responsibility to make sure that this mechanism
is well documented, especially given that not that many people were
involved in its design. It's possible that no more than the 3 original
authors of UPSERT fully understand speculative insertion -- it's easy
to miss some of the subtleties.

I do not pursue something like this without good reason. I'm
optimistic that the patch will be accepted if it is carefully
considered.

-- 
Peter Geoghegan
From 2b2a4c40a5e60ac1f28a75f11204ce88eb48cc73 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan 
Date: Tue, 2 Jun 2015 17:34:16 -0700
Subject: [PATCH] Refactor speculative insertion into unique indexes

Add a dedicated IndexUniqueCheck constant for the speculative insertion
case, UNIQUE_CHECK_SPECULATIVE, rather than reusing
UNIQUE_CHECK_PARTIAL, which should now only be used for deferrable
unique constraints.

This change allows btinsert() (and, in principle, any amcanunique
aminsert function) to avoid physically inserting an IndexTuple in the
event of detecting a conflict during speculative insertion's second
phase.  With nbtree, this avoidance now occurs at the critical point in
_bt_doinsert() immediately after establishing that there is a conflict,
but immediately before actually calling _bt_insertonpg() to proceed with
physical IndexTuple insertion.

At that point during UNIQUE_CHECK_PARTIAL insertion it makes sense to
soldier on, because the possibility remains that the conflict will later
go away and everything will happen to work out because the conflicting
insertion's transaction aborted.  Speculative inserters, in contrast,
have no chance of working out a way to proceed without first deleting
the would-be-pointed-to heap tuple already physically inserted.  For the
current row proposed for insertion, no useful progress will have been
made at this point.

This patch has nothing to do with performance; it is intended to clarify
how amcanunique AMs perform speculative insertion, and the general
theory of operation.  It is natural to avoid an unnecessary index tuple
insertion.  That that could happen before was quite misleading, because
it implied that it was necessary, and didn't acknowledge the differing
requirements in each case.
---
 doc/src/sgml/indexam.sgml  | 101 ++---
 src/backend/access/nbtree/nbtinsert.c  |  49 +---
 src/backend/executor/execIndexing.c|  34 +--
 src/backend/executor/nodeModifyTable.c |   2 +-
 src/include/access/genam.h |   8 +++
 5 files changed, 148 insertions(+), 46 deletions(-)

diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index 5f7befb..1b26dd0 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -271,10 +271,13 @@ aminsert (Relation indexRelation,
 
   
The function's Boolean result value is significant only when
-   checkUnique is 

Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2016-03-19 Thread Robert Haas
On Mon, Mar 14, 2016 at 8:17 PM, Peter Geoghegan  wrote:
> On Mon, Mar 14, 2016 at 5:04 PM, Robert Haas  wrote:
>> There hasn't been a new version of this patch in 9 months, you're
>> clearly not in a hurry to produce one, and nobody else seems to feel
>> strongly that this is something that needs to be done at all.  I think
>> we could just let this go and be just fine, but instead of doing that
>> and moving onto patches that people do feel strongly about, we're
>> arguing about this.  Bummer.
>
> I'm busy working on fixing an OpenSSL bug affecting all released
> versions right at the moment, but have a number of complex 9.6 patches
> to review, most of which are in need of support. I'm very busy.
>
> I said that I'd get to this patch soon. I might be kicking the can
> down the road a little with this patch; if so, I'm sorry. I suggest we
> leave it at that, until the CF is almost over or until I produce a
> revision.

Sure, and if everybody does that, then there will be 40 patches that
get updated in the last 2 days if the CommitFest, and that will be
impossible.  Come on.  You're demanding a degree of preferential
treatment which is unsupportable.  You're also assuming that anybody's
going to be willing to commit that revision when you produce it, which
looks to me like an unproven hypothesis.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2016-03-14 Thread Peter Geoghegan
On Mon, Mar 14, 2016 at 5:21 PM, Andres Freund  wrote:
> So? You're not the only one. I don't see why we shouldn't move this to
> 'returned with feedback' until there's a new version.

I don't see any point in that; I intend to get a revision in to the
ongoing CF. But fine.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2016-03-14 Thread Andres Freund
On 2016-03-14 17:17:02 -0700, Peter Geoghegan wrote:
> On Mon, Mar 14, 2016 at 5:04 PM, Robert Haas  wrote:
> > There hasn't been a new version of this patch in 9 months, you're
> > clearly not in a hurry to produce one, and nobody else seems to feel
> > strongly that this is something that needs to be done at all.  I think
> > we could just let this go and be just fine, but instead of doing that
> > and moving onto patches that people do feel strongly about, we're
> > arguing about this.  Bummer.
> 
> I'm busy working on fixing an OpenSSL bug affecting all released
> versions right at the moment, but have a number of complex 9.6 patches
> to review, most of which are in need of support. I'm very busy.

So? You're not the only one. I don't see why we shouldn't move this to
'returned with feedback' until there's a new version.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2016-03-14 Thread Peter Geoghegan
On Mon, Mar 14, 2016 at 5:04 PM, Robert Haas  wrote:
> There hasn't been a new version of this patch in 9 months, you're
> clearly not in a hurry to produce one, and nobody else seems to feel
> strongly that this is something that needs to be done at all.  I think
> we could just let this go and be just fine, but instead of doing that
> and moving onto patches that people do feel strongly about, we're
> arguing about this.  Bummer.

I'm busy working on fixing an OpenSSL bug affecting all released
versions right at the moment, but have a number of complex 9.6 patches
to review, most of which are in need of support. I'm very busy.

I said that I'd get to this patch soon. I might be kicking the can
down the road a little with this patch; if so, I'm sorry. I suggest we
leave it at that, until the CF is almost over or until I produce a
revision.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2016-03-14 Thread Robert Haas
On Sat, Mar 12, 2016 at 5:53 PM, Peter Geoghegan  wrote:
> On Sat, Mar 12, 2016 at 2:43 PM, Michael Paquier
>  wrote:
>> Only one version of this patch has been sent at the beginning of this
>> thread, and Heikki has clearly expressed his disagreement about at
>> least a portion of it at the beginning of this thread, so I find it
>> hard to define it as an "uncontroversial" thing and something that is
>> clear to have as things stand. Seeing a new version soon would be a
>> good next step I guess.
>
> What is the point in saying this, Michael? What purpose does it serve?

Gee, I think Michael is right on target.  What purpose does writing
him an email that sounds annoyed serve?

There hasn't been a new version of this patch in 9 months, you're
clearly not in a hurry to produce one, and nobody else seems to feel
strongly that this is something that needs to be done at all.  I think
we could just let this go and be just fine, but instead of doing that
and moving onto patches that people do feel strongly about, we're
arguing about this.  Bummer.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2016-03-12 Thread Peter Geoghegan
On Sat, Mar 12, 2016 at 2:53 PM, Peter Geoghegan  wrote:
> I said "basically uncontroversial", not "uncontroversial". That is a
> perfectly accurate characterization of the patch, and if you disagree
> than I suggest you re-read the thread.

In particular, note that Alvaro eventually sided with me against the
thing that Heikki argued for:

http://www.postgresql.org/message-id/20160118195643.GA117199@alvherre.pgsql

Describing what happened that way is unfair on Heikki, because I don't
think he was at all firm in what he said about making the new
UNIQUE_CHECK_SPECULATIVE "like CHECK_UNIQUE_YES, but return FALSE
instead of throwing an error on conflict". We were working through the
design, and it didn't actually come to any kind of impasse.

It's surprising and disappointing to me that this supposed
disagreement has been blown out of all proportion.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2016-03-12 Thread Peter Geoghegan
On Sat, Mar 12, 2016 at 2:43 PM, Michael Paquier
 wrote:
> Only one version of this patch has been sent at the beginning of this
> thread, and Heikki has clearly expressed his disagreement about at
> least a portion of it at the beginning of this thread, so I find it
> hard to define it as an "uncontroversial" thing and something that is
> clear to have as things stand. Seeing a new version soon would be a
> good next step I guess.

What is the point in saying this, Michael? What purpose does it serve?

I said "basically uncontroversial", not "uncontroversial". That is a
perfectly accurate characterization of the patch, and if you disagree
than I suggest you re-read the thread. Andres and Heikki were both in
favor of this patch. Heikki and I discussed one particular aspect of
it, and then it trailed off. The only thing that Heikki categorically
stated was that he disliked one narrow aspect of the style of one
thing in one function. I've already said I'm happy to do that.

As things stand, the documentation for amcanunique methods, and the
way they are described internally, is fairly misleading.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2016-03-12 Thread Michael Paquier
On Sat, Mar 12, 2016 at 11:24 PM, Peter Geoghegan  wrote:
> On Fri, Mar 11, 2016 at 5:26 AM, Robert Haas  wrote:
>> This patch was reviewed during CF 2016-01 and has not been updated for
>> CF 2016-03.  I think we should mark it Returned with Feedback.
>
> I have a full plate at the moment, Robert, both as a reviewer and as a
> patch author. This patch is basically uncontroversial, and is built to
> make the AM interface clearer, and the design of speculative insertion
> easier to understand. It's clear we should have it. I'll get around to
> revising it before too long.

Only one version of this patch has been sent at the beginning of this
thread, and Heikki has clearly expressed his disagreement about at
least a portion of it at the beginning of this thread, so I find it
hard to define it as an "uncontroversial" thing and something that is
clear to have as things stand. Seeing a new version soon would be a
good next step I guess.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2016-03-12 Thread Peter Geoghegan
On Fri, Mar 11, 2016 at 5:26 AM, Robert Haas  wrote:
> This patch was reviewed during CF 2016-01 and has not been updated for
> CF 2016-03.  I think we should mark it Returned with Feedback.

I have a full plate at the moment, Robert, both as a reviewer and as a
patch author. This patch is basically uncontroversial, and is built to
make the AM interface clearer, and the design of speculative insertion
easier to understand. It's clear we should have it. I'll get around to
revising it before too long.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2016-03-11 Thread Robert Haas
On Mon, Jan 18, 2016 at 3:14 PM, Peter Geoghegan  wrote:
> On Mon, Jan 18, 2016 at 11:56 AM, Alvaro Herrera
>  wrote:
>>> That's because I believe this is quite broken, as already pointed out.
>>
>> I think I like your approach better.
>
> That makes things far simpler, then.
>
>>> Your premise here is that what Heikki said in passing months ago is
>>> incontrovertibly the right approach. That's ridiculous. I think Heikki
>>> and I could work this out quite quickly, if he engaged, but for
>>> whatever reason he appears unable to. I doubt that Heikki thinks that
>>> about what he said, so why do you?
>>
>> I don't -- I just think you could have sent a patch that addressed all
>> the other points, leave this one as initially submitted, and note that
>> the new submission left it unaddressed because you disagreed.
>
> I'll try to do that soon. I've got a very busy schedule over the next
> couple of weeks, though.

This patch was reviewed during CF 2016-01 and has not been updated for
CF 2016-03.  I think we should mark it Returned with Feedback.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2016-01-18 Thread Alvaro Herrera
Peter Geoghegan wrote:
> On Fri, Dec 18, 2015 at 12:55 PM, Robert Haas  wrote:

> > I only see one patch version on the thread.
> 
> I'm not going to post a revision until I thrash out the tiny issues
> with Heikki. He kind of trailed off. So maybe that kills it
> immediately, which is a shame.

If you refuse to post an updated version of the patch until Heikki
weighs in some more, and given that Heikki has (for the purposes of this
patch) completely vanished, I think we should mark this rejected.

If somebody else is open to reviewing the patch, I think that'd be
another way to move forward, but presumably they would start from a
version with the discussed changes already fixed.  Otherwise it's a
waste of time.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2016-01-18 Thread Peter Geoghegan
On Mon, Jan 18, 2016 at 8:36 AM, Alvaro Herrera
 wrote:
> If you refuse to post an updated version of the patch until Heikki
> weighs in some more, and given that Heikki has (for the purposes of this
> patch) completely vanished, I think we should mark this rejected.

I don't refuse. I just don't want to waste anyone's time. I will
follow all of Heikki's feedback immediately, except this:

"I think it'd be better to define it as "like CHECK_UNIQUE_YES, but
return FALSE instead of throwing an error on conflict". The difference
is that the aminsert would not be allowed to return FALSE when there
is no conflict".

That's because I believe this is quite broken, as already pointed out.

> If somebody else is open to reviewing the patch, I think that'd be
> another way to move forward, but presumably they would start from a
> version with the discussed changes already fixed.  Otherwise it's a
> waste of time.

Your premise here is that what Heikki said in passing months ago is
incontrovertibly the right approach. That's ridiculous. I think Heikki
and I could work this out quite quickly, if he engaged, but for
whatever reason he appears unable to. I doubt that Heikki thinks that
about what he said, so why do you?

The point about CHECK_UNIQUE_YES I highlighted above felt like a
temporary misunderstanding to me, and not even what you might call a
real disagreement. It wasn't as if Heikki was insistent at the time. I
pointed out that what he said was broken according to an established
definition of broken (it would result in unprincipled deadlocks). He
didn't respond to that point. I think he didn't get back quickly in
part because I gave him something to think about.

If any other committer wants to engage with me on this, then I will of
course work with them. But that will not be predicated on my first
revising the patch in a way that this other committer does not
understand. That would be profoundly unfair.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2016-01-18 Thread Alvaro Herrera
Peter Geoghegan wrote:
> On Mon, Jan 18, 2016 at 8:36 AM, Alvaro Herrera
>  wrote:
> > If you refuse to post an updated version of the patch until Heikki
> > weighs in some more, and given that Heikki has (for the purposes of this
> > patch) completely vanished, I think we should mark this rejected.
> 
> I don't refuse. I just don't want to waste anyone's time. I will
> follow all of Heikki's feedback immediately, except this:
> 
> "I think it'd be better to define it as "like CHECK_UNIQUE_YES, but
> return FALSE instead of throwing an error on conflict". The difference
> is that the aminsert would not be allowed to return FALSE when there
> is no conflict".
> 
> That's because I believe this is quite broken, as already pointed out.

I think I like your approach better.

> > If somebody else is open to reviewing the patch, I think that'd be
> > another way to move forward, but presumably they would start from a
> > version with the discussed changes already fixed.  Otherwise it's a
> > waste of time.
> 
> Your premise here is that what Heikki said in passing months ago is
> incontrovertibly the right approach. That's ridiculous. I think Heikki
> and I could work this out quite quickly, if he engaged, but for
> whatever reason he appears unable to. I doubt that Heikki thinks that
> about what he said, so why do you?

I don't -- I just think you could have sent a patch that addressed all
the other points, leave this one as initially submitted, and note that
the new submission left it unaddressed because you disagreed.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2016-01-18 Thread Peter Geoghegan
On Mon, Jan 18, 2016 at 11:56 AM, Alvaro Herrera
 wrote:
>> That's because I believe this is quite broken, as already pointed out.
>
> I think I like your approach better.

That makes things far simpler, then.

>> Your premise here is that what Heikki said in passing months ago is
>> incontrovertibly the right approach. That's ridiculous. I think Heikki
>> and I could work this out quite quickly, if he engaged, but for
>> whatever reason he appears unable to. I doubt that Heikki thinks that
>> about what he said, so why do you?
>
> I don't -- I just think you could have sent a patch that addressed all
> the other points, leave this one as initially submitted, and note that
> the new submission left it unaddressed because you disagreed.

I'll try to do that soon. I've got a very busy schedule over the next
couple of weeks, though.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2015-12-19 Thread Peter Geoghegan
On Sat, Dec 19, 2015 at 3:26 PM, Michael Paquier
 wrote:
> +1. There are folks around doing tests using 9.5 now, it is not
> correct to impact the effort they have been putting on it until now.

This is a total misrepresentation of what I've said.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2015-12-19 Thread Michael Paquier
On Sat, Dec 19, 2015 at 1:58 AM, Robert Haas wrote:
> No, it's far too late to be pushing this into 9.5.  We are at RC1 now
> and hoping to cut a final release right after Christmas.  I think it's
> quite wrong to argue that these changes have no risk of destabilizing
> 9.5.  Nobody is exempt from having bugs in their code - not me, not
> you, not Tom Lane.  But quite apart from that, there seems to be no
> compelling benefit to having these changes in 9.5.  You say that the
> branches will diverge needlessly, but the whole point of having
> branches is that we do need things to diverge.  The question isn't
> "why shouldn't these go into 9.5?" but "do these fix something that is
> clearly broken in 9.5 and must be fixed to avoid hurting users?".
> Andres has said clearly that he doesn't think so, and Heikki didn't
> seem convinced that we wanted the changes at all.  I've read over the
> thread and I think that even if all the good things you say about this
> patch are 100% true, it doesn't amount to a good reason to back-patch.
> Code that does something possibly non-sensical or sub-optimal isn't a
> reason to back-patch in the absence of a clear, user-visible
> consequence.

+1. There are folks around doing tests using 9.5 now, it is not
correct to impact the effort they have been putting on it until now.

> I think it's a shame that we haven't gotten this patch dealt with just
> because when somebody submits a patch in June, it's not very nice for
> it to still be pending in December, but since this stuff is even
> further outside my area of expertise than the sorting stuff, and since
> me and my split personalities only have so many hours in the day, I'm
> going to have to leave it to somebody else to pick up anyhow.  But
> that's a separate issue from whether this should be back-patched.

Many patches wait in the queue for months, that's not new. Some of
them wait even longer than that.

For those reasons, and because you are willing visibly to still work
on it, I am moving this patch to next CF.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2015-12-18 Thread Robert Haas
On Thu, Dec 17, 2015 at 2:55 AM, Peter Geoghegan  wrote:
> On Wed, Dec 16, 2015 at 11:44 PM, Peter Geoghegan  wrote:
>>> In any case, at this point 9.5 is really aimed to be stabilized, so
>>> targeting only master is a far saner approach IMO for this patch.
>>> Pushing that in 9.5 a couple of months back may have given enough
>>> reason to do so... But well life is life.
>>
>> No, this really isn't an optimization at all.
>
> I should add: I think that the chances of this patch destabilizing the
> code are very slim, once it receives the proper review. Certainly, I
> foresee no possible downside to not inserting the doomed IndexTuple,
> since it's guaranteed to have its heap tuple super-deleted immediately
> afterwards.
>
> That's the only real behavioral change proposed here. So, I would
> prefer it if we got this in before the first stable release of 9.5.

No, it's far too late to be pushing this into 9.5.  We are at RC1 now
and hoping to cut a final release right after Christmas.  I think it's
quite wrong to argue that these changes have no risk of destabilizing
9.5.  Nobody is exempt from having bugs in their code - not me, not
you, not Tom Lane.  But quite apart from that, there seems to be no
compelling benefit to having these changes in 9.5.  You say that the
branches will diverge needlessly, but the whole point of having
branches is that we do need things to diverge.  The question isn't
"why shouldn't these go into 9.5?" but "do these fix something that is
clearly broken in 9.5 and must be fixed to avoid hurting users?".
Andres has said clearly that he doesn't think so, and Heikki didn't
seem convinced that we wanted the changes at all.  I've read over the
thread and I think that even if all the good things you say about this
patch are 100% true, it doesn't amount to a good reason to back-patch.
Code that does something possibly non-sensical or sub-optimal isn't a
reason to back-patch in the absence of a clear, user-visible
consequence.

I think it's a shame that we haven't gotten this patch dealt with just
because when somebody submits a patch in June, it's not very nice for
it to still be pending in December, but since this stuff is even
further outside my area of expertise than the sorting stuff, and since
me and my split personalities only have so many hours in the day, I'm
going to have to leave it to somebody else to pick up anyhow.  But
that's a separate issue from whether this should be back-patched.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2015-12-18 Thread Peter Geoghegan
On Fri, Dec 18, 2015 at 8:58 AM, Robert Haas  wrote:
> No, it's far too late to be pushing this into 9.5.  We are at RC1 now
> and hoping to cut a final release right after Christmas.  I think it's
> quite wrong to argue that these changes have no risk of destabilizing
> 9.5.  Nobody is exempt from having bugs in their code - not me, not
> you, not Tom Lane.  But quite apart from that, there seems to be no
> compelling benefit to having these changes in 9.5.  You say that the
> branches will diverge needlessly, but the whole point of having
> branches is that we do need things to diverge.  The question isn't
> "why shouldn't these go into 9.5?" but "do these fix something that is
> clearly broken in 9.5 and must be fixed to avoid hurting users?".
> Andres has said clearly that he doesn't think so, and Heikki didn't
> seem convinced that we wanted the changes at all.

It isn't true that Heikki was not basically in favor of this. This
should have been committed as part of the original patch, really.

I hope to avoid needless confusion about the documented (by the
official documentation) AM interface. Yes, that is

> I think it's a shame that we haven't gotten this patch dealt with just
> because when somebody submits a patch in June, it's not very nice for
> it to still be pending in December, but since this stuff is even
> further outside my area of expertise than the sorting stuff, and since
> me and my split personalities only have so many hours in the day, I'm
> going to have to leave it to somebody else to pick up anyhow.  But
> that's a separate issue from whether this should be back-patched.

Note that I've already proposed a compromise, even though I don't
think my original position was at all unreasonable. There'd be zero
real changes (only the addition of the new constant name,
documentation updates, comment updates, etc) under that compromise (as
against one change.).

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2015-12-18 Thread Robert Haas
On Fri, Dec 18, 2015 at 2:04 PM, Peter Geoghegan  wrote:
> It isn't true that Heikki was not basically in favor of this. This
> should have been committed as part of the original patch, really.

Maybe he wasn't against the whole thing, but he's posted two messages
to this thread and they can't be read as unequivocally in favor of
these changes.  He clearly didn't like at least some of it.

> I hope to avoid needless confusion about the documented (by the
> official documentation) AM interface. Yes, that is

Something maybe got cut off here?

>> I think it's a shame that we haven't gotten this patch dealt with just
>> because when somebody submits a patch in June, it's not very nice for
>> it to still be pending in December, but since this stuff is even
>> further outside my area of expertise than the sorting stuff, and since
>> me and my split personalities only have so many hours in the day, I'm
>> going to have to leave it to somebody else to pick up anyhow.  But
>> that's a separate issue from whether this should be back-patched.
>
> Note that I've already proposed a compromise, even though I don't
> think my original position was at all unreasonable. There'd be zero
> real changes (only the addition of the new constant name,
> documentation updates, comment updates, etc) under that compromise (as
> against one change.).

I only see one patch version on the thread.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2015-12-18 Thread Peter Geoghegan
On Fri, Dec 18, 2015 at 12:55 PM, Robert Haas  wrote:
> On Fri, Dec 18, 2015 at 2:04 PM, Peter Geoghegan  wrote:
>> It isn't true that Heikki was not basically in favor of this. This
>> should have been committed as part of the original patch, really.
>
> Maybe he wasn't against the whole thing, but he's posted two messages
> to this thread and they can't be read as unequivocally in favor of
> these changes.  He clearly didn't like at least some of it.

The issues were very trivial.

> I only see one patch version on the thread.

I'm not going to post a revision until I thrash out the tiny issues
with Heikki. He kind of trailed off. So maybe that kills it
immediately, which is a shame.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2015-12-17 Thread Michael Paquier
On Fri, Dec 18, 2015 at 4:31 PM, Peter Geoghegan  wrote:
> On Thu, Dec 17, 2015 at 11:06 PM, Michael Paquier
>  wrote:
>> Now per the two points listed in
>> the first sentence in this paragraph, perhaps this opinion is fine as
>> moot :) I didn't get much involved in the development of this code
>> after all.
>
> I'm concerned that Heikki's apparent unavailability will become a
> blocker to this.

Yeah, not only this patch... The second committer with enough
background on the matter is Andres.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2015-12-17 Thread Michael Paquier
On Thu, Dec 17, 2015 at 4:55 PM, Peter Geoghegan  wrote:
> On Wed, Dec 16, 2015 at 11:44 PM, Peter Geoghegan  wrote:
>>> In any case, at this point 9.5 is really aimed to be stabilized, so
>>> targeting only master is a far saner approach IMO for this patch.
>>> Pushing that in 9.5 a couple of months back may have given enough
>>> reason to do so... But well life is life.
>>
>> No, this really isn't an optimization at all.
>
> I should add: I think that the chances of this patch destabilizing the
> code are very slim, once it receives the proper review. Certainly, I
> foresee no possible downside to not inserting the doomed IndexTuple,
> since it's guaranteed to have its heap tuple super-deleted immediately
> afterwards.

I am no committer, just a guy giving an opinion about this patch and I
have to admit that I have not enough studied the speculative insertion
code to have a clear technical point of view on the matter, but even
if the chances of destabilizing the code are slim, it does not seem a
wise idea to me to do that post-rc1 and before a GA as there are
people testing the code as it is now. Now per the two points listed in
the first sentence in this paragraph, perhaps this opinion is fine as
moot :) I didn't get much involved in the development of this code
after all.

> That's the only real behavioral change proposed here. So, I would
> prefer it if we got this in before the first stable release of 9.5.

Yeah, I got this one.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2015-12-17 Thread Peter Geoghegan
On Thu, Dec 17, 2015 at 11:06 PM, Michael Paquier
 wrote:
>> I should add: I think that the chances of this patch destabilizing the
>> code are very slim, once it receives the proper review. Certainly, I
>> foresee no possible downside to not inserting the doomed IndexTuple,
>> since it's guaranteed to have its heap tuple super-deleted immediately
>> afterwards.
>
> I am no committer, just a guy giving an opinion about this patch and I
> have to admit that I have not enough studied the speculative insertion
> code to have a clear technical point of view on the matter, but even
> if the chances of destabilizing the code are slim, it does not seem a
> wise idea to me to do that post-rc1 and before a GA as there are
> people testing the code as it is now.

You can express doubt about almost anything. In this case, you haven't
voiced any particular concern about any particular risk. The risk of
not proceeding is that 9.5 will remain in a divergent state for no
reason, with substantial differences in the documentation of the AM
interface, and that has a cost. Why should the risk of destabilizing
9.5 become more palatable when 9.5 has been out for 6 months or a
year? You can take it that this will probably be now-or-never.

This is mostly just comment changes, and changes to documentation. If
it comes down to it, we could leave the existing 9.5 code intact (i.e.
still unnecessarily insert the IndexTuple), while commenting that it
is unnecessary, and still changing everything else. That would have an
unquantifiably tiny risk, certainly smaller than the risk of
committing the patch as-is (which, to reiterate, is a risk that I
think is very small).

FWIW, I tend to doubt that users have tested speculative insertion/ON
CONFLICT much this whole time. There were a couple of bug reports, but
that's it.

> Now per the two points listed in
> the first sentence in this paragraph, perhaps this opinion is fine as
> moot :) I didn't get much involved in the development of this code
> after all.

I'm concerned that Heikki's apparent unavailability will become a
blocker to this.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2015-12-16 Thread Michael Paquier
On Sun, Oct 4, 2015 at 2:07 AM, Peter Geoghegan  wrote:
> On Sat, Oct 3, 2015 at 6:38 AM, Andres Freund  wrote:
>> I'm not arguing against any of this - but I don't think this needs to be
>> on the 9.5 open items list. I plan to remove from there.
>
> Obviously I don't think that this is a critical fix. I do think that
> it would be nice to keep the branches in sync, and that might become a
> bit more difficult after 9.5 is released.

(A couple of months later)
This is not an actual fix, but an optimization, no?
UNIQUE_CHECK_SPECULATIVE is just used to optimize a couple of code
paths in the case of a insert conflicting during btree insertion..

In any case, at this point 9.5 is really aimed to be stabilized, so
targeting only master is a far saner approach IMO for this patch.
Pushing that in 9.5 a couple of months back may have given enough
reason to do so... But well life is life.

+* it later
Missing a dot here :)

+* Set checkedIndex here, since partial unique index
will still count
+* as a found arbiter index despite being skipped due
to predicate not
+* being satisfied
Ditto.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2015-12-16 Thread Peter Geoghegan
On Wed, Dec 16, 2015 at 11:44 PM, Peter Geoghegan  wrote:
>> In any case, at this point 9.5 is really aimed to be stabilized, so
>> targeting only master is a far saner approach IMO for this patch.
>> Pushing that in 9.5 a couple of months back may have given enough
>> reason to do so... But well life is life.
>
> No, this really isn't an optimization at all.

I should add: I think that the chances of this patch destabilizing the
code are very slim, once it receives the proper review. Certainly, I
foresee no possible downside to not inserting the doomed IndexTuple,
since it's guaranteed to have its heap tuple super-deleted immediately
afterwards.

That's the only real behavioral change proposed here. So, I would
prefer it if we got this in before the first stable release of 9.5.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2015-12-16 Thread Peter Geoghegan
On Wed, Dec 16, 2015 at 11:20 PM, Michael Paquier
 wrote:
> (A couple of months later)
> This is not an actual fix, but an optimization, no?
> UNIQUE_CHECK_SPECULATIVE is just used to optimize a couple of code
> paths in the case of a insert conflicting during btree insertion..
>
> In any case, at this point 9.5 is really aimed to be stabilized, so
> targeting only master is a far saner approach IMO for this patch.
> Pushing that in 9.5 a couple of months back may have given enough
> reason to do so... But well life is life.

No, this really isn't an optimization at all. It has nothing to do
with performance, since I think that unnecessarily inserting an index
tuple in practice has very little or no appreciable overhead.

The point of avoiding that is that it makes no sense, while doing it
implies that it does make sense. (i.e. It does not make sense to
insert into a B-Tree leaf page an IndexTuple pointing to a speculative
heap tuple with the certain knowledge that we'll have to super-delete
the speculative heap tuple in the end anyway).

This is 100% about clarifying the intent and design of the code.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2015-10-03 Thread Andres Freund
Hi,

On 2015-06-10 16:19:27 -0700, Peter Geoghegan wrote:
> Currently, speculative insertion (the INSERT ... ON CONFLICT DO UPDATE
> executor/storage infrastructure) uses checkUnique ==
> UNIQUE_CHECK_PARTIAL for unique indexes, which is a constant
> originally only used by deferred unique constraints. It occurred to me
> that this has a number of disadvantages:
> ...
> Attached patch updates speculative insertion along these lines.
> ...
> The patch also updates the AM interface documentation (the part
> covering unique indexes). It seems quite natural to me to document the
> theory of operation for speculative insertion there.

I'm not arguing against any of this - but I don't think this needs to be
on the 9.5 open items list. I plan to remove from there.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2015-10-03 Thread Peter Geoghegan
On Sat, Oct 3, 2015 at 6:38 AM, Andres Freund  wrote:
> I'm not arguing against any of this - but I don't think this needs to be
> on the 9.5 open items list. I plan to remove from there.

Obviously I don't think that this is a critical fix. I do think that
it would be nice to keep the branches in sync, and that might become a
bit more difficult after 9.5 is released.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2015-07-19 Thread Peter Geoghegan
On Thu, Jul 2, 2015 at 2:58 PM, Peter Geoghegan p...@heroku.com wrote:
 As with row locking, with insertion, if there is a conflict, any
 outcome (UPDATE or INSERT) is then possible.

Where are we on this? It would be nice to get this out of the way
before a 9.5 beta is put out.

Thanks
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2015-07-02 Thread Heikki Linnakangas

On 07/01/2015 09:19 PM, Peter Geoghegan wrote:

On Wed, Jul 1, 2015 at 10:45 AM, Peter Geoghegan p...@heroku.com wrote:

On Tue, Jun 30, 2015 at 11:19 AM, Heikki Linnakangas hlinn...@iki.fi wrote:

I agree it would be cleaner to have a separate CHECK_UNIQUE_XXX code for
speculative insertions. You've defined CHECK_UNIQUE_SPECULATIVE as like
CHECK_UNIQUE_PARTIAL, but you don't have to insert the index tuple if
there's a conflict. I think it'd be better to define it as like
CHECK_UNIQUE_YES, but return FALSE instead of throwing an error on
conflict. The difference is that the aminsert would not be allowed to
return FALSE when there is no conflict.


Suppose we do it that way. Then what's the difference between
CHECK_UNIQUE_SPECULATIVE and CHECK_UNIQUE_PARTIAL? You've just
effectively required the CHECK_UNIQUE_YES case to not physically
insert a physical tuple before throwing an error, which does not seem
essential to the existing definition of CHECK_UNIQUE_YES -- you've
redefined CHECK_UNIQUE_YES in a way that nbtree happens to meet at the
moment. If we had an amcanunique AM that worked a bit like exclusion
constraints, this new obligation for CHECK_UNIQUE_YES might make it
impossible for that to work.


Another more obvious and important thing: CHECK_UNIQUE_YES waits for
conflicts to be resolved before returning to its caller. If you don't
get an error, you're done. CHECK_UNIQUE_PARTIAL never waits, and if we
add a CHECK_UNIQUE_SPECULATIVE, it ought to not wait either.

Sure, if a speculative inserter detects a conflict, it still has to
wait. But not in the aminsert call, and not until it cleans up its
physical insertion (by super-deleting). Clearly a
CHECK_UNIQUE_SPECULATIVE would have to be much closer to
CHECK_UNIQUE_PARTIAL than to CHECK_UNIQUE_YES.


Why is it not OK for aminsert to do the waiting, with 
CHECK_UNIQUE_SPECULATIVE?


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2015-07-02 Thread Peter Geoghegan
On Thu, Jul 2, 2015 at 1:30 AM, Heikki Linnakangas hlinn...@iki.fi wrote:
 Sure, if a speculative inserter detects a conflict, it still has to
 wait. But not in the aminsert call, and not until it cleans up its
 physical insertion (by super-deleting). Clearly a
 CHECK_UNIQUE_SPECULATIVE would have to be much closer to
 CHECK_UNIQUE_PARTIAL than to CHECK_UNIQUE_YES.


 Why is it not OK for aminsert to do the waiting, with
 CHECK_UNIQUE_SPECULATIVE?

Well, waiting means getting a ShareLock on the other session's XID.
You can't do that without first releasing your locks, unless you're
okay with unprincipled deadlocks.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2015-07-02 Thread Peter Geoghegan
On Thu, Jul 2, 2015 at 10:49 AM, Peter Geoghegan p...@heroku.com wrote:
 Well, waiting means getting a ShareLock on the other session's XID.
 You can't do that without first releasing your locks, unless you're
 okay with unprincipled deadlocks.

Besides, if the other session wins the race and inserts a physical
heap tuple, but then aborts its xact, we might have to insert again
(although not before super-deleting), with that insert going on to be
successful.

As with row locking, with insertion, if there is a conflict, any
outcome (UPDATE or INSERT) is then possible.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2015-07-01 Thread Peter Geoghegan
On Wed, Jul 1, 2015 at 10:45 AM, Peter Geoghegan p...@heroku.com wrote:
 On Tue, Jun 30, 2015 at 11:19 AM, Heikki Linnakangas hlinn...@iki.fi wrote:
 I agree it would be cleaner to have a separate CHECK_UNIQUE_XXX code for
 speculative insertions. You've defined CHECK_UNIQUE_SPECULATIVE as like
 CHECK_UNIQUE_PARTIAL, but you don't have to insert the index tuple if
 there's a conflict. I think it'd be better to define it as like
 CHECK_UNIQUE_YES, but return FALSE instead of throwing an error on
 conflict. The difference is that the aminsert would not be allowed to
 return FALSE when there is no conflict.

 Suppose we do it that way. Then what's the difference between
 CHECK_UNIQUE_SPECULATIVE and CHECK_UNIQUE_PARTIAL? You've just
 effectively required the CHECK_UNIQUE_YES case to not physically
 insert a physical tuple before throwing an error, which does not seem
 essential to the existing definition of CHECK_UNIQUE_YES -- you've
 redefined CHECK_UNIQUE_YES in a way that nbtree happens to meet at the
 moment. If we had an amcanunique AM that worked a bit like exclusion
 constraints, this new obligation for CHECK_UNIQUE_YES might make it
 impossible for that to work.

Another more obvious and important thing: CHECK_UNIQUE_YES waits for
conflicts to be resolved before returning to its caller. If you don't
get an error, you're done. CHECK_UNIQUE_PARTIAL never waits, and if we
add a CHECK_UNIQUE_SPECULATIVE, it ought to not wait either.

Sure, if a speculative inserter detects a conflict, it still has to
wait. But not in the aminsert call, and not until it cleans up its
physical insertion (by super-deleting). Clearly a
CHECK_UNIQUE_SPECULATIVE would have to be much closer to
CHECK_UNIQUE_PARTIAL than to CHECK_UNIQUE_YES.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2015-07-01 Thread Peter Geoghegan
On Tue, Jun 30, 2015 at 11:19 AM, Heikki Linnakangas hlinn...@iki.fi wrote:
 I agree it would be cleaner to have a separate CHECK_UNIQUE_XXX code for
 speculative insertions. You've defined CHECK_UNIQUE_SPECULATIVE as like
 CHECK_UNIQUE_PARTIAL, but you don't have to insert the index tuple if
 there's a conflict. I think it'd be better to define it as like
 CHECK_UNIQUE_YES, but return FALSE instead of throwing an error on
 conflict. The difference is that the aminsert would not be allowed to
 return FALSE when there is no conflict.

Suppose we do it that way. Then what's the difference between
CHECK_UNIQUE_SPECULATIVE and CHECK_UNIQUE_PARTIAL? You've just
effectively required the CHECK_UNIQUE_YES case to not physically
insert a physical tuple before throwing an error, which does not seem
essential to the existing definition of CHECK_UNIQUE_YES -- you've
redefined CHECK_UNIQUE_YES in a way that nbtree happens to meet at the
moment. If we had an amcanunique AM that worked a bit like exclusion
constraints, this new obligation for CHECK_UNIQUE_YES might make it
impossible for that to work.

I'm not necessarily in disagreement. I just didn't do it that way for
that reason.

 Actually, even without this patch, a dummy implementation of aminsert that
 always returns FALSE in CHECK_UNIQUE_PARTIAL mode would be legal per the
 docs, but would loop forever. So that would be nice to fix or document away,
 even though it's not a problem for B-tree currently.

UNIQUE_CHECK_PARTIAL calls to an aminsert function are allowed to
report false positives (of not being unique, by returning FALSE as you
say), where there may not have been a conflict, but it's not clear
(e.g. because the conflicting xact has yet to commit/abort). You still
need to insert, though, so that the recheck at end-of-xact works for
deferred constraints. At that point, it's really not too different to
ordinary unique enforcement, except that the other guy is guaranteed
to notice you, too.

You can construct a theoretical case where lock starvation occurs with
unique constraint enforcement. I think it helps with nbtree here that
someone will reliably *not* see a conflict when concurrently
inserting, because unique index value locking exists (by locking the
first leaf page a value could be on with a buffer lock). But even if
that wasn't the case, the insert + recheck thing would be safe, just
as with exclusion constraints...provided you insert to begin with,
that is.

 In passing, I have make ExecInsertIndexTuples() give up when a
 speculative insertion conflict is detected. Again, this is not about
 bloat prevention; it's about making the code easier to understand by
 not doing something that is unnecessary, and in avoiding that also
 avoiding the implication that it is necessary. There are already
 enough complicated interactions that *are* necessary (e.g. livelock
 avoidance for exclusion constraints). Let us make our intent clearer.

 Hmm, not sure I like those changes. ExecInsertIndexTuples's behaviour now
 depends on whether specConflict is passed, but that seems weird as
 specConflict is an output parameter.

Your statement is ambiguous and confusing to me. Are you objecting to
the idea of returning from ExecInsertIndexTuples() early in general,
or to one narrow aspect of how that was proposed -- the fact that it
occurs due to specConflict != NULL rather than noDupErr? Obviously
if it's just the latter, then that's a small adjustment. But AFAICT
your remark about specConflict could one detail of a broader complaint
about an idea that you dislike generally.

 The patch also updates the AM interface documentation (the part
 covering unique indexes). It seems quite natural to me to document the
 theory of operation for speculative insertion there.


 Yeah, although I find the explanation pretty long-winded and difficult to
 understand ;-).

I don't know what you mean. You say things like this to me fairly
regularly, but I'm always left wondering what the take-away should be.
Please be more specific.

Long-winded generally means that more words than necessary were used.
I think that the documentation proposed is actually very information
dense, if anything. As always, I aimed to keep it consistent with the
surrounding documentation.

I understand that the material might be a little hard to follow, but
it concerns how someone can externally implement a Postgres index
access method that is amcanunique. That's a pretty esoteric subject
-- so far, we've had exactly zero takers (even without the
amcanunique part). It's damn hard to make these ideas accessible.

I feel that I have an obligation to share information about (say) how
the speculative insertion mechanism works -- the theory of operation
seems very useful. Like any one of us, I might not be around to
provide input on something like that in the future, so it makes sense
to be thorough and unambiguous. There are very few people (3?) who
have a good sense of how it works currently, and there are some 

Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2015-07-01 Thread Peter Geoghegan
On Wed, Jul 1, 2015 at 10:45 AM, Peter Geoghegan p...@heroku.com wrote:
 You can construct a theoretical case where lock starvation occurs with
 unique constraint enforcement. I think it helps with nbtree here that
 someone will reliably *not* see a conflict when concurrently
 inserting, because unique index value locking exists (by locking the
 first leaf page a value could be on with a buffer lock). But even if
 that wasn't the case, the insert + recheck thing would be safe, just
 as with exclusion constraints...provided you insert to begin with,
 that is.

Of course, the fact that UNIQUE_CHECK_PARTIAL aminsert callers
(including deferred constraint inserters and, currently, speculative
inserters) can rely on this is very useful to UPSERT. It guarantees
progress of one session without messy exclusion constraint style fixes
(for deadlock and livelock avoidance). You cannot talk about
speculative insertion without talking about unprincipled deadlocks
(and maybe livelocks).

If I had to bet where we might find some bugs in the executor parts of
UPSERT, my first guess would be the exclusion constraint edge-case
handling (livelocking stuff). Those are probably relatively benign
bugs, but recent bugs in exclusion constraints (in released branches)
show that they can hide for a long time.
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring speculative insertion with unique indexes a little

2015-06-30 Thread Heikki Linnakangas

On 06/11/2015 02:19 AM, Peter Geoghegan wrote:

Currently, speculative insertion (the INSERT ... ON CONFLICT DO UPDATE
executor/storage infrastructure) uses checkUnique ==
UNIQUE_CHECK_PARTIAL for unique indexes, which is a constant
originally only used by deferred unique constraints. It occurred to me
that this has a number of disadvantages:

* It confuses separation of concerns. Pushing down this information to
the nbtree AM makes it clear why it's slightly special from a
speculative insertion point of view. For example, the nbtree AM does
not actually require livelock insurance (for unique indexes,
although in principle not for nbtree-based exclusion constraints,
which are possible).

* UNIQUE_CHECK_PARTIAL is not only not the same thing as
UNIQUE_CHECK_SPECULATIVE (a new constant for the enum). It's also
naturally mutually exclusive with it (since we do not and cannot
support deferred unique constraints as arbiters). Let's represent this
directly.

* It makes a conflict not detected by the pre-check always insert an
index tuple, even though that occurs after a point where it's already
been established that the pointed-to TID is doomed -- it must go on to
be super deleted. Why bother bloating the index?


I'm actually not really motivated by wanting to reduce bloat here
(that was always something that I thought was a non-issue with *any*
implemented speculative insertion prototype [1]). Rather, by actually
physically inserting an index tuple unnecessarily, the implication is
that it makes sense to do so (perhaps for roughly the same reason it
makes sense with deferred unique constraints, or some other
complicated and subtle reason.). AFAICT that implication is incorrect,
though; I see no reason why inserting that index tuple serves any
purpose, and it clearly *can* be avoided with little effort.


I agree it would be cleaner to have a separate CHECK_UNIQUE_XXX code for 
speculative insertions. You've defined CHECK_UNIQUE_SPECULATIVE as like 
CHECK_UNIQUE_PARTIAL, but you don't have to insert the index tuple if 
there's a conflict. I think it'd be better to define it as like 
CHECK_UNIQUE_YES, but return FALSE instead of throwing an error on 
conflict. The difference is that the aminsert would not be allowed to 
return FALSE when there is no conflict.


Actually, even without this patch, a dummy implementation of aminsert 
that always returns FALSE in CHECK_UNIQUE_PARTIAL mode would be legal 
per the docs, but would loop forever. So that would be nice to fix or 
document away, even though it's not a problem for B-tree currently.



Attached patch updates speculative insertion along these lines.

In passing, I have make ExecInsertIndexTuples() give up when a
speculative insertion conflict is detected. Again, this is not about
bloat prevention; it's about making the code easier to understand by
not doing something that is unnecessary, and in avoiding that also
avoiding the implication that it is necessary. There are already
enough complicated interactions that *are* necessary (e.g. livelock
avoidance for exclusion constraints). Let us make our intent clearer.


Hmm, not sure I like those changes. ExecInsertIndexTuples's behaviour 
now depends on whether specConflict is passed, but that seems weird as 
specConflict is an output parameter.



The patch also updates the AM interface documentation (the part
covering unique indexes). It seems quite natural to me to document the
theory of operation for speculative insertion there.


Yeah, although I find the explanation pretty long-winded and difficult 
to understand ;-).


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring pgbench.c

2015-06-28 Thread Tomas Vondra

Hi,

On 06/28/2015 08:10 AM, Fabien COELHO wrote:


Hello Tatsuo,


Main pgbench logic consists of single file pgbench.c which is 4036
lines of code as of today. This is not a small number and I think
it would be nice if it is divided into smaller files because it
will make it easier to maintain, add or change features of pgbench.


I do not think that this large file is a so big a problem (good
editors help navigation in the code), and I'm not sure that splitting
it would achieve much: there are not that many functions, some of
them are maybe long (main, threadRun, doCustom) but mostly for good
reasons.


My thoughts, exactly. I don't think just splitting the file into 
multiple pieces will achieve anything - the problem is that we've 
extended the original pgbench code in rather hackish way, without any 
major design changes, so IMHO what should be done is refactoring ...



I've submitted a patch to remove fork-emulation, which I think
would really help simplify the code (maybe -10% source in
pgbench.c, less #ifs, avoid double implementations or
more-complex-than-necessary implementations or not-implemented
features).


... and cleanup of dead code.

--
Tomas Vondra   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring pgbench.c

2015-06-28 Thread Fabien COELHO


Hello Tatsuo,

Main pgbench logic consists of single file pgbench.c which is 4036 lines 
of code as of today. This is not a small number and I think it would be 
nice if it is divided into smaller files because it will make it easier 
to maintain, add or change features of pgbench.


I do not think that this large file is a so big a problem (good editors 
help navigation in the code), and I'm not sure that splitting it would 
achieve much: there are not that many functions, some of them are maybe 
long (main, threadRun, doCustom) but mostly for good reasons.


I've submitted a patch to remove fork-emulation, which I think would 
really help simplify the code (maybe -10% source in pgbench.c, less 
#ifs, avoid double implementations or more-complex-than-necessary 
implementations or not-implemented features).


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring pgbench.c

2015-06-28 Thread Tatsuo Ishii
 I do not think that this large file is a so big a problem (good
 editors help navigation in the code), and I'm not sure that splitting
 it would achieve much: there are not that many functions, some of
 them are maybe long (main, threadRun, doCustom) but mostly for good
 reasons.
 
 My thoughts, exactly. I don't think just splitting the file into
 multiple pieces will achieve anything - the problem is that we've
 extended the original pgbench code in rather hackish way, without any
 major design changes, so IMHO what should be done is refactoring ...

This is kind of surprising to me that two people are against
refactoring proposal (I understand that Fabien has pending patches for
pgbench and that could be a motivation for this though).

Splitting single large file into smaller files in which functions
contain strong relation will be itself a benefit since there will be
more chance for people to be needed to look into smaller places than
before. This is just a basic idea of modular programming and will be a
long term benefit of PostgreSQL project, which was my thought.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring pgbench.c

2015-06-28 Thread Tomas Vondra

On 06/28/15 18:56, Tatsuo Ishii wrote:

I do not think that this large file is a so big a problem (good
editors help navigation in the code), and I'm not sure that splitting
it would achieve much: there are not that many functions, some of
them are maybe long (main, threadRun, doCustom) but mostly for good
reasons.


My thoughts, exactly. I don't think just splitting the file into
multiple pieces will achieve anything - the problem is that we've
extended the original pgbench code in rather hackish way, without any
major design changes, so IMHO what should be done is refactoring ...


This is kind of surprising to me that two people are against
refactoring proposal (I understand that Fabien has pending patches
for pgbench and that could be a motivation for this though).


I think that's a misunderstanding. I'm not against refactoring - not at 
all, and neither is Fabien I believe. However the first paragraph of 
your post seems to suggest that we get better code by merely splitting 
the a large file into several smaller ones.


I don't think that alone helps - I think we first need to refactor the 
code first, and then possibly split the file into several modules.



Splitting single large file into smaller files in which functions
contain strong relation will be itself a benefit since there will be
more chance for people to be needed to look into smaller places than
before. This is just a basic idea of modular programming and will be
a long term benefit of PostgreSQL project, which was my thought.


I respectfully disagree. I think that for this to be true, the files 
need to be somehow logically related (which I'd expect to be the output 
of the refactoring). But maybe you have some particular refactoring and 
split in mind?


best regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring pgbench.c

2015-06-28 Thread Fabien COELHO


This is kind of surprising to me that two people are against 
refactoring proposal (I understand that Fabien has pending patches for 
pgbench and that could be a motivation for this though).


I think that's a misunderstanding. I'm not against refactoring - not at all, 
and neither is Fabien I believe.


Indeed I'm not, I was writing about splitting pgbench.c into several 
files. Although, as Tatsuo-san noted, I'm probably against a refactoring 
before the patches I already submitted get considered:-)


Really refactoring would mean breaking apart the long functions, but these 
are long for some reasons (say global goto to handle errors...), so how to 
do something very useful on that line is not clear to me.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring pgbench.c

2015-06-28 Thread Jeff Janes
On Sat, Jun 27, 2015 at 5:10 PM, Tatsuo Ishii is...@postgresql.org wrote:

 Main pgbench logic consists of single file pgbench.c which is 4036
 lines of code as of today. This is not a small number and I think it
 would be nice if it is divided into smaller files because it will make
 it easier to maintain, add or change features of pgbench.  I will come
 up with an idea how to split pgbench.c later. In the mean time I
 attached a call graph of pgbench.c generated by egypt, which we could
 get a basic idea how to split and modularize pgbench.c.


I've never found the raw line count of a file to be a problem, at least not
at the 4000 line mark.

If some functions could be moved to existing libraries (like the functions
strtoint64 or doConnect or read_line_from_file or the statistical
distribution functions) that would be great, but I assume that would have
been done already if there was already a logical place to hold them.  I
don't think inventing new libraries just to hold these would be an
improvement.

If you provided a suggested dissection, maybe I would find it compelling.
But without seeing a concrete proposal I think the process of refactoring
is going to impede other improvements more than the result of refactoring
it will promote them.

Cheers,

Jeff


[HACKERS] Refactoring pgbench.c

2015-06-27 Thread Tatsuo Ishii
Main pgbench logic consists of single file pgbench.c which is 4036
lines of code as of today. This is not a small number and I think it
would be nice if it is divided into smaller files because it will make
it easier to maintain, add or change features of pgbench.  I will come
up with an idea how to split pgbench.c later. In the mean time I
attached a call graph of pgbench.c generated by egypt, which we could
get a basic idea how to split and modularize pgbench.c.

BTW, while looking at pgbench.c I noticed subtle coding problems:

1) strtoint64() should be decalred as static

2) 
static
void
agg_vals_init(AggVals *aggs, instr_time start)

static and void should be one line, rather than separate 2 lines
according to our coding style.

I will commit fix for these if there's no objection.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring GUC unit conversions

2015-02-26 Thread Fujii Masao
On Fri, Feb 13, 2015 at 10:26 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 In the redesign checkpoint_segments patch, Robert suggested keeping the
 settings' base unit as number of segments, but allow conversions from MB,
 GB etc. I started looking into that and found that adding a new unit to
 guc.c is quite messy. The conversions are done with complicated
 if-switch-case constructs.

 Attached is a patch to refactor that, making the conversions table-driven.
 This doesn't change functionality, it just makes the code nicer.

Isn't it good idea to allow even wal_keep_segments to converse from MB, GB etc?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Refactoring GUC unit conversions

2015-02-13 Thread Heikki Linnakangas
In the redesign checkpoint_segments patch, Robert suggested keeping 
the settings' base unit as number of segments, but allow conversions 
from MB, GB etc. I started looking into that and found that adding a new 
unit to guc.c is quite messy. The conversions are done with complicated 
if-switch-case constructs.


Attached is a patch to refactor that, making the conversions 
table-driven. This doesn't change functionality, it just makes the code 
nicer.


Any objections?

- Heikki
From a053c61c333687224d33a18a2a299c4dc2eb6bfe Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas heikki.linnakangas@iki.fi
Date: Fri, 13 Feb 2015 15:24:50 +0200
Subject: [PATCH 1/1] Refactor unit conversions code in guc.c.

Replace the if-switch-case constructs with two conversion tables,
containing all the supported conversions between human-readable unit
strings and the base units used in GUC variables. This makes the code
easier to read, and makes adding new units simpler.
---
 src/backend/utils/misc/guc.c | 425 +++
 src/include/utils/guc.h  |   2 +
 2 files changed, 188 insertions(+), 239 deletions(-)

diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 9572777..59e25af 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -97,20 +97,6 @@
 #define CONFIG_EXEC_PARAMS_NEW global/config_exec_params.new
 #endif
 
-#define KB_PER_MB (1024)
-#define KB_PER_GB (1024*1024)
-#define KB_PER_TB (1024*1024*1024)
-
-#define MS_PER_S 1000
-#define S_PER_MIN 60
-#define MS_PER_MIN (1000 * 60)
-#define MIN_PER_H 60
-#define S_PER_H (60 * 60)
-#define MS_PER_H (1000 * 60 * 60)
-#define MIN_PER_D (60 * 24)
-#define S_PER_D (60 * 60 * 24)
-#define MS_PER_D (1000 * 60 * 60 * 24)
-
 /*
  * Precision with which REAL type guc values are to be printed for GUC
  * serialization.
@@ -666,6 +652,88 @@ const char *const config_type_names[] =
 	 /* PGC_ENUM */ enum
 };
 
+/*
+ * Unit conversions tables.
+ *
+ * There are two tables, one for memory units, and another for time units.
+ * For each supported conversion from one unit to another, we have an entry
+ * in the conversion table.
+ *
+ * To keep things simple, and to avoid intermediate-value overflows,
+ * conversions are never chained. There needs to be a direct conversion
+ * between all units.
+ *
+ * The conversions from each base unit must be kept in order from greatest
+ * to smallest unit; convert_from_base_unit() relies on that. (The order of
+ * the base units does not matter.)
+ */
+#define MAX_UNIT_LEN		3	/* length of longest recognized unit string */
+
+typedef struct
+{
+	char	unit[MAX_UNIT_LEN + 1];	/* unit, as a string, like kB or min */
+	int		base_unit;		/* GUC_UNIT_XXX */
+	int		multiplier;		/* If positive, multiply the value with this for
+			 * unit - base_unit conversion. If negative,
+			 * divide (with the absolute value) */
+} unit_conversion;
+
+/* Ensure that the constants in the tables don't overflow or underflow */
+#if BLCKSZ  1024 || BLCKSZ  (1024*1024)
+#error BLCKSZ must be between 1KB and 1MB
+#endif
+#if XLOG_BLCKSZ  1024 || XLOG_BLCKSZ  (1024*1024)
+#error XLOG_BLCKSZ must be between 1KB and 1MB
+#endif
+
+static const char *memory_units_hint =
+	gettext_noop(Valid units for this parameter are \kB\, \MB\, \GB\, and \TB\.);
+
+static const unit_conversion memory_unit_conversion_table[] =
+{
+	{ TB,		GUC_UNIT_KB,	 	1024*1024*1024 },
+	{ GB,		GUC_UNIT_KB,	 	1024*1024 },
+	{ MB,		GUC_UNIT_KB,	 	1024 },
+	{ kB,		GUC_UNIT_KB,	 	1 },
+
+	{ TB,		GUC_UNIT_BLOCKS,	(1024*1024*1024) / (BLCKSZ / 1024) },
+	{ GB,		GUC_UNIT_BLOCKS,	(1024*1024) / (BLCKSZ / 1024) },
+	{ MB,		GUC_UNIT_BLOCKS,	1024 / (BLCKSZ / 1024) },
+	{ kB,		GUC_UNIT_BLOCKS,	-(BLCKSZ / 1024) },
+
+	{ TB,		GUC_UNIT_XBLOCKS,	(1024*1024*1024) / (XLOG_BLCKSZ / 1024) },
+	{ GB,		GUC_UNIT_XBLOCKS,	(1024*1024) / (XLOG_BLCKSZ / 1024) },
+	{ MB,		GUC_UNIT_XBLOCKS,	1024 / (XLOG_BLCKSZ / 1024) },
+	{ kB,		GUC_UNIT_XBLOCKS,	-(XLOG_BLCKSZ / 1024) },
+
+	{  }		/* end of table marker */
+};
+
+static const char *time_units_hint =
+	gettext_noop(Valid units for this parameter are \ms\, \s\, \min\, \h\, and \d\.);
+
+static const unit_conversion time_unit_conversion_table[] =
+{
+	{ d,		GUC_UNIT_MS,	1000 * 60 * 60 * 24 },
+	{ h,		GUC_UNIT_MS,	1000 * 60 * 60 },
+	{ min, 	GUC_UNIT_MS,	1000 * 60},
+	{ s,		GUC_UNIT_MS,	1000 },
+	{ ms,		GUC_UNIT_MS,	1 },
+
+	{ d,		GUC_UNIT_S,		60 * 60 * 24 },
+	{ h,		GUC_UNIT_S,		60 * 60 },
+	{ min, 	GUC_UNIT_S,		60 },
+	{ s,		GUC_UNIT_S,		1 },
+	{ ms, 	GUC_UNIT_S,	 	-1000 },
+
+	{ d, 		GUC_UNIT_MIN,	60 * 24 },
+	{ h, 		GUC_UNIT_MIN,	60 },
+	{ min, 	GUC_UNIT_MIN,	1 },
+	{ s, 		GUC_UNIT_MIN,	-60 },
+	{ ms, 	GUC_UNIT_MIN,	-1000 * 60 },
+
+	{  }		/* end of table marker */
+};
 
 /*
  * Contents of GUC tables
@@ -5018,6 +5086,85 @@ ReportGUCOption(struct config_generic * record)
 }
 
 /*
+ * Convert a value from one of the human-friendly units (kB, min etc.)
+ * to a given base unit. 'value' and 'unit' 

Re: [HACKERS] Refactoring GUC unit conversions

2015-02-13 Thread Jim Nasby

On 2/13/15 7:26 AM, Heikki Linnakangas wrote:

In the redesign checkpoint_segments patch, Robert suggested keeping
the settings' base unit as number of segments, but allow conversions
from MB, GB etc. I started looking into that and found that adding a new
unit to guc.c is quite messy. The conversions are done with complicated
if-switch-case constructs.

Attached is a patch to refactor that, making the conversions
table-driven. This doesn't change functionality, it just makes the code
nicer.


Looks good, but shouldn't there be a check for a unit that's neither 
memory or time?

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring GUC unit conversions

2015-02-13 Thread Heikki Linnakangas

On 02/13/2015 07:34 PM, Jim Nasby wrote:

On 2/13/15 7:26 AM, Heikki Linnakangas wrote:

In the redesign checkpoint_segments patch, Robert suggested keeping
the settings' base unit as number of segments, but allow conversions
from MB, GB etc. I started looking into that and found that adding a new
unit to guc.c is quite messy. The conversions are done with complicated
if-switch-case constructs.

Attached is a patch to refactor that, making the conversions
table-driven. This doesn't change functionality, it just makes the code
nicer.


Looks good, but shouldn't there be a check for a unit that's neither
memory or time?


Can you elaborate? We currently only support units for memory and time 
settings.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring GUC unit conversions

2015-02-13 Thread Jim Nasby

On 2/13/15 11:44 AM, Heikki Linnakangas wrote:

On 02/13/2015 07:34 PM, Jim Nasby wrote:

On 2/13/15 7:26 AM, Heikki Linnakangas wrote:

In the redesign checkpoint_segments patch, Robert suggested keeping
the settings' base unit as number of segments, but allow conversions
from MB, GB etc. I started looking into that and found that adding a new
unit to guc.c is quite messy. The conversions are done with complicated
if-switch-case constructs.

Attached is a patch to refactor that, making the conversions
table-driven. This doesn't change functionality, it just makes the code
nicer.


Looks good, but shouldn't there be a check for a unit that's neither
memory or time?


Can you elaborate? We currently only support units for memory and time
settings.


I'm thinking an Assert in case someone screws up the function call. But 
perhaps I'm just being paranoid.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Refactoring code for sync node detection (was: Support for N synchronous standby servers)

2014-09-23 Thread Michael Paquier
On Sat, Sep 20, 2014 at 1:16 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Fri, Sep 19, 2014 at 12:18 PM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Sep 16, 2014 at 2:19 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 - A patch refactoring code for pg_stat_get_wal_senders and
 SyncRepReleaseWaiters as there is in either case duplicated code in
 this area to select the synchronous node as the one connected with
 lowest priority

 A strong +1 for this idea.  I have never liked that, and cleaning it
 up seems eminently sensible.

 Interestingly, the syncrep code has in some of its code paths the idea
 that a synchronous node is unique, while other code paths assume that
 there can be multiple synchronous nodes. If that's fine I think that
 it would be better to just make the code multiple-sync node aware, by
 having a single function call in walsender.c and syncrep.c that
 returns an integer array of WAL sender positions (WalSndCtl). as that
 seems more extensible long-term. Well for now the array would have a
 single element, being the WAL sender with lowest priority  0. Feel
 free to protest about that approach though :)
Please find attached a patch refactoring this code. Looking once again
at that I have taken the approach minimizing the footprint of
refactoring on current code, by simply adding a function called
SyncRepGetSynchronousNode in syncrep.c that returns to the caller a
position in the WAL sender array to define the code considered as
synchronous, and if no synchronous node is found.

I'll add it to the next commit fest.

Regards,
-- 
Michael
From 6f66020b3c5fc0866b63bb12cf12c582be14f7d0 Mon Sep 17 00:00:00 2001
From: Michael Paquier mich...@otacoo.com
Date: Tue, 23 Sep 2014 22:57:00 +0900
Subject: [PATCH] Refactor code to detect synchronous node in WAL sender array

This patch is made to remove code duplication in walsender.c and syncrep.c
in order to detect what is the node with the lowest strictly-positive
priority, facilitating maintenance of this code.
---
 src/backend/replication/syncrep.c   | 89 ++---
 src/backend/replication/walsender.c | 34 +-
 src/include/replication/syncrep.h   |  1 +
 3 files changed, 74 insertions(+), 50 deletions(-)

diff --git a/src/backend/replication/syncrep.c b/src/backend/replication/syncrep.c
index aa54bfb..e0b1034 100644
--- a/src/backend/replication/syncrep.c
+++ b/src/backend/replication/syncrep.c
@@ -5,7 +5,7 @@
  * Synchronous replication is new as of PostgreSQL 9.1.
  *
  * If requested, transaction commits wait until their commit LSN is
- * acknowledged by the sync standby.
+ * acknowledged by the synchronous standby.
  *
  * This module contains the code for waiting and release of backends.
  * All code in this module executes on the primary. The core streaming
@@ -357,6 +357,60 @@ SyncRepInitConfig(void)
 	}
 }
 
+
+/*
+ * Obtain position of synchronous standby in the array referencing all
+ * the WAL senders, or -1 if no synchronous node can be found. The caller
+ * of this function should take a lock on SyncRepLock.
+ */
+int
+SyncRepGetSynchronousNode(void)
+{
+	int		sync_node = -1;
+	int		priority = 0;
+	int		i;
+
+	/* Scan WAL senders and find synchronous node if any */
+	for (i = 0; i  max_wal_senders; i++)
+	{
+		/* Use volatile pointer to prevent code rearrangement */
+		volatile WalSnd *walsnd = WalSndCtl-walsnds[i];
+
+		/* Process to next if not active */
+		if (walsnd-pid == 0)
+			continue;
+
+		/* Process to next if not streaming */
+		if (walsnd-state != WALSNDSTATE_STREAMING)
+			continue;
+
+		/* Process to next one if asynchronous */
+		if (walsnd-sync_standby_priority == 0)
+			continue;
+
+		/* Process to next one if priority conditions not satisfied */
+		if (priority != 0 
+			priority = walsnd-sync_standby_priority)
+			continue;
+
+		/* Process to next one if flush position is invalid */
+		if (XLogRecPtrIsInvalid(walsnd-flush))
+			continue;
+
+		/*
+		 * We have a potential synchronous candidate, choose it as the
+		 * synchronous node for the time being before going through the
+		 * other nodes listed in the WAL sender array.
+		 */
+		sync_node = i;
+
+		/* Update priority to current value of WAL sender */
+		priority = walsnd-sync_standby_priority;
+	}
+
+	return sync_node;
+}
+
 /*
  * Update the LSNs on each queue based upon our latest state. This
  * implements a simple policy of first-valid-standby-releases-waiter.
@@ -369,10 +423,9 @@ SyncRepReleaseWaiters(void)
 {
 	volatile WalSndCtlData *walsndctl = WalSndCtl;
 	volatile WalSnd *syncWalSnd = NULL;
+	int			sync_node;
 	int			numwrite = 0;
 	int			numflush = 0;
-	int			priority = 0;
-	int			i;
 
 	/*
 	 * If this WALSender is serving a standby that is not on the list of
@@ -388,32 +441,14 @@ SyncRepReleaseWaiters(void)
 	/*
 	 * We're a potential sync standby. Release waiters if we are the highest
 	 * priority standby. If there are multiple standbys with same priorities
-	 * then we use the first 

Re: [HACKERS] Refactoring standby mode logic

2012-12-03 Thread Simon Riggs
On 29 November 2012 09:06, Heikki Linnakangas hlinnakan...@vmware.com wrote:
 The code that reads the WAL from the archive, from pg_xlog, and from a
 master server via walreceiver, is quite complicated. I'm talking about the
 WaitForWALToBecomeAvailable() function in xlog.c. I got frustrated with that
 while working on the switching timeline over streaming replication patch.

 Attached is a patch to refactor that logic into a more straightforward state
 machine. It's always been a kind of a state machine, but it's been hard to
 see, as the code wasn't explicitly written that way. Any objections?

 The only user-visible effect is that this slightly changes the order that
 recovery tries to read files from the archive, and pg_xlog, in the presence
 of multiple timelines. At the moment, if recovery fails to find a file on
 current timeline in the archive, it then tries to find it in pg_xlog. If
 it's not found there either, it checks if the file on next timeline exists
 in the archive, and then checks if exists in pg_xlog. For example, if we're
 currently recovering timeline 2, and target timeline is 4, and we're looking
 for WAL file A, the files are searched for in this order:

 1. File 0004000A in archive
 2. File 0004000A in pg_xlog
 3. File 0003000A in archive
 4. File 0003000A in pg_xlog
 5. File 0002000A in archive
 6. File 0002000A in pg_xlog

 With this patch, the order is:

 1. File 0004000A in archive
 2. File 0003000A in archive
 3. File 0002000A in archive
 4. File 0004000A in pg_xlog
 5. File 0003000A in pg_xlog
 6. File 0002000A in pg_xlog

 This change should have no effect in normal restore scenarios. It'd only
 make a difference if some files in the middle of the sequence of WAL files
 are missing from the archive, but have been copied to pg_xlog manually, and
 only if that file contains a timeline switch. Even then, I think I like the
 new order better; it's easier to explain if nothing else.

Sorry, forgot to say fine by me.

This probably helps the avoidance of shutdown checkpoints, since for
that, we need to skip retrieving from archive once we're up.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring standby mode logic

2012-12-03 Thread Heikki Linnakangas

On 30.11.2012 13:11, Dimitri Fontaine wrote:

Hi,

Heikki Linnakangashlinnakan...@vmware.com  writes:

Attached is a patch to refactor that logic into a more straightforward state
machine. It's always been a kind of a state machine, but it's been hard to
see, as the code wasn't explicitly written that way. Any objections?


On a quick glance over, looks good to me. Making that code simpler to
read and reason about seems a good goal.


Thanks.


This change should have no effect in normal restore scenarios. It'd only
make a difference if some files in the middle of the sequence of WAL files
are missing from the archive, but have been copied to pg_xlog manually, and
only if that file contains a timeline switch. Even then, I think I like the
new order better; it's easier to explain if nothing else.


I'm not understanding the sequence difference well enough to comment
here, but I think some people are currently playing tricks in their
failover scripts with moving files directly to the pg_xlog of the server
to be promoted.


That's still perfectly ok. It's only if you have a diverged timeline 
history, and you have files from one timeline in the archive and files 
from another in pg_xlog that you'll see a difference. But in such a 
split situation, it's quite arbitrary which timeline recovery will 
follow anyway, I don't think anyone can sanely rely on either behavior.



Is it possible for your refactoring to keep the old sequence?


Hmm, perhaps, but I think it would complicate the logic a bit. Doesn't 
seem worth it.


Committed..

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring standby mode logic

2012-12-03 Thread Dimitri Fontaine
Heikki Linnakangas hlinnakan...@vmware.com writes:
 I'm not understanding the sequence difference well enough to comment
 here, but I think some people are currently playing tricks in their
 failover scripts with moving files directly to the pg_xlog of the server
 to be promoted.

 That's still perfectly ok. It's only if you have a diverged timeline
 history, and you have files from one timeline in the archive and files from
 another in pg_xlog that you'll see a difference. But in such a split
 situation, it's quite arbitrary which timeline recovery will follow anyway,
 I don't think anyone can sanely rely on either behavior.

Fair enough.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring standby mode logic

2012-11-30 Thread Dimitri Fontaine
Hi,

Heikki Linnakangas hlinnakan...@vmware.com writes:
 Attached is a patch to refactor that logic into a more straightforward state
 machine. It's always been a kind of a state machine, but it's been hard to
 see, as the code wasn't explicitly written that way. Any objections?

On a quick glance over, looks good to me. Making that code simpler to
read and reason about seems a good goal.

 This change should have no effect in normal restore scenarios. It'd only
 make a difference if some files in the middle of the sequence of WAL files
 are missing from the archive, but have been copied to pg_xlog manually, and
 only if that file contains a timeline switch. Even then, I think I like the
 new order better; it's easier to explain if nothing else.

I'm not understanding the sequence difference well enough to comment
here, but I think some people are currently playing tricks in their
failover scripts with moving files directly to the pg_xlog of the server
to be promoted.

Is it possible for your refactoring to keep the old sequence?

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Refactoring standby mode logic

2012-11-29 Thread Heikki Linnakangas
The code that reads the WAL from the archive, from pg_xlog, and from a 
master server via walreceiver, is quite complicated. I'm talking about 
the WaitForWALToBecomeAvailable() function in xlog.c. I got frustrated 
with that while working on the switching timeline over streaming 
replication patch.


Attached is a patch to refactor that logic into a more straightforward 
state machine. It's always been a kind of a state machine, but it's been 
hard to see, as the code wasn't explicitly written that way. Any objections?


The only user-visible effect is that this slightly changes the order 
that recovery tries to read files from the archive, and pg_xlog, in the 
presence of multiple timelines. At the moment, if recovery fails to find 
a file on current timeline in the archive, it then tries to find it in 
pg_xlog. If it's not found there either, it checks if the file on next 
timeline exists in the archive, and then checks if exists in pg_xlog. 
For example, if we're currently recovering timeline 2, and target 
timeline is 4, and we're looking for WAL file A, the files are searched 
for in this order:


1. File 0004000A in archive
2. File 0004000A in pg_xlog
3. File 0003000A in archive
4. File 0003000A in pg_xlog
5. File 0002000A in archive
6. File 0002000A in pg_xlog

With this patch, the order is:

1. File 0004000A in archive
2. File 0003000A in archive
3. File 0002000A in archive
4. File 0004000A in pg_xlog
5. File 0003000A in pg_xlog
6. File 0002000A in pg_xlog

This change should have no effect in normal restore scenarios. It'd only 
make a difference if some files in the middle of the sequence of WAL 
files are missing from the archive, but have been copied to pg_xlog 
manually, and only if that file contains a timeline switch. Even then, I 
think I like the new order better; it's easier to explain if nothing else.


- Heikki
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***
*** 512,523  static XLogwrtResult LogwrtResult = {0, 0};
  
  /*
   * Codes indicating where we got a WAL file from during recovery, or where
!  * to attempt to get one.  These are chosen so that they can be OR'd together
!  * in a bitmask state variable.
   */
! #define XLOG_FROM_ARCHIVE		(10)	/* Restored using restore_command */
! #define XLOG_FROM_PG_XLOG		(11)	/* Existing file in pg_xlog */
! #define XLOG_FROM_STREAM		(12)	/* Streamed from master */
  
  /*
   * openLogFile is -1 or a kernel FD for an open log file segment.
--- 512,529 
  
  /*
   * Codes indicating where we got a WAL file from during recovery, or where
!  * to attempt to get one.
   */
! typedef enum
! {
! 	XLOG_FROM_ANY = 0,		/* request to read WAL from any source */
! 	XLOG_FROM_ARCHIVE,		/* restored using restore_command */
! 	XLOG_FROM_PG_XLOG,		/* existing file in pg_xlog */
! 	XLOG_FROM_STREAM,		/* streamed from master */
! } XLogSource;
! 
! /* human-readable names for XLogSources, for debugging output */
! static const char *xlogSourceNames[] = { any, archive, pg_xlog, stream };
  
  /*
   * openLogFile is -1 or a kernel FD for an open log file segment.
***
*** 542,563  static XLogSegNo readSegNo = 0;
  static uint32 readOff = 0;
  static uint32 readLen = 0;
  static bool	readFileHeaderValidated = false;
! static int	readSource = 0;		/* XLOG_FROM_* code */
  
  /*
!  * Keeps track of which sources we've tried to read the current WAL
!  * record from and failed.
   */
! static int	failedSources = 0;	/* OR of XLOG_FROM_* codes */
  
  /*
   * These variables track when we last obtained some WAL data to process,
   * and where we got it from.  (XLogReceiptSource is initially the same as
   * readSource, but readSource gets reset to zero when we don't have data
!  * to process right now.)
   */
  static TimestampTz XLogReceiptTime = 0;
! static int	XLogReceiptSource = 0;		/* XLOG_FROM_* code */
  
  /* Buffer for currently read page (XLOG_BLCKSZ bytes) */
  static char *readBuf = NULL;
--- 548,575 
  static uint32 readOff = 0;
  static uint32 readLen = 0;
  static bool	readFileHeaderValidated = false;
! static XLogSource readSource = 0;		/* XLOG_FROM_* code */
  
  /*
!  * Keeps track of which source we're currently reading from. This is
!  * different from readSource in that this is always set, even when we don't
!  * currently have a WAL file open. If lastSourceFailed is set, our last
!  * attempt to read from currentSource failed, and we should try another source
!  * next.
   */
! static XLogSource currentSource = 0;	/* XLOG_FROM_* code */
! static bool	lastSourceFailed = false;
  
  /*
   * These variables track when we last obtained some WAL data to process,
   * and where we got it from.  (XLogReceiptSource is initially the same as
   * readSource, but readSource gets reset to zero 

Re: [HACKERS] Refactoring simplify_function (was: Caching constant stable expressions)

2012-04-05 Thread Robert Haas
On Fri, Mar 23, 2012 at 7:41 PM, Marti Raudsepp ma...@juffo.org wrote:
 Yeah, I'm still working on addressing the comments from your last email.

 Haven't had much time to work on it for the last 2 weeks, but I hope
 to finish most of it this weekend.

Since the updated patch seems never to have been posted, I think this
one is dead for 9.2.  I'll mark it Returned with Feedback.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring simplify_function (was: Caching constant stable expressions)

2012-03-23 Thread Tom Lane
Marti Raudsepp ma...@juffo.org writes:
 Per Tom's request, I split out this refactoring from my CacheExpr patch.

 Basically I'm just centralizing the eval_const_expressions_mutator()
 call on function arguments, from multiple different places to a single
 location. Without this, it would be a lot harder to implement argument
 caching logic in the CacheExpr patch.

I've applied a slightly-modified version of this after reconciling it
with the protransform fixes.  I assume you are going to submit a rebased
version of the main CacheExpr patch?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring simplify_function (was: Caching constant stable expressions)

2012-03-23 Thread Marti Raudsepp
On Sat, Mar 24, 2012 at 01:17, Tom Lane t...@sss.pgh.pa.us wrote:
 I've applied a slightly-modified version of this after reconciling it
 with the protransform fixes.

Cool, thanks!

 I assume you are going to submit a rebased
 version of the main CacheExpr patch?

Yeah, I'm still working on addressing the comments from your last email.

Haven't had much time to work on it for the last 2 weeks, but I hope
to finish most of it this weekend.

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Refactoring simplify_function (was: Caching constant stable expressions)

2012-03-10 Thread Marti Raudsepp
Hi list,

Per Tom's request, I split out this refactoring from my CacheExpr patch.

Basically I'm just centralizing the eval_const_expressions_mutator()
call on function arguments, from multiple different places to a single
location. Without this, it would be a lot harder to implement argument
caching logic in the CacheExpr patch.

The old call tree goes like:
case T_FuncExpr:
- eval_const_expressions_mutator(args)
- simplify_function
   - reorder_function_arguments
   - eval_const_expressions_mutator(args)
   - add_function_defaults
   - eval_const_expressions_mutator(args)

New call tree:
case T_FuncExpr:
- simplify_function
   - simplify_copy_function_arguments
  - reorder_function_arguments
  - add_function_defaults
   - eval_const_expressions_mutator(args)

Assumptions being made:
* The code didn't depend on expanding existing arguments before adding defaults
* operator calls don't need to expand default arguments -- it's
currently impossible to CREATE OPERATOR with a function that has
unspecified arguments

Other changes:
* simplify_function no longer needs a 'bool has_named_args' argument
(it finds out itself).
* I added 'bool mutate_args' in its place, since some callers need to
mutate args beforehand.
* reorder_function_arguments no longer needs to keep track of which
default args were added.

Regards,
Marti
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
new file mode 100644
index cd3da46..9485e95
*** a/src/backend/optimizer/util/clauses.c
--- b/src/backend/optimizer/util/clauses.c
*** static List *simplify_and_arguments(List
*** 109,124 
  static Node *simplify_boolean_equality(Oid opno, List *args);
  static Expr *simplify_function(Expr *oldexpr, Oid funcid,
    Oid result_type, int32 result_typmod, Oid result_collid,
!   Oid input_collid, List **args,
!   bool has_named_args,
!   bool allow_inline,
    eval_const_expressions_context *context);
  static List *reorder_function_arguments(List *args, Oid result_type,
! 		   HeapTuple func_tuple,
! 		   eval_const_expressions_context *context);
  static List *add_function_defaults(List *args, Oid result_type,
! 	  HeapTuple func_tuple,
! 	  eval_const_expressions_context *context);
  static List *fetch_function_defaults(HeapTuple func_tuple);
  static void recheck_cast_function_args(List *args, Oid result_type,
  		   HeapTuple func_tuple);
--- 109,123 
  static Node *simplify_boolean_equality(Oid opno, List *args);
  static Expr *simplify_function(Expr *oldexpr, Oid funcid,
    Oid result_type, int32 result_typmod, Oid result_collid,
!   Oid input_collid, List **args_p,
!   bool mutate_args, bool allow_inline,
    eval_const_expressions_context *context);
+ static List *simplify_copy_function_arguments(List *old_args, Oid result_type,
+ 			HeapTuple func_tuple);
  static List *reorder_function_arguments(List *args, Oid result_type,
! 		   HeapTuple func_tuple);
  static List *add_function_defaults(List *args, Oid result_type,
! 	  HeapTuple func_tuple);
  static List *fetch_function_defaults(HeapTuple func_tuple);
  static void recheck_cast_function_args(List *args, Oid result_type,
  		   HeapTuple func_tuple);
*** eval_const_expressions_mutator(Node *nod
*** 2303,2336 
  		case T_FuncExpr:
  			{
  FuncExpr   *expr = (FuncExpr *) node;
! List	   *args;
! bool		has_named_args;
  Expr	   *simple;
  FuncExpr   *newexpr;
- ListCell   *lc;
- 
- /*
-  * Reduce constants in the FuncExpr's arguments, and check to
-  * see if there are any named args.
-  */
- args = NIL;
- has_named_args = false;
- foreach(lc, expr-args)
- {
- 	Node	   *arg = (Node *) lfirst(lc);
- 
- 	arg = eval_const_expressions_mutator(arg, context);
- 	if (IsA(arg, NamedArgExpr))
- 		has_named_args = true;
- 	args = lappend(args, arg);
- }
  
  /*
   * Code for op/func reduction is pretty bulky, so split it out
   * as a separate function.	Note: exprTypmod normally returns
   * -1 for a FuncExpr, but not when the node is recognizably a
   * length coercion; we want to preserve the typmod in the
!  * eventual Const if so.
   */
  simple = simplify_function((Expr *) expr,
  		   expr-funcid,
--- 2302,2318 
  		case T_FuncExpr:
  			{
  FuncExpr   *expr = (FuncExpr *) node;
! List	   *args = expr-args;
  Expr	   *simple;
  FuncExpr   *newexpr;
  
  /*
   * Code for op/func reduction is pretty bulky, so split it out
   * as a separate function.	Note: exprTypmod normally returns
   * -1 for a FuncExpr, but not when the node is recognizably a
   * length coercion; we want to preserve the typmod in the
!  * eventual Const if so. This function also mutates the
!  * argument list.
   */
  simple = simplify_function((Expr *) expr,
  		   expr-funcid,

Re: [HACKERS] Refactoring log_newpage

2012-02-02 Thread Simon Riggs
On Thu, Feb 2, 2012 at 7:26 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:

 Well, you can obviously check the catalogs for that, but you must be
 assuming that you don't have access to the catalogs or this would be a
 non-issue.

 You can also identify the kind of page by looking at the special area of the
 stored page. See:
 http://archives.postgresql.org/pgsql-hackers/2007-04/msg00392.php

How does that work with different forks?

I think its very ugly to mark all sorts of different pages as if they
were heap pages when they clearly aren't. I don't recall anything so
ugly being allowed anywhere else in the system. Why is it *needed*
here?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Refactoring log_newpage

2012-02-02 Thread Heikki Linnakangas

On 02.02.2012 11:35, Simon Riggs wrote:

On Thu, Feb 2, 2012 at 7:26 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:


Well, you can obviously check the catalogs for that, but you must be
assuming that you don't have access to the catalogs or this would be a
non-issue.

You can also identify the kind of page by looking at the special area of the
stored page. See:
http://archives.postgresql.org/pgsql-hackers/2007-04/msg00392.php


How does that work with different forks?


You have the fork number explicitly in the newpage record already.


I think its very ugly to mark all sorts of different pages as if they
were heap pages when they clearly aren't. I don't recall anything so
ugly being allowed anywhere else in the system. Why is it *needed*
here?


It's not needed. Beauty is in the eye of the beholder, but I don't find 
it all that ugly, and the comment in log_newpage explains it well.


I don't see much value in adding a new field to the record. Any tools 
that read the WAL format will need to be taught about that change. Not a 
huge issue, but I also don't see much gain. On the whole I'd be inclined 
to just leave it all alone, but whatever.


I don't think it's a good idea to rename XLOG_HEAP_NEWPAGE to 
XLOG_NEWPAGE. The WAL record is still part of the heapam rmgr, even if 
it's used by other access methods, too.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   3   >