Re: [HACKERS] WIP: Covering + unique indexes.

2017-11-12 Thread Andrey Borodin
Hi!
+1 for pushing this. I'm really looking forward to see this in 11.

> 31 окт. 2017 г., в 13:21, Anastasia Lubennikova 
>  написал(а):
> 
> Updated version is attached. It applies to the commit 
> e4fbf22831c2bbcf032ee60a327b871d2364b3f5.
> The first patch patch contains changes in general index routines
> and the second one contains btree specific changes.
> 
> This version contains fixes of the issues mentioned in the thread above and 
> passes all existing tests.
> But still it requires review and testing, because the merge was quite uneasy.
> Especially I worry about integration with partitioning. I'll add some more 
> tests in the next message.

I've been doing benchmark tests a year ago, so I skip this part in this review.

I've done some stress tests with pgbench, replication etc. Everything was fine 
until I plugged in amcheck.
If I create cluster with this [0] and then 
./pgbench -i -s 50

create index on pgbench_accounts (abalance) include (bid,aid,filler); 
create extension amcheck;

--and finally 
SELECT bt_index_check(c.oid), c.relname, c.relpages
FROM pg_index i
JOIN pg_opclass op ON i.indclass[0] = op.oid
JOIN pg_am am ON op.opcmethod = am.oid
JOIN pg_class c ON i.indexrelid = c.oid
JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE am.amname = 'btree' AND n.nspname = 'public'
AND c.relpersistence != 't'
AND i.indisready AND i.indisvalid
ORDER BY c.relpages DESC LIMIT 100;
--just copypasted from amcheck docs with minor corrections

Postgres crashes: 
TRAP: FailedAssertion("!(((const void*)() != ((void*)0)) && 
(scankey->sk_attno) > 0)", File: "nbtsearch.c", Line: 466)

May be I'm doing something wrong or amcheck support will go with different 
patch?

Few minor nitpicks:
0. PgAdmin fails to understand what is going on [1] . It is clearly problem of 
PgAdmin, pg_dump works as expected.
1. ISTM index_truncate_tuple() can be optimized. We only need to reset tuple 
length and infomask. But this should not be hotpath, anyway, so I propose 
ignoring this for current version.
2. I've done grammarly checking :) This comma seems redundant [2]
I don't think something of these items require fixing.

Thanks for working on this, I believe it is important.

Best regards, Andrey Borodin.

[0] https://github.com/x4m/pgscripts/blob/master/install.sh
[1] https://yadi.sk/i/ro9YKFqo3PcwFT
[2] 
https://github.com/x4m/postgres_g/commit/657c28952d923d8c150e6cabb3bdcbbc44a641b6?diff=unified#diff-640baf2937029728a8d51cccd554c2eeR1291

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2017-04-08 Thread David Steele
On 4/4/17 2:47 PM, Peter Geoghegan wrote:
> 
> At the very least, you should change comments to note the issue. I
> think it's highly unlikely that this could ever result in a failure to
> find a split point, which there are many defenses against already, but
> I think I would find that difficult to prove. The intent of the code
> is almost as important as the code, at least in my opinion.

This submission as been Returned with Feedback.  Please feel free to
resubmit to a future commitfest.

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2017-04-04 Thread Peter Geoghegan
On Tue, Apr 4, 2017 at 3:07 AM, Anastasia Lubennikova
 wrote:
>> * I think that we should store this (the number of attributes), and
>> use it directly when comparing, per my remarks to Tom over on that
>> other thread. We should also use the free bit within
>> IndexTupleData.t_info, to indicate that the IndexTuple was truncated,
>> just to make it clear to everyone that might care that that's how
>> these truncated IndexTuples need to be represented.
>>
>> Doing this would have no real impact on your patch, because for you
>> this will be 100% redundant. It will help external tools, and perhaps
>> another, more general suffix truncation patch that comes in the
>> future. We should try very hard to have a future-proof on-disk
>> representation. I think that this is quite possible.
>
> To be honest, I think that it'll make the patch overcomplified, because this
> exact patch has nothing to do with suffix truncation.
> Although, we can add any necessary flags if this work will be continued in
> the future.

Yes, doing things that way would mean adding a bit more complexity to
your patch, but IMV would be worth it to have the on-disk format be
compatible with what a full suffix truncation patch will eventually
require.

Obviously I disagree with what you say here -- I think that your patch
*does* have plenty in common with suffix truncation. But, you don't
have to even agree with me on that to see why what I propose is still
a good idea. Tom Lane had a specific objection to this patch --
catalog metadata is currently necessary to interpret internal page
IndexTuples [1]. However, by storing the true number of columns in the
case of truncated tuples, we can make the situation with IndexTuples
similar enough to the existing situation with heap tuples, where the
number of attributes is available right in the header as "natts". We
don't have to rely on something like catalog metadata from a great
distance, where some caller may forget to pass through the metadata to
a lower level.

So, presumably doing things this way addresses Tom's exact objection
to the truncation aspect of this patch [2]. We have the capacity to
store something like natts "for free" -- let's use it. The lack of any
direct source of metadata was called "dangerous". As much as anything
else, I want to remove any danger.

> There is already an assertion.
> Assert(IndexTupleSize(newitup) <= IndexTupleSize(olditup));
> Do you think it is not enough?

I think that a "can't happen" check will work better in the future,
when user defined code could be involved in truncation. Any extra
overhead will be paid relatively infrequently, and will be very low.

>> * I see a small bug. You forgot to teach _bt_findsplitloc() about
>> truncation. It does this currently, which you did not update:
>>
>>  /*
>>   * The first item on the right page becomes the high key of the left
>> page;
>>   * therefore it counts against left space as well as right space.
>>   */
>>  leftfree -= firstrightitemsz;
>>
>> I think that this accounting needs to be fixed.
>
> Could you explain, what's wrong with this accounting? We may expect to take
> more space on the left page, than will be taken after highkey truncation.
> But I don't see any problem here.

Obviously it would at least be slightly better to have the actual
truncated high key size where that's expected -- not the would-be
untruncated high key size. The code as it stands might lead to a bad
choice of split point in edge-cases.

At the very least, you should change comments to note the issue. I
think it's highly unlikely that this could ever result in a failure to
find a split point, which there are many defenses against already, but
I think I would find that difficult to prove. The intent of the code
is almost as important as the code, at least in my opinion.

[1] 
postgr.es/m/CAH2-Wz=vmdh8pfazx9wah9bn5ast5vrna0xsz+gsfrs12bp...@mail.gmail.com
[2] postgr.es/m/11895.1490983884%40sss.pgh.pa.us
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2017-04-04 Thread Anastasia Lubennikova

01.04.2017 02:31, Peter Geoghegan:


* index_truncate_tuple() should have as an argument the number of
attributes. No need to "#include utils/rel.h" that way.

Will fix.


* I think that we should store this (the number of attributes), and
use it directly when comparing, per my remarks to Tom over on that
other thread. We should also use the free bit within
IndexTupleData.t_info, to indicate that the IndexTuple was truncated,
just to make it clear to everyone that might care that that's how
these truncated IndexTuples need to be represented.

Doing this would have no real impact on your patch, because for you
this will be 100% redundant. It will help external tools, and perhaps
another, more general suffix truncation patch that comes in the
future. We should try very hard to have a future-proof on-disk
representation. I think that this is quite possible.
To be honest, I think that it'll make the patch overcomplified, because 
this exact patch has nothing to do with suffix truncation.
Although, we can add any necessary flags if this work will be continued 
in the future.

* I suggest adding a "can't happen" defensive check + error that
checks that the tuple returned by index_truncate_tuple() is sized <=
the original. This cannot be allowed to ever happen. (Not that I think
it will.)

There is already an assertion.
Assert(IndexTupleSize(newitup) <= IndexTupleSize(olditup));
Do you think it is not enough?

* I see a small bug. You forgot to teach _bt_findsplitloc() about
truncation. It does this currently, which you did not update:

 /*
  * The first item on the right page becomes the high key of the left page;
  * therefore it counts against left space as well as right space.
  */
 leftfree -= firstrightitemsz;

I think that this accounting needs to be fixed.
Could you explain, what's wrong with this accounting? We may expect to 
take more space on the left page, than will be taken after highkey 
truncation. But I don't see any problem here.



* Note sure about one thing. What's the reason for this change?


-   /* Log left page */
-   if (!isleaf)
-   {
-   /*
-* We must also log the left page's high key, because the right
-* page's leftmost key is suppressed on non-leaf levels.  Show it
-* as belonging to the left page buffer, so that it is not stored
-* if XLogInsert decides it needs a full-page image of the left
-* page.
-*/
-   itemid = PageGetItemId(origpage, P_HIKEY);
-   item = (IndexTuple) PageGetItem(origpage, itemid);
-   XLogRegisterBufData(0, (char *) item, 
MAXALIGN(IndexTupleSize(item)));
-   }
+   /*
+* We must also log the left page's high key, because the right
+* page's leftmost key is suppressed on non-leaf levels.  Show it
+* as belonging to the left page buffer, so that it is not stored
+* if XLogInsert decides it needs a full-page image of the left
+* page.
+*/
+   itemid = PageGetItemId(origpage, P_HIKEY);
+   item = (IndexTuple) PageGetItem(origpage, itemid);
+   XLogRegisterBufData(0, (char *) item, MAXALIGN(IndexTupleSize(item)));

Is this related to the problem that you mentioned to me that you'd
fixed when we spoke in person earlier today? You said something about
WAL logging, but I don't recall any details. I don't remember seeing
this change in prior versions.

Anyway, whatever the reason for doing this on the leaf level now, the
comments should be updated to explain it.

This change related to the bug described in this message.
https://www.postgresql.org/message-id/20170330192706.GA2565%40e733.localdomain
After fix it is not reproducible. I will update comments in the next patch.

* Speaking of WAL-logging, I think that there is another bug in
btree_xlog_split(). You didn't change this existing code at all:

 /*
  * On leaf level, the high key of the left page is equal to the first key
  * on the right page.
  */
 if (isleaf)
 {
 ItemId  hiItemId = PageGetItemId(rpage, P_FIRSTDATAKEY(ropaque));

 left_hikey = PageGetItem(rpage, hiItemId);
 left_hikeysz = ItemIdGetLength(hiItemId);
 }

It seems like this was missed when you changed WAL-logging, since you
do something for this on the logging side, but not here, on the replay
side. No?


I changed it. Now we always use highkey saved in xlog.
This code don't needed anymore and can be deleted. Thank you for the 
notice. I will send updated patch today.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2017-04-02 Thread Peter Geoghegan
On Thu, Mar 30, 2017 at 8:26 AM, Teodor Sigaev  wrote:
> Any objection from reviewers to push both patches?

I object.

Unfortunately, it seems very unlikely that we'll be able to get the
patch into shape in the allotted time before feature-freeze, even with
the 1 week extension.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2017-04-02 Thread Peter Geoghegan
On Fri, Mar 31, 2017 at 4:31 PM, Peter Geoghegan  wrote:
> That's all I have for now. Maybe I can look again later, or tomorrow.

I took another look, this time at code used during CREATE INDEX. More feedback:

* I see no reason to expose _bt_pgaddtup() (to modify it to not be
static, so it can be called during CREATE INDEX for truncated high
key). You could call PageAddItem() directly, just as _bt_pgaddtup()
itself does, and lose nothing. This is the case because the special
steps within _bt_pgaddtup() are only when inserting the first real
item (and only on an internal page). You're only ever using
_bt_pgaddtup() for the high key offset. Would a raw PageAddItem() call
lose anything?

I think I see why you've done this -- the existing CREATE INDEX
_bt_sortaddtup() routine (which is very similar to _bt_pgaddtup(), a
routine used for *insertion*) doesn't do the correct thing were you to
use it, because it assumes that the page is always right most (i.e.,
always has no high key yet).

The reason _bt_sortaddtup() exists is explained here:

 * This is almost like nbtinsert.c's _bt_pgaddtup(), but we can't use
 * that because it assumes that P_RIGHTMOST() will return the correct
 * answer for the page.  Here, we don't know yet if the page will be
 * rightmost.  Offset P_FIRSTKEY is always the first data key.
 */
static void
_bt_sortaddtup(Page page,
   Size itemsize,
   IndexTuple itup,
   OffsetNumber itup_off)
{
...
}

(...thinks some more...)

So, this difference only matters when you have a non-leaf item, which
is never subject to truncation in your patch. So, in fact, it doesn't
matter at all. I guess you should just use _bt_pgaddtup() after all,
rather than bothering with a raw PageAddItem(), even. But, don't
forget to note why this is okay above _bt_sortaddtup().

* Calling PageIndexTupleDelete() within _bt_buildadd(), which
memmove()s all other items on the leaf page, seems wasteful in the
context of CREATE INDEX. Can we do better?

* I also think that calling PageIndexTupleDelete() has a page space
accounting bug, because the following thing happens a second time for
highkey ItemId when new code does this call:

phdr->pd_lower -= sizeof(ItemIdData);

(The first time this happens is within _bt_buildadd() itself, just
before your patch calls PageIndexTupleDelete().)

* I don't think it's okay to let index_truncate_tuple() leak memory
within _bt_buildadd(). It's probably okay for nbtinsert.c callers to
index_truncate_tuple() to not be too careful, though, since those
calls occur in a per-tuple memory context. The same cannot be said for
_bt_buildadd()/CREATE INDEX calls.

* Speaking of memory management: is this really needed?

> @@ -554,7 +580,11 @@ _bt_buildadd(BTWriteState *wstate, BTPageState *state, 
> IndexTuple itup)
>  * Save a copy of the minimum key for the new page.  We have to copy
>  * it off the old page, not the new one, in case we are not at leaf
>  * level.
> +* Despite oitup is already initialized, it's important to get high
> +* key from the page, since we could have replaced it with truncated
> +* copy. See comment above.
>  */
> +   oitup = (IndexTuple) PageGetItem(opage,PageGetItemId(opage, P_HIKEY));
> state->btps_minkey = CopyIndexTuple(oitup);

You didn't modify/truncate oitup in-place -- you effectively made a
(truncated) copy by calling index_truncate_tuple(). Maybe you can
manage the memory by assigning keytup to state->btps_minkey, in place
of a CopyIndexTuple(), just for the truncation case?

I haven't studied this in enough detail to be sure that that would be
correct, but it seems clear that a better strategy is needed for
managing memory within _bt_buildadd().

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2017-03-31 Thread Peter Geoghegan
I had a quick look at this on the flight back from PGConf.US.

On Fri, Mar 31, 2017 at 10:40 AM, Anastasia Lubennikova
 wrote:
> But I haven't considered the possibility of index_form_tuple() failure.
> Fixed in this version of the patch. Now it creates a copy of tupledesc to
> pass it to index_form_tuple.

I think that we need to be 100% sure that index_truncate_tuple() will
not generate an IndexTuple that is larger than the original.
Otherwise, you could violate the "1/3 of page size exceeded" thing. We
need to catch that when the user actually inserts an oversized value.
After that, it's always too late. (See my remarks to Tom on other
thread about this, too.)

> We'd discussed with other reviewers, they suggested index_truncate_tuple()
> instead of index_reform_tuple().
> I think that this name reflects the essence of the function clear enough and
> don't feel like renaming it again.

+1.

Feedback so far:

* index_truncate_tuple() should have as an argument the number of
attributes. No need to "#include utils/rel.h" that way.

* I think that we should store this (the number of attributes), and
use it directly when comparing, per my remarks to Tom over on that
other thread. We should also use the free bit within
IndexTupleData.t_info, to indicate that the IndexTuple was truncated,
just to make it clear to everyone that might care that that's how
these truncated IndexTuples need to be represented.

Doing this would have no real impact on your patch, because for you
this will be 100% redundant. It will help external tools, and perhaps
another, more general suffix truncation patch that comes in the
future. We should try very hard to have a future-proof on-disk
representation. I think that this is quite possible.

* I suggest adding a "can't happen" defensive check + error that
checks that the tuple returned by index_truncate_tuple() is sized <=
the original. This cannot be allowed to ever happen. (Not that I think
it will.)

* I see a small bug. You forgot to teach _bt_findsplitloc() about
truncation. It does this currently, which you did not update:

/*
 * The first item on the right page becomes the high key of the left page;
 * therefore it counts against left space as well as right space.
 */
leftfree -= firstrightitemsz;

I think that this accounting needs to be fixed.

* Note sure about one thing. What's the reason for this change?

> -   /* Log left page */
> -   if (!isleaf)
> -   {
> -   /*
> -* We must also log the left page's high key, because the right
> -* page's leftmost key is suppressed on non-leaf levels.  Show it
> -* as belonging to the left page buffer, so that it is not stored
> -* if XLogInsert decides it needs a full-page image of the left
> -* page.
> -*/
> -   itemid = PageGetItemId(origpage, P_HIKEY);
> -   item = (IndexTuple) PageGetItem(origpage, itemid);
> -   XLogRegisterBufData(0, (char *) item, 
> MAXALIGN(IndexTupleSize(item)));
> -   }
> +   /*
> +* We must also log the left page's high key, because the right
> +* page's leftmost key is suppressed on non-leaf levels.  Show it
> +* as belonging to the left page buffer, so that it is not stored
> +* if XLogInsert decides it needs a full-page image of the left
> +* page.
> +*/
> +   itemid = PageGetItemId(origpage, P_HIKEY);
> +   item = (IndexTuple) PageGetItem(origpage, itemid);
> +   XLogRegisterBufData(0, (char *) item, MAXALIGN(IndexTupleSize(item)));

Is this related to the problem that you mentioned to me that you'd
fixed when we spoke in person earlier today? You said something about
WAL logging, but I don't recall any details. I don't remember seeing
this change in prior versions.

Anyway, whatever the reason for doing this on the leaf level now, the
comments should be updated to explain it.

* Speaking of WAL-logging, I think that there is another bug in
btree_xlog_split(). You didn't change this existing code at all:

/*
 * On leaf level, the high key of the left page is equal to the first key
 * on the right page.
 */
if (isleaf)
{
ItemId  hiItemId = PageGetItemId(rpage, P_FIRSTDATAKEY(ropaque));

left_hikey = PageGetItem(rpage, hiItemId);
left_hikeysz = ItemIdGetLength(hiItemId);
}

It seems like this was missed when you changed WAL-logging, since you
do something for this on the logging side, but not here, on the replay
side. No?

That's all I have for now. Maybe I can look again later, or tomorrow.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2017-03-31 Thread Anastasia Lubennikova

31.03.2017 20:57, Robert Haas:

On Fri, Mar 31, 2017 at 1:40 PM, Anastasia Lubennikova
 wrote:

First of all, I want to thank you and Robert for reviewing this patch.
Your expertise in postgres subsystems is really necessary for features like
this.
I just wonder, why don't you share your thoughts and doubts till the "last
call".

I haven't done any significant technical work other than review
patches in 14 months, and in the last several months I've often worked
10 and 12 hour days to get more review done.

I think at one level you've got a fair complaint here - it's hard to
get things committed, and this patch probably didn't get as much
attention as it deserved.  It's not so easy to know how to fix that.
I'm pretty sure "tell Andres and Robert to work harder" isn't it.



*off-topic*
No complaints from me, I understand how difficult is reviewing and 
highly appreciate your work.

The problem is that not all developers are qualified enough to do a review.
I've tried to make a course about postrges internals. Something like 
"Deep dive into postgres codebase for hackers".
And it turned out to be really helpful for new developers. So, I wonder, 
maybe we could write some tips for new reviewers and testers as well.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2017-03-31 Thread Anastasia Lubennikova

31.03.2017 20:47, Andres Freund:

Maybe it would be better to modify index_form_tuple() to accept a new
argument with a number of attributes, then you can just Assert that
this number is never higher than the number of attributes in the
TupleDesc.

Good point.
I agree that this function is a bit strange. I have to set
tupdesc->nattrs to support compatibility with index_form_tuple().
I didn't want to add neither a new field to tupledesc nor a new
parameter to index_form_tuple(), because they are used widely.


But I haven't considered the possibility of index_form_tuple() failure.
Fixed in this version of the patch. Now it creates a copy of tupledesc to
pass it to index_form_tuple.

That seems like it'd actually be a noticeable increase in memory
allocator overhead.  I think we should just add (as just proposed in
separate thread), a _extended version of it that allows to specify the
number of columns.


The function is called not that often. Only once per page split for 
indexes with included columns.
Doesn't look like dramatic overhead. So I decided that a wrapper 
function would be more appropriate than refactoring of all 
index_form_tuple() calls.

But index_form_tuple_extended() looks like a better solution.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [HACKERS] WIP: Covering + unique indexes.

2017-03-31 Thread Robert Haas
On Fri, Mar 31, 2017 at 1:40 PM, Anastasia Lubennikova
 wrote:
> First of all, I want to thank you and Robert for reviewing this patch.
> Your expertise in postgres subsystems is really necessary for features like
> this.
> I just wonder, why don't you share your thoughts and doubts till the "last
> call".

I haven't done any significant technical work other than review
patches in 14 months, and in the last several months I've often worked
10 and 12 hour days to get more review done.

I think at one level you've got a fair complaint here - it's hard to
get things committed, and this patch probably didn't get as much
attention as it deserved.  It's not so easy to know how to fix that.
I'm pretty sure "tell Andres and Robert to work harder" isn't it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2017-03-31 Thread Andres Freund
On 2017-03-31 20:40:59 +0300, Anastasia Lubennikova wrote:
> 30.03.2017 22:11, Andres Freund
> > Any objection from reviewers to push both patches?
> 
> First of all, I want to thank you and Robert for reviewing this patch.
> Your expertise in postgres subsystems is really necessary for features like
> this.
> I just wonder, why don't you share your thoughts and doubts till the "last
> call".

Because there's a lot of other patches?  I only looked because Teodor
announced he was thinking about committing - I just don't have the
energy to look at all patches before they're ready to commit.
Unfortunatly "ready-for-committer" is very frequently not actually that
:(


> > Maybe it would be better to modify index_form_tuple() to accept a new
> > argument with a number of attributes, then you can just Assert that
> > this number is never higher than the number of attributes in the
> > TupleDesc.
> Good point.
> I agree that this function is a bit strange. I have to set
> tupdesc->nattrs to support compatibility with index_form_tuple().
> I didn't want to add neither a new field to tupledesc nor a new
> parameter to index_form_tuple(), because they are used widely.
> 
> 
> But I haven't considered the possibility of index_form_tuple() failure.
> Fixed in this version of the patch. Now it creates a copy of tupledesc to
> pass it to index_form_tuple.

That seems like it'd actually be a noticeable increase in memory
allocator overhead.  I think we should just add (as just proposed in
separate thread), a _extended version of it that allows to specify the
number of columns.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2017-03-31 Thread Robert Haas
On Thu, Mar 30, 2017 at 5:22 PM, Anastasia Lubennikova
 wrote:
> Well,
> I don't know how can we estimate the quality of the review or testing.
> The patch was reviewed by many people.
> Here are those who marked themselves as reviewers on this and previous
> committfests: Stephen Frost (sfrost), Andrew Dunstan (adunstan), Aleksander
> Alekseev (a.alekseev), Amit Kapila (amitkapila), Andrey Borodin (x4m), Peter
> Geoghegan (pgeoghegan), David Rowley (davidrowley).

Sure, but the amount of in-depth review seems to have been limited.
Just because somebody put their name down in the CommitFest
application doesn't mean that they did a detailed review of all the
code.

>> It seems highly surprising to me that CheckIndexCompatible() only gets
>> a one line change in this patch.  That seems unlikely to be correct.
>
> What makes you think so? CheckIndexCompatible() only cares about possible
> opclasses' changes.
> For covering indexes opclasses are only applicable to indnkeyatts. And that
> is exactly what was changed in this patch.
> Do you think it needs some other changes?

Probably.  I mean, for an INCLUDE column, it wouldn't matter if a
collation or opclass change happened, but if the base data type had
changed, you'd still need to rebuild the index.  So presumably
CheckIndexCompatible() ought to be comparing some things, but not
everything, for INCLUDE columns.

>> Has anybody tested this patch with amcheck?  Does it break amcheck?
>
> Yes, it breaks amcheck. Amcheck should be patched in order to work with
> covering indexes.
> We've discussed it with Peter before and I even wrote small patch.
> I'll attach it in the following message.

Great.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2017-03-30 Thread Anastasia Lubennikova

30.03.2017 19:49, Robert Haas:

On Thu, Mar 30, 2017 at 11:26 AM, Teodor Sigaev  wrote:

I had a look on patch and played with it, seems, it looks fine. I splitted
it to two patches: core changes (+bloom index fix) and btree itself. All
docs are left in first patch - I'm too lazy to rewrite documentation which
is changed in second patch.
Any objection from reviewers to push both patches?

Has this really had enough review and testing?  The last time it was
pushed, it didn't go too well.  And laziness is not a very good excuse
for not dividing up patches properly.


Well,
I don't know how can we estimate the quality of the review or testing.
The patch was reviewed by many people.
Here are those who marked themselves as reviewers on this and previous 
committfests: Stephen Frost (sfrost), Andrew Dunstan (adunstan), 
Aleksander Alekseev (a.alekseev), Amit Kapila (amitkapila), Andrey 
Borodin (x4m), Peter Geoghegan (pgeoghegan), David Rowley (davidrowley).


For me it looks serious enough. These people, as well as many others, 
shared their thoughts on this topic and pointed out various mistakes.
I fixed all the issues as soon as I could. And I'm not going to 
disappear when it will be committed. Personally, I always thought that 
we have Alpha and Beta releases for integration testing.


Speaking of the feature itself, it is included into our fork of 
PostgreSQL 9.6 since it was released.
And as far as I know, there were no complaints from users. It makes me 
believe that there are no critical bugs there.

While there may be conflicts with some other features of v10.0.


It seems highly surprising to me that CheckIndexCompatible() only gets
a one line change in this patch.  That seems unlikely to be correct.
What makes you think so? CheckIndexCompatible() only cares about 
possible opclasses' changes.
For covering indexes opclasses are only applicable to indnkeyatts. And 
that is exactly what was changed in this patch.

Do you think it needs some other changes?


Has anybody done some testing of this patch with the WAL consistency
checker?  Like, create some tables with indexes that have INCLUDE
columns, set up a standby, enable consistency checking, pound the
master, and see if the standby bails?
Good point. I missed this feature, I wish someone mentioned this issue a 
bit earlier.

And as Alexander's test shows there is some problem with my patch, indeed.
I'll fix it and send updated patch.


Has anybody tested this patch with amcheck?  Does it break amcheck?
Yes, it breaks amcheck. Amcheck should be patched in order to work with 
covering indexes.

We've discussed it with Peter before and I even wrote small patch.
I'll attach it in the following message.


A few minor comments:

-foreach(lc, constraint->keys)
+else foreach(lc, constraint->keys)

That doesn't look like a reasonable way of formatting the code.

+/* Here is some code duplication. But we do need it. */

That is not a very informative comment.

+* NOTE It is not crutial for reliability in present,

Spelling, punctuation.



Will be fixed as well.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2017-03-30 Thread Aleksander Alekseev
Hi Robert,

> Has anybody done some testing of this patch with the WAL consistency
> checker?  Like, create some tables with indexes that have INCLUDE
> columns, set up a standby, enable consistency checking, pound the
> master, and see if the standby bails?

I've decided to run such a test. It looks like there is a bug indeed.

Steps to reproduce:

0. Apply a patch.
1. Build PostgreSQL using quick-build.sh [1]
2. Install master and replica using install.sh [2]
3. Download test.sql [3]
4. Run: `cat test.sql | psql`
5. In replica's logfile:

```
FATAL:  inconsistent page found, rel 1663/16384/16396, forknum 0, blkno 1
```

> Has anybody tested this patch with amcheck?  Does it break amcheck?

Amcheck doesn't complain.

[1] https://github.com/afiskon/pgscripts/blob/master/quick-build.sh
[2] https://github.com/afiskon/pgscripts/blob/master/install.sh
[3] http://afiskon.ru/s/88/93c544e6cf_test.sql

-- 
Best regards,
Aleksander Alekseev


signature.asc
Description: PGP signature


Re: [HACKERS] WIP: Covering + unique indexes.

2017-03-30 Thread Andres Freund
On 2017-03-30 18:26:05 +0300, Teodor Sigaev wrote:
> Any objection from reviewers to push both patches?


> diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
> index f2eda67..59029b9 100644
> --- a/contrib/bloom/blutils.c
> +++ b/contrib/bloom/blutils.c
> @@ -120,6 +120,7 @@ blhandler(PG_FUNCTION_ARGS)
>   amroutine->amclusterable = false;
>   amroutine->ampredlocks = false;
>   amroutine->amcanparallel = false;
> + amroutine->amcaninclude = false;

That name doesn't strike me as very descriptive.


> +  INCLUDE
> +  
> +   
> +An optional INCLUDE clause allows a list of columns to be
> +specified which will be included in the non-key portion of the index.
> +Columns which are part of this clause cannot also exist in the
> +key columns portion of the index, and vice versa. The
> +INCLUDE columns exist solely to allow more queries to 
> benefit
> +from index-only scans by including certain columns in 
> the
> +index, the value of which would otherwise have to be obtained by 
> reading
> +the table's heap. Having these columns in the INCLUDE 
> clause
> +in some cases allows PostgreSQL to skip the heap read
> +completely. This also allows UNIQUE indexes to be 
> defined on
> +one set of columns, which can include another set of columns in the
> +   INCLUDE clause, on which the uniqueness is not enforced.
> +It's the same with other constraints (PRIMARY KEY and EXCLUDE). This 
> can
> +also can be used for non-unique indexes as any columns which are not 
> required
> +for the searching or ordering of records can be used in the
> +INCLUDE clause, which can slightly reduce the size of 
> the index.
> +Currently, only the B-tree access method supports this feature.
> +Expressions as included columns are not supported since they cannot 
> be used
> +in index-only scans.
> +   
> +  
> + 

This could use some polishing.


> +/*
> + * Reform index tuple. Truncate nonkey (INCLUDE) attributes.
> + */
> +IndexTuple
> +index_truncate_tuple(Relation idxrel, IndexTuple olditup)
> +{
> + TupleDesc   itupdesc = RelationGetDescr(idxrel);
> + Datum   values[INDEX_MAX_KEYS];
> + boolisnull[INDEX_MAX_KEYS];
> + IndexTuple  newitup;
> + int indnatts = IndexRelationGetNumberOfAttributes(idxrel);
> + int indnkeyatts = IndexRelationGetNumberOfKeyAttributes(idxrel);
> +
> + Assert(indnatts <= INDEX_MAX_KEYS);
> + Assert(indnkeyatts > 0);
> + Assert(indnkeyatts < indnatts);
> +
> + index_deform_tuple(olditup, itupdesc, values, isnull);
> +
> + /* form new tuple that will contain only key attributes */
> + itupdesc->natts = indnkeyatts;
> + newitup = index_form_tuple(itupdesc, values, isnull);
> + newitup->t_tid = olditup->t_tid;
> +
> + itupdesc->natts = indnatts;

Uh, isn't this a *seriously* bad idea?  If index_form_tuple errors out,
this'll corrupt the tuple descriptor.


Maybe also rename the function to index_build_key_tuple()?

>   * Construct a string describing the contents of an index entry, in the
>   * form "(key_name, ...)=(key_value, ...)".  This is currently used
> - * for building unique-constraint and exclusion-constraint error messages.
> + * for building unique-constraint and exclusion-constraint error messages,
> + * so only key columns of index are checked and printed.

s/index/the index/


> @@ -368,7 +370,7 @@ systable_beginscan(Relation heapRelation,
>   {
>   int j;
>  
> - for (j = 0; j < irel->rd_index->indnatts; j++)
> + for (j = 0; j < 
> IndexRelationGetNumberOfAttributes(irel); j++)

>   {
>   if (key[i].sk_attno == 
> irel->rd_index->indkey.values[j])
>   {
> @@ -376,7 +378,7 @@ systable_beginscan(Relation heapRelation,
>   break;
>   }
>   }
> - if (j == irel->rd_index->indnatts)
> + if (j == IndexRelationGetNumberOfAttributes(irel))
>   elog(ERROR, "column is not in index");
>   }

Not that it matters overly much, but why are we doing this for all
attributes, rather than just key attributes?


> --- a/src/backend/bootstrap/bootstrap.c
> +++ b/src/backend/bootstrap/bootstrap.c
> @@ -600,7 +600,7 @@ boot_openrel(char *relname)
>relname, (int) ATTRIBUTE_FIXED_PART_SIZE);
>  
>   boot_reldesc = heap_openrv(makeRangeVar(NULL, relname, -1), NoLock);
> - numattr = boot_reldesc->rd_rel->relnatts;
> + numattr = RelationGetNumberOfAttributes(boot_reldesc);
>   for (i = 0; i < numattr; i++)
>   {
>   if (attrtypes[i] == NULL)

That seems a bit unrelated.


> @@ 

Re: [HACKERS] WIP: Covering + unique indexes.

2017-03-30 Thread Robert Haas
On Thu, Mar 30, 2017 at 11:26 AM, Teodor Sigaev  wrote:
> I had a look on patch and played with it, seems, it looks fine. I splitted
> it to two patches: core changes (+bloom index fix) and btree itself. All
> docs are left in first patch - I'm too lazy to rewrite documentation which
> is changed in second patch.
> Any objection from reviewers to push both patches?

Has this really had enough review and testing?  The last time it was
pushed, it didn't go too well.  And laziness is not a very good excuse
for not dividing up patches properly.

It seems highly surprising to me that CheckIndexCompatible() only gets
a one line change in this patch.  That seems unlikely to be correct.

Has anybody done some testing of this patch with the WAL consistency
checker?  Like, create some tables with indexes that have INCLUDE
columns, set up a standby, enable consistency checking, pound the
master, and see if the standby bails?

Has anybody tested this patch with amcheck?  Does it break amcheck?

A few minor comments:

-foreach(lc, constraint->keys)
+else foreach(lc, constraint->keys)

That doesn't look like a reasonable way of formatting the code.

+/* Here is some code duplication. But we do need it. */

That is not a very informative comment.

+* NOTE It is not crutial for reliability in present,

Spelling, punctuation.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2017-03-30 Thread Aleksander Alekseev
Hi Teodor,

> I had a look on patch and played with it, seems, it looks fine. I splitted
> it to two patches: core changes (+bloom index fix) and btree itself. All
> docs are left in first patch - I'm too lazy to rewrite documentation which
> is changed in second patch.
> Any objection from reviewers to push both patches?

These patches look OK. Definitely no objections from me.

-- 
Best regards,
Aleksander Alekseev


signature.asc
Description: PGP signature


Re: [HACKERS] WIP: Covering + unique indexes.

2017-03-30 Thread Teodor Sigaev
I had a look on patch and played with it, seems, it looks fine. I splitted it to 
two patches: core changes (+bloom index fix) and btree itself. All docs are left 
in first patch - I'm too lazy to rewrite documentation which is changed in 
second patch.

Any objection from reviewers to push both patches?


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


0001-covering-core.patch
Description: binary/octet-stream


0002-covering-btree.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2017-03-30 Thread Teodor Sigaev

-   IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IMPORT_P IN_P
+   IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IMPORT_P IN_P 
INCLUDE

I think your syntax would read no worse, possibly even better, if you
just used the existing INCLUDING keyword.
It was a discussion in this thread about naming and both databases, which 
support covering indexes, use INCLUDE keyword.


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2017-03-24 Thread Aleksander Alekseev
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

This patch looks good to me. As I understand we have both a complete feature 
and a consensus in a thread here. If there are no objection, I'm marking this 
patch as "Ready for Commiter".

The new status of this patch is: Ready for Committer

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2017-03-21 Thread Anastasia Lubennikova

Patch rebased to the current master is in attachments.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

commit 497d52b713dd8f926b465ddf22f21db7229b12e3
Author: Anastasia 
Date:   Tue Mar 21 12:58:13 2017 +0300

include_columns_10.0_v4.patch

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index c1e9089..5c80e3b 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -100,7 +100,7 @@ static remoteConn *getConnectionByName(const char *name);
 static HTAB *createConnHash(void);
 static void createNewConnection(const char *name, remoteConn *rconn);
 static void deleteConnection(const char *name);
-static char **get_pkey_attnames(Relation rel, int16 *numatts);
+static char **get_pkey_attnames(Relation rel, int16 *indnkeyatts);
 static char **get_text_array_contents(ArrayType *array, int *numitems);
 static char *get_sql_insert(Relation rel, int *pkattnums, int pknumatts, char **src_pkattvals, char **tgt_pkattvals);
 static char *get_sql_delete(Relation rel, int *pkattnums, int pknumatts, char **tgt_pkattvals);
@@ -1480,7 +1480,7 @@ PG_FUNCTION_INFO_V1(dblink_get_pkey);
 Datum
 dblink_get_pkey(PG_FUNCTION_ARGS)
 {
-	int16		numatts;
+	int16		indnkeyatts;
 	char	  **results;
 	FuncCallContext *funcctx;
 	int32		call_cntr;
@@ -1506,7 +1506,7 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		rel = get_rel_from_relname(PG_GETARG_TEXT_PP(0), AccessShareLock, ACL_SELECT);
 
 		/* get the array of attnums */
-		results = get_pkey_attnames(rel, );
+		results = get_pkey_attnames(rel, );
 
 		relation_close(rel, AccessShareLock);
 
@@ -1526,9 +1526,9 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		attinmeta = TupleDescGetAttInMetadata(tupdesc);
 		funcctx->attinmeta = attinmeta;
 
-		if ((results != NULL) && (numatts > 0))
+		if ((results != NULL) && (indnkeyatts > 0))
 		{
-			funcctx->max_calls = numatts;
+			funcctx->max_calls = indnkeyatts;
 
 			/* got results, keep track of them */
 			funcctx->user_fctx = results;
@@ -2016,10 +2016,10 @@ dblink_fdw_validator(PG_FUNCTION_ARGS)
  * get_pkey_attnames
  *
  * Get the primary key attnames for the given relation.
- * Return NULL, and set numatts = 0, if no primary key exists.
+ * Return NULL, and set indnkeyatts = 0, if no primary key exists.
  */
 static char **
-get_pkey_attnames(Relation rel, int16 *numatts)
+get_pkey_attnames(Relation rel, int16 *indnkeyatts)
 {
 	Relation	indexRelation;
 	ScanKeyData skey;
@@ -2029,8 +2029,8 @@ get_pkey_attnames(Relation rel, int16 *numatts)
 	char	  **result = NULL;
 	TupleDesc	tupdesc;
 
-	/* initialize numatts to 0 in case no primary key exists */
-	*numatts = 0;
+	/* initialize indnkeyatts to 0 in case no primary key exists */
+	*indnkeyatts = 0;
 
 	tupdesc = rel->rd_att;
 
@@ -2051,12 +2051,12 @@ get_pkey_attnames(Relation rel, int16 *numatts)
 		/* we're only interested if it is the primary key */
 		if (index->indisprimary)
 		{
-			*numatts = index->indnatts;
-			if (*numatts > 0)
+			*indnkeyatts = index->indnkeyatts;
+			if (*indnkeyatts > 0)
 			{
-result = (char **) palloc(*numatts * sizeof(char *));
+result = (char **) palloc(*indnkeyatts * sizeof(char *));
 
-for (i = 0; i < *numatts; i++)
+for (i = 0; i < *indnkeyatts; i++)
 	result[i] = SPI_fname(tupdesc, index->indkey.values[i]);
 			}
 			break;
diff --git a/contrib/tcn/tcn.c b/contrib/tcn/tcn.c
index 1241108..943e410 100644
--- a/contrib/tcn/tcn.c
+++ b/contrib/tcn/tcn.c
@@ -138,9 +138,9 @@ triggered_change_notification(PG_FUNCTION_ARGS)
 		/* we're only interested if it is the primary key and valid */
 		if (index->indisprimary && IndexIsValid(index))
 		{
-			int			numatts = index->indnatts;
+			int indnkeyatts = index->indnkeyatts;
 
-			if (numatts > 0)
+			if (indnkeyatts > 0)
 			{
 int			i;
 
@@ -150,7 +150,7 @@ triggered_change_notification(PG_FUNCTION_ARGS)
 appendStringInfoCharMacro(payload, ',');
 appendStringInfoCharMacro(payload, operation);
 
-for (i = 0; i < numatts; i++)
+for (i = 0; i < indnkeyatts; i++)
 {
 	int			colno = index->indkey.values[i];
 
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index df0435c..e196e20 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -3620,6 +3620,14 @@
   pg_class.relnatts)
  
 
+  
+  indnkeyatts
+  int2
+  
+  The number of key columns in the index. "Key columns" are ordinary
+  index columns (as opposed to "included" columns).
+ 
+
  
   indisunique
   bool
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index ac51258..f9539e9 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -114,6 +114,8 @@ typedef struct IndexAmRoutine
 boolamcanparallel;
 /* type of data stored in index, or InvalidOid if variable */
 Oid amkeytype;
+/* does AM support columns included with clause INCLUDE? */
+bool  

Re: [HACKERS] WIP: Covering + unique indexes.

2017-02-27 Thread Peter Eisentraut
On 2/16/17 08:13, Anastasia Lubennikova wrote:
> @@ -629,7 +630,7 @@ static Node *makeRecursiveViewSelect(char *relname, List 
> *aliases, Node *query);
>  
>   HANDLER HAVING HEADER_P HOLD HOUR_P
>  
> - IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IMPORT_P IN_P
> + IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IMPORT_P IN_P 
> INCLUDE
>   INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
>   INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
>   INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION

I think your syntax would read no worse, possibly even better, if you
just used the existing INCLUDING keyword.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2017-02-25 Thread Amit Kapila
On Thu, Feb 16, 2017 at 6:43 PM, Anastasia Lubennikova
 wrote:
> 14.02.2017 15:46, Amit Kapila:
>
>
>> 4.
>> + IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IMPORT_P IN_P
>> INCLUDE
>>INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
>>INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
>>INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
>> @@ -3431,17 +3433,18 @@ ConstraintElem:
>>n->initially_valid = !n->skip_validation;
>>$$ = (Node *)n;
>>}
>> - | UNIQUE '(' columnList ')' opt_definition OptConsTableSpace
>> + | UNIQUE '(' columnList ')' opt_c_including opt_definition
>> OptConsTableSpace
>>
>> If we want to use INCLUDE in syntax, then it might be better to keep
>> the naming reflect the same.  For ex. instead of opt_c_including we
>> should use opt_c_include.
>>
>
>
> Thank you for this suggestion. I've just wrote the code looking at examples
> around,
> but optincluding and optcincluding clauses seem to be redundant.
> I've cleaned up the code.
>

I think you have cleaned only in gram.y as I could see the references
to 'including' in other parts of code.  For ex, see below code:
@@ -2667,6 +2667,7 @@ _copyConstraint(const Constraint *from)
  COPY_NODE_FIELD(raw_expr);
  COPY_STRING_FIELD(cooked_expr);
  COPY_NODE_FIELD(keys);
+ COPY_NODE_FIELD(including);
  COPY_NODE_FIELD(exclusions);
  COPY_NODE_FIELD(options);
  COPY_STRING_FIELD(indexname);
@@ -3187,6 +3188,7 @@ _copyIndexStmt(const IndexStmt *from)
  COPY_STRING_FIELD(accessMethod);
  COPY_STRING_FIELD(tableSpace);
  COPY_NODE_FIELD(indexParams);
+ COPY_NODE_FIELD(indexIncludingParams);


@@ -425,6 +425,13 @@ ConstructTupleDescriptor(Relation heapRelation,
  /*
+ * Code below is concerned to the opclasses which are not used
+ * with the included columns.
+ */
+ if (i >= indexInfo->ii_NumIndexKeyAttrs)
+ continue;
+

There seems to be code below the above check which is not directly
related to opclasses, so not sure if you have missed that or is there
any other reason to ignore that.  I am referring to following code in
the same function after the above check:
/*
* If a key type different from the heap value is specified, update
*
the type-related fields in the index tupdesc.
*/
if (OidIsValid(keyType) &&
keyType != to->atttypid)


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2017-02-14 Thread Amit Kapila
On Mon, Jan 9, 2017 at 8:32 PM, Anastasia Lubennikova
 wrote:
> Updated version of the patch is attached. Besides code itself, it contains
> new regression test,
> documentation updates and a paragraph in nbtree/README.
>

The latest patch doesn't apply cleanly.

Few assorted comments:
1.
@@ -4806,16 +4810,25 @@ RelationGetIndexAttrBitmap(Relation relation,
IndexAttrBitmapKind attrKind)
{
..
+ /*
+ * Since we have covering indexes with non-key columns,
+ * we must handle them accurately here. non-key columns
+ * must be added into indexattrs, since they are in index,
+ * and HOT-update shouldn't miss them.
+ * Obviously, non-key columns couldn't be referenced by
+ * foreign key or identity key. Hence we do not include
+ * them into uindexattrs and idindexattrs bitmaps.
+ */
  if (attrnum != 0)
  {
  indexattrs = bms_add_member(indexattrs,
attrnum -
FirstLowInvalidHeapAttributeNumber);

- if (isKey)
+ if (isKey && i < indexInfo->ii_NumIndexKeyAttrs)
  uindexattrs = bms_add_member(uindexattrs,
attrnum -
FirstLowInvalidHeapAttributeNumber);

- if (isIDKey)
+ if (isIDKey && i < indexInfo->ii_NumIndexKeyAttrs)
  idindexattrs = bms_add_member(idindexattrs,
attrnum -
FirstLowInvalidHeapAttributeNumber);
..
}

Can included columns be part of primary key?  If not, then won't you
need a check similar to above for Primary keys?


2.
+ int indnkeyattrs; /* number of index key attributes*/
+ int indnattrs; /* total number of index attributes*/
+ Oid   *indkeys; /* In spite of the name 'indkeys' this field
+ * contains both key and nonkey
attributes*/

Before the end of the comment, one space is needed.

3.
}
-
  /*
  * For UNIQUE and PR
IMARY KEY, we just have a list of column names.
  *

Looks like spurious line removal.

4.
+ IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IMPORT_P IN_P INCLUDE
  INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
  INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
  INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
@@ -3431,17 +3433,18 @@ ConstraintElem:
  n->initially_valid = !n->skip_validation;
  $$ = (Node *)n;
  }
- | UNIQUE '(' columnList ')' opt_definition OptConsTableSpace
+ | UNIQUE '(' columnList ')' opt_c_including opt_definition OptConsTableSpace

If we want to use INCLUDE in syntax, then it might be better to keep
the naming reflect the same.  For ex. instead of opt_c_including we
should use opt_c_include.

5.
+opt_c_including: INCLUDE optcincluding { $$ = $2; }
+ | /* EMPTY */ { $$
= NIL; }
+ ;
+
+optcincluding : '(' columnList ')' { $$ = $2; }
+ ;
+

It seems optcincluding is redundant, why can't we directly specify
along with INCLUDE?  If there was some other use of optcincluding or
if there is a complicated definition of the same then it would have
made sense to define it separately.  We have a lot of similar usage in
gram.y, refer opt_in_database.

6.
+optincluding : '(' index_including_params ')' { $$ = $2; }
+ ;
+opt_including: INCLUDE optincluding { $$ = $2; }
+ | /* EMPTY */ { $$ = NIL; }
+ ;

Here the ordering of above clauses seems to be another way.  Also, the
naming of both seems to be confusing. I think either we can eliminate
*optincluding* by following suggestion similar to the previous point
or name them somewhat clearly (like opt_include_clause and
opt_include_params/opt_include_list).

7. Can you include doc fixes suggested by Erik Rijkers [1]?  I have
checked them and they seem to be better than what is there in the
patch.


[1] - 
https://www.postgresql.org/message-id/3863bca17face15c6acd507e0173a6dc%40xs4all.nl

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2017-01-09 Thread Erik Rijkers

On 2017-01-09 16:02, Anastasia Lubennikova wrote:

 include_columns_10.0_v1.patch


The patch applies, compiles, and make check is OK.

It yields nice perfomance gains and I haven't been able to break 
anything (yet).


Some edits of the sgml-changes are attached.

Thank you for this very useful improvement.

Erik Rijkers





--- doc/src/sgml/catalogs.sgml.orig	2017-01-10 03:40:52.649761572 +0100
+++ doc/src/sgml/catalogs.sgml	2017-01-10 03:53:13.408298695 +0100
@@ -3598,7 +3598,7 @@
   int2
   
   The number of key columns in the index. "Key columns" are ordinary
-  index columns in contrast with "included" columns.
+  index columns (as opposed to "included" columns).
  
 
  
--- doc/src/sgml/ref/create_index.sgml.orig	2017-01-10 03:14:25.603940872 +0100
+++ doc/src/sgml/ref/create_index.sgml	2017-01-10 03:22:20.013526245 +0100
@@ -153,16 +153,15 @@
 the table's heap. Having these columns in the INCLUDE clause
 in some cases allows PostgreSQL to skip the heap read
 completely. This also allows UNIQUE indexes to be defined on
-one set of columns, which can include another set of column in the
-INCLUDE clause, on which the uniqueness is not enforced upon.
+one set of columns, which can include another set of columns in the
+INCLUDE clause, on which the uniqueness is not enforced.
 It's the same with other constraints (PRIMARY KEY and EXCLUDE). This can
 also can be used for non-unique indexes as any columns which are not required
-for the searching or ordering of records can be included in the
-INCLUDE clause, which can slightly reduce the size of the index,
-due to storing included attributes only in leaf index pages.
+for the searching or ordering of records can be used in the
+INCLUDE clause, which can slightly reduce the size of the index.
 Currently, only the B-tree access method supports this feature.
 Expressions as included columns are not supported since they cannot be used
-in index-only scan.
+in index-only scans.

   
  
--- doc/src/sgml/ref/create_table.sgml.orig	2017-01-10 03:15:17.033377433 +0100
+++ doc/src/sgml/ref/create_table.sgml	2017-01-10 03:34:57.541537576 +0100
@@ -615,9 +615,9 @@
  
   Adding a unique constraint will automatically create a unique btree
   index on the column or group of columns used in the constraint.
-  Optional clause INCLUDE allows to add into the index
-  a portion of columns on which the uniqueness is not enforced upon.
-  Note, that althogh constraint is not enforced upon included columns, it still
+  The optional clause INCLUDE adds to that index
+  one or more columns on which the uniqueness is not enforced.
+  Note that although the constraint is not enforced on the included columns, it still
   depends on them. Consequently, some operations on these columns (e.g. DROP COLUMN)
   can cause cascade constraint and index deletion.
   See paragraph about INCLUDE in
@@ -659,7 +659,7 @@
   index on the column or group of columns used in the constraint.
   An optional INCLUDE clause allows a list of columns
   to be specified which will be included in the non-key portion of the index.
-  Althogh uniqueness is not enforced on the included columns, the constraint
+  Although uniqueness is not enforced on the included columns, the constraint
   still depends on them. Consequently, some operations on the included columns
   (e.g. DROP COLUMN) can cause cascade constraint and index deletion.
   See paragraph about INCLUDE in

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2016-12-01 Thread Haribabu Kommi
On Sat, Nov 19, 2016 at 8:38 AM, Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> On 10/4/16 10:47 AM, Anastasia Lubennikova wrote:
> > 03.10.2016 15:29, Anastasia Lubennikova:
> >> 03.10.2016 05:22, Michael Paquier:
> >>> On Tue, Sep 27, 2016 at 12:17 AM, Anastasia Lubennikova
> >>>  wrote:
>  Ok, I'll write it in a few days.
> >>> Marked as returned with feedback per last emails exchanged.
> >>
> >> The only complaint about this patch was a lack of README,
> >> which is fixed now (see the attachment). So, I added it to new CF,
> >> marked as ready for committer.
> >
> > One more fix for pg_upgrade.
>
> Latest patch doesn't apply.  See also review by Brad DeJong.  I'm
> setting it back to Waiting.


Closed in 2016-11 commitfest with "returned with feedback" status.
Please feel free to update the status once you submit the updated patch.

Regards,
Hari Babu
Fujitsu Australia


Re: [HACKERS] WIP: Covering + unique indexes.

2016-11-18 Thread Peter Eisentraut
On 10/4/16 10:47 AM, Anastasia Lubennikova wrote:
> 03.10.2016 15:29, Anastasia Lubennikova:
>> 03.10.2016 05:22, Michael Paquier:
>>> On Tue, Sep 27, 2016 at 12:17 AM, Anastasia Lubennikova
>>>  wrote:
 Ok, I'll write it in a few days.
>>> Marked as returned with feedback per last emails exchanged.
>>
>> The only complaint about this patch was a lack of README,
>> which is fixed now (see the attachment). So, I added it to new CF,
>> marked as ready for committer.
> 
> One more fix for pg_upgrade.

Latest patch doesn't apply.  See also review by Brad DeJong.  I'm
setting it back to Waiting.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes

2016-11-14 Thread Brad DeJong
Anastasia, et al,

This is a review of including_columns_9.7_v5.patch.

I looked through the commit fest list and this patch was interesting and I 
wanted to try it.

I have used include columns on Microsoft SQL Server; DB2 for Linux, Unix, 
Windows; and DB2 for z/OS.

After reading the e-mail discussions, there are still points that I am not 
clear on.

Given "create table foo (a int, b int, c int, d int)" and "create unique index 
foo_a_b on foo (a, b) including (c)".

   index only?   heap tuple 
needed?
select a, b, c from foo where a = 1yes  no
select a, b, d from foo where a = 1no   yes
select a, bfrom foo where a = 1 and c = 1  ??

It was clear from the discussion that the index scan/access method would not 
resolve the "c = 1" predicate but it was not clear if "c = 1" could be resolved 
without accessing the heap tuple.

Are included columns counted against the 32 column and 2712 byte index limits? 
I did not see either explicitly mentioned in the discussion or the 
documentation. I only ask because in SQL Server the limits are different for 
include columns.


1. syntax - on 2016-08-14, Andrey Borodin wrote "I think MS SQL syntax INCLUDE 
instead of INCLUDING would be better". I would go further than that. This 
feature is already supported by 2 of the top 5 SQL databases and they both use 
INCLUDE. Using different syntax because of an internal implementation detail 
seems short sighted.

2. The patch does not seem to apply cleanly anymore.

> git checkout master
Already on 'master'
> git pull
From http://git.postgresql.org/git/postgresql
   d49cc58..06f5fd2  master -> origin/master
Updating d49cc58..06f5fd2
Fast-forward
 src/include/port.h | 2 +-
 src/port/dirmod.c  | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)
> patch -pl < including_columns_9.7_v5.patch
patching file contrib/dblink/dblink.c
...
patching file src/backend/parser/gram.y
...
Hunk #2 FAILED at 375.
...
1 out of 12 hunks FAILED -- saving rejects to file 
src/backend/parser/gram.y.rej
...
patching file src/bin/pg_dump/pg_dump.c
...
Hunk #8 FAILED at 6399.
Hunk #9 FAILED at 6426.
...
2 out of 13 hunks FAILED -- saving rejects to file 
src/bin/pg_dump/pg_dump.c.rej
...

3. After "fixing" compilation errors (best guess based on similar change in 
other location), "make check" failed.

> make check
...
parallel group (3 tests):  index_including create_view create_index
 create_index ... ok
 create_view  ... ok
 index_including  ... FAILED
...

Looking at regression.diffs, it looks like the output format of \d tbl 
changed (lines 288,300) from the expected "Column | Type | Modifiers" to 
"Column | Type | Collation | Nullable | Default".

4. documentation - minor items (these are not actual diffs)
  create_index.sgml
  -[ INCLUDING ( column_name )]
  +[ INCLUDING ( column_name [, ...] )]

   An optional INCLUDING clause allows a list of 
columns to be
  -specified which will be included in the index, in the non-key 
portion of the index.
  +specified which will be included in the non-key portion of the 
index.

  The whole paragraph on INCLUDING does not include many of the points 
raised in the feature discussions.

  create_table.sgml (for both UNIQUE and PRIMARY KEY constraints) (similar 
change could be made to nbtree/README)
  -Optional clause INCLUDING allows to add into 
the index
  -a portion of columns on which the uniqueness is not enforced 
upon.
  -Note, that althogh constraint is not enforced on included 
columns, it still
  -depends on them. Consequently, some operations on these columns 
(e.g. DROP COLUMN)
  -can cause cascade constraint and index deletion.

  +An optional INCLUDING clause allows a list of 
columns
  +to be specified which will be included in the non-key portion of 
the index.
  +Although uniqueness is not enforced on the included columns, the 
constraint
  +still depends on them. Consequently, some operations on the 
included columns
  +(e.g. DROP COLUMN) can cause cascading 
constraint and index deletion.

  indexcmds.c
  -* are key columns, and which are just part of the INCLUDING list 
by check
  +* are key columns, and which are just part of the INCLUDING list 
by checking

ruleutils.c
  -   * meaningless there, so do not include them into the 
message.
  +   * meaningless there, so do not include them in the 
message.

pg_dump.c (does "if (fout->remoteVersion >= 90600)" also need to change?)
  -   * In 9.6 we add 

Re: [HACKERS] WIP: Covering + unique indexes.

2016-10-05 Thread Tom Lane
Robert Haas  writes:
> On Wed, Oct 5, 2016 at 9:04 AM, Amit Kapila  wrote:
>> Okay, but in that case I think we don't need to store including
>> columns in non-leaf pages to get the exact ordering.  As mentioned
>> upthread, we can use truncated scan key to reach to leaf level and
>> then use the complete key to find the exact location to store the key.
>> This is only possible if there exists an opclass for columns that are
>> covered as part of including clause.  So, we can allow "order by" to
>> use index scan only if the columns covered in included clause have
>> opclass for btree.

> But what if there are many pages full of keys that have the same
> values for the non-INCLUDING columns?

I concur with Robert that INCLUDING columns should be just dead weight
as far as the index is concerned.  Even if opclass information is
available for them, it's overcomplication for too little return.  We do
not need three classes of columns in an index.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2016-10-05 Thread Robert Haas
On Wed, Oct 5, 2016 at 9:04 AM, Amit Kapila  wrote:
> Okay, but in that case I think we don't need to store including
> columns in non-leaf pages to get the exact ordering.  As mentioned
> upthread, we can use truncated scan key to reach to leaf level and
> then use the complete key to find the exact location to store the key.
> This is only possible if there exists an opclass for columns that are
> covered as part of including clause.  So, we can allow "order by" to
> use index scan only if the columns covered in included clause have
> opclass for btree.

But what if there are many pages full of keys that have the same
values for the non-INCLUDING columns?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2016-10-05 Thread Amit Kapila
On Tue, Oct 4, 2016 at 7:50 PM, Robert Haas  wrote:
> On Tue, Oct 4, 2016 at 9:20 AM, Amit Kapila  wrote:
 I'd say that the reason for not using included columns in any
 operations which require comparisons, is that they don't have opclass.
 Let's go back to the example of points.
 This data type don't have any opclass for B-tree, because of fundamental
 reasons.
 And we can not apply _bt_compare() and others to this attribute, so
 we don't include it to scan key.

 create table t (i int, i2 int, p point);
 create index idx1 on (i) including (i2);
 create index idx2 on (i) including (p);
 create index idx3 on (i) including (i2, p);
 create index idx4 on (i) including (p, i2);

 You can keep tuples ordered in idx1, but not for idx2, partially ordered 
 for
 idx3, but not for idx4.
>>>
>>> Yeah, I think we shouldn't go there.  I mean, once you start ordering
>>> by INCLUDING columns, then you're going to need to include them in
>>> leaf pages because otherwise you can't actually guarantee that they
>>> are in the right order.
>>
>> I am not sure what you mean by above, because patch already stores
>> INCLUDING columns in leaf pages.
>
> Sorry, I meant non-leaf pages.
>

Okay, but in that case I think we don't need to store including
columns in non-leaf pages to get the exact ordering.  As mentioned
upthread, we can use truncated scan key to reach to leaf level and
then use the complete key to find the exact location to store the key.
This is only possible if there exists an opclass for columns that are
covered as part of including clause.  So, we can allow "order by" to
use index scan only if the columns covered in included clause have
opclass for btree.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2016-10-04 Thread Anastasia Lubennikova

03.10.2016 15:29, Anastasia Lubennikova:

03.10.2016 05:22, Michael Paquier:

On Tue, Sep 27, 2016 at 12:17 AM, Anastasia Lubennikova
 wrote:

Ok, I'll write it in a few days.

Marked as returned with feedback per last emails exchanged.


The only complaint about this patch was a lack of README,
which is fixed now (see the attachment). So, I added it to new CF,
marked as ready for committer.


One more fix for pg_upgrade.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index d4f9090..99735ce 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -100,7 +100,7 @@ static remoteConn *getConnectionByName(const char *name);
 static HTAB *createConnHash(void);
 static void createNewConnection(const char *name, remoteConn *rconn);
 static void deleteConnection(const char *name);
-static char **get_pkey_attnames(Relation rel, int16 *numatts);
+static char **get_pkey_attnames(Relation rel, int16 *indnkeyatts);
 static char **get_text_array_contents(ArrayType *array, int *numitems);
 static char *get_sql_insert(Relation rel, int *pkattnums, int pknumatts, char **src_pkattvals, char **tgt_pkattvals);
 static char *get_sql_delete(Relation rel, int *pkattnums, int pknumatts, char **tgt_pkattvals);
@@ -1483,7 +1483,7 @@ PG_FUNCTION_INFO_V1(dblink_get_pkey);
 Datum
 dblink_get_pkey(PG_FUNCTION_ARGS)
 {
-	int16		numatts;
+	int16		indnkeyatts;
 	char	  **results;
 	FuncCallContext *funcctx;
 	int32		call_cntr;
@@ -1509,7 +1509,7 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		rel = get_rel_from_relname(PG_GETARG_TEXT_P(0), AccessShareLock, ACL_SELECT);
 
 		/* get the array of attnums */
-		results = get_pkey_attnames(rel, );
+		results = get_pkey_attnames(rel, );
 
 		relation_close(rel, AccessShareLock);
 
@@ -1529,9 +1529,9 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		attinmeta = TupleDescGetAttInMetadata(tupdesc);
 		funcctx->attinmeta = attinmeta;
 
-		if ((results != NULL) && (numatts > 0))
+		if ((results != NULL) && (indnkeyatts > 0))
 		{
-			funcctx->max_calls = numatts;
+			funcctx->max_calls = indnkeyatts;
 
 			/* got results, keep track of them */
 			funcctx->user_fctx = results;
@@ -2021,10 +2021,10 @@ dblink_fdw_validator(PG_FUNCTION_ARGS)
  * get_pkey_attnames
  *
  * Get the primary key attnames for the given relation.
- * Return NULL, and set numatts = 0, if no primary key exists.
+ * Return NULL, and set indnkeyatts = 0, if no primary key exists.
  */
 static char **
-get_pkey_attnames(Relation rel, int16 *numatts)
+get_pkey_attnames(Relation rel, int16 *indnkeyatts)
 {
 	Relation	indexRelation;
 	ScanKeyData skey;
@@ -2034,8 +2034,8 @@ get_pkey_attnames(Relation rel, int16 *numatts)
 	char	  **result = NULL;
 	TupleDesc	tupdesc;
 
-	/* initialize numatts to 0 in case no primary key exists */
-	*numatts = 0;
+	/* initialize indnkeyatts to 0 in case no primary key exists */
+	*indnkeyatts = 0;
 
 	tupdesc = rel->rd_att;
 
@@ -2056,12 +2056,12 @@ get_pkey_attnames(Relation rel, int16 *numatts)
 		/* we're only interested if it is the primary key */
 		if (index->indisprimary)
 		{
-			*numatts = index->indnatts;
-			if (*numatts > 0)
+			*indnkeyatts = index->indnkeyatts;
+			if (*indnkeyatts > 0)
 			{
-result = (char **) palloc(*numatts * sizeof(char *));
+result = (char **) palloc(*indnkeyatts * sizeof(char *));
 
-for (i = 0; i < *numatts; i++)
+for (i = 0; i < *indnkeyatts; i++)
 	result[i] = SPI_fname(tupdesc, index->indkey.values[i]);
 			}
 			break;
diff --git a/contrib/tcn/tcn.c b/contrib/tcn/tcn.c
index 7352b29..c024cf3 100644
--- a/contrib/tcn/tcn.c
+++ b/contrib/tcn/tcn.c
@@ -138,9 +138,9 @@ triggered_change_notification(PG_FUNCTION_ARGS)
 		/* we're only interested if it is the primary key and valid */
 		if (index->indisprimary && IndexIsValid(index))
 		{
-			int			numatts = index->indnatts;
+			int indnkeyatts = index->indnkeyatts;
 
-			if (numatts > 0)
+			if (indnkeyatts > 0)
 			{
 int			i;
 
@@ -150,7 +150,7 @@ triggered_change_notification(PG_FUNCTION_ARGS)
 appendStringInfoCharMacro(payload, ',');
 appendStringInfoCharMacro(payload, operation);
 
-for (i = 0; i < numatts; i++)
+for (i = 0; i < indnkeyatts; i++)
 {
 	int			colno = index->indkey.values[i];
 
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 29738b0..9c7f9e5 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -3564,6 +3564,14 @@
   pg_class.relnatts)
  
 
+  
+  indnkeyatts
+  int2
+  
+  The number of key columns in the index. "Key columns" are ordinary
+  index columns in contrast with "included" columns.
+ 
+
  
   indisunique
   bool
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index 40f201b..92bef4a 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -110,6 +110,8 @@ 

Re: [HACKERS] WIP: Covering + unique indexes.

2016-10-04 Thread Robert Haas
On Tue, Oct 4, 2016 at 9:20 AM, Amit Kapila  wrote:
>>> I'd say that the reason for not using included columns in any
>>> operations which require comparisons, is that they don't have opclass.
>>> Let's go back to the example of points.
>>> This data type don't have any opclass for B-tree, because of fundamental
>>> reasons.
>>> And we can not apply _bt_compare() and others to this attribute, so
>>> we don't include it to scan key.
>>>
>>> create table t (i int, i2 int, p point);
>>> create index idx1 on (i) including (i2);
>>> create index idx2 on (i) including (p);
>>> create index idx3 on (i) including (i2, p);
>>> create index idx4 on (i) including (p, i2);
>>>
>>> You can keep tuples ordered in idx1, but not for idx2, partially ordered for
>>> idx3, but not for idx4.
>>
>> Yeah, I think we shouldn't go there.  I mean, once you start ordering
>> by INCLUDING columns, then you're going to need to include them in
>> leaf pages because otherwise you can't actually guarantee that they
>> are in the right order.
>
> I am not sure what you mean by above, because patch already stores
> INCLUDING columns in leaf pages.

Sorry, I meant non-leaf pages.

>>  And then you have to wonder why an INCLUDING
>> column is any different from a non-INCLUDING column.  It seems best to
>> make a firm rule that INCLUDING columns are there only for the values,
>> not for ordering.  That rule is simple and clear, which is a good
>> thing.
>
> Okay, we can make that firm rule, but I think reasoning behind that
> should be clear.  As far as I get it by reading some of the mails in
> this thread, it is because some of the other databases doesn't seem to
> support ordering for included columns or supporting the same can
> complicate the code.  One point, we should keep in mind that
> suggestion for including many other columns in INCLUDING clause to use
> Index Only scans by other databases might not hold equally good for
> PostgreSQL because it can lead to many HOT updates as non-HOT updates.

Right.  Looking back, the originally articulated rationale for this
patch was that you might want a single index that is UNIQUE ON (a) but
also INCLUDING (b) rather than two indexes, a unique index on (a) and
a non-unique index on (a, b).  In that case, the patch is a
straight-up win: you get the same number of HOT updates either way,
but you don't use as much disk space, or spend as much CPU time and
WAL updating your indexes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2016-10-04 Thread Amit Kapila
On Tue, Sep 27, 2016 at 7:51 PM, Robert Haas  wrote:
> On Mon, Sep 26, 2016 at 11:17 AM, Anastasia Lubennikova
>  wrote:
>>> Is there any fundamental problem in storing them in ordered way?  I
>>> mean to say, you need to anyway store all the column values on leaf
>>> page, so why can't we find the exact location for the complete key.
>>> Basically use truncated key to reach to leaf level and then use the
>>> complete key to find the exact location to store the key.  I might be
>>> missing some thing here, but if we can store them in ordered fashion,
>>> we can use them even for queries containing ORDER BY (where ORDER BY
>>> contains included columns).
>>
>> I'd say that the reason for not using included columns in any
>> operations which require comparisons, is that they don't have opclass.
>> Let's go back to the example of points.
>> This data type don't have any opclass for B-tree, because of fundamental
>> reasons.
>> And we can not apply _bt_compare() and others to this attribute, so
>> we don't include it to scan key.
>>
>> create table t (i int, i2 int, p point);
>> create index idx1 on (i) including (i2);
>> create index idx2 on (i) including (p);
>> create index idx3 on (i) including (i2, p);
>> create index idx4 on (i) including (p, i2);
>>
>> You can keep tuples ordered in idx1, but not for idx2, partially ordered for
>> idx3, but not for idx4.
>
> Yeah, I think we shouldn't go there.  I mean, once you start ordering
> by INCLUDING columns, then you're going to need to include them in
> leaf pages because otherwise you can't actually guarantee that they
> are in the right order.
>

I am not sure what you mean by above, because patch already stores
INCLUDING columns in leaf pages.

>  And then you have to wonder why an INCLUDING
> column is any different from a non-INCLUDING column.  It seems best to
> make a firm rule that INCLUDING columns are there only for the values,
> not for ordering.  That rule is simple and clear, which is a good
> thing.
>

Okay, we can make that firm rule, but I think reasoning behind that
should be clear.  As far as I get it by reading some of the mails in
this thread, it is because some of the other databases doesn't seem to
support ordering for included columns or supporting the same can
complicate the code.  One point, we should keep in mind that
suggestion for including many other columns in INCLUDING clause to use
Index Only scans by other databases might not hold equally good for
PostgreSQL because it can lead to many HOT updates as non-HOT updates.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2016-10-03 Thread Anastasia Lubennikova

03.10.2016 05:22, Michael Paquier:

On Tue, Sep 27, 2016 at 12:17 AM, Anastasia Lubennikova
 wrote:

Ok, I'll write it in a few days.

Marked as returned with feedback per last emails exchanged.


The only complaint about this patch was a lack of README,
which is fixed now (see the attachment). So, I added it to new CF,
marked as ready for committer.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index d4f9090..99735ce 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -100,7 +100,7 @@ static remoteConn *getConnectionByName(const char *name);
 static HTAB *createConnHash(void);
 static void createNewConnection(const char *name, remoteConn *rconn);
 static void deleteConnection(const char *name);
-static char **get_pkey_attnames(Relation rel, int16 *numatts);
+static char **get_pkey_attnames(Relation rel, int16 *indnkeyatts);
 static char **get_text_array_contents(ArrayType *array, int *numitems);
 static char *get_sql_insert(Relation rel, int *pkattnums, int pknumatts, char **src_pkattvals, char **tgt_pkattvals);
 static char *get_sql_delete(Relation rel, int *pkattnums, int pknumatts, char **tgt_pkattvals);
@@ -1483,7 +1483,7 @@ PG_FUNCTION_INFO_V1(dblink_get_pkey);
 Datum
 dblink_get_pkey(PG_FUNCTION_ARGS)
 {
-	int16		numatts;
+	int16		indnkeyatts;
 	char	  **results;
 	FuncCallContext *funcctx;
 	int32		call_cntr;
@@ -1509,7 +1509,7 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		rel = get_rel_from_relname(PG_GETARG_TEXT_P(0), AccessShareLock, ACL_SELECT);
 
 		/* get the array of attnums */
-		results = get_pkey_attnames(rel, );
+		results = get_pkey_attnames(rel, );
 
 		relation_close(rel, AccessShareLock);
 
@@ -1529,9 +1529,9 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		attinmeta = TupleDescGetAttInMetadata(tupdesc);
 		funcctx->attinmeta = attinmeta;
 
-		if ((results != NULL) && (numatts > 0))
+		if ((results != NULL) && (indnkeyatts > 0))
 		{
-			funcctx->max_calls = numatts;
+			funcctx->max_calls = indnkeyatts;
 
 			/* got results, keep track of them */
 			funcctx->user_fctx = results;
@@ -2021,10 +2021,10 @@ dblink_fdw_validator(PG_FUNCTION_ARGS)
  * get_pkey_attnames
  *
  * Get the primary key attnames for the given relation.
- * Return NULL, and set numatts = 0, if no primary key exists.
+ * Return NULL, and set indnkeyatts = 0, if no primary key exists.
  */
 static char **
-get_pkey_attnames(Relation rel, int16 *numatts)
+get_pkey_attnames(Relation rel, int16 *indnkeyatts)
 {
 	Relation	indexRelation;
 	ScanKeyData skey;
@@ -2034,8 +2034,8 @@ get_pkey_attnames(Relation rel, int16 *numatts)
 	char	  **result = NULL;
 	TupleDesc	tupdesc;
 
-	/* initialize numatts to 0 in case no primary key exists */
-	*numatts = 0;
+	/* initialize indnkeyatts to 0 in case no primary key exists */
+	*indnkeyatts = 0;
 
 	tupdesc = rel->rd_att;
 
@@ -2056,12 +2056,12 @@ get_pkey_attnames(Relation rel, int16 *numatts)
 		/* we're only interested if it is the primary key */
 		if (index->indisprimary)
 		{
-			*numatts = index->indnatts;
-			if (*numatts > 0)
+			*indnkeyatts = index->indnkeyatts;
+			if (*indnkeyatts > 0)
 			{
-result = (char **) palloc(*numatts * sizeof(char *));
+result = (char **) palloc(*indnkeyatts * sizeof(char *));
 
-for (i = 0; i < *numatts; i++)
+for (i = 0; i < *indnkeyatts; i++)
 	result[i] = SPI_fname(tupdesc, index->indkey.values[i]);
 			}
 			break;
diff --git a/contrib/tcn/tcn.c b/contrib/tcn/tcn.c
index 7352b29..c024cf3 100644
--- a/contrib/tcn/tcn.c
+++ b/contrib/tcn/tcn.c
@@ -138,9 +138,9 @@ triggered_change_notification(PG_FUNCTION_ARGS)
 		/* we're only interested if it is the primary key and valid */
 		if (index->indisprimary && IndexIsValid(index))
 		{
-			int			numatts = index->indnatts;
+			int indnkeyatts = index->indnkeyatts;
 
-			if (numatts > 0)
+			if (indnkeyatts > 0)
 			{
 int			i;
 
@@ -150,7 +150,7 @@ triggered_change_notification(PG_FUNCTION_ARGS)
 appendStringInfoCharMacro(payload, ',');
 appendStringInfoCharMacro(payload, operation);
 
-for (i = 0; i < numatts; i++)
+for (i = 0; i < indnkeyatts; i++)
 {
 	int			colno = index->indkey.values[i];
 
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 322d8d6..0983c93 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -3564,6 +3564,14 @@
   pg_class.relnatts)
  
 
+  
+  indnkeyatts
+  int2
+  
+  The number of key columns in the index. "Key columns" are ordinary
+  index columns in contrast with "included" columns.
+ 
+
  
   indisunique
   bool
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index 40f201b..92bef4a 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -110,6 +110,8 @@ typedef struct IndexAmRoutine
 boolamclusterable;
 /* does 

Re: [HACKERS] WIP: Covering + unique indexes.

2016-10-02 Thread Michael Paquier
On Tue, Sep 27, 2016 at 12:17 AM, Anastasia Lubennikova
 wrote:
> Ok, I'll write it in a few days.

Marked as returned with feedback per last emails exchanged.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2016-09-27 Thread Robert Haas
On Mon, Sep 26, 2016 at 11:17 AM, Anastasia Lubennikova
 wrote:
>> Is there any fundamental problem in storing them in ordered way?  I
>> mean to say, you need to anyway store all the column values on leaf
>> page, so why can't we find the exact location for the complete key.
>> Basically use truncated key to reach to leaf level and then use the
>> complete key to find the exact location to store the key.  I might be
>> missing some thing here, but if we can store them in ordered fashion,
>> we can use them even for queries containing ORDER BY (where ORDER BY
>> contains included columns).
>
> I'd say that the reason for not using included columns in any
> operations which require comparisons, is that they don't have opclass.
> Let's go back to the example of points.
> This data type don't have any opclass for B-tree, because of fundamental
> reasons.
> And we can not apply _bt_compare() and others to this attribute, so
> we don't include it to scan key.
>
> create table t (i int, i2 int, p point);
> create index idx1 on (i) including (i2);
> create index idx2 on (i) including (p);
> create index idx3 on (i) including (i2, p);
> create index idx4 on (i) including (p, i2);
>
> You can keep tuples ordered in idx1, but not for idx2, partially ordered for
> idx3, but not for idx4.

Yeah, I think we shouldn't go there.  I mean, once you start ordering
by INCLUDING columns, then you're going to need to include them in
leaf pages because otherwise you can't actually guarantee that they
are in the right order.  And then you have to wonder why an INCLUDING
column is any different from a non-INCLUDING column.  It seems best to
make a firm rule that INCLUDING columns are there only for the values,
not for ordering.  That rule is simple and clear, which is a good
thing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2016-09-26 Thread Anastasia Lubennikova


24.09.2016 15:36, Amit Kapila:

On Wed, Sep 21, 2016 at 6:51 PM, Anastasia Lubennikova
 wrote:

20.09.2016 08:21, Amit Kapila:

On Tue, Sep 6, 2016 at 10:18 PM, Anastasia Lubennikova
 wrote:

28.08.2016 09:13, Amit Kapila:


The problem seems really tricky, but the answer is simple.
We store included columns unordered. It was mentioned somewhere in
this thread.


Is there any fundamental problem in storing them in ordered way?  I
mean to say, you need to anyway store all the column values on leaf
page, so why can't we find the exact location for the complete key.
Basically use truncated key to reach to leaf level and then use the
complete key to find the exact location to store the key.  I might be
missing some thing here, but if we can store them in ordered fashion,
we can use them even for queries containing ORDER BY (where ORDER BY
contains included columns).



I'd say that the reason for not using included columns in any
operations which require comparisons, is that they don't have opclass.
Let's go back to the example of points.
This data type don't have any opclass for B-tree, because of fundamental 
reasons.

And we can not apply _bt_compare() and others to this attribute, so
we don't include it to scan key.

create table t (i int, i2 int, p point);
create index idx1 on (i) including (i2);
create index idx2 on (i) including (p);
create index idx3 on (i) including (i2, p);
create index idx4 on (i) including (p, i2);

You can keep tuples ordered in idx1, but not for idx2, partially ordered 
for idx3, but not for idx4.


At the very beginning of this thread [1], I suggested to use opclass, 
where possible.
Exactly the same idea, you're thinking about. But after short 
discussion, we came
to conclusion that it would require many additional checks and will be 
too complicated,

at least for the initial patch.


Let me give you an example:

create table t (i int, p point);
create index on (i) including (p);
"point" data type doesn't have any opclass for btree.
Should we insert (0, '(0,2)') before (0, '(1,1)') or after?
We have no idea what is the "correct order" for this attribute.
So the answer is "it doesn't matter". When searching in index,
we know that only key attrs are ordered, so only them can be used
in scankey. Other columns are filtered after retrieving data.

explain select i,p from t where i =0 and p <@ circle '((0,0),2)';
 QUERY PLAN
---
  Index Only Scan using idx on t  (cost=0.14..4.20 rows=1 width=20)
Index Cond: (i = 0)
Filter: (p <@ '<(0,0),2>'::circle)


I think here reason for using Filter is that because we don't keep
included columns in scan keys, can't we think of having them in scan
keys, but use only key columns in scan key to reach till leaf level
and then use complete scan key at leaf level.



What should I add to README (or to documentation),
to make it more understandable?


May be add the data representation like only leaf pages contains all
the columns and how the scan works.  I think you can see if you can
extend "Notes About Data Representation" and or "Other Things That Are
Handy to Know" sections in existing README.


Ok, I'll write it in a few days.


[1] https://www.postgresql.org/message-id/55f84df4.5030...@postgrespro.ru

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2016-09-24 Thread Amit Kapila
On Wed, Sep 21, 2016 at 6:51 PM, Anastasia Lubennikova
 wrote:
> 20.09.2016 08:21, Amit Kapila:
>
> On Tue, Sep 6, 2016 at 10:18 PM, Anastasia Lubennikova
>  wrote:
>
> 28.08.2016 09:13, Amit Kapila:
>
>
> The problem seems really tricky, but the answer is simple.
> We store included columns unordered. It was mentioned somewhere in
> this thread.
>

Is there any fundamental problem in storing them in ordered way?  I
mean to say, you need to anyway store all the column values on leaf
page, so why can't we find the exact location for the complete key.
Basically use truncated key to reach to leaf level and then use the
complete key to find the exact location to store the key.  I might be
missing some thing here, but if we can store them in ordered fashion,
we can use them even for queries containing ORDER BY (where ORDER BY
contains included columns).

> Let me give you an example:
>
> create table t (i int, p point);
> create index on (i) including (p);
> "point" data type doesn't have any opclass for btree.
> Should we insert (0, '(0,2)') before (0, '(1,1)') or after?
> We have no idea what is the "correct order" for this attribute.
> So the answer is "it doesn't matter". When searching in index,
> we know that only key attrs are ordered, so only them can be used
> in scankey. Other columns are filtered after retrieving data.
>
> explain select i,p from t where i =0 and p <@ circle '((0,0),2)';
> QUERY PLAN
> ---
>  Index Only Scan using idx on t  (cost=0.14..4.20 rows=1 width=20)
>Index Cond: (i = 0)
>Filter: (p <@ '<(0,0),2>'::circle)
>

I think here reason for using Filter is that because we don't keep
included columns in scan keys, can't we think of having them in scan
keys, but use only key columns in scan key to reach till leaf level
and then use complete scan key at leaf level.

>
> The same approach is used for included columns of any type, even if
> their data types have opclass.
>
> Is this truncation concept of high key needed for correctness of patch
> or is it just to save space in index?   If you need this, then I think
> nbtree/Readme needs to be updated.
>
>
> Now it's done only for space saving. We never check included attributes
> in non-leaf pages, so why store them? Especially if we assume that included
> attributes can be quite long.
> There is already a note in documentation:
>
> +It's the same with other constraints (PRIMARY KEY and EXCLUDE).
> This can
> +also can be used for non-unique indexes as any columns which are
> not required
> +for the searching or ordering of records can be included in the
> +INCLUDING clause, which can slightly reduce the size of
> the index,
> +due to storing included attributes only in leaf index pages.
>

Okay, thanks for clarification.

> What should I add to README (or to documentation),
> to make it more understandable?
>

May be add the data representation like only leaf pages contains all
the columns and how the scan works.  I think you can see if you can
extend "Notes About Data Representation" and or "Other Things That Are
Handy to Know" sections in existing README.

> -- I am getting Assertion failure when I use this patch with database
> created with a build before this patch.  However, if I create a fresh
> database it works fine.  Assertion failure details are as below:
>
> LOG:  database system is ready to accept connections
> LOG:  autovacuum launcher started
> TRAP: unrecognized TOAST vartag("((bool) 1)", File:
> "src/backend/access/common/h
> eaptuple.c", Line: 532)
> LOG:  server process (PID 1404) was terminated by exception 0x8003
> HINT:  See C include file "ntstatus.h" for a description of the hexadecimal
> valu
> e.
> LOG:  terminating any other active server processes
>
>
> That is expected behavior, because catalog versions are not compatible.
> But I wonder why there was no message about that?
> I suppose, that's because CATALOG_VERSION_NO was outdated in my
> patch. As well as I know, committer will change it before the commit.
> Try new patch with updated value. It should fail with a message about
> incompatible versions.
>

Yeah, that must be reason, but lets not change it now, otherwise we
will face conflicts while applying patch.



-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2016-09-20 Thread Amit Kapila
On Tue, Sep 20, 2016 at 10:51 AM, Amit Kapila  wrote:
> On Tue, Sep 6, 2016 at 10:18 PM, Anastasia Lubennikova
>  wrote:
>> 28.08.2016 09:13, Amit Kapila:
>>
>> On Mon, Aug 15, 2016 at 8:15 PM, Anastasia Lubennikova
>>  wrote:
>>
>>
>> So the patch is correct.
>> We can go further and remove this index_truncate_tuple() call, because
>> the first key of any inner (or root) page doesn't need any key at all.
>>
>
> Anyway, I think truncation happens if the page is at leaf level and
> that is ensured by check, so I think we can't remove this:
> + if (indnkeyatts != indnatts && P_ISLEAF(pageop))
>
>
> -- I have one more question regarding this truncate high-key concept.
> I think if high key is truncated, then during insertion, for cases
> like below it move to next page, whereas current page needs to be
> splitted.
>
> Assume index on c1,c2,c3 and c2,c3 are including columns.
>
> Actual high key on leaf Page X -
> 3, 2 , 2
> Truncated high key on leaf Page X
> 3
>
> New insertion key
> 3, 1, 2
>
> Now, I think for such cases during insertion if the page X doesn't
> have enough space, it will move to next page whereas ideally, it
> should split current page.  Refer function _bt_findinsertloc() for
> this logic.
>

Basically, here I wanted to know is that do we maintain ordering for
keys with respect to including columns while storing them (In above
example, do we ensure that 3,1,2 is always stored before 3,2,2)?

>
>
> -- I am getting Assertion failure when I use this patch with database
> created with a build before this patch.  However, if I create a fresh
> database it works fine.  Assertion failure details are as below:
>

I have tried this test on my Windows m/c only.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2016-09-19 Thread Amit Kapila
On Tue, Sep 6, 2016 at 10:18 PM, Anastasia Lubennikova
 wrote:
> 28.08.2016 09:13, Amit Kapila:
>
> On Mon, Aug 15, 2016 at 8:15 PM, Anastasia Lubennikova
>  wrote:
>
>
> So the patch is correct.
> We can go further and remove this index_truncate_tuple() call, because
> the first key of any inner (or root) page doesn't need any key at all.
>

Anyway, I think truncation happens if the page is at leaf level and
that is ensured by check, so I think we can't remove this:
+ if (indnkeyatts != indnatts && P_ISLEAF(pageop))


-- I have one more question regarding this truncate high-key concept.
I think if high key is truncated, then during insertion, for cases
like below it move to next page, whereas current page needs to be
splitted.

Assume index on c1,c2,c3 and c2,c3 are including columns.

Actual high key on leaf Page X -
3, 2 , 2
Truncated high key on leaf Page X
3

New insertion key
3, 1, 2

Now, I think for such cases during insertion if the page X doesn't
have enough space, it will move to next page whereas ideally, it
should split current page.  Refer function _bt_findinsertloc() for
this logic.

Is this truncation concept of high key needed for correctness of patch
or is it just to save space in index?   If you need this, then I think
nbtree/Readme needs to be updated.


-- I am getting Assertion failure when I use this patch with database
created with a build before this patch.  However, if I create a fresh
database it works fine.  Assertion failure details are as below:

LOG:  database system is ready to accept connections
LOG:  autovacuum launcher started
TRAP: unrecognized TOAST vartag("((bool) 1)", File: "src/backend/access/common/h
eaptuple.c", Line: 532)
LOG:  server process (PID 1404) was terminated by exception 0x8003
HINT:  See C include file "ntstatus.h" for a description of the hexadecimal valu
e.
LOG:  terminating any other active server processes

--
@@ -1260,14 +1262,14 @@ RelationInitIndexAccessInfo(Relation relation)
  * Allocate arrays to hold data
  */
  relation->rd_opfamily = (Oid *)
- MemoryContextAllocZero(indexcxt, natts * sizeof(Oid));
+ MemoryContextAllocZero(indexcxt, indnkeyatts * sizeof(Oid));
  relation->rd_opcintype = (Oid *)
- MemoryContextAllocZero(indexcxt, natts * sizeof(Oid));
+ MemoryContextAllocZero(indexcxt, indnkeyatts * sizeof(Oid));

  amsupport = relation->rd_amroutine->amsupport;
  if (amsupport > 0)
  {
- int nsupport = natts * amsupport;
+ int nsupport = indnatts * amsupport;

  relation->rd_support = (RegProcedure *)
  MemoryContextAllocZero(indexcxt, nsupport * sizeof(RegProcedure));
@@ -1281,10 +1283,10 @@ RelationInitIndexAccessInfo(Relation relation)
  }

  relation->rd_indcollation = (Oid *)
- MemoryContextAllocZero(indexcxt, natts * sizeof(Oid));
+ MemoryContextAllocZero(indexcxt, indnatts * sizeof(Oid));

Can you add a comment in above code or some other related place as to
why you need some attributes in relcache entry of size indnkeyatts and
others of size indnatts?

--
@@ -63,17 +63,26 @@ _bt_mkscankey(Relation rel, IndexTuple itup)
 {
  ScanKey skey;
  TupleDesc itupdesc;
- int natts;
+ int indnatts,
+ indnkeyatts;
  int16   *indoption;
  int i;

  itupdesc = RelationGetDescr(rel);
- natts = RelationGetNumberOfAttributes(rel);
+ indnatts = IndexRelationGetNumberOfAttributes(rel);
+ indnkeyatts = IndexRelationGetNumberOfKeyAttributes(rel);
  indoption = rel->rd_indoption;

- skey = (ScanKey) palloc(natts * sizeof(ScanKeyData));
+ Assert(indnkeyatts != 0);
+ Assert(indnkeyatts <= indnatts);

Here I think you need to declare indnatts as PG_USED_FOR_ASSERTS_ONLY,
otherwise it will give warning on some platforms.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2016-09-14 Thread Anastasia Lubennikova

One more update.

I added ORDER BY clause to regression tests.
It was done as a separate bugfix patch by Tom Lane some time ago,
but it definitely should be included into the patch.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index d4f9090..99735ce 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -100,7 +100,7 @@ static remoteConn *getConnectionByName(const char *name);
 static HTAB *createConnHash(void);
 static void createNewConnection(const char *name, remoteConn *rconn);
 static void deleteConnection(const char *name);
-static char **get_pkey_attnames(Relation rel, int16 *numatts);
+static char **get_pkey_attnames(Relation rel, int16 *indnkeyatts);
 static char **get_text_array_contents(ArrayType *array, int *numitems);
 static char *get_sql_insert(Relation rel, int *pkattnums, int pknumatts, char **src_pkattvals, char **tgt_pkattvals);
 static char *get_sql_delete(Relation rel, int *pkattnums, int pknumatts, char **tgt_pkattvals);
@@ -1483,7 +1483,7 @@ PG_FUNCTION_INFO_V1(dblink_get_pkey);
 Datum
 dblink_get_pkey(PG_FUNCTION_ARGS)
 {
-	int16		numatts;
+	int16		indnkeyatts;
 	char	  **results;
 	FuncCallContext *funcctx;
 	int32		call_cntr;
@@ -1509,7 +1509,7 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		rel = get_rel_from_relname(PG_GETARG_TEXT_P(0), AccessShareLock, ACL_SELECT);
 
 		/* get the array of attnums */
-		results = get_pkey_attnames(rel, );
+		results = get_pkey_attnames(rel, );
 
 		relation_close(rel, AccessShareLock);
 
@@ -1529,9 +1529,9 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		attinmeta = TupleDescGetAttInMetadata(tupdesc);
 		funcctx->attinmeta = attinmeta;
 
-		if ((results != NULL) && (numatts > 0))
+		if ((results != NULL) && (indnkeyatts > 0))
 		{
-			funcctx->max_calls = numatts;
+			funcctx->max_calls = indnkeyatts;
 
 			/* got results, keep track of them */
 			funcctx->user_fctx = results;
@@ -2021,10 +2021,10 @@ dblink_fdw_validator(PG_FUNCTION_ARGS)
  * get_pkey_attnames
  *
  * Get the primary key attnames for the given relation.
- * Return NULL, and set numatts = 0, if no primary key exists.
+ * Return NULL, and set indnkeyatts = 0, if no primary key exists.
  */
 static char **
-get_pkey_attnames(Relation rel, int16 *numatts)
+get_pkey_attnames(Relation rel, int16 *indnkeyatts)
 {
 	Relation	indexRelation;
 	ScanKeyData skey;
@@ -2034,8 +2034,8 @@ get_pkey_attnames(Relation rel, int16 *numatts)
 	char	  **result = NULL;
 	TupleDesc	tupdesc;
 
-	/* initialize numatts to 0 in case no primary key exists */
-	*numatts = 0;
+	/* initialize indnkeyatts to 0 in case no primary key exists */
+	*indnkeyatts = 0;
 
 	tupdesc = rel->rd_att;
 
@@ -2056,12 +2056,12 @@ get_pkey_attnames(Relation rel, int16 *numatts)
 		/* we're only interested if it is the primary key */
 		if (index->indisprimary)
 		{
-			*numatts = index->indnatts;
-			if (*numatts > 0)
+			*indnkeyatts = index->indnkeyatts;
+			if (*indnkeyatts > 0)
 			{
-result = (char **) palloc(*numatts * sizeof(char *));
+result = (char **) palloc(*indnkeyatts * sizeof(char *));
 
-for (i = 0; i < *numatts; i++)
+for (i = 0; i < *indnkeyatts; i++)
 	result[i] = SPI_fname(tupdesc, index->indkey.values[i]);
 			}
 			break;
diff --git a/contrib/tcn/tcn.c b/contrib/tcn/tcn.c
index 7352b29..c024cf3 100644
--- a/contrib/tcn/tcn.c
+++ b/contrib/tcn/tcn.c
@@ -138,9 +138,9 @@ triggered_change_notification(PG_FUNCTION_ARGS)
 		/* we're only interested if it is the primary key and valid */
 		if (index->indisprimary && IndexIsValid(index))
 		{
-			int			numatts = index->indnatts;
+			int indnkeyatts = index->indnkeyatts;
 
-			if (numatts > 0)
+			if (indnkeyatts > 0)
 			{
 int			i;
 
@@ -150,7 +150,7 @@ triggered_change_notification(PG_FUNCTION_ARGS)
 appendStringInfoCharMacro(payload, ',');
 appendStringInfoCharMacro(payload, operation);
 
-for (i = 0; i < numatts; i++)
+for (i = 0; i < indnkeyatts; i++)
 {
 	int			colno = index->indkey.values[i];
 
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 322d8d6..0983c93 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -3564,6 +3564,14 @@
   pg_class.relnatts)
  
 
+  
+  indnkeyatts
+  int2
+  
+  The number of key columns in the index. "Key columns" are ordinary
+  index columns in contrast with "included" columns.
+ 
+
  
   indisunique
   bool
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index 40f201b..92bef4a 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -110,6 +110,8 @@ typedef struct IndexAmRoutine
 boolamclusterable;
 /* does AM handle predicate locks? */
 boolampredlocks;
+/* does AM support columns included with clause INCLUDING? */
+boolamcaninclude;
 /* type of data stored in index, or 

Re: [HACKERS] WIP: Covering + unique indexes.

2016-09-06 Thread Anastasia Lubennikova

28.08.2016 09:13, Amit Kapila:

On Mon, Aug 15, 2016 at 8:15 PM, Anastasia Lubennikova
  wrote:
@@ -590,7 +622,14 @@ _bt_buildadd(BTWriteState *wstate, BTPageState
*state, IndexTuple itup)
   if (last_off == P_HIKEY)
   {
   Assert(state->btps_minkey == NULL);
- state->btps_minkey = CopyIndexTuple(itup);
+ /*
+ * Truncate the tuple that we're going to insert
+ * into the parent page as a downlink
+ */

+ if (indnkeyatts != indnatts && P_ISLEAF(pageop))
+ state->btps_minkey = index_truncate_tuple(wstate->index, itup);
+ else
+ state->btps_minkey = CopyIndexTuple(itup);

It seems that above code always ensure that for leaf pages, high key
is a truncated tuple.  What is less clear is if that is true, why you
need to re-ensure it again for the old  page in below code:


Thank you for the question. Investigation took a long time)

As far as I understand, the code above only applies to
the first tuple of each level. While the code you have quoted below
truncates high keys for all other pages.

There is a comment that clarifies situation:
/*
 * If the new item is the first for its page, stash a copy for 
later. Note

 * this will only happen for the first item on a level; on later pages,
 * the first item for a page is copied from the prior page in the code
 * above.
 */


So the patch is correct.
We can go further and remove this index_truncate_tuple() call, because
the first key of any inner (or root) page doesn't need any key at all.
It simply points out to the leftmost page of the level below.
But it's not a bug, because truncation of one tuple per level doesn't
add any considerable overhead. So I want to leave the patch in its 
current state.



@@ -510,6 +513,8 @@ _bt_buildadd(BTWriteState *wstate, BTPageState
*state, IndexTuple itup)
{
..
+ if (indnkeyatts != indnatts && P_ISLEAF(opageop))
+ {
+ /*
+ * It's essential to truncate High key here.
+ * The purpose is not just to save more space on this particular page,
+ * but to keep whole b-tree structure consistent. Subsequent insertions
+ * assume that hikey is already truncated, and so they should not
+ * worry about it, when copying the high key into the parent page
+ * as a downlink.
+ * NOTE It is not crutial for reliability in present,
+ * but maybe it will be that in the future.
+ */
+ keytup = index_truncate_tuple(wstate->index, oitup);
+
+ /*  delete "wrong" high key, insert keytup as P_HIKEY. */
+ PageIndexTupleDelete(opage, P_HIKEY);
+
+ if (!_bt_pgaddtup(opage, IndexTupleSize(keytup), keytup, P_HIKEY))
+ elog(ERROR, "failed to rewrite compressed item in index \"%s\"",
+ RelationGetRelationName(wstate->index));
+ }
+
..
..



--
Anastasia Lubennikova
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company



Re: [HACKERS] WIP: Covering + unique indexes.

2016-08-28 Thread Amit Kapila
On Mon, Aug 15, 2016 at 8:15 PM, Anastasia Lubennikova
 wrote:
@@ -590,7 +622,14 @@ _bt_buildadd(BTWriteState *wstate, BTPageState
*state, IndexTuple itup)
  if (last_off == P_HIKEY)
  {
  Assert(state->btps_minkey == NULL);
- state->btps_minkey = CopyIndexTuple(itup);
+ /*
+ * Truncate the tuple that we're going to insert
+ * into the parent page as a downlink
+ */

+ if (indnkeyatts != indnatts && P_ISLEAF(pageop))
+ state->btps_minkey = index_truncate_tuple(wstate->index, itup);
+ else
+ state->btps_minkey = CopyIndexTuple(itup);

It seems that above code always ensure that for leaf pages, high key
is a truncated tuple.  What is less clear is if that is true, why you
need to re-ensure it again for the old  page in below code:

@@ -510,6 +513,8 @@ _bt_buildadd(BTWriteState *wstate, BTPageState
*state, IndexTuple itup)
{
..
+ if (indnkeyatts != indnatts && P_ISLEAF(opageop))
+ {
+ /*
+ * It's essential to truncate High key here.
+ * The purpose is not just to save more space on this particular page,
+ * but to keep whole b-tree structure consistent. Subsequent insertions
+ * assume that hikey is already truncated, and so they should not
+ * worry about it, when copying the high key into the parent page
+ * as a downlink.
+ * NOTE It is not crutial for reliability in present,
+ * but maybe it will be that in the future.
+ */
+ keytup = index_truncate_tuple(wstate->index, oitup);
+
+ /*  delete "wrong" high key, insert keytup as P_HIKEY. */
+ PageIndexTupleDelete(opage, P_HIKEY);
+
+ if (!_bt_pgaddtup(opage, IndexTupleSize(keytup), keytup, P_HIKEY))
+ elog(ERROR, "failed to rewrite compressed item in index \"%s\"",
+ RelationGetRelationName(wstate->index));
+ }
+
..
..

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2016-08-17 Thread Andrew Borodin
> That was a bug caused by high key truncation, that occurs when index has more 
> than 3 levels. Fixed.
Affirmative. I've tested index construction with 100M rows and
subsequent execution of select queries using index, works fine.

Best regards, Andrey Borodin, Octonica & Ural Federal University.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2016-08-15 Thread Anastasia Lubennikova

14.08.2016 20:11, Andrey Borodin:

The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, failed
Spec compliant:   tested, passed
Documentation:tested, passed

Hi hackers!

I've read the patch and here is my code review.

==PURPOSE
I've used this feature from time to time with MS SQL. From my experience 
INCLUDE is a 'sugar on top' feature.
Some MS SQL classes do not even mention INCLUDE despite it's there from 2005 
(though classes do not mention lots of important things, so it's not kind of 
valuable indicator).
But those who use it, use it whenever possible. For example, system view with 
recommended indices rarely list one without INCLUDE columns.
So, this feature is very important from perspective of converting MS SQL DBAs 
to PostgreSQL. This is how I see it.


Thank you for the review, I hope this feature will be useful for many 
people.



SUGGESTIONS==
0. Index build is broken. This script 
https://github.com/x4m/pggistopt/blob/8ad65d2e305e98c836388a07909af5983dba9c73/test.sql
 SEGFAULTs and may cause situation when you cannot insert anything into table 
(I think drop of index would help, but didn't tested this)


Thank you for reporting. That was a bug caused by high key truncation, 
that occurs

when index has more than 3 levels.
Fixed. See attached file.


1. I think MS SQL syntax INCLUDE instead of INCLUDING would be better (for a 
purpose listed above)


I've chosen this particular name to avoid using of new keyword. We 
already have INCLUDING
in postgres in a context of inheritance that will never intersect with 
covering indexes.

I'm sure it won't be a big problem of migration from MsSQL.


2. Empty line added in ruleutils.c. Is it for a reason?


No, just a missed line.
Fixed.


3. Now we have indnatts and indnkeyatts instead of indnatts. I think it is 
worth considering renaming indnatts to something different from old name. 
Someone somewhere could still suppose it's a number of keys.


I agree that naming became vague after this patch.
I've already suggested to replace "indkeys[]" with more specific name, and
AFAIR there was no reaction. So I didn't do that.
But I don't sure about your suggestion regarding indnatts. Old queries 
(and old indexes)
can still use it correctly. I don't see a reason to break compatibility 
for all users.
Those who will use this new feature, should ensure that their queries to 
pg_index

behave as expected.


PERFORMANCE==
Due to suggestion number 0 I could not measure performance of index build. 
Index crashes when there's more than 1.1 million of rows in a table.
Performance test script is here 
https://github.com/x4m/pggistopt/blob/f206b4395baa15a2fa42897eeb27bd555619119a/test.sql
Test scenario is following:
1. Create table, then create index, then add data.
2. Make a query touching data in INCLUDING columns.
This scenario is tested against table with:
A. Table with index, that do not contain touched columns, just PK.
B. Index with all columns in index.
C. Index with PK in keys and INCLUDING all other columns.

Tests were executed 5 times on Ubuntu VM under Hyper-V i5 2500 CPU, 16 Gb of 
RAM, SSD disk.
Time to insert 10M rows:
A. AVG 110 seconds STD 4.8
B. AVG 121 seconds STD 2.0
C. AVG 111 seconds STD 5.7
Inserts to INCLUDING index is almost as fast as inserts to index without extra 
columns.

Time to run SELECT query:
A. AVG 2864 ms STD 794
B. AVG 2329 ms STD 84
C. AVG 2293 ms STD 58
Selects with INCLUDING columns is almost as fast as with full index.

Index size (deterministic measure, STD = 0)
A. 317 MB
B. 509 MB
C. 399 MB
Index size is in the middle between full index and minimal index.

I think this numbers agree with expectation from the feature.

CONCLUSION==
This patch brings useful and important feature. Build shall be repaired; other 
my suggestions are only suggestions.



Best regards, Andrey Borodin, Octonica & Ural Federal University.

The new status of this patch is: Waiting on Author



--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 9c8e308..891325d 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -100,7 +100,7 @@ static remoteConn *getConnectionByName(const char *name);
 static HTAB *createConnHash(void);
 static void createNewConnection(const char *name, remoteConn *rconn);
 static void deleteConnection(const char *name);
-static char **get_pkey_attnames(Relation rel, int16 *numatts);
+static char **get_pkey_attnames(Relation rel, int16 *indnkeyatts);
 static char **get_text_array_contents(ArrayType *array, int *numitems);
 static char *get_sql_insert(Relation rel, int *pkattnums, int pknumatts, char **src_pkattvals, char **tgt_pkattvals);
 static char *get_sql_delete(Relation rel, int *pkattnums, int pknumatts, 

Re: [HACKERS] WIP: Covering + unique indexes.

2016-08-14 Thread Andrey Borodin
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, failed
Spec compliant:   tested, passed
Documentation:tested, passed

Hi hackers!

I've read the patch and here is my code review.

==PURPOSE
I've used this feature from time to time with MS SQL. From my experience 
INCLUDE is a 'sugar on top' feature. 
Some MS SQL classes do not even mention INCLUDE despite it's there from 2005 
(though classes do not mention lots of important things, so it's not kind of 
valuable indicator).
But those who use it, use it whenever possible. For example, system view with 
recommended indices rarely list one without INCLUDE columns.
So, this feature is very important from perspective of converting MS SQL DBAs 
to PostgreSQL. This is how I see it.

SUGGESTIONS==
0. Index build is broken. This script 
https://github.com/x4m/pggistopt/blob/8ad65d2e305e98c836388a07909af5983dba9c73/test.sql
 SEGFAULTs and may cause situation when you cannot insert anything into table 
(I think drop of index would help, but didn't tested this)
1. I think MS SQL syntax INCLUDE instead of INCLUDING would be better (for a 
purpose listed above)
2. Empty line added in ruleutils.c. Is it for a reason?
3. Now we have indnatts and indnkeyatts instead of indnatts. I think it is 
worth considering renaming indnatts to something different from old name. 
Someone somewhere could still suppose it's a number of keys.

PERFORMANCE==
Due to suggestion number 0 I could not measure performance of index build. 
Index crashes when there's more than 1.1 million of rows in a table.
Performance test script is here 
https://github.com/x4m/pggistopt/blob/f206b4395baa15a2fa42897eeb27bd555619119a/test.sql
Test scenario is following:
1. Create table, then create index, then add data.
2. Make a query touching data in INCLUDING columns.
This scenario is tested against table with:
A. Table with index, that do not contain touched columns, just PK.
B. Index with all columns in index.
C. Index with PK in keys and INCLUDING all other columns.

Tests were executed 5 times on Ubuntu VM under Hyper-V i5 2500 CPU, 16 Gb of 
RAM, SSD disk.
Time to insert 10M rows:
A. AVG 110 seconds STD 4.8
B. AVG 121 seconds STD 2.0
C. AVG 111 seconds STD 5.7
Inserts to INCLUDING index is almost as fast as inserts to index without extra 
columns.

Time to run SELECT query:
A. AVG 2864 ms STD 794
B. AVG 2329 ms STD 84
C. AVG 2293 ms STD 58
Selects with INCLUDING columns is almost as fast as with full index.

Index size (deterministic measure, STD = 0)
A. 317 MB
B. 509 MB
C. 399 MB
Index size is in the middle between full index and minimal index.

I think this numbers agree with expectation from the feature.

CONCLUSION==
This patch brings useful and important feature. Build shall be repaired; other 
my suggestions are only suggestions.



Best regards, Andrey Borodin, Octonica & Ural Federal University.

The new status of this patch is: Waiting on Author

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2016-05-02 Thread Robert Haas
On Wed, Apr 27, 2016 at 5:47 PM, David Steele  wrote:
> On 4/27/16 5:08 PM, Peter Geoghegan wrote:
>> So, in case it needs to be
>> said, I'll say it: You've chosen a very ambitious set of projects to
>> work on, by any standard. I think it's a good thing that you've been
>> ambitious, and I don't suggest changing that, since I think that you
>> have commensurate skill. But, in order to be successful in these
>> projects, patience and resolve are very important.
>
> +1.
>
> This is very exciting work and I look forward to seeing it continue.
> The patch was perhaps not a good fit for the last CF of 9.6 but that
> doesn't mean it can't have a bright future.

+1.  Totally agreed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2016-04-27 Thread David Steele
On 4/27/16 5:08 PM, Peter Geoghegan wrote:

> So, in case it needs to be
> said, I'll say it: You've chosen a very ambitious set of projects to
> work on, by any standard. I think it's a good thing that you've been
> ambitious, and I don't suggest changing that, since I think that you
> have commensurate skill. But, in order to be successful in these
> projects, patience and resolve are very important.

+1.

This is very exciting work and I look forward to seeing it continue.
The patch was perhaps not a good fit for the last CF of 9.6 but that
doesn't mean it can't have a bright future.

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2016-04-27 Thread Peter Geoghegan
On Tue, Apr 12, 2016 at 9:14 AM, Anastasia Lubennikova
 wrote:
> Sooner or later, I'd like to see this patch finished.

Me, too.

> For now, it has two complaints:
> - support of expressions as included columns.
> Frankly, I don't understand, why it's a problem of the patch.
> The patch is  already big enough and it will be much easier to add
> expressions support in the following patch, after the first one will be
> stable.
> I wonder, if someone has objections to that?

Probably. If we limit the scope of something, it's always in a way
that limits the functionality available to users, rather than limits
how generalized the new functionality is, and so cutting scope
sometimes isn't possible. There is a very high value placed on
features working well together. A user ought to be able to rely on the
intuition that features work well together. Preserving that general
ability for users to guess correctly what will work based on what they
already know is seen as important.

For example, notice that the INSERT documentation allows UPSERT unique
index inference to optionally accept an opclass or collation. So far,
the need for this functionality is totally theoretical (in practice
all B-Tree opclasses have the same idea about equality across a given
type, and we have no case insensitive collations), but it's still
there. Making that work was not a small effort (there was a follow-up
bugfix commit just for that, too). This approach is mostly about
making the implementation theoretically sound (or demonstrating that
it is) by considering edge-cases up-front. Often, there will be
benefits to a maximally generalized approach that were not initially
anticipated by the patch author, or anyone else.

I agree that it is difficult to uphold this standard at all times, but
there is something to be said for it. Postgres development must have a
very long term outlook, and this approach tends to make things easier
for future patch authors by making the code more maintainable. Even if
this is the wrong thing in specific cases, it's sometimes easier to
just do it than to convince others that their concern is misplaced in
this one instance.

> Yes, it's a kind of delayed feature. But should we wait for every patch when
> it will be entirely completed?

I think that knowing where and how to cut scope is an important skill.
If this question is asked as a general question, then the answer must
be "yes". I suggest asking a more specific question. :-)

> - lack of review and testing
> Obviously I did as much testing as I could.
> So, if reviewers have any concerns about the patch, I'm waiting forward to
> see them.

For what it's worth, I agree that you put a great deal of effort into
this patch, and it did not get in to 9.6 because of a collective
failure to focus minds on the patch. Your patch was a credible
attempt, which is impressive when you consider that the B-Tree code is
so complicated. There is also the fact that there is now a very small
list of credible reviewers for B-Tree patches; you must have noticed
that not even amcheck was committed, even though I was asked to
produce a polished version in February during the FOSDEM dev meeting,
and even though it's just a contrib module that is totally orientated
around finding bugs and so on. I'm not happy about that either, but
that's just something I have to swallow.

I fancy myself as am expert on the B-Tree code, but I've never managed
to make an impact in improving its performance at all (I've never made
a serious effort, but have had many ideas). So, in case it needs to be
said, I'll say it: You've chosen a very ambitious set of projects to
work on, by any standard. I think it's a good thing that you've been
ambitious, and I don't suggest changing that, since I think that you
have commensurate skill. But, in order to be successful in these
projects, patience and resolve are very important.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2016-04-12 Thread Anastasia Lubennikova
Attached version has fix of pg_dump suggested by Stephen 
Frost 
in -committers thread.

http://postgresql.nabble.com/pgsql-CREATE-INDEX-INCLUDING-column-td5897653.html
Sooner or later, I'd like to see this patch finished.

For now, it has two complaints:
- support of expressions as included columns.
Frankly, I don't understand, why it's a problem of the patch.
The patch is  already big enough and it will be much easier to add 
expressions support in the following patch, after the first one will be 
stable.

I wonder, if someone has objections to that?
Yes, it's a kind of delayed feature. But should we wait for every patch 
when it will be entirely completed?


- lack of review and testing
Obviously I did as much testing as I could.
So, if reviewers have any concerns about the patch, I'm waiting forward 
to see them.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 9c8e308..891325d 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -100,7 +100,7 @@ static remoteConn *getConnectionByName(const char *name);
 static HTAB *createConnHash(void);
 static void createNewConnection(const char *name, remoteConn *rconn);
 static void deleteConnection(const char *name);
-static char **get_pkey_attnames(Relation rel, int16 *numatts);
+static char **get_pkey_attnames(Relation rel, int16 *indnkeyatts);
 static char **get_text_array_contents(ArrayType *array, int *numitems);
 static char *get_sql_insert(Relation rel, int *pkattnums, int pknumatts, char **src_pkattvals, char **tgt_pkattvals);
 static char *get_sql_delete(Relation rel, int *pkattnums, int pknumatts, char **tgt_pkattvals);
@@ -1485,7 +1485,7 @@ PG_FUNCTION_INFO_V1(dblink_get_pkey);
 Datum
 dblink_get_pkey(PG_FUNCTION_ARGS)
 {
-	int16		numatts;
+	int16		indnkeyatts;
 	char	  **results;
 	FuncCallContext *funcctx;
 	int32		call_cntr;
@@ -1511,7 +1511,7 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		rel = get_rel_from_relname(PG_GETARG_TEXT_P(0), AccessShareLock, ACL_SELECT);
 
 		/* get the array of attnums */
-		results = get_pkey_attnames(rel, );
+		results = get_pkey_attnames(rel, );
 
 		relation_close(rel, AccessShareLock);
 
@@ -1531,9 +1531,9 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		attinmeta = TupleDescGetAttInMetadata(tupdesc);
 		funcctx->attinmeta = attinmeta;
 
-		if ((results != NULL) && (numatts > 0))
+		if ((results != NULL) && (indnkeyatts > 0))
 		{
-			funcctx->max_calls = numatts;
+			funcctx->max_calls = indnkeyatts;
 
 			/* got results, keep track of them */
 			funcctx->user_fctx = results;
@@ -2023,10 +2023,10 @@ dblink_fdw_validator(PG_FUNCTION_ARGS)
  * get_pkey_attnames
  *
  * Get the primary key attnames for the given relation.
- * Return NULL, and set numatts = 0, if no primary key exists.
+ * Return NULL, and set indnkeyatts = 0, if no primary key exists.
  */
 static char **
-get_pkey_attnames(Relation rel, int16 *numatts)
+get_pkey_attnames(Relation rel, int16 *indnkeyatts)
 {
 	Relation	indexRelation;
 	ScanKeyData skey;
@@ -2036,8 +2036,8 @@ get_pkey_attnames(Relation rel, int16 *numatts)
 	char	  **result = NULL;
 	TupleDesc	tupdesc;
 
-	/* initialize numatts to 0 in case no primary key exists */
-	*numatts = 0;
+	/* initialize indnkeyatts to 0 in case no primary key exists */
+	*indnkeyatts = 0;
 
 	tupdesc = rel->rd_att;
 
@@ -2058,12 +2058,12 @@ get_pkey_attnames(Relation rel, int16 *numatts)
 		/* we're only interested if it is the primary key */
 		if (index->indisprimary)
 		{
-			*numatts = index->indnatts;
-			if (*numatts > 0)
+			*indnkeyatts = index->indnkeyatts;
+			if (*indnkeyatts > 0)
 			{
-result = (char **) palloc(*numatts * sizeof(char *));
+result = (char **) palloc(*indnkeyatts * sizeof(char *));
 
-for (i = 0; i < *numatts; i++)
+for (i = 0; i < *indnkeyatts; i++)
 	result[i] = SPI_fname(tupdesc, index->indkey.values[i]);
 			}
 			break;
diff --git a/contrib/tcn/tcn.c b/contrib/tcn/tcn.c
index 7352b29..142730a 100644
--- a/contrib/tcn/tcn.c
+++ b/contrib/tcn/tcn.c
@@ -138,9 +138,9 @@ triggered_change_notification(PG_FUNCTION_ARGS)
 		/* we're only interested if it is the primary key and valid */
 		if (index->indisprimary && IndexIsValid(index))
 		{
-			int			numatts = index->indnatts;
+			int			indnkeyatts = index->indnkeyatts;
 
-			if (numatts > 0)
+			if (indnkeyatts > 0)
 			{
 int			i;
 
@@ -150,7 +150,7 @@ triggered_change_notification(PG_FUNCTION_ARGS)
 appendStringInfoCharMacro(payload, ',');
 appendStringInfoCharMacro(payload, operation);
 
-for (i = 0; i < numatts; i++)
+for (i = 0; i < indnkeyatts; i++)
 {
 	int			colno = index->indkey.values[i];
 
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index d6b60db..342d5ec 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -3557,6 +3557,14 

Re: [HACKERS] WIP: Covering + unique indexes.

2016-04-08 Thread Anastasia Lubennikova

08.04.2016 15:45, Anastasia Lubennikova:

08.04.2016 15:06, Teodor Sigaev:

On Wed, Apr 6, 2016 at 1:50 PM, Peter Geoghegan  wrote:

Personally, I like documenting assertions, and will sometimes write
assertions that the compiler could easily optimize away. Maybe going
*that* far is more a matter of personal style, but I think an
assertion about the new index tuple size being <= the old one is just
a good idea. It's not about a problem in your code at all.


You should make index_truncate_tuple()/index_reform_tuple() promise to
always do this in its comments/contract with caller as part of this,
IMV.


Some notices:
- index_truncate_tuple(Relation idxrel, IndexTuple olditup, int 
indnatts,

   int  indnkeyatts)
  Why we need indnatts/indnkeyatts? They are presented in idxrel struct
  already
- follow code where index_truncate_tuple() is called, it should never 
called in
  case where indnatts == indnkeyatts. So, indnkeyatts should be 
strictly less
  than indnatts, pls, change assertion. If they are equal the this 
function

  becomes complicated variant of CopyIndexTuple()


Good point. These attributes seem to be there since previous versions 
of the function.

But now they are definitely unnecessary. Updated patch is attached


One more improvement - note about expressions into documentation.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index b5f67af..61a21a9 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -161,6 +161,8 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] INCLUDING clause, which can slightly reduce the size of the index,
 due to storing included attributes only in leaf index pages.
 Currently, only the B-tree access method supports this feature.
+Expressions as included columns are not supported since they cannot be used
+in index-only scan.

   
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2016-04-08 Thread Anastasia Lubennikova

08.04.2016 15:06, Teodor Sigaev:

On Wed, Apr 6, 2016 at 1:50 PM, Peter Geoghegan  wrote:

Personally, I like documenting assertions, and will sometimes write
assertions that the compiler could easily optimize away. Maybe going
*that* far is more a matter of personal style, but I think an
assertion about the new index tuple size being <= the old one is just
a good idea. It's not about a problem in your code at all.


You should make index_truncate_tuple()/index_reform_tuple() promise to
always do this in its comments/contract with caller as part of this,
IMV.


Some notices:
- index_truncate_tuple(Relation idxrel, IndexTuple olditup, int indnatts,
   int  indnkeyatts)
  Why we need indnatts/indnkeyatts? They are presented in idxrel struct
  already
- follow code where index_truncate_tuple() is called, it should never 
called in
  case where indnatts == indnkeyatts. So, indnkeyatts should be 
strictly less
  than indnatts, pls, change assertion. If they are equal the this 
function

  becomes complicated variant of CopyIndexTuple()


Good point. These attributes seem to be there since previous versions of 
the function.

But now they are definitely unnecessary. Updated patch is attached

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 9c8e308..891325d 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -100,7 +100,7 @@ static remoteConn *getConnectionByName(const char *name);
 static HTAB *createConnHash(void);
 static void createNewConnection(const char *name, remoteConn *rconn);
 static void deleteConnection(const char *name);
-static char **get_pkey_attnames(Relation rel, int16 *numatts);
+static char **get_pkey_attnames(Relation rel, int16 *indnkeyatts);
 static char **get_text_array_contents(ArrayType *array, int *numitems);
 static char *get_sql_insert(Relation rel, int *pkattnums, int pknumatts, char **src_pkattvals, char **tgt_pkattvals);
 static char *get_sql_delete(Relation rel, int *pkattnums, int pknumatts, char **tgt_pkattvals);
@@ -1485,7 +1485,7 @@ PG_FUNCTION_INFO_V1(dblink_get_pkey);
 Datum
 dblink_get_pkey(PG_FUNCTION_ARGS)
 {
-	int16		numatts;
+	int16		indnkeyatts;
 	char	  **results;
 	FuncCallContext *funcctx;
 	int32		call_cntr;
@@ -1511,7 +1511,7 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		rel = get_rel_from_relname(PG_GETARG_TEXT_P(0), AccessShareLock, ACL_SELECT);
 
 		/* get the array of attnums */
-		results = get_pkey_attnames(rel, );
+		results = get_pkey_attnames(rel, );
 
 		relation_close(rel, AccessShareLock);
 
@@ -1531,9 +1531,9 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		attinmeta = TupleDescGetAttInMetadata(tupdesc);
 		funcctx->attinmeta = attinmeta;
 
-		if ((results != NULL) && (numatts > 0))
+		if ((results != NULL) && (indnkeyatts > 0))
 		{
-			funcctx->max_calls = numatts;
+			funcctx->max_calls = indnkeyatts;
 
 			/* got results, keep track of them */
 			funcctx->user_fctx = results;
@@ -2023,10 +2023,10 @@ dblink_fdw_validator(PG_FUNCTION_ARGS)
  * get_pkey_attnames
  *
  * Get the primary key attnames for the given relation.
- * Return NULL, and set numatts = 0, if no primary key exists.
+ * Return NULL, and set indnkeyatts = 0, if no primary key exists.
  */
 static char **
-get_pkey_attnames(Relation rel, int16 *numatts)
+get_pkey_attnames(Relation rel, int16 *indnkeyatts)
 {
 	Relation	indexRelation;
 	ScanKeyData skey;
@@ -2036,8 +2036,8 @@ get_pkey_attnames(Relation rel, int16 *numatts)
 	char	  **result = NULL;
 	TupleDesc	tupdesc;
 
-	/* initialize numatts to 0 in case no primary key exists */
-	*numatts = 0;
+	/* initialize indnkeyatts to 0 in case no primary key exists */
+	*indnkeyatts = 0;
 
 	tupdesc = rel->rd_att;
 
@@ -2058,12 +2058,12 @@ get_pkey_attnames(Relation rel, int16 *numatts)
 		/* we're only interested if it is the primary key */
 		if (index->indisprimary)
 		{
-			*numatts = index->indnatts;
-			if (*numatts > 0)
+			*indnkeyatts = index->indnkeyatts;
+			if (*indnkeyatts > 0)
 			{
-result = (char **) palloc(*numatts * sizeof(char *));
+result = (char **) palloc(*indnkeyatts * sizeof(char *));
 
-for (i = 0; i < *numatts; i++)
+for (i = 0; i < *indnkeyatts; i++)
 	result[i] = SPI_fname(tupdesc, index->indkey.values[i]);
 			}
 			break;
diff --git a/contrib/tcn/tcn.c b/contrib/tcn/tcn.c
index 7352b29..142730a 100644
--- a/contrib/tcn/tcn.c
+++ b/contrib/tcn/tcn.c
@@ -138,9 +138,9 @@ triggered_change_notification(PG_FUNCTION_ARGS)
 		/* we're only interested if it is the primary key and valid */
 		if (index->indisprimary && IndexIsValid(index))
 		{
-			int			numatts = index->indnatts;
+			int			indnkeyatts = index->indnkeyatts;
 
-			if (numatts > 0)
+			if (indnkeyatts > 0)
 			{
 int			i;
 
@@ -150,7 +150,7 @@ triggered_change_notification(PG_FUNCTION_ARGS)
 appendStringInfoCharMacro(payload, ',');
 

Re: [HACKERS] WIP: Covering + unique indexes.

2016-04-08 Thread Teodor Sigaev

On Wed, Apr 6, 2016 at 1:50 PM, Peter Geoghegan  wrote:

Personally, I like documenting assertions, and will sometimes write
assertions that the compiler could easily optimize away. Maybe going
*that* far is more a matter of personal style, but I think an
assertion about the new index tuple size being <= the old one is just
a good idea. It's not about a problem in your code at all.


You should make index_truncate_tuple()/index_reform_tuple() promise to
always do this in its comments/contract with caller as part of this,
IMV.


Some notices:
- index_truncate_tuple(Relation idxrel, IndexTuple olditup, int indnatts,
   int  indnkeyatts)
  Why we need indnatts/indnkeyatts? They are presented in idxrel struct
  already
- follow code where index_truncate_tuple() is called, it should never called in
  case where indnatts == indnkeyatts. So, indnkeyatts should be strictly less
  than indnatts, pls, change assertion. If they are equal the this function
  becomes complicated variant of CopyIndexTuple()
--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2016-04-07 Thread Anastasia Lubennikova

06.04.2016 23:52, Peter Geoghegan:

On Wed, Apr 6, 2016 at 1:50 PM, Peter Geoghegan  wrote:

Personally, I like documenting assertions, and will sometimes write
assertions that the compiler could easily optimize away. Maybe going
*that* far is more a matter of personal style, but I think an
assertion about the new index tuple size being <= the old one is just
a good idea. It's not about a problem in your code at all.

You should make index_truncate_tuple()/index_reform_tuple() promise to
always do this in its comments/contract with caller as part of this,
IMV.



Mentioned issues are fixed. Patch is attached.

I'd like to remind you that the commitfest will be closed very-very 
soon, so I'd like to get your final resolution about the patch.

Not to have it in the 9.6 release will be very disappointing.

I agree that b-tree is a crucial subsystem. But it seems to me, that we 
have lack of improvements in this area
not only because of the algorithm's complexity but also because of lack 
of enthusiasts to work on it and struggle through endless discussions.
But it's off-topic here. Attention to these development difficulties 
will be one of the messages of my pgcon talk.


You know, we lost a lot of time discussing various b-tree problems.
Besides that, I am sure that the patch is really in a good shape. It 
hasn't any open problems to fix.

And possible subtle bugs can be found at the testing stage of the release.

Looking forward to your reply.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 9c8e308..891325d 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -100,7 +100,7 @@ static remoteConn *getConnectionByName(const char *name);
 static HTAB *createConnHash(void);
 static void createNewConnection(const char *name, remoteConn *rconn);
 static void deleteConnection(const char *name);
-static char **get_pkey_attnames(Relation rel, int16 *numatts);
+static char **get_pkey_attnames(Relation rel, int16 *indnkeyatts);
 static char **get_text_array_contents(ArrayType *array, int *numitems);
 static char *get_sql_insert(Relation rel, int *pkattnums, int pknumatts, char **src_pkattvals, char **tgt_pkattvals);
 static char *get_sql_delete(Relation rel, int *pkattnums, int pknumatts, char **tgt_pkattvals);
@@ -1485,7 +1485,7 @@ PG_FUNCTION_INFO_V1(dblink_get_pkey);
 Datum
 dblink_get_pkey(PG_FUNCTION_ARGS)
 {
-	int16		numatts;
+	int16		indnkeyatts;
 	char	  **results;
 	FuncCallContext *funcctx;
 	int32		call_cntr;
@@ -1511,7 +1511,7 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		rel = get_rel_from_relname(PG_GETARG_TEXT_P(0), AccessShareLock, ACL_SELECT);
 
 		/* get the array of attnums */
-		results = get_pkey_attnames(rel, );
+		results = get_pkey_attnames(rel, );
 
 		relation_close(rel, AccessShareLock);
 
@@ -1531,9 +1531,9 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		attinmeta = TupleDescGetAttInMetadata(tupdesc);
 		funcctx->attinmeta = attinmeta;
 
-		if ((results != NULL) && (numatts > 0))
+		if ((results != NULL) && (indnkeyatts > 0))
 		{
-			funcctx->max_calls = numatts;
+			funcctx->max_calls = indnkeyatts;
 
 			/* got results, keep track of them */
 			funcctx->user_fctx = results;
@@ -2023,10 +2023,10 @@ dblink_fdw_validator(PG_FUNCTION_ARGS)
  * get_pkey_attnames
  *
  * Get the primary key attnames for the given relation.
- * Return NULL, and set numatts = 0, if no primary key exists.
+ * Return NULL, and set indnkeyatts = 0, if no primary key exists.
  */
 static char **
-get_pkey_attnames(Relation rel, int16 *numatts)
+get_pkey_attnames(Relation rel, int16 *indnkeyatts)
 {
 	Relation	indexRelation;
 	ScanKeyData skey;
@@ -2036,8 +2036,8 @@ get_pkey_attnames(Relation rel, int16 *numatts)
 	char	  **result = NULL;
 	TupleDesc	tupdesc;
 
-	/* initialize numatts to 0 in case no primary key exists */
-	*numatts = 0;
+	/* initialize indnkeyatts to 0 in case no primary key exists */
+	*indnkeyatts = 0;
 
 	tupdesc = rel->rd_att;
 
@@ -2058,12 +2058,12 @@ get_pkey_attnames(Relation rel, int16 *numatts)
 		/* we're only interested if it is the primary key */
 		if (index->indisprimary)
 		{
-			*numatts = index->indnatts;
-			if (*numatts > 0)
+			*indnkeyatts = index->indnkeyatts;
+			if (*indnkeyatts > 0)
 			{
-result = (char **) palloc(*numatts * sizeof(char *));
+result = (char **) palloc(*indnkeyatts * sizeof(char *));
 
-for (i = 0; i < *numatts; i++)
+for (i = 0; i < *indnkeyatts; i++)
 	result[i] = SPI_fname(tupdesc, index->indkey.values[i]);
 			}
 			break;
diff --git a/contrib/tcn/tcn.c b/contrib/tcn/tcn.c
index 7352b29..142730a 100644
--- a/contrib/tcn/tcn.c
+++ b/contrib/tcn/tcn.c
@@ -138,9 +138,9 @@ triggered_change_notification(PG_FUNCTION_ARGS)
 		/* we're only interested if it is the primary key and valid */
 		if (index->indisprimary && IndexIsValid(index))
 		{
-			int			numatts = index->indnatts;
+			int			

Re: [HACKERS] WIP: Covering + unique indexes.

2016-04-06 Thread Peter Geoghegan
On Wed, Apr 6, 2016 at 1:50 PM, Peter Geoghegan  wrote:
> Personally, I like documenting assertions, and will sometimes write
> assertions that the compiler could easily optimize away. Maybe going
> *that* far is more a matter of personal style, but I think an
> assertion about the new index tuple size being <= the old one is just
> a good idea. It's not about a problem in your code at all.

You should make index_truncate_tuple()/index_reform_tuple() promise to
always do this in its comments/contract with caller as part of this,
IMV.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2016-04-06 Thread Peter Geoghegan
On Wed, Apr 6, 2016 at 6:15 AM, Anastasia Lubennikova
 wrote:
>> * I would like to see index_reform_tuple() assert that the new,
>> truncated index tuple is definitely <= the original (I worry about the
>> 1/3 page restriction issue). Maybe you should also change the name of
>> index_reform_tuple(), per David.
>
> Is it possible that the new tuple, containing less attributes than the old
> one, will have a greater size?
> Maybe you can give an example?
> I think that  Assert(indnkeyatts <= indnatts); covers this kind of errors.

I don't think it is possible, because you aren't e.g. making an
attribute's value NULL where it wasn't NULL before (making the
IndexTuple contain a NULL bitmap where it didn't before). But that's
kind of subtle, and it certainly seems worth an assertion. It could
change tomorrow, when someone optimizes heap_deform_tuple(), which has
been proposed more than once.

Personally, I like documenting assertions, and will sometimes write
assertions that the compiler could easily optimize away. Maybe going
*that* far is more a matter of personal style, but I think an
assertion about the new index tuple size being <= the old one is just
a good idea. It's not about a problem in your code at all.

> I do not mind to rename this function, but what name would be better?
> index_truncate_tuple()?

That seems better, yes.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2016-04-06 Thread Anastasia Lubennikova

06.04.2016 16:15, Anastasia Lubennikova :

06.04.2016 03:05, Peter Geoghegan:

* There is some stray whitespace within RelationGetIndexAttrBitmap().
I think you should have updated it with code, though. I don't think
it's necessary for HOT updates to work, but I think it could be
necessary so that we don't need to get a row lock that blocks
non-conflict foreign key locking (see heap_update() callers). I think
you need to be careful for non-key columns within the loop in
RelationGetIndexAttrBitmap(), basically, because it seems to still go
through all columns. UPSERT also must call this code, FWIW.

* I think that a similar omission is also made for the replica
identity stuff in RelationGetIndexAttrBitmap(). Some thought is needed
on how this patch interacts with logical decoding, I guess.


Good point. Indexes are everywhere in the code.
I missed that RelationGetIndexAttrBitmap() is used not only for REINDEX.
I'll discuss it with Theodor and send an updated patch tomorrow.


As promised, updated patch is in attachments.
But, I'm not an expert in this area, so it needs a 'critical look'.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 9c8e308..891325d 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -100,7 +100,7 @@ static remoteConn *getConnectionByName(const char *name);
 static HTAB *createConnHash(void);
 static void createNewConnection(const char *name, remoteConn *rconn);
 static void deleteConnection(const char *name);
-static char **get_pkey_attnames(Relation rel, int16 *numatts);
+static char **get_pkey_attnames(Relation rel, int16 *indnkeyatts);
 static char **get_text_array_contents(ArrayType *array, int *numitems);
 static char *get_sql_insert(Relation rel, int *pkattnums, int pknumatts, char **src_pkattvals, char **tgt_pkattvals);
 static char *get_sql_delete(Relation rel, int *pkattnums, int pknumatts, char **tgt_pkattvals);
@@ -1485,7 +1485,7 @@ PG_FUNCTION_INFO_V1(dblink_get_pkey);
 Datum
 dblink_get_pkey(PG_FUNCTION_ARGS)
 {
-	int16		numatts;
+	int16		indnkeyatts;
 	char	  **results;
 	FuncCallContext *funcctx;
 	int32		call_cntr;
@@ -1511,7 +1511,7 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		rel = get_rel_from_relname(PG_GETARG_TEXT_P(0), AccessShareLock, ACL_SELECT);
 
 		/* get the array of attnums */
-		results = get_pkey_attnames(rel, );
+		results = get_pkey_attnames(rel, );
 
 		relation_close(rel, AccessShareLock);
 
@@ -1531,9 +1531,9 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		attinmeta = TupleDescGetAttInMetadata(tupdesc);
 		funcctx->attinmeta = attinmeta;
 
-		if ((results != NULL) && (numatts > 0))
+		if ((results != NULL) && (indnkeyatts > 0))
 		{
-			funcctx->max_calls = numatts;
+			funcctx->max_calls = indnkeyatts;
 
 			/* got results, keep track of them */
 			funcctx->user_fctx = results;
@@ -2023,10 +2023,10 @@ dblink_fdw_validator(PG_FUNCTION_ARGS)
  * get_pkey_attnames
  *
  * Get the primary key attnames for the given relation.
- * Return NULL, and set numatts = 0, if no primary key exists.
+ * Return NULL, and set indnkeyatts = 0, if no primary key exists.
  */
 static char **
-get_pkey_attnames(Relation rel, int16 *numatts)
+get_pkey_attnames(Relation rel, int16 *indnkeyatts)
 {
 	Relation	indexRelation;
 	ScanKeyData skey;
@@ -2036,8 +2036,8 @@ get_pkey_attnames(Relation rel, int16 *numatts)
 	char	  **result = NULL;
 	TupleDesc	tupdesc;
 
-	/* initialize numatts to 0 in case no primary key exists */
-	*numatts = 0;
+	/* initialize indnkeyatts to 0 in case no primary key exists */
+	*indnkeyatts = 0;
 
 	tupdesc = rel->rd_att;
 
@@ -2058,12 +2058,12 @@ get_pkey_attnames(Relation rel, int16 *numatts)
 		/* we're only interested if it is the primary key */
 		if (index->indisprimary)
 		{
-			*numatts = index->indnatts;
-			if (*numatts > 0)
+			*indnkeyatts = index->indnkeyatts;
+			if (*indnkeyatts > 0)
 			{
-result = (char **) palloc(*numatts * sizeof(char *));
+result = (char **) palloc(*indnkeyatts * sizeof(char *));
 
-for (i = 0; i < *numatts; i++)
+for (i = 0; i < *indnkeyatts; i++)
 	result[i] = SPI_fname(tupdesc, index->indkey.values[i]);
 			}
 			break;
diff --git a/contrib/tcn/tcn.c b/contrib/tcn/tcn.c
index 7352b29..142730a 100644
--- a/contrib/tcn/tcn.c
+++ b/contrib/tcn/tcn.c
@@ -138,9 +138,9 @@ triggered_change_notification(PG_FUNCTION_ARGS)
 		/* we're only interested if it is the primary key and valid */
 		if (index->indisprimary && IndexIsValid(index))
 		{
-			int			numatts = index->indnatts;
+			int			indnkeyatts = index->indnkeyatts;
 
-			if (numatts > 0)
+			if (indnkeyatts > 0)
 			{
 int			i;
 
@@ -150,7 +150,7 @@ triggered_change_notification(PG_FUNCTION_ARGS)
 appendStringInfoCharMacro(payload, ',');
 appendStringInfoCharMacro(payload, operation);
 
-for (i = 0; i < numatts; i++)
+for (i = 0; i < indnkeyatts; i++)
 {
 	int			colno = 

Re: [HACKERS] WIP: Covering + unique indexes.

2016-04-06 Thread Anastasia Lubennikova

06.04.2016 03:05, Peter Geoghegan:

On Tue, Apr 5, 2016 at 1:31 PM, Peter Geoghegan  wrote:

My new understanding: The extra "included" columns are stored in the
index, but do not affect its sort order at all. They are no more part
of the key than, say, the heap TID that the key points to. They are
just "payload".


It was really long and complicated discussion. I'm glad that finally we 
are in agreement about the patch.
Anyway, I think all mentioned questions will be very helpful for the 
future work on b-tree.



Noticed a few issues following another pass:

* tuplesort.c should handle the CLUSTER case in the same way as the
btree case. No?

Yes, I just missed that cluster uses index sort.  Fixed.


* Why have a RelationGetNumberOfAttributes(indexRel) call in
tuplesort_begin_index_btree() at all now?

Fixed.

* This critical section is unnecessary, because this happens during
index builds:

+   if (indnkeyatts != indnatts && P_ISLEAF(opageop))
+   {
+   /*
+* It's essential to truncate High key here.
+* The purpose is not just to save more space
on this particular page,
+* but to keep whole b-tree structure
consistent. Subsequent insertions
+* assume that hikey is already truncated, and
so they should not
+* worry about it, when copying the high key
into the parent page
+* as a downlink.
+* NOTE It is not crutial for reliability in present,
+* but maybe it will be that in the future.
+* NOTE this code will be changed by the
"btree compression" patch,
+* which is in progress now.
+*/
+   keytup = index_reform_tuple(wstate->index, oitup,
+
  indnatts, indnkeyatts);
+
+   /*  delete "wrong" high key, insert keytup as
P_HIKEY. */
+   START_CRIT_SECTION();
+   PageIndexTupleDelete(opage, P_HIKEY);
+
+   if (!_bt_pgaddtup(opage,
IndexTupleSize(keytup), keytup, P_HIKEY))
+   elog(ERROR, "failed to rewrite
compressed item in index \"%s\"",
+   RelationGetRelationName(wstate->index));
+   END_CRIT_SECTION();
+   }

Note that START_CRIT_SECTION() promotes any ERROR to PANIC, which
isn't useful here, because we have no buffer lock held, and nothing
must be WAL-logged.

* Think you forgot to update spghandler(). (You did not add a test for
just that one AM, either)

Fixed.

* I wonder why this restriction needs to exist:

+   else
+   elog(ERROR, "Expressions are not supported in
included columns.");

What does not supporting it buy us? Was it just that the pg_index
representation is more complicated, and you wanted to put it off?

An error like this should use ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED ..., btw.
Yes, you get it right. It was a bit complicated to implement and I 
decided to delay it to the next patch.

errmsg is fixed.


* I would like to see index_reform_tuple() assert that the new,
truncated index tuple is definitely <= the original (I worry about the
1/3 page restriction issue). Maybe you should also change the name of
index_reform_tuple(), per David.
Is it possible that the new tuple, containing less attributes than the 
old one, will have a greater size?

Maybe you can give an example?
I think that  Assert(indnkeyatts <= indnatts); covers this kind of errors.
I do not mind to rename this function, but what name would be better?
index_truncate_tuple()?


* There is some stray whitespace within RelationGetIndexAttrBitmap().
I think you should have updated it with code, though. I don't think
it's necessary for HOT updates to work, but I think it could be
necessary so that we don't need to get a row lock that blocks
non-conflict foreign key locking (see heap_update() callers). I think
you need to be careful for non-key columns within the loop in
RelationGetIndexAttrBitmap(), basically, because it seems to still go
through all columns. UPSERT also must call this code, FWIW.

* I think that a similar omission is also made for the replica
identity stuff in RelationGetIndexAttrBitmap(). Some thought is needed
on how this patch interacts with logical decoding, I guess.


Good point. Indexes are everywhere in the code.
I missed that RelationGetIndexAttrBitmap() is used not only for REINDEX.
I'll discuss it with Theodor and send an updated patch tomorrow.


* Valgrind shows an error with an aggregate statement I tried:

2016-04-05 17:01:31.129 PDT 12310 LOG:  statement: explain analyze
select count(*) from ab  where b > 5 group by a, b;
==12310== Invalid read of size 4
==12310==at 0x656615: match_clause_to_indexcol 

Re: [HACKERS] WIP: Covering + unique indexes.

2016-04-05 Thread Peter Geoghegan
On Tue, Apr 5, 2016 at 1:31 PM, Peter Geoghegan  wrote:
> My new understanding: The extra "included" columns are stored in the
> index, but do not affect its sort order at all. They are no more part
> of the key than, say, the heap TID that the key points to. They are
> just "payload".

Noticed a few issues following another pass:

* tuplesort.c should handle the CLUSTER case in the same way as the
btree case. No?

* Why have a RelationGetNumberOfAttributes(indexRel) call in
tuplesort_begin_index_btree() at all now?

* This critical section is unnecessary, because this happens during
index builds:

+   if (indnkeyatts != indnatts && P_ISLEAF(opageop))
+   {
+   /*
+* It's essential to truncate High key here.
+* The purpose is not just to save more space
on this particular page,
+* but to keep whole b-tree structure
consistent. Subsequent insertions
+* assume that hikey is already truncated, and
so they should not
+* worry about it, when copying the high key
into the parent page
+* as a downlink.
+* NOTE It is not crutial for reliability in present,
+* but maybe it will be that in the future.
+* NOTE this code will be changed by the
"btree compression" patch,
+* which is in progress now.
+*/
+   keytup = index_reform_tuple(wstate->index, oitup,
+
 indnatts, indnkeyatts);
+
+   /*  delete "wrong" high key, insert keytup as
P_HIKEY. */
+   START_CRIT_SECTION();
+   PageIndexTupleDelete(opage, P_HIKEY);
+
+   if (!_bt_pgaddtup(opage,
IndexTupleSize(keytup), keytup, P_HIKEY))
+   elog(ERROR, "failed to rewrite
compressed item in index \"%s\"",
+   RelationGetRelationName(wstate->index));
+   END_CRIT_SECTION();
+   }

Note that START_CRIT_SECTION() promotes any ERROR to PANIC, which
isn't useful here, because we have no buffer lock held, and nothing
must be WAL-logged.

* Think you forgot to update spghandler(). (You did not add a test for
just that one AM, either)

* I wonder why this restriction needs to exist:

+   else
+   elog(ERROR, "Expressions are not supported in
included columns.");

What does not supporting it buy us? Was it just that the pg_index
representation is more complicated, and you wanted to put it off?

An error like this should use ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED ..., btw.

* I would like to see index_reform_tuple() assert that the new,
truncated index tuple is definitely <= the original (I worry about the
1/3 page restriction issue). Maybe you should also change the name of
index_reform_tuple(), per David.

* There is some stray whitespace within RelationGetIndexAttrBitmap().
I think you should have updated it with code, though. I don't think
it's necessary for HOT updates to work, but I think it could be
necessary so that we don't need to get a row lock that blocks
non-conflict foreign key locking (see heap_update() callers). I think
you need to be careful for non-key columns within the loop in
RelationGetIndexAttrBitmap(), basically, because it seems to still go
through all columns. UPSERT also must call this code, FWIW.

* I think that a similar omission is also made for the replica
identity stuff in RelationGetIndexAttrBitmap(). Some thought is needed
on how this patch interacts with logical decoding, I guess.

* Valgrind shows an error with an aggregate statement I tried:

2016-04-05 17:01:31.129 PDT 12310 LOG:  statement: explain analyze
select count(*) from ab  where b > 5 group by a, b;
==12310== Invalid read of size 4
==12310==at 0x656615: match_clause_to_indexcol (indxpath.c:2226)
==12310==by 0x656615: match_clause_to_index (indxpath.c:2144)
==12310==by 0x656DBC: match_clauses_to_index (indxpath.c:2115)
==12310==by 0x658054: match_restriction_clauses_to_index (indxpath.c:2026)
==12310==by 0x658054: create_index_paths (indxpath.c:269)
==12310==by 0x64D1DB: set_plain_rel_pathlist (allpaths.c:649)
==12310==by 0x64D1DB: set_rel_pathlist (allpaths.c:427)
==12310==by 0x64D93B: set_base_rel_pathlists (allpaths.c:299)
==12310==by 0x64D93B: make_one_rel (allpaths.c:170)
==12310==by 0x66876C: query_planner (planmain.c:246)
==12310==by 0x669FBA: grouping_planner (planner.c:1666)
==12310==by 0x66D0C9: subquery_planner (planner.c:751)
==12310==by 0x66D3DA: standard_planner (planner.c:300)
==12310==by 0x66D714: planner (planner.c:170)
==12310==by 0x6FD692: pg_plan_query (postgres.c:798)
==12310==by 0x59082D: ExplainOneQuery (explain.c:350)
==12310== 

Re: [HACKERS] WIP: Covering + unique indexes.

2016-04-05 Thread Peter Geoghegan
On Tue, Apr 5, 2016 at 7:56 AM, Anastasia Lubennikova
 wrote:
> I would say, this is changed, but it doesn't matter.

Actually, I would now say that it hasn't really changed (see below),
based on my new understanding. The *choice* to go on one page or the
other still exists.

> Performing any search in btree (including choosing suitable page for
> insertion), we use only key attributes.
> We assume that included columns are stored in index unordered.

The patch assumes no ordering for the non-indexed columns in the
index? While I knew that the patch was primarily motivated by enabling
index-only scans, I didn't realize that at all. The patch is much much
less like a general suffix truncation patch than I thought. I may have
been confused in part by the high key issue that you only recently
fixed, but you should have corrected me about suffix truncation
earlier. Obviously, this was a significant misunderstanding; we have
been "talking at cross purposes" this whole time.

There seems to have been significant misunderstanding about this before now:

http://www.postgresql.org/message-id/cakjs1f9w0ab-g7h6yygnbq7hjsokf3uwhu7-q5jobbatyk9...@mail.gmail.com

My new understanding: The extra "included" columns are stored in the
index, but do not affect its sort order at all. They are no more part
of the key than, say, the heap TID that the key points to. They are
just "payload".

> "just matching on constrained attributes" is the core idea of the whole
> patch. Included columns just provide us possibility to use index-only scan.
> Nothing more. We assume use case where index-only-scan is faster than
> index-scan + heap fetch. For example, in queries like "select data from tbl
> where id = 1;" we have no scan condition on data. Maybe you afraid of long
> linear scan when we have enormous index bloat even on unique index. It will
> happen anyway, whether we have index-only scan on covering index or
> index-scan on unique index + heap fetch. The only difference is that the
> covering index is faster.

My concern about performance when that happens is very much secondary.
I really only mentioned it to help explain my primary concern.

> At the very beginning of the proposal discussion, I suggested to implement
> third kind of columns, which are not constrained, but used in scankey.
> They must have op class to do it, and they are not truncated. But it was
> decided to abandon this feature.

I must have missed that. Obviously, I wasn't paying enough attention
to earlier discussion. Earlier versions of the patch did fail to
recognize that the sort order was not the entire indexed order, but
that isn't the case with V8. That that was ever possible was only a
bug, it turns out.

>> The more I think about it, the more I doubt that it's okay to not
>> ensure downlinks are always distinct with their siblings, by sometimes
>> including non-constrained (truncatable) attributes within internal
>> pages, as needed to *distinguish* downlinks (also, we must
>> occasionally have *all* attributes including truncatable attributes in
>> internal pages -- we must truncate nothing to keep the key space sane
>> in the parent).

> Frankly, I still do not understand what you're worried about.
> If high key is greater than the scan key, we definitely cannot find any more
> tuples, because key attributes are ordered.
> If high key is equal to the scan key, we will continue searching and read
> next page.

I thought, because of the emphasis on unique indexes, that this patch
was mostly to offer a way of getting an index with uniqueness only
enforced on certain columns, but otherwise just the same as having a
non-unique index on those same columns. Plus, some suffix truncation,
because point-lookups involving later attributes are unlikely to be
useful when this is scoped to just unique indexes (which were
emphasized by you), because truncating key columns is not helpful
unless bloat is terrible.

I now understand that it was quite wrong to link this to suffix
truncation at all. The two are really not the same. That does make the
patch seem significantly simpler, at least as far as nbtree goes; a
tool like amcheck is not likely to detect problems in this patch that
a human tester could not catch. That was the kind of problem that I
feared.

> The code is not changed here, it is the same as processing of duplicates
> spreading over several pages. If you do not trust postgresql btree changes
> to the L to make duplicates work, I don't know what to say, but it's
> definitely not related to my patch.

My point about the postgres btree changes to L to make duplicates
work is that I think it makes the patch work, but perhaps not
absolutely reliably. I don't have any specific misgivings about it on
its own. Again, my earlier remarks were based on a misguided
understanding of the patch, so it doesn't matter now.

Communication is hard. There may be a lesson here for both of us about that.

> Of course I do not mind if 

Re: [HACKERS] WIP: Covering + unique indexes.

2016-04-05 Thread Anastasia Lubennikova

05.04.2016 01:48, Peter Geoghegan :

On Mon, Mar 21, 2016 at 9:53 AM, Anastasia Lubennikova
 wrote:

It's a bit more complicated to add it into index creation algorithm.
There's a trick with a "high key".
 /*
  * We copy the last item on the page into the new page, and then
  * rearrange the old page so that the 'last item' becomes its high
key
  * rather than a true data item.  There had better be at least two
  * items on the page already, else the page would be empty of useful
  * data.
  */
 /*
  * Move 'last' into the high key position on opage
  */

To be consistent with other steps of algorithm ( all high keys must be
truncated tuples), I had to update this high key on place:
delete the old one, and insert truncated high key.

Hmm. But the high key comparing equal to the Scankey gives insertion
the choice of where to put its IndexTuple (it can go on the page with
the high key, or its right-sibling, according only to considerations
about fillfactor, etc). Is this changed? Does it not matter? Why not?
Is it just worth it?


I would say, this is changed, but it doesn't matter.
Performing any search in btree (including choosing suitable page for 
insertion), we use only key attributes.

We assume that included columns are stored in index unordered.
Simple example.
create table tbl(id int, data int);
create index idx on tbl (id) including (data);

Select query does not consider included columns in scan key.
It selects all tuples satisfying the condition on key column. And only 
after that it applies filter to remove wrong rows from the result.
If key attribute doesn't satisfy query condition, there are no more 
tuples to return and we can interrupt scan.


You can find more explanations in the attached sql script,
that contains queries to recieve detailed information about index 
structure using pageinspect.



The right-most page on every level has no high-key. But you say those
pages have an "imaginary" *positive* infinity high key, just as
internal pages have (non-imaginary) minus infinity downlinks as their
first item/downlink. So tuples in a (say) leaf page are always bound
by the downlink lower bound in parent, while their own high key is an
upper bound. Either (and, rarely, both) could be (positive or
negative) infinity.

Maybe you now see why I talked about special _bt_compare() logic for
this. I proposed special logic that is similar to the existing minus
infinity thing _bt_compare() does (although _bt_binsrch(), an
important caller of _bt_compare() also does special things for
internal .vs leaf case, so I'm not sure any new special logic must go
in _bt_compare()).


In my view, it's the correct way to fix this problem, because the caller is
responsible for passing proper arguments to the function.
Of course I will add a check into bt_compare, but I'd rather make it an
assertion (see the patch attached).

I see what you mean, but I think we need to decide what to do about
the key space when leaf high keys are truncated. I do think that
truncating the high key was the right idea, though, and it nicely
illustrates that nothing special should happen in upper levels. Suffix
truncation should only happen when leaf pages are split, generally
speaking.
As I said, the high key is very similar to the downlinks, in that both
bound the things that go on each page. Together with downlinks they
represent *discrete* ranges *unambiguously*, so INCLUDING truncation
needs to make it clear which page new items go on. As I said,
_bt_binsrch() already takes special actions for internal pages, making
sure to return the item that is < the scankey, not <= the scankey
which is only allowed for leaf pages. (See README, from "Lehman and
Yao assume that the key range for a subtree S is described by Ki < v
<= Ki+1 where Ki and Ki+1 are the adjacent keys in the parent
page...").

To give a specific example, I worry about the case where two sibling
downlinks in a parent page are distinct, but per specific-to-Postgres
"Ki <= v <= Ki+1" thing (which differs from the classic L
invariant), some tuples with all right downlink's attributes matching
end up in left child page, not right child page. I worry that since
_bt_findsplitloc() doesn't consider this (for example), the split
point doesn't *reliably* and unambiguously divide the key space
between the new halves of a page being split. I think the "Ki <= v <=
Ki+1"/_bt_binsrch() thing might save you in common cases where all
downlink attributes are distinct, so maybe that simpler case is okay.
But to be even more specific, what about the more complicated case
where the downlinks *are* fully _bt_compare()-wise equal? This could
happen even though they're constrained to be unique in leaf pages, due
to bloat. Unique indexes aren't special here; they just make it far
less likely that this would happen in practice, because it takes a lot
of bloat. Less importantly, when 

Re: [HACKERS] WIP: Covering + unique indexes.

2016-04-04 Thread Peter Geoghegan
On Mon, Mar 21, 2016 at 9:53 AM, Anastasia Lubennikova
 wrote:
> Thanks to David,
> I've missed these letters at first.
> I'll answer here.

Sorry about using the wrong thread.

> I agree that _bt_insertonpg() is not right for truncation.

Cool.

> It's a bit more complicated to add it into index creation algorithm.
> There's a trick with a "high key".
> /*
>  * We copy the last item on the page into the new page, and then
>  * rearrange the old page so that the 'last item' becomes its high
> key
>  * rather than a true data item.  There had better be at least two
>  * items on the page already, else the page would be empty of useful
>  * data.
>  */
> /*
>  * Move 'last' into the high key position on opage
>  */
>
> To be consistent with other steps of algorithm ( all high keys must be
> truncated tuples), I had to update this high key on place:
> delete the old one, and insert truncated high key.

Hmm. But the high key comparing equal to the Scankey gives insertion
the choice of where to put its IndexTuple (it can go on the page with
the high key, or its right-sibling, according only to considerations
about fillfactor, etc). Is this changed? Does it not matter? Why not?
Is it just worth it?

The right-most page on every level has no high-key. But you say those
pages have an "imaginary" *positive* infinity high key, just as
internal pages have (non-imaginary) minus infinity downlinks as their
first item/downlink. So tuples in a (say) leaf page are always bound
by the downlink lower bound in parent, while their own high key is an
upper bound. Either (and, rarely, both) could be (positive or
negative) infinity.

Maybe you now see why I talked about special _bt_compare() logic for
this. I proposed special logic that is similar to the existing minus
infinity thing _bt_compare() does (although _bt_binsrch(), an
important caller of _bt_compare() also does special things for
internal .vs leaf case, so I'm not sure any new special logic must go
in _bt_compare()).

> It is a very important issue. But I don't think it's a bug there.
> I've read amcheck sources thoroughly and found that the problem appears at
> "invariant_key_less_than_equal_nontarget_offset()

> It uses scankey, made with _bt_mkscankey() which uses only key attributes,
> but calls _bt_compare with wrong keysz.
> If we wiil use nkeyatts = state->rel->rd_index->relnatts; instead of natts,
> all the checks would be passed successfully.

I probably shouldn't have brought amcheck into that particular
discussion. I thought amcheck might be a useful way to frame the
discussion, because amcheck always cares about specific invariants,
and notes a few special cases.

> In my view, it's the correct way to fix this problem, because the caller is
> responsible for passing proper arguments to the function.
> Of course I will add a check into bt_compare, but I'd rather make it an
> assertion (see the patch attached).

I see what you mean, but I think we need to decide what to do about
the key space when leaf high keys are truncated. I do think that
truncating the high key was the right idea, though, and it nicely
illustrates that nothing special should happen in upper levels. Suffix
truncation should only happen when leaf pages are split, generally
speaking.

As I said, the high key is very similar to the downlinks, in that both
bound the things that go on each page. Together with downlinks they
represent *discrete* ranges *unambiguously*, so INCLUDING truncation
needs to make it clear which page new items go on. As I said,
_bt_binsrch() already takes special actions for internal pages, making
sure to return the item that is < the scankey, not <= the scankey
which is only allowed for leaf pages. (See README, from "Lehman and
Yao assume that the key range for a subtree S is described by Ki < v
<= Ki+1 where Ki and Ki+1 are the adjacent keys in the parent
page...").

To give a specific example, I worry about the case where two sibling
downlinks in a parent page are distinct, but per specific-to-Postgres
"Ki <= v <= Ki+1" thing (which differs from the classic L
invariant), some tuples with all right downlink's attributes matching
end up in left child page, not right child page. I worry that since
_bt_findsplitloc() doesn't consider this (for example), the split
point doesn't *reliably* and unambiguously divide the key space
between the new halves of a page being split. I think the "Ki <= v <=
Ki+1"/_bt_binsrch() thing might save you in common cases where all
downlink attributes are distinct, so maybe that simpler case is okay.
But to be even more specific, what about the more complicated case
where the downlinks *are* fully _bt_compare()-wise equal? This could
happen even though they're constrained to be unique in leaf pages, due
to bloat. Unique indexes aren't special here; they just make it far
less likely that this would happen in practice, because it 

Re: [HACKERS] WIP: Covering + unique indexes.

2016-04-04 Thread Tom Lane
Peter Geoghegan  writes:
> On Mon, Apr 4, 2016 at 7:14 AM, Teodor Sigaev  wrote:
>> Are there any objectins on it? I'm planning to look closely today or
>> tommorrow and commit it.

> I object to committing the patch in that time frame. I'm looking at it again.

Since it's a rather complex patch, pushing it in advance of the reviewers
signing off on it doesn't seem like a great idea ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2016-04-04 Thread Peter Geoghegan
On Mon, Apr 4, 2016 at 7:14 AM, Teodor Sigaev  wrote:
> Are there any objectins on it? I'm planning to look closely today or
> tommorrow and commit it.

I object to committing the patch in that time frame. I'm looking at it again.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2016-04-04 Thread Teodor Sigaev

It seems to me that the patch is completed.
Except, maybe, grammar check of comments and documentation.

Looking forward to your review.
Are there any objectins on it? I'm planning to look closely today or tommorrow 
and commit it.


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2016-03-28 Thread Anastasia Lubennikova

21.03.2016 19:53, Anastasia Lubennikova:

19.03.2016 08:00, Peter Geoghegan:
On Fri, Mar 18, 2016 at 5:15 AM, David Steele  
wrote:
It looks like this patch should be marked "needs review" and I have 
done so.

Uh, no it shouldn't. I've posted an extensive review on the original
design thread. See CF entry:

https://commitfest.postgresql.org/9/433/

Marked "Waiting on Author".

Thanks to David,
I've missed these letters at first.
I'll answer here.


* You truncate (remove suffix attributes -- the "included" attributes)
within _bt_insertonpg():

-   right_item = CopyIndexTuple(item);
+   indnatts = IndexRelationGetNumberOfAttributes(rel);
+   indnkeyatts = IndexRelationGetNumberOfKeyAttributes(rel);
+
+   if (indnatts != indnkeyatts)
+   {
+   right_item = index_reform_tuple(rel, item, indnatts, 
indnkeyatts);

+   right_item_sz = IndexTupleDSize(*right_item);
+   right_item_sz = MAXALIGN(right_item_sz);
+   }
+   else
+   right_item = CopyIndexTuple(item);
 ItemPointerSet(&(right_item->t_tid), rbkno, P_HIKEY);

I suggest that you do this within _bt_insert_parent(), instead, iff
the original target page is know to be a leaf page. That's where it
needs to happen for conventional suffix truncation, which has special
considerations when determining which attributes are safe to truncate
(or even which byte in the first distinguishing attribute it is okay
to truncate past)


I agree that _bt_insertonpg() is not right for truncation.
Furthermore, I've noticed that all internal keys are solely the copies 
of "High keys" from the leaf pages. Which is pretty logical.
Therefore, if we have already truncated the tuple, when it became a 
High key, we do not need the same truncation within 
_bt_insert_parent() or any other function.
So the only thing to worry about is the HighKey truncation. I rewrote 
the code. Now only _bt_split cares about truncation.


It's a bit more complicated to add it into index creation algorithm.
There's a trick with a "high key".
/*
 * We copy the last item on the page into the new page, and then
 * rearrange the old page so that the 'last item' becomes its 
high key
 * rather than a true data item.  There had better be at least 
two
 * items on the page already, else the page would be empty of 
useful

 * data.
 */
/*
 * Move 'last' into the high key position on opage
 */

To be consistent with other steps of algorithm ( all high keys must be 
truncated tuples), I had to update this high key on place:

delete the old one, and insert truncated high key.
The very same logic I use to truncate posting list of a compressed 
tuple in the "btree_compression" patch. [1]
I hope, both patches will be accepted, and then I'll thoroughly merge 
them .



* I think the comparison logic may have a bug.

Does this work with amcheck? Maybe it works with bt_index_check(), but
not bt_index_parent_check()? I think that you need to make sure that
_bt_compare() knows about this, too. That's because it isn't good
enough to let a truncated internal IndexTuple compare equal to a
scankey when non-truncated attributes are equal.


It is a very important issue. But I don't think it's a bug there.
I've read amcheck sources thoroughly and found that the problem 
appears at

"invariant_key_less_than_equal_nontarget_offset()


static bool
invariant_key_less_than_equal_nontarget_offset(BtreeCheckState *state,
   Page nontarget, ScanKey 
key,

   OffsetNumber upperbound)
{
int16natts = state->rel->rd_rel->relnatts;
int32cmp;

cmp = _bt_compare(state->rel, natts, key, nontarget, upperbound);

return cmp <= 0;
}

It uses scankey, made with _bt_mkscankey() which uses only key 
attributes, but calls _bt_compare with wrong keysz.
If we wiil use nkeyatts = state->rel->rd_index->relnatts; instead of 
natts, all the checks would be passed successfully.


Same for invariant_key_greater_than_equal_offset() and 
invariant_key_less_than_equal_nontarget_offset().


In my view, it's the correct way to fix this problem, because the 
caller is responsible for passing proper arguments to the function.
Of course I will add a check into bt_compare, but I'd rather make it 
an assertion (see the patch attached).


I'll add a flag to distinguish regular and truncated tuples, but it 
will not be used in this patch. Please, comment, if I've missed 
something.
As you've already mentioned, neither high keys, nor tuples on internal 
pages are using  "itup->t_tid.ip_posid", so I'll take one bit of it.


It will definitely require changes in the future works on suffix 
truncation or something like that, but IMHO for now it's enough.


Do you have any objections or comments?

[1] https://commitfest.postgresql.org/9/494/



One more version of the patch is attached. I did more testing, and fixed 
couple of bugs.

Re: [HACKERS] WIP: Covering + unique indexes.

2016-03-21 Thread Anastasia Lubennikova

19.03.2016 08:00, Peter Geoghegan:

On Fri, Mar 18, 2016 at 5:15 AM, David Steele  wrote:

It looks like this patch should be marked "needs review" and I have done so.

Uh, no it shouldn't. I've posted an extensive review on the original
design thread. See CF entry:

https://commitfest.postgresql.org/9/433/

Marked "Waiting on Author".

Thanks to David,
I've missed these letters at first.
I'll answer here.


* You truncate (remove suffix attributes -- the "included" attributes)
within _bt_insertonpg():

-   right_item = CopyIndexTuple(item);
+   indnatts = IndexRelationGetNumberOfAttributes(rel);
+   indnkeyatts = IndexRelationGetNumberOfKeyAttributes(rel);
+
+   if (indnatts != indnkeyatts)
+   {
+   right_item = index_reform_tuple(rel, item, indnatts, indnkeyatts);
+   right_item_sz = IndexTupleDSize(*right_item);
+   right_item_sz = MAXALIGN(right_item_sz);
+   }
+   else
+   right_item = CopyIndexTuple(item);
 ItemPointerSet(&(right_item->t_tid), rbkno, P_HIKEY);

I suggest that you do this within _bt_insert_parent(), instead, iff
the original target page is know to be a leaf page. That's where it
needs to happen for conventional suffix truncation, which has special
considerations when determining which attributes are safe to truncate
(or even which byte in the first distinguishing attribute it is okay
to truncate past)


I agree that _bt_insertonpg() is not right for truncation.
Furthermore, I've noticed that all internal keys are solely the copies 
of "High keys" from the leaf pages. Which is pretty logical.
Therefore, if we have already truncated the tuple, when it became a High 
key, we do not need the same truncation within _bt_insert_parent() or 
any other function.
So the only thing to worry about is the HighKey truncation. I rewrote 
the code. Now only _bt_split cares about truncation.


It's a bit more complicated to add it into index creation algorithm.
There's a trick with a "high key".
/*
 * We copy the last item on the page into the new page, and then
 * rearrange the old page so that the 'last item' becomes its 
high key

 * rather than a true data item.  There had better be at least two
 * items on the page already, else the page would be empty of 
useful

 * data.
 */
/*
 * Move 'last' into the high key position on opage
 */

To be consistent with other steps of algorithm ( all high keys must be 
truncated tuples), I had to update this high key on place:

delete the old one, and insert truncated high key.
The very same logic I use to truncate posting list of a compressed tuple 
in the "btree_compression" patch. [1]

I hope, both patches will be accepted, and then I'll thoroughly merge them .


* I think the comparison logic may have a bug.

Does this work with amcheck? Maybe it works with bt_index_check(), but
not bt_index_parent_check()? I think that you need to make sure that
_bt_compare() knows about this, too. That's because it isn't good
enough to let a truncated internal IndexTuple compare equal to a
scankey when non-truncated attributes are equal.


It is a very important issue. But I don't think it's a bug there.
I've read amcheck sources thoroughly and found that the problem appears at
"invariant_key_less_than_equal_nontarget_offset()


static bool
invariant_key_less_than_equal_nontarget_offset(BtreeCheckState *state,
   Page nontarget, ScanKey key,
   OffsetNumber upperbound)
{
int16natts = state->rel->rd_rel->relnatts;
int32cmp;

cmp = _bt_compare(state->rel, natts, key, nontarget, upperbound);

return cmp <= 0;
}

It uses scankey, made with _bt_mkscankey() which uses only key 
attributes, but calls _bt_compare with wrong keysz.
If we wiil use nkeyatts = state->rel->rd_index->relnatts; instead of 
natts, all the checks would be passed successfully.


Same for invariant_key_greater_than_equal_offset() and 
invariant_key_less_than_equal_nontarget_offset().


In my view, it's the correct way to fix this problem, because the caller 
is responsible for passing proper arguments to the function.
Of course I will add a check into bt_compare, but I'd rather make it an 
assertion (see the patch attached).


I'll add a flag to distinguish regular and truncated tuples, but it will 
not be used in this patch. Please, comment, if I've missed something.
As you've already mentioned, neither high keys, nor tuples on internal 
pages are using  "itup->t_tid.ip_posid", so I'll take one bit of it.


It will definitely require changes in the future works on suffix 
truncation or something like that, but IMHO for now it's enough.


Do you have any objections or comments?

[1] https://commitfest.postgresql.org/9/494/

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/dblink/dblink.c 

Re: [HACKERS] WIP: Covering + unique indexes.

2016-03-19 Thread David Steele
On 3/14/16 9:57 AM, Anastasia Lubennikova wrote:

> New version of the patch implements pg_dump well.
> Documentation related to constraints is updated.
> 
> I hope, that patch is in a good shape now.

It looks like this patch should be marked "needs review" and I have done so.

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Covering + unique indexes.

2016-03-18 Thread Peter Geoghegan
On Fri, Mar 18, 2016 at 5:15 AM, David Steele  wrote:
> It looks like this patch should be marked "needs review" and I have done so.

Uh, no it shouldn't. I've posted an extensive review on the original
design thread. See CF entry:

https://commitfest.postgresql.org/9/433/

Marked "Waiting on Author".

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS]WIP: Covering + unique indexes.

2016-03-14 Thread Anastasia Lubennikova

02.03.2016 08:50, Michael Paquier:

On Wed, Mar 2, 2016 at 2:10 AM, Anastasia Lubennikova
 wrote:

01.03.2016 19:55, Anastasia Lubennikova:

It is not the final version, because it breaks pg_dump for previous
versions. I need some help from hackers here.
pgdump. line 5466
if (fout->remoteVersion >= 90400)

What does 'remoteVersion' mean? And what is the right way to change it? Or
it changes between releases?
I guess that 90400 is for 9.4 and 80200 is for 8.2 but is it really so?
That is totally new to me.

Yes, you got it. That's basically PG_VERSION_NUM as compiled on the
server that has been queried, in this case the server from which a
dump is taken. If you are changing the system catalog layer, you would
need to provide a query at least equivalent to what has been done
until now for your patch, the modify pg_dump as follows:
if (fout->remoteVersion >= 90600)
{
 query = my_new_query;
}
else if (fout->remoteVersion >= 90400)
{
 query = the existing 9.4 query
}
etc.

In short you just need to add a new block so as remote servers newer
than 9.6 will be able to dump objects correctly. pg_upgrade is a good
way to check the validity of pg_dump actually, this explains why some
objects are not dropped in the regression tests. Perhaps you'd want to
do the same with your patch if the current test coverage of pg_dump is
not enough. I have not looked at your patch so I cannot say for sure.


Thank you for the explanation.
New version of the patch implements pg_dump well.
Documentation related to constraints is updated.

I hope, that patch is in a good shape now. Brief overview for reviewers:

This patch allows unique indexes to be defined on one set of columns
and include another set of column in the INCLUDING clause, on which
the uniqueness is not enforced upon. It allows more queries to benefit
from using index-only scan. Currently, only the B-tree access method
supports this feature.

Syntax example:
CREATE TABLE tbl (c1 int, c2 int, c3 box);
CREATE INDEX idx ON TABLE tbl (c1) INCLUDING (c2, c3);

In opposite to key columns (c1),  included columns (c2,c3) are not used
in index scankeys neither in "search" scankeys nor in "insertion" scankeys.
Included columns are stored only in leaf pages and it can help to slightly
reduce index size. Hence, included columns do not require any opclass
for btree access method. As you can see from example above, it's possible
to add into index columns of "box" type.

The most common use-case for this feature is combination of UNIQUE or
PRIMARY KEY constraint on columns (a,b) and covering index on columns 
(a,b,c).

So, there is a new syntax for constraints.

CREATE TABLE tblu (c1 int, c2 int, c3 box, UNIQUE (c1,c2) INCLUDING (c3));
Index, created for this constraint contains three columns.
"tblu_c1_c2_c3_key" UNIQUE CONSTRAINT, btree (c1, c2) INCLUDING (c3)

CREATE TABLE tblpk (c1 int, c2 int, c3 box, PRIMARY KEY (c1) INCLUDING 
(c3));

Index, created for this constraint contains two columns. Note that NOT NULL
constraint is applied only to key column(s) as well as unique constraint.

postgres=# \d tblpk
 Table "public.tblpk"
 Column |  Type   | Modifiers
+-+---
 c1 | integer | not null
 c2 | integer |
 c3 | box |
Indexes:
"tblpk_pkey" PRIMARY KEY, btree (c1) INCLUDING (c3)

Same for ALTER TABLE statements:
CREATE TABLE tblpka (c1 int, c2 int, c3 box);
ALTER TABLE tblpka ADD PRIMARY KEY (c1) INCLUDING (c3);

pg_dump is updated and seems to work fine with this kind of indexes.

I see only one problem left (maybe I've mentioned it before).
Queries like this [1] must be rewritten, because after catalog changes,
i.indkey contains both key and included attrs.
One more thing to do is some refactoring of names, since "indkey"
looks really confusing to me. But it could be done as a separate patch [2].


[1] https://wiki.postgresql.org/wiki/Retrieve_primary_key_columns
[2] http://www.postgresql.org/message-id/56bb7788.30...@postgrespro.ru

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 9c8e308..891325d 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -100,7 +100,7 @@ static remoteConn *getConnectionByName(const char *name);
 static HTAB *createConnHash(void);
 static void createNewConnection(const char *name, remoteConn *rconn);
 static void deleteConnection(const char *name);
-static char **get_pkey_attnames(Relation rel, int16 *numatts);
+static char **get_pkey_attnames(Relation rel, int16 *indnkeyatts);
 static char **get_text_array_contents(ArrayType *array, int *numitems);
 static char *get_sql_insert(Relation rel, int *pkattnums, int pknumatts, char **src_pkattvals, char **tgt_pkattvals);
 static char *get_sql_delete(Relation rel, int *pkattnums, int pknumatts, char **tgt_pkattvals);
@@ -1485,7 +1485,7 @@ PG_FUNCTION_INFO_V1(dblink_get_pkey);
 Datum
 

Re: [HACKERS]WIP: Covering + unique indexes.

2016-03-01 Thread Michael Paquier
On Wed, Mar 2, 2016 at 2:10 AM, Anastasia Lubennikova
 wrote:
> 01.03.2016 19:55, Anastasia Lubennikova:
>> It is not the final version, because it breaks pg_dump for previous
>> versions. I need some help from hackers here.
>> pgdump. line 5466
>> if (fout->remoteVersion >= 90400)
>>
>> What does 'remoteVersion' mean? And what is the right way to change it? Or
>> it changes between releases?
>> I guess that 90400 is for 9.4 and 80200 is for 8.2 but is it really so?
>> That is totally new to me.

Yes, you got it. That's basically PG_VERSION_NUM as compiled on the
server that has been queried, in this case the server from which a
dump is taken. If you are changing the system catalog layer, you would
need to provide a query at least equivalent to what has been done
until now for your patch, the modify pg_dump as follows:
if (fout->remoteVersion >= 90600)
{
query = my_new_query;
}
else if (fout->remoteVersion >= 90400)
{
query = the existing 9.4 query
}
etc.

In short you just need to add a new block so as remote servers newer
than 9.6 will be able to dump objects correctly. pg_upgrade is a good
way to check the validity of pg_dump actually, this explains why some
objects are not dropped in the regression tests. Perhaps you'd want to
do the same with your patch if the current test coverage of pg_dump is
not enough. I have not looked at your patch so I cannot say for sure.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS]WIP: Covering + unique indexes.

2016-03-01 Thread Anastasia Lubennikova

01.03.2016 19:55, Anastasia Lubennikova:


29.02.2016 18:17, Anastasia Lubennikova:

25.02.2016 21:39, Jeff Janes:
As promised, here's the new version of the patch 
"including_columns_4.0".

I fixed all issues except some points mentioned below.

Thanks for the update patch.  I get a compiler warning:

genam.c: In function 'BuildIndexValueDescription':
genam.c:259: warning: unused variable 'tupdesc'


Thank you for the notice, I'll fix it in the next update.

Also, I can't create a primary key INCLUDING columns directly:

jjanes=# create table foobar (a int, b int, c int);
jjanes=# alter table foobar add constraint foobar_pkey primary key
(a,b) including (c);
ERROR:  syntax error at or near "including"

But I can get there using a circuitous route:

jjanes=# create unique index on foobar (a,b) including (c);
jjanes=# alter table foobar add constraint foobar_pkey primary key
using index foobar_a_b_c_idx;

The description of the table's index knows to include the including 
column:


jjanes=# \d foobar
 Table "public.foobar"
  Column |  Type   | Modifiers
+-+---
  a  | integer | not null
  b  | integer | not null
  c  | integer |
Indexes:
 "foobar_pkey" PRIMARY KEY, btree (a, b) INCLUDING (c)


Since the machinery appears to all be in place to have primary keys
with INCLUDING columns, it would be nice if the syntax for adding
primary keys allowed one to implement them directly.

Is this something or future expansion, or could it be added at the
same time as the main patch?


Good point.
At quick glance, this looks easy to implement it. The only problem is 
that there are too many places in code which must be updated.
I'll try to do it, and if there would be difficulties, it's fine with 
me to delay this feature for the future work.


I found one more thing to do. Pgdump does not handle included columns 
now. I will fix it in the next version of the patch.




As promised, fixed patch is in attachments. It allows to perform 
following statements:


create table utbl (a int, b box);
alter table utbl add unique (a) including(b);
create table ptbl (a int, b box);
alter table ptbl add primary key (a) including(b);

And now they can be dumped/restored successfully.
I used following settings
pg_dump --verbose -Fc postgres -f pg.dump
pg_restore -d newdb pg.dump

It is not the final version, because it breaks pg_dump for previous 
versions. I need some help from hackers here.

pgdump. line 5466
if (fout->remoteVersion >= 90400)

What does 'remoteVersion' mean? And what is the right way to change 
it? Or it changes between releases?
I guess that 90400 is for 9.4 and 80200 is for 8.2 but is it really 
so? That is totally new to me.
BTW, While we are on the subject, maybe it's worth to replace these 
magic numbers with some set of macro?


P.S. I'll update documentation for ALTER TABLE in the next patch.


Sorry for missed attachment. Now it's here.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 9c8e308..891325d 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -100,7 +100,7 @@ static remoteConn *getConnectionByName(const char *name);
 static HTAB *createConnHash(void);
 static void createNewConnection(const char *name, remoteConn *rconn);
 static void deleteConnection(const char *name);
-static char **get_pkey_attnames(Relation rel, int16 *numatts);
+static char **get_pkey_attnames(Relation rel, int16 *indnkeyatts);
 static char **get_text_array_contents(ArrayType *array, int *numitems);
 static char *get_sql_insert(Relation rel, int *pkattnums, int pknumatts, char **src_pkattvals, char **tgt_pkattvals);
 static char *get_sql_delete(Relation rel, int *pkattnums, int pknumatts, char **tgt_pkattvals);
@@ -1485,7 +1485,7 @@ PG_FUNCTION_INFO_V1(dblink_get_pkey);
 Datum
 dblink_get_pkey(PG_FUNCTION_ARGS)
 {
-	int16		numatts;
+	int16		indnkeyatts;
 	char	  **results;
 	FuncCallContext *funcctx;
 	int32		call_cntr;
@@ -1511,7 +1511,7 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		rel = get_rel_from_relname(PG_GETARG_TEXT_P(0), AccessShareLock, ACL_SELECT);
 
 		/* get the array of attnums */
-		results = get_pkey_attnames(rel, );
+		results = get_pkey_attnames(rel, );
 
 		relation_close(rel, AccessShareLock);
 
@@ -1531,9 +1531,9 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		attinmeta = TupleDescGetAttInMetadata(tupdesc);
 		funcctx->attinmeta = attinmeta;
 
-		if ((results != NULL) && (numatts > 0))
+		if ((results != NULL) && (indnkeyatts > 0))
 		{
-			funcctx->max_calls = numatts;
+			funcctx->max_calls = indnkeyatts;
 
 			/* got results, keep track of them */
 			funcctx->user_fctx = results;
@@ -2023,10 +2023,10 @@ dblink_fdw_validator(PG_FUNCTION_ARGS)
  * get_pkey_attnames
  *
  * Get the primary key attnames for the given relation.
- * Return NULL, and set numatts = 0, if no primary key exists.
+ * Return NULL, and set indnkeyatts = 0, 

Re: [HACKERS]WIP: Covering + unique indexes.

2016-03-01 Thread Anastasia Lubennikova


29.02.2016 18:17, Anastasia Lubennikova:

25.02.2016 21:39, Jeff Janes:
As promised, here's the new version of the patch 
"including_columns_4.0".

I fixed all issues except some points mentioned below.

Thanks for the update patch.  I get a compiler warning:

genam.c: In function 'BuildIndexValueDescription':
genam.c:259: warning: unused variable 'tupdesc'


Thank you for the notice, I'll fix it in the next update.

Also, I can't create a primary key INCLUDING columns directly:

jjanes=# create table foobar (a int, b int, c int);
jjanes=# alter table foobar add constraint foobar_pkey primary key
(a,b) including (c);
ERROR:  syntax error at or near "including"

But I can get there using a circuitous route:

jjanes=# create unique index on foobar (a,b) including (c);
jjanes=# alter table foobar add constraint foobar_pkey primary key
using index foobar_a_b_c_idx;

The description of the table's index knows to include the including 
column:


jjanes=# \d foobar
 Table "public.foobar"
  Column |  Type   | Modifiers
+-+---
  a  | integer | not null
  b  | integer | not null
  c  | integer |
Indexes:
 "foobar_pkey" PRIMARY KEY, btree (a, b) INCLUDING (c)


Since the machinery appears to all be in place to have primary keys
with INCLUDING columns, it would be nice if the syntax for adding
primary keys allowed one to implement them directly.

Is this something or future expansion, or could it be added at the
same time as the main patch?


Good point.
At quick glance, this looks easy to implement it. The only problem is 
that there are too many places in code which must be updated.
I'll try to do it, and if there would be difficulties, it's fine with 
me to delay this feature for the future work.


I found one more thing to do. Pgdump does not handle included columns 
now. I will fix it in the next version of the patch.




As promised, fixed patch is in attachments. It allows to perform 
following statements:


create table utbl (a int, b box);
alter table utbl add unique (a) including(b);
create table ptbl (a int, b box);
alter table ptbl add primary key (a) including(b);

And now they can be dumped/restored successfully.
I used following settings
pg_dump --verbose -Fc postgres -f pg.dump
pg_restore -d newdb pg.dump

It is not the final version, because it breaks pg_dump for previous 
versions. I need some help from hackers here.

pgdump. line 5466
if (fout->remoteVersion >= 90400)

What does 'remoteVersion' mean? And what is the right way to change it? 
Or it changes between releases?
I guess that 90400 is for 9.4 and 80200 is for 8.2 but is it really so? 
That is totally new to me.
BTW, While we are on the subject, maybe it's worth to replace these 
magic numbers with some set of macro?


P.S. I'll update documentation for ALTER TABLE in the next patch.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS]WIP: Covering + unique indexes.

2016-02-29 Thread Anastasia Lubennikova

25.02.2016 21:39, Jeff Janes:

As promised, here's the new version of the patch "including_columns_4.0".
I fixed all issues except some points mentioned below.

Thanks for the update patch.  I get a compiler warning:

genam.c: In function 'BuildIndexValueDescription':
genam.c:259: warning: unused variable 'tupdesc'


Thank you for the notice, I'll fix it in the next update.

Also, I can't create a primary key INCLUDING columns directly:

jjanes=# create table foobar (a int, b int, c int);
jjanes=# alter table foobar add constraint foobar_pkey primary key
(a,b) including (c);
ERROR:  syntax error at or near "including"

But I can get there using a circuitous route:

jjanes=# create unique index on foobar (a,b) including (c);
jjanes=# alter table foobar add constraint foobar_pkey primary key
using index foobar_a_b_c_idx;

The description of the table's index knows to include the including column:

jjanes=# \d foobar
 Table "public.foobar"
  Column |  Type   | Modifiers
+-+---
  a  | integer | not null
  b  | integer | not null
  c  | integer |
Indexes:
 "foobar_pkey" PRIMARY KEY, btree (a, b) INCLUDING (c)


Since the machinery appears to all be in place to have primary keys
with INCLUDING columns, it would be nice if the syntax for adding
primary keys allowed one to implement them directly.

Is this something or future expansion, or could it be added at the
same time as the main patch?


Good point.
At quick glance, this looks easy to implement it. The only problem is 
that there are too many places in code which must be updated.
I'll try to do it, and if there would be difficulties, it's fine with me 
to delay this feature for the future work.


I found one more thing to do. Pgdump does not handle included columns 
now. I will fix it in the next version of the patch.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS]WIP: Covering + unique indexes.

2016-02-25 Thread Jeff Janes
On Thu, Feb 11, 2016 at 8:46 AM, Anastasia Lubennikova
 wrote:
> 02.02.2016 15:50, Anastasia Lubennikova:

>
> As promised, here's the new version of the patch "including_columns_4.0".
> I fixed all issues except some points mentioned below.

Thanks for the update patch.  I get a compiler warning:

genam.c: In function 'BuildIndexValueDescription':
genam.c:259: warning: unused variable 'tupdesc'

Also, I can't create a primary key INCLUDING columns directly:

jjanes=# create table foobar (a int, b int, c int);
jjanes=# alter table foobar add constraint foobar_pkey primary key
(a,b) including (c);
ERROR:  syntax error at or near "including"

But I can get there using a circuitous route:

jjanes=# create unique index on foobar (a,b) including (c);
jjanes=# alter table foobar add constraint foobar_pkey primary key
using index foobar_a_b_c_idx;

The description of the table's index knows to include the including column:

jjanes=# \d foobar
Table "public.foobar"
 Column |  Type   | Modifiers
+-+---
 a  | integer | not null
 b  | integer | not null
 c  | integer |
Indexes:
"foobar_pkey" PRIMARY KEY, btree (a, b) INCLUDING (c)


Since the machinery appears to all be in place to have primary keys
with INCLUDING columns, it would be nice if the syntax for adding
primary keys allowed one to implement them directly.

Is this something or future expansion, or could it be added at the
same time as the main patch?

I think this is something it would be pretty frustrating for the user
to be unable to do right from the start.

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS]WIP: Covering + unique indexes.

2016-02-11 Thread Anastasia Lubennikova

02.02.2016 15:50, Anastasia Lubennikova:

31.01.2016 11:04, David Rowley:

On 27 January 2016 at 03:35, Anastasia Lubennikova
 wrote:

including_columns_3.0 is the latest version of patch.
And changes regarding the previous version are attached in a 
separate patch.

Just to ease the review and debug.

Hi,

I've made another pass over the patch. There's still a couple of
things that I think need to be looked at.

Thank you again.
I just write here to say that I do not disappear and I do remember 
about the issue.
But I'm very very busy this week. I'll send an updated patch next week 
as soon as possible.




As promised, here's the new version of the patch "including_columns_4.0".
I fixed all issues except some points mentioned below.
Besides, I did some refactoring:
- use macros IndexRelationGetNumberOfAttributes, 
IndexRelationGetNumberOfKeyAttributes where possible. Use macro 
RelationGetNumberOfAttributes. Maybe that's a bit unrelated changes, but 
it'll make development much easier in future.

- rename related variables to indnatts,  indnkeyatts.

I'm also not that keen on index_reform_tuple() in general. I wonder if
there's a way we can just keep the Datum/isnull arrays a bit longer,
and only form the tuple when needed. I've not looked into this in
detail, but it does look like reforming the tuple is not going to be
cheap.
It is used in splits, for example. There is no datum array, we just 
move tuple key from a child page to a parent page or something like that.
And according to INCLUDING algorithm we need to truncate nonkey 
attributes.

If we do need to keep this function, I think a better name might be
index_trim_tuple() and I don't think you need to pass the original
length. It might make sense to Assert() that the trim length is
smaller than the tuple size


As regards the performance, I don't think that it's a big problem here.
Do you suggest to do it in a following way memcpy(oldtup, newtup, 
newtuplength)?


I've tested it some more, and still didn't find any performance issues.


in gistrescan() there is some code:

for (attno = 1; attno <= natts; attno++)
{
TupleDescInitEntry(so->giststate->fetchTupdesc, attno, NULL,
   scan->indexRelation->rd_opcintype[attno - 1],
   -1, 0);
}

Going by RelationInitIndexAccessInfo() rd_opcintype[] is allocated to
be sized by the number of key columns, but this loop goes over the
number of attribute columns.
Perhaps this is not a big problem since GIST does not support
INCLUDING columns, but it does seem wrong still.


GiST doesn't support INCLUDING clause, so natts and nkeyatts are 
always equal. I don't see any problem here.
And I think that it's an extra work to this patch. Maybe I or someone 
else would add this feature to other access methods later.


Still the same.

Which brings me to the fact that I've spent a bit of time trying to
look for places where you've forgotten to change natts to nkeyatts. I
did find this one, but I don't have much confidence that there's not
lots more places that have been forgotten. Apart from this one, how
confident are you that you've found all the places? I'm getting
towards being happy with the code that I see that's been changed, but
I'm hesitant to mark as "Ready for committer" due to not being all
that comfortable that all the code that needs to be updated has been
updated. I'm not quite sure of a good way to find all these places.
I found all mentions of natts and other related variables with grep, 
and replaced (or expand) them with nkeyatts where it was necessary.

As mentioned before, I didn't change other AMs.
I strongly agree that any changes related to btree require thorough 
inspection, so I'll recheck it again. But I'm almost sure that it's okay.


I rechecked everything again and fixed couple of omissions. Thank you 
for being exacting reviewer)
I don't know how to ensure that everything is ok, but I have no idea 
what else I can do.



I wondering if hacking the code so that each btree index which is
created with > 1 column puts all but the first column into the
INCLUDING columns, then run the regression tests to see if there are
any crashes. I'm really not that sure of how else to increase the
confidence levels on this. Do you have ideas?


Do I understand correctly that you suggest to replace all multicolumn 
indexes with (1key column) + included?
I don't think it's a good idea. INCLUDING clause brings some 
disadvantages. For example, included columns must be filtered after 
the search, while key columns could be used in scan key directly. I 
already mentioned this in test example:


explain analyze select c1, c2 from tbl where c1<1 and c3<20;
If columns' opclasses are used, new query plan uses them in  Index 
Cond: ((c1 < 1) AND (c3 < 20))
Otherwise, new query can not use included column in Index Cond and 
uses filter instead:

Index Cond: (c1 < 1)
Filter: (c3 < 20)
Rows Removed by Filter: 9993
It slows down the query significantly.

And besides that, we 

Re: [HACKERS]WIP: Covering + unique indexes.

2016-02-02 Thread Anastasia Lubennikova

31.01.2016 11:04, David Rowley:

On 27 January 2016 at 03:35, Anastasia Lubennikova
 wrote:

including_columns_3.0 is the latest version of patch.
And changes regarding the previous version are attached in a separate patch.
Just to ease the review and debug.

Hi,

I've made another pass over the patch. There's still a couple of
things that I think need to be looked at.

Thank you again.
I just write here to say that I do not disappear and I do remember about 
the issue.
But I'm very very busy this week. I'll send an updated patch next week 
as soon as possible.



Do we need the "b (included)" here? The key is (a) = (1). Having
irrelevant details might be confusing.

postgres=# create table a (a int not null, b int not null);
CREATE TABLE
postgres=# create unique index on a (a) including(b);
CREATE INDEX
postgres=# insert into a values(1,1);
INSERT 0 1
postgres=# insert into a values(1,1);
ERROR:  duplicate key value violates unique constraint "a_a_b_idx"
DETAIL:  Key (a, b (included))=(1, 1) already exists.
I thought that it could be strange if user inserts two values and then 
sees only one of them in error message.
But now I see that you're right. I'll also look at the same functional 
in other DBs and fix it.



In index_reform_tuple() I find it a bit scary that you change the
TupleDesc's number of attributes then set it back again once you're
finished reforming the shortened tuple.
Maybe it would be better to modify index_form_tuple() to accept a new
argument with a number of attributes, then you can just Assert that
this number is never higher than the number of attributes in the
TupleDesc.

Good point.
I agree that this function is a bit strange. I have to set 
tupdesc->nattrs to support compatibility with index_form_tuple().
I didn't want to add neither a new field to tupledesc nor a new 
parameter to index_form_tuple(), because they are used widely.

I'm also not that keen on index_reform_tuple() in general. I wonder if
there's a way we can just keep the Datum/isnull arrays a bit longer,
and only form the tuple when needed. I've not looked into this in
detail, but it does look like reforming the tuple is not going to be
cheap.
It is used in splits, for example. There is no datum array, we just move 
tuple key from a child page to a parent page or something like that.

And according to INCLUDING algorithm we need to truncate nonkey attributes.

If we do need to keep this function, I think a better name might be
index_trim_tuple() and I don't think you need to pass the original
length. It might make sense to Assert() that the trim length is
smaller than the tuple size


As regards the performance, I don't think that it's a big problem here.
Do you suggest to do it in a following way memcpy(oldtup, newtup, 
newtuplength)?

I will

in gistrescan() there is some code:

for (attno = 1; attno <= natts; attno++)
{
TupleDescInitEntry(so->giststate->fetchTupdesc, attno, NULL,
   scan->indexRelation->rd_opcintype[attno - 1],
   -1, 0);
}

Going by RelationInitIndexAccessInfo() rd_opcintype[] is allocated to
be sized by the number of key columns, but this loop goes over the
number of attribute columns.
Perhaps this is not a big problem since GIST does not support
INCLUDING columns, but it does seem wrong still.


GiST doesn't support INCLUDING clause, so natts and nkeyatts are always 
equal. I don't see any problem here.
And I think that it's an extra work to this patch. Maybe I or someone 
else would add this feature to other access methods later.

Which brings me to the fact that I've spent a bit of time trying to
look for places where you've forgotten to change natts to nkeyatts. I
did find this one, but I don't have much confidence that there's not
lots more places that have been forgotten. Apart from this one, how
confident are you that you've found all the places? I'm getting
towards being happy with the code that I see that's been changed, but
I'm hesitant to mark as "Ready for committer" due to not being all
that comfortable that all the code that needs to be updated has been
updated. I'm not quite sure of a good way to find all these places.
I found all mentions of natts and other related variables with grep, and 
replaced (or expand) them with nkeyatts where it was necessary.

As mentioned before, I didn't change other AMs.
I strongly agree that any changes related to btree require thorough 
inspection, so I'll recheck it again. But I'm almost sure that it's okay.



I wondering if hacking the code so that each btree index which is
created with > 1 column puts all but the first column into the
INCLUDING columns, then run the regression tests to see if there are
any crashes. I'm really not that sure of how else to increase the
confidence levels on this. Do you have ideas?


Do I understand correctly that you suggest to replace all multicolumn 
indexes with (1key column) + included?
I don't think it's a good idea. INCLUDING clause brings some 
disadvantages. For 

Re: [HACKERS]WIP: Covering + unique indexes.

2016-02-02 Thread Alvaro Herrera
Anastasia Lubennikova wrote:

> I just write here to say that I do not disappear and I do remember about the
> issue.
> But I'm very very busy this week. I'll send an updated patch next week as
> soon as possible.

That's great to know, thanks.  I moved your patch to the next
commitfest.  Please do submit a new version before it starts!

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS]WIP: Covering + unique indexes.

2016-01-31 Thread David Rowley
On 27 January 2016 at 03:35, Anastasia Lubennikova
 wrote:
> including_columns_3.0 is the latest version of patch.
> And changes regarding the previous version are attached in a separate patch.
> Just to ease the review and debug.

Hi,

I've made another pass over the patch. There's still a couple of
things that I think need to be looked at.

Do we need the "b (included)" here? The key is (a) = (1). Having
irrelevant details might be confusing.

postgres=# create table a (a int not null, b int not null);
CREATE TABLE
postgres=# create unique index on a (a) including(b);
CREATE INDEX
postgres=# insert into a values(1,1);
INSERT 0 1
postgres=# insert into a values(1,1);
ERROR:  duplicate key value violates unique constraint "a_a_b_idx"
DETAIL:  Key (a, b (included))=(1, 1) already exists.

Extra tabs:
/* Truncate nonkey attributes when inserting on nonleaf pages. */
if (rel->rd_index->indnatts != rel->rd_index->indnkeyatts
&& !P_ISLEAF(lpageop))
{
itup = index_reform_tuple(rel, itup,
rel->rd_index->indnatts, rel->rd_index->indnkeyatts);
}

In index_reform_tuple() I find it a bit scary that you change the
TupleDesc's number of attributes then set it back again once you're
finished reforming the shortened tuple.
Maybe it would be better to modify index_form_tuple() to accept a new
argument with a number of attributes, then you can just Assert that
this number is never higher than the number of attributes in the
TupleDesc.

I'm also not that keen on index_reform_tuple() in general. I wonder if
there's a way we can just keep the Datum/isnull arrays a bit longer,
and only form the tuple when needed. I've not looked into this in
detail, but it does look like reforming the tuple is not going to be
cheap.

If we do need to keep this function, I think a better name might be
index_trim_tuple() and I don't think you need to pass the original
length. It might make sense to Assert() that the trim length is
smaller than the tuple size

What statement will cause this:

numberOfKeyAttributes = list_length(stmt->indexParams);
if (numberOfKeyAttributes <= 0)
ereport(ERROR,
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
errmsg("must specify at least one key column")));

I seem to just get errors from the parser when trying.


Much of this goes over 80 chars:

/*
* We append any INCLUDING columns onto the indexParams list so that
* we have one list with all columns. Later we can determine which of these
* are key columns, and which are just part of the INCLUDING list by
check the list
* position. A list item in a position less than ii_NumIndexKeyAttrs is part of
* the key columns, and anything equal to and over is part of the
* INCLUDING columns.
*/
stmt->indexParams = list_concat(stmt->indexParams, stmt->indexIncludingParams);

in gistrescan() there is some code:

for (attno = 1; attno <= natts; attno++)
{
TupleDescInitEntry(so->giststate->fetchTupdesc, attno, NULL,
  scan->indexRelation->rd_opcintype[attno - 1],
  -1, 0);
}

Going by RelationInitIndexAccessInfo() rd_opcintype[] is allocated to
be sized by the number of key columns, but this loop goes over the
number of attribute columns.
Perhaps this is not a big problem since GIST does not support
INCLUDING columns, but it does seem wrong still.

Which brings me to the fact that I've spent a bit of time trying to
look for places where you've forgotten to change natts to nkeyatts. I
did find this one, but I don't have much confidence that there's not
lots more places that have been forgotten. Apart from this one, how
confident are you that you've found all the places? I'm getting
towards being happy with the code that I see that's been changed, but
I'm hesitant to mark as "Ready for committer" due to not being all
that comfortable that all the code that needs to be updated has been
updated. I'm not quite sure of a good way to find all these places. I
wondering if hacking the code so that each btree index which is
created with > 1 column puts all but the first column into the
INCLUDING columns, then run the regression tests to see if there are
any crashes. I'm really not that sure of how else to increase the
confidence levels on this. Do you have ideas?

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS]WIP: Covering + unique indexes.

2016-01-26 Thread Anastasia Lubennikova

25.01.2016 03:32, Jeff Janes:

On Fri, Jan 22, 2016 at 7:19 AM, Anastasia Lubennikova
 wrote:

Done. I hope that my patch is close to the commit too.


Thanks for the update.

I've run into this problem:

create table foobar (x text, w text);
create unique index foobar_pkey on foobar (x) including (w);
alter table foobar add constraint foobar_pkey primary key using index
foobar_pkey;

ERROR:  index "foobar_pkey" does not have default sorting behavior
LINE 1: alter table foobar add constraint foobar_pkey primary key us...
^
DETAIL:  Cannot create a primary key or unique constraint using such an index.
Time: 1.577 ms


If I instead define the table as
create table foobar (x int, w xml);

Then I can create the index and then the primary key the first time I
do this in a session.  But then if I drop the table and repeat the
process, I get "does not have default sorting behavior" error even for
this index that previously succeeded, so I think there is some kind of
problem with the backend syscache or catcache.

create table foobar (x int, w xml);
create unique index foobar_pkey on foobar (x) including (w);
alter table foobar add constraint foobar_pkey primary key using index
foobar_pkey;
drop table foobar ;
create table foobar (x int, w xml);
create unique index foobar_pkey on foobar (x) including (w);
alter table foobar add constraint foobar_pkey primary key using index
foobar_pkey;
ERROR:  index "foobar_pkey" does not have default sorting behavior
LINE 1: alter table foobar add constraint foobar_pkey primary key us...
^
DETAIL:  Cannot create a primary key or unique constraint using such an index.


Great, I've fixed that. Thank you for the tip about cache.

I've also found and fixed related bug in copying tables with indexes:
create table tbl2 (like tbl including all);
And there's one more tiny fix in get_pkey_attnames in dblink module.

including_columns_3.0 is the latest version of patch.
And changes regarding the previous version are attached in a separate 
patch. Just to ease the review and debug.


I've changed size of pg_index.indclass array. It contains indnkeyatts 
elements now.
While pg_index.indkey still contains all attributes. And this query 
Retrieve primary key columns 
 provides 
pretty non-obvious result. Is it a normal behavior here or some changes 
are required? Do you know any similar queries?


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 9c8e308..ef6aee5 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -2058,7 +2058,7 @@ get_pkey_attnames(Relation rel, int16 *numatts)
 		/* we're only interested if it is the primary key */
 		if (index->indisprimary)
 		{
-			*numatts = index->indnatts;
+			*numatts = index->indnkeyatts;
 			if (*numatts > 0)
 			{
 result = (char **) palloc(*numatts * sizeof(char *));
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 412c845..c201f8b 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -3515,6 +3515,14 @@
   pg_class.relnatts)
  
 
+  
+  indnkeyatts
+  int2
+  
+  The number of key columns in the index. "Key columns" are ordinary
+  index columns in contrast with "included" columns.
+ 
+
  
   indisunique
   bool
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index 5f7befb..aaed5a7 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -111,6 +111,8 @@ typedef struct IndexAmRoutine
 boolamclusterable;
 /* does AM handle predicate locks? */
 boolampredlocks;
+/* does AM support columns included with clause INCLUDING? */
+boolamcaninclude;
 /* type of data stored in index, or InvalidOid if variable */
 Oid amkeytype;
 
@@ -852,7 +854,8 @@ amrestrpos (IndexScanDesc scan);
using unique indexes, which are indexes that disallow
multiple entries with identical keys.  An access method that supports this
feature sets amcanunique true.
-   (At present, only b-tree supports it.)
+   (At present, only b-tree supports it.) Columns which are present in the
+   INCLUDING clause are not used to enforce uniqueness.
   
 
   
diff --git a/doc/src/sgml/indices.sgml b/doc/src/sgml/indices.sgml
index 23bbec6..09d4e6b 100644
--- a/doc/src/sgml/indices.sgml
+++ b/doc/src/sgml/indices.sgml
@@ -633,7 +633,8 @@ CREATE INDEX test3_desc_index ON test3 (id DESC NULLS LAST);
Indexes can also be used to enforce uniqueness of a column's value,
or the uniqueness of the combined values of more than one column.
 
-CREATE UNIQUE INDEX name ON table (column , ...);
+CREATE UNIQUE INDEX name ON table (column , ...)
+INCLUDING (column , ...);
 
Currently, only B-tree 

Re: [HACKERS]WIP: Covering + unique indexes.

2016-01-24 Thread Jeff Janes
On Fri, Jan 22, 2016 at 7:19 AM, Anastasia Lubennikova
 wrote:
>
> Done. I hope that my patch is close to the commit too.
>

Thanks for the update.

I've run into this problem:

create table foobar (x text, w text);
create unique index foobar_pkey on foobar (x) including (w);
alter table foobar add constraint foobar_pkey primary key using index
foobar_pkey;

ERROR:  index "foobar_pkey" does not have default sorting behavior
LINE 1: alter table foobar add constraint foobar_pkey primary key us...
   ^
DETAIL:  Cannot create a primary key or unique constraint using such an index.
Time: 1.577 ms


If I instead define the table as
create table foobar (x int, w xml);

Then I can create the index and then the primary key the first time I
do this in a session.  But then if I drop the table and repeat the
process, I get "does not have default sorting behavior" error even for
this index that previously succeeded, so I think there is some kind of
problem with the backend syscache or catcache.

create table foobar (x int, w xml);
create unique index foobar_pkey on foobar (x) including (w);
alter table foobar add constraint foobar_pkey primary key using index
foobar_pkey;
drop table foobar ;
create table foobar (x int, w xml);
create unique index foobar_pkey on foobar (x) including (w);
alter table foobar add constraint foobar_pkey primary key using index
foobar_pkey;
ERROR:  index "foobar_pkey" does not have default sorting behavior
LINE 1: alter table foobar add constraint foobar_pkey primary key us...
   ^
DETAIL:  Cannot create a primary key or unique constraint using such an index.

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS]WIP: Covering + unique indexes.

2016-01-22 Thread Anastasia Lubennikova

22.01.2016 01:47, David Rowley:

On 20 January 2016 at 06:08, Anastasia Lubennikova
 wrote:



18.01.2016 01:02, David Rowley пишет:

On 14 January 2016 at 08:24, David Rowley  wrote:

I will try to review the omit_opclass_4.0.patch soon.


Hi, as promised, here's my review of the omit_opclass_4.0.patch patch.

Thank you again. All mentioned points are fixed and patches are merged.
I hope it's all right now. Please check comments one more time. I rather doubt 
that I wrote everything correctly.


Thanks for updating.

+for the searching or ordering of records can defined in the

should be:

+for the searching or ordering of records can be defined in the

but perhaps "defined" should be "included".

The following is still quite wasteful. CopyIndexTuple() does a
palloc() and memcpy(), and then you throw that away if
rel->rd_index->indnatts != rel->rd_index->indnkeyatts. I think you
just need to add an "else" and move the CopyIndexTuple() below the if.

item = (IndexTuple) PageGetItem(lpage, itemid);
   right_item = CopyIndexTuple(item);
+ if (rel->rd_index->indnatts != rel->rd_index->indnkeyatts)
+ right_item = index_reform_tuple(rel, right_item,
rel->rd_index->indnatts, rel->rd_index->indnkeyatts);

Fixed. Thank you for reminding me.

Tom also commited
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=65c5fcd353a859da9e61bfb2b92a99f12937de3b
So it looks like you'll need to update your pg_am.h changes. Looks
like you'll need a new struct member in IndexAmRoutine and just
populate that new member in each of the *handler functions listed in
pg_am.h

-#define Natts_pg_am 30
+#define Natts_pg_am 31

Done. I hope that my patch is close to the commit too.

Thank you again for review.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 412c845..c201f8b 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -3515,6 +3515,14 @@
   pg_class.relnatts)
  
 
+  
+  indnkeyatts
+  int2
+  
+  The number of key columns in the index. "Key columns" are ordinary
+  index columns in contrast with "included" columns.
+ 
+
  
   indisunique
   bool
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index 5f7befb..ee40309 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -111,6 +111,8 @@ typedef struct IndexAmRoutine
 boolamclusterable;
 /* does AM handle predicate locks? */
 boolampredlocks;
+/* does AM support columns included with clause INCLUDING? */
+boolamcaninclude;
 /* type of data stored in index, or InvalidOid if variable */
 Oid amkeytype;
 
@@ -852,7 +854,8 @@ amrestrpos (IndexScanDesc scan);
using unique indexes, which are indexes that disallow
multiple entries with identical keys.  An access method that supports this
feature sets amcanunique true.
-   (At present, only b-tree supports it.)
+   (At present, only b-tree supports it.) Columns which are present in the
+   INCLUDING clause are not used to enforce uniqueness.
   
 
   
diff --git a/doc/src/sgml/indices.sgml b/doc/src/sgml/indices.sgml
index 23bbec6..09d4e6b 100644
--- a/doc/src/sgml/indices.sgml
+++ b/doc/src/sgml/indices.sgml
@@ -633,7 +633,8 @@ CREATE INDEX test3_desc_index ON test3 (id DESC NULLS LAST);
Indexes can also be used to enforce uniqueness of a column's value,
or the uniqueness of the combined values of more than one column.
 
-CREATE UNIQUE INDEX name ON table (column , ...);
+CREATE UNIQUE INDEX name ON table (column , ...)
+INCLUDING (column , ...);
 
Currently, only B-tree indexes can be declared unique.
   
@@ -642,7 +643,9 @@ CREATE UNIQUE INDEX name ON tableINCLUDING aren't used to enforce constraints (UNIQUE,
+   PRIMARY KEY, etc).
   
 
   
diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index ce36a1b..8011d6c 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -23,6 +23,7 @@ PostgreSQL documentation
 
 CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] name ] ON table_name [ USING method ]
 ( { column_name | ( expression ) } [ COLLATE collation ] [ opclass ] [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] )
+[ INCLUDING ( { column_name | ( expression ) } [ COLLATE collation ] [ opclass ] [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] )
 [ WITH ( storage_parameter = value [, ... ] ) ]
 [ TABLESPACE tablespace_name ]
 [ WHERE predicate ]
@@ -139,6 +140,32 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] 
 
  
+  INCLUDING
+  
+   
+An optional INCLUDING clause allows a list of columns to be
+specified which will be included in the index, in the non-key portion of
+the 

Re: [HACKERS]WIP: Covering + unique indexes.

2016-01-21 Thread Jeff Janes
On Tue, Jan 19, 2016 at 9:08 AM, Anastasia Lubennikova
 wrote:
>
>
> 18.01.2016 01:02, David Rowley пишет:
>
> On 14 January 2016 at 08:24, David Rowley 
> wrote:
>>
>> I will try to review the omit_opclass_4.0.patch soon.
>
>
> Hi, as promised, here's my review of the omit_opclass_4.0.patch patch.
>
> Thank you again. All mentioned points are fixed and patches are merged.
> I hope it's all right now. Please check comments one more time. I rather
> doubt that I wrote everything correctly.

Unfortunately there are several merge conflicts between your patch and
this commit:

commit 65c5fcd353a859da9e61bfb2b92a99f12937de3b
Author: Tom Lane 
Date:   Sun Jan 17 19:36:59 2016 -0500

Restructure index access method API to hide most of it at the C level.


Can you rebase past that commit?

Thanks,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS]WIP: Covering + unique indexes.

2016-01-21 Thread David Rowley
On 20 January 2016 at 06:08, Anastasia Lubennikova
 wrote:
>
>
>
> 18.01.2016 01:02, David Rowley пишет:
>
> On 14 January 2016 at 08:24, David Rowley  
> wrote:
>>
>> I will try to review the omit_opclass_4.0.patch soon.
>
>
> Hi, as promised, here's my review of the omit_opclass_4.0.patch patch.
>
> Thank you again. All mentioned points are fixed and patches are merged.
> I hope it's all right now. Please check comments one more time. I rather 
> doubt that I wrote everything correctly.


Thanks for updating.

+for the searching or ordering of records can defined in the

should be:

+for the searching or ordering of records can be defined in the

but perhaps "defined" should be "included".

The following is still quite wasteful. CopyIndexTuple() does a
palloc() and memcpy(), and then you throw that away if
rel->rd_index->indnatts != rel->rd_index->indnkeyatts. I think you
just need to add an "else" and move the CopyIndexTuple() below the if.

item = (IndexTuple) PageGetItem(lpage, itemid);
  right_item = CopyIndexTuple(item);
+ if (rel->rd_index->indnatts != rel->rd_index->indnkeyatts)
+ right_item = index_reform_tuple(rel, right_item,
rel->rd_index->indnatts, rel->rd_index->indnkeyatts);

Tom also commited
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=65c5fcd353a859da9e61bfb2b92a99f12937de3b
So it looks like you'll need to update your pg_am.h changes. Looks
like you'll need a new struct member in IndexAmRoutine and just
populate that new member in each of the *handler functions listed in
pg_am.h

-#define Natts_pg_am 30
+#define Natts_pg_am 31

Can the following be changed to

-   (At present, only b-tree supports it.)
+   (At present, only b-tree supports it.) Columns included with clause
+   INCLUDING  aren't used to enforce uniqueness.

-   (At present, only b-tree supports it.)
+   (At present, only b-tree supports it.) Columns which are present in the
 INCLUDING clause are not used to enforce uniqueness.

> Also this makes me think that the name ii_KeyAttrNumbers is now out-of-date, 
> as it contains
> the including columns too by the looks of it. Maybe it just needs to drop the 
> "Key" and become
> "ii_AttrNumbers". It would be interesting to hear what others think of that.
>
> I'm also wondering if indexkeys is still a good name for the IndexOptInfo 
> struct member.
> Including columns are not really keys, but I feel renaming that might cause a 
> fair bit of code churn, so I'd be interested to hear what other's have to say.
>
>
> I agree that KeyAttrNumbers and indexkeys are a bit confusing names, but I'd 
> like to keep them at least in this patch.
> It's may be worth doing "index structures refactoring" as a separate patch.


I agree. A separate patch sounds like the best course of action, but
authoring that can wait until after this is committed (I think).

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS]WIP: Covering + unique indexes.

2016-01-19 Thread Anastasia Lubennikova



18.01.2016 01:02, David Rowley пишет:
On 14 January 2016 at 08:24, David Rowley 
> 
wrote:


I will try to review the omit_opclass_4.0.patch soon.


Hi, as promised, here's my review of the omit_opclass_4.0.patch patch.


Thank you again. All mentioned points are fixed and patches are merged.
I hope it's all right now. Please check comments one more time. I rather 
doubt that I wrote everything correctly.
Also this makes me think that the name ii_KeyAttrNumbers is now 
out-of-date, as it contains
the including columns too by the looks of it. Maybe it just needs to 
drop the "Key" and become
"ii_AttrNumbers". It would be interesting to hear what others think of 
that.


I'm also wondering if indexkeys is still a good name for the 
IndexOptInfo struct member.
Including columns are not really keys, but I feel renaming that might 
cause a fair bit of code churn, so I'd be interested to hear what 
other's have to say.


I agree that KeyAttrNumbers and indexkeys are a bit confusing names, but 
I'd like to keep them at least in this patch.

It's may be worth doing "index structures refactoring" as a separate patch.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 97ef618..d17a06c 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -644,6 +644,13 @@
   Does an index of this type manage fine-grained predicate locks?
  
 
+  
+  amcaninclude
+  bool
+  
+  Does the access method support included columns?
+ 
+
  
   amkeytype
   oid
@@ -3714,6 +3721,14 @@
   pg_class.relnatts)
  
 
+  
+  indnkeyatts
+  int2
+  
+  The number of key columns in the index. "Key columns" are ordinary
+  index columns in contrast with "included" columns.
+ 
+
  
   indisunique
   bool
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index 1c09bae..d01af17 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -767,7 +767,8 @@ amrestrpos (IndexScanDesc scan);
using unique indexes, which are indexes that disallow
multiple entries with identical keys.  An access method that supports this
feature sets pg_am.amcanunique true.
-   (At present, only b-tree supports it.)
+   (At present, only b-tree supports it.) Columns included with clause
+   INCLUDING  aren't used to enforce uniqueness.
   
 
   
diff --git a/doc/src/sgml/indices.sgml b/doc/src/sgml/indices.sgml
index 23bbec6..09d4e6b 100644
--- a/doc/src/sgml/indices.sgml
+++ b/doc/src/sgml/indices.sgml
@@ -633,7 +633,8 @@ CREATE INDEX test3_desc_index ON test3 (id DESC NULLS LAST);
Indexes can also be used to enforce uniqueness of a column's value,
or the uniqueness of the combined values of more than one column.
 
-CREATE UNIQUE INDEX name ON table (column , ...);
+CREATE UNIQUE INDEX name ON table (column , ...)
+INCLUDING (column , ...);
 
Currently, only B-tree indexes can be declared unique.
   
@@ -642,7 +643,9 @@ CREATE UNIQUE INDEX name ON tableINCLUDING aren't used to enforce constraints (UNIQUE,
+   PRIMARY KEY, etc).
   
 
   
diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index ce36a1b..8360bb6 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -23,6 +23,7 @@ PostgreSQL documentation
 
 CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] name ] ON table_name [ USING method ]
 ( { column_name | ( expression ) } [ COLLATE collation ] [ opclass ] [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] )
+[ INCLUDING ( { column_name | ( expression ) } [ COLLATE collation ] [ opclass ] [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] )
 [ WITH ( storage_parameter = value [, ... ] ) ]
 [ TABLESPACE tablespace_name ]
 [ WHERE predicate ]
@@ -139,6 +140,32 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] 
 
  
+  INCLUDING
+  
+   
+An optional INCLUDING clause allows a list of columns to be
+specified which will be included in the index, in the non-key portion of
+the index. Columns which are part of this clause cannot also exist in the
+key columns portion of the index, and vice versa. The
+INCLUDING columns exist solely to allow more queries to benefit
+from index-only scans by including certain columns in the
+index, the value of which would otherwise have to be obtained by reading
+the table's heap. Having these columns in the INCLUDING clause
+in some cases allows PostgreSQL to skip the heap read
+completely. This also allows UNIQUE indexes to be defined on
+one set of columns, which can include another set of column in the
+INCLUDING clause, on which the uniqueness is not 

Re: [HACKERS]WIP: Covering + unique indexes.

2016-01-19 Thread Anastasia Lubennikova



12.01.2016 20:47, Jeff Janes:

It looks like the "covering" patch, with or without the "omit_opclass"
patch, does not support expressions as included columns:

create table foobar (x text, y xml);
create index on foobar (x) including  (md5(x));
ERROR:  unrecognized node type: 904
create index on foobar (x) including  ((y::text));
ERROR:  unrecognized node type: 911

I think we would probably want it to work with those (or at least to
throw a better error message).
Thank you for the notice. I couldn't fix it quickly and added a stub in 
the latest patch.

But I'll try to fix it and add expressions support a bit later.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS]WIP: Covering + unique indexes.

2016-01-17 Thread David Rowley
On 14 January 2016 at 08:24, David Rowley 
wrote:

> I will try to review the omit_opclass_4.0.patch soon.
>

Hi, as promised, here's my review of the omit_opclass_4.0.patch patch.

The following comment needs to be updated:

 * indexkeys[], indexcollations[], opfamily[], and opcintype[]
 * each have ncolumns entries.

I think you'll need to do quite a bit of refactoring in this comment to
explain how it all works now, and which arrays we expect to be which length.

The omit_opclass_4.0.patch patch should remove the following comment which
you added in the other patch:

/* TODO
* All these arrays below still have length = ncolumns.
* Fix, when optional opclass functionality will be added.
*
* Generally, any column could be returned by IndexOnlyScan.
* Even if it doesn't have opclass for that type of index.
*
* For example,
* we have an index "create index on tbl(c1) including c2".
* If there's no suitable oplass on c2
* query "select c2 from tbl where c2 < 10" can't use index-only scan
* and query "select c2 from tbl where c1 < 10" can.
* But now it doesn't because of requirement that
* each indexed column must have an opclass.
*/

The following comment should be updated to mention that this is only the
case for
key attributes, and we just take the type from the index for including
attributes.
Perhaps the comment is better outside of the if (i < nkeyatts) block too,
and just
explain both at once.

/*
* The provided data is not necessarily of the type stored in the
* index; rather it is of the index opclass's input type. So look
* at rd_opcintype not the index tupdesc.
*
* Note: this is a bit shaky for opclasses that have pseudotype
* input types such as ANYARRAY or RECORD.  Currently, the
* typoutput functions associated with the pseudotypes will work
* okay, but we might have to try harder in future.
*/

In BuildIndexInfo() numKeys is a bit confusing. Perhaps that needs renamed
to numAtts?
Also this makes me think that the name ii_KeyAttrNumbers is now
out-of-date, as it contains
the including columns too by the looks of it. Maybe it just needs to drop
the "Key" and become
"ii_AttrNumbers". It would be interesting to hear what others think of that.

IndexInfo *
BuildIndexInfo(Relation index)
{
IndexInfo  *ii = makeNode(IndexInfo);
Form_pg_index indexStruct = index->rd_index;
int i;
int numKeys;

/* check the number of keys, and copy attr numbers into the IndexInfo */
numKeys = indexStruct->indnatts;
if (numKeys < 1 || numKeys > INDEX_MAX_KEYS)
elog(ERROR, "invalid indnatts %d for index %u",
numKeys, RelationGetRelid(index));
ii->ii_NumIndexAttrs = numKeys;
ii->ii_NumIndexKeyAttrs = indexStruct->indnkeyatts;
Assert(ii->ii_NumIndexKeyAttrs != 0);
Assert(ii->ii_NumIndexKeyAttrs <= ii->ii_NumIndexAttrs);


Here you've pushed a chunk of code over one tab, but you don't have to do
that. Just add:

+ if (i >= indexInfo->ii_NumIndexKeyAttrs)
+ continue;

This'll make the patch a bit smaller. Also, maybe it's time to get rid of
you debug stuff that you've commented out?

for (i = 0; i < numKeys; i++)
ii->ii_KeyAttrNumbers[i] = indexStruct->indkey.values[i];
- if (OidIsValid(keyType) && keyType != to->atttypid)
+ if (i < indexInfo->ii_NumIndexKeyAttrs)
  {
- /* index value and heap value have different types */
- tuple = SearchSysCache1(TYPEOID, ObjectIdGetDatum(keyType));
+ /*
+ * Check the opclass and index AM to see if either provides a keytype


Same for this part:

- /*
- * Identify the exclusion operator, if any.
- */
- if (nextExclOp)
+ if (attn < nkeycols)

Could become:

+ if (attn >= nkeycols)
+ continue;


I'm also wondering if indexkeys is still a good name for the IndexOptInfo
struct member.
Including columns are not really keys, but I feel renaming that might cause
a fair bit of code churn, so I'd be interested to hear what other's have to
say.

  info->indexkeys = (int *) palloc(sizeof(int) * ncolumns);
- info->indexcollations = (Oid *) palloc(sizeof(Oid) * ncolumns);
- info->opfamily = (Oid *) palloc(sizeof(Oid) * ncolumns);
- info->opcintype = (Oid *) palloc(sizeof(Oid) * ncolumns);
+ info->indexcollations = (Oid *) palloc(sizeof(Oid) * nkeycolumns);
+ info->opfamily = (Oid *) palloc(sizeof(Oid) * nkeycolumns);
+ info->opcintype = (Oid *) palloc(sizeof(Oid) * nkeycolumns);


In quite a few places you do: int natts, nkeyatts;
but the areas you've done this don't seem to ever declare multiple
variables per type. Maybe it's best to follow what's there and just write
"int" again on the next line.

If you submit an updated patch I can start looking over the change fairly
soon.

Many thanks

David

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS]WIP: Covering + unique indexes.

2016-01-13 Thread Anastasia Lubennikova



13.01.2016 04:47, David Rowley :
On 13 January 2016 at 06:47, Jeff Janes > wrote:



Why is omit_opclass a separate patch?  If the included columns now
never participate in the index ordering, shouldn't it be an inherent
property of the main patch that you can "cover" things without btree
opclasses?


I don't personally think the covering_unique_4.0.patch is that close 
to being too big to review, I think things would make more sense of 
the omit_opclass_4.0.patch was included together with this.




I agree that these patches should be merged. It'll be fixed it the next 
updates.
I kept them separate only for historical reasons, it was more convenient 
for me to debug them. Furthermore, I wanted to show some performance 
degradation caused by "omit_opclass" and give a way to reproduce it 
performing test with and whithot the patch.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [HACKERS]WIP: Covering + unique indexes.

2016-01-13 Thread Anastasia Lubennikova

13.01.2016 04:27, David Rowley:

I've also done some testing:

create table ab (a int, b int);
insert into ab select a,b from generate_Series(1,10) a(a), 
generate_series(1,1) b(b);

set enable_bitmapscan=off;
set enable_indexscan=off;

select * from ab where a = 1 and b=1;
 a | b
---+---
 1 | 1
(1 row)

set enable_indexscan = on;
select * from ab where a = 1 and b=1;
 a | b
---+---
(0 rows)

This is broken. I've not looked into why yet, but from looking at the 
EXPLAIN output I was a bit surprised to see b=1 as an index condition. 
I'd have expected a Filter maybe, but I've not looked at the EXPLAIN 
code to see how those are determined yet.


Hmm... Do you use both patches?
And could you provide index definition, I can't reproduce the problem 
assuming that index is created by the statement

CREATE INDEX idx ON ab (a) INCLUDING (b);

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS]WIP: Covering + unique indexes.

2016-01-13 Thread David Rowley
On 14 January 2016 at 02:58, Anastasia Lubennikova <
a.lubennik...@postgrespro.ru> wrote:

> 13.01.2016 04:27, David Rowley:
>
>> I've also done some testing:
>>
>> create table ab (a int, b int);
>> insert into ab select a,b from generate_Series(1,10) a(a),
>> generate_series(1,1) b(b);
>> set enable_bitmapscan=off;
>> set enable_indexscan=off;
>>
>> select * from ab where a = 1 and b=1;
>>  a | b
>> ---+---
>>  1 | 1
>> (1 row)
>>
>> set enable_indexscan = on;
>> select * from ab where a = 1 and b=1;
>>  a | b
>> ---+---
>> (0 rows)
>>
>> This is broken. I've not looked into why yet, but from looking at the
>> EXPLAIN output I was a bit surprised to see b=1 as an index condition. I'd
>> have expected a Filter maybe, but I've not looked at the EXPLAIN code to
>> see how those are determined yet.
>>
>
> Hmm... Do you use both patches?
> And could you provide index definition, I can't reproduce the problem
> assuming that index is created by the statement
> CREATE INDEX idx ON ab (a) INCLUDING (b);


Sorry, I forgot the index, and yes you guessed correctly about that.

The problem only exists without the omit_opclass_4.0.patch and with the
covering_unique_4.0.patch, so please ignore.

I will try to review the omit_opclass_4.0.patch soon.

David

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS]WIP: Covering + unique indexes.

2016-01-12 Thread Anastasia Lubennikova


04.01.2016 11:49, David Rowley:
On 2 December 2015 at 01:53, Anastasia Lubennikova 
> 
wrote:


Finally, completed patch "covering_unique_3.0.patch" is here.
It includes the functionality discussed above in the thread,
regression tests and docs update.
I think it's quite ready for review.


Hi Anastasia,

I've maybe mentioned before that I think this is a great feature and I 
think it will be very useful to have, so I've signed up to review the 
patch, and below is the results of my first pass from reading the 
code. Apologies if some of the things seem like nitpicks, I've 
basically just listed everything I've noticed during, no matter how small.


First of all, I would like to thank you for writing such a detailed review.
All mentioned style problems, comments and typos are fixed in the patch 
v4.0.
+   An access method that supports this feature sets 
pg_am.amcanincluding true.


I don't think this belongs under the "Index Uniqueness Checks" title. 
I think the "Columns included with clause INCLUDING  aren't used to 
enforce uniqueness." that you've added before it is a good idea, but 
perhaps the details of amcanincluding are best explained elsewhere.

agree


+This clause specifies additional columns to be appended to 
the set of index columns.
+Included columns don't support any constraints 
(UNIQUE, PRMARY KEY, EXCLUSION CONSTRAINT).
+These columns can improve the performance of some queries 
 through using advantages of index-only scan
+(Or so called covering indexes. 
Covering index is the index that
+covers all columns required in the query and prevents a table 
access).
+Besides that, included attributes are not stored in index 
inner pages.
+It allows to decrease index size and furthermore it provides 
a way to extend included
+columns to store atttributes without suitable opclass (not 
implemented yet).
+This clause could be applied to both unique and nonunique 
indexes.
+It's possible to have non-unique covering index, which 
behaves as a regular

+multi-column index with a bit smaller index-size.
+Currently, only the B-tree access method supports this feature.

"PRMARY KEY" should be "PRIMARY KEY". I ended up rewriting this 
paragraph as follows.


"An optional INCLUDING clause allows a list of columns to 
be specified which will be included in the index, in the non-key 
portion of the index. Columns which are part of this clause cannot 
also exist in the indexed columns portion of the index, and vice 
versa. The INCLUDING columns exist solely to allow more 
queries to benefit from index only scans by including 
certain columns in the index, the value of which would otherwise have 
to be obtained by reading the table's heap. Having these columns in 
the INCLUDING clause in some cases allows 
PostgreSQL to skip the heap read completely. This also 
allows UNIQUE indexes to be defined on one set of columns, 
which can include another set of column in the INCLUDING 
clause, on which the uniqueness is not enforced upon. This can also be 
useful for non-unique indexes as any columns which are not required 
for the searching or ordering of records can defined in the 
INCLUDING clause, which can often reduce the size of the 
index."


Maybe not perfect, but maybe it's an improvement?



Yes, this explanation is much better. I've just added couple of notes.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 97ef618..d17a06c 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -644,6 +644,13 @@
   Does an index of this type manage fine-grained predicate locks?
  
 
+  
+  amcaninclude
+  bool
+  
+  Does the access method support included columns?
+ 
+
  
   amkeytype
   oid
@@ -3714,6 +3721,14 @@
   pg_class.relnatts)
  
 
+  
+  indnkeyatts
+  int2
+  
+  The number of key columns in the index. "Key columns" are ordinary
+  index columns in contrast with "included" columns.
+ 
+
  
   indisunique
   bool
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index 1c09bae..a102391 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -765,9 +765,10 @@ amrestrpos (IndexScanDesc scan);
   
PostgreSQL enforces SQL uniqueness constraints
using unique indexes, which are indexes that disallow
-   multiple entries with identical keys.  An access method that supports this
+   multiple entries with identical keys. An access method that supports this
feature sets pg_am.amcanunique true.
-   (At present, only b-tree supports it.)
+   Columns included with clause INCLUDING  aren't used to enforce uniqueness.
+   (At present, only b-tree supports 

Re: [HACKERS]WIP: Covering + unique indexes.

2016-01-12 Thread Jeff Janes
On Tue, Jan 12, 2016 at 8:59 AM, Anastasia Lubennikova
 wrote:
> 08.01.2016 00:12, David Rowley:
>
> On 7 January 2016 at 06:36, Jeff Janes  wrote:
>>

> But now I see the reason to create non-unique index with included columns -
> lack of suitable opclass on column "b".
> It's impossible to add it into the index as a key column, but that's not a
> problem with INCLUDING clause.
> Look at example.
>
> create table t1 (a int, b box);
> create index t1_a_inc_b_idx on t1 (a) including (b);
> create index on tbl (a,b);
> ERROR:  data type box has no default operator class for access method
> "btree"
> HINT:  You must specify an operator class for the index or define a default
> operator class for the data type.
> create index on tbl (a) including (b);
> CREATE INDEX
>
> This functionality is provided by the attached patch "omit_opclass_4.0",
> which must be applied over covering_unique_4.0.patch.

Thanks for the updates.

Why is omit_opclass a separate patch?  If the included columns now
never participate in the index ordering, shouldn't it be an inherent
property of the main patch that you can "cover" things without btree
opclasses?

Are you keeping them separate just to make review easier?  Or do you
think there might be a reason to commit one but not the other?  I
think that if we decide not to use the omit_opclass patch, then we
should also not allow covering columns to be specified on non-unique
indexes.

It looks like the "covering" patch, with or without the "omit_opclass"
patch, does not support expressions as included columns:

create table foobar (x text, y xml);
create index on foobar (x) including  (md5(x));
ERROR:  unrecognized node type: 904
create index on foobar (x) including  ((y::text));
ERROR:  unrecognized node type: 911

I think we would probably want it to work with those (or at least to
throw a better error message).

Thanks,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS]WIP: Covering + unique indexes.

2016-01-12 Thread David Rowley
On 13 January 2016 at 05:59, Anastasia Lubennikova <
a.lubennik...@postgrespro.ru> wrote:

> 08.01.2016 00:12, David Rowley:
>
> On 7 January 2016 at 06:36, Jeff Janes  wrote:
>
>> On Tue, Jan 5, 2016 at 11:55 PM, David Rowley
>>  wrote:
>> > create table ab (a int,b int);
>> > insert into ab select x,y from generate_series(1,20) x(x),
>> > generate_series(10,1,-1) y(y);
>> > create index on ab (a) including (b);
>> > explain select * from ab order by a,b;
>> > QUERY PLAN
>> > --
>> >  Sort  (cost=10.64..11.14 rows=200 width=8)
>> >Sort Key: a, b
>> >->  Seq Scan on ab  (cost=0.00..3.00 rows=200 width=8)
>> > (3 rows)
>>
>> If you set enable_sort=off, then you get the index-only scan with no
>> sort.  So it believes the index can be used for ordering (correctly, I
>> think), just sometimes it thinks it is not faster to do it that way.
>>
>> I'm not sure why this would be a correctness problem.  The covered
>> column does not participate in uniqueness checks, but it still usually
>> participates in index ordering.  (That is why dummy op-classes are
>> needed if you want to include non-sortable-type columns as being
>> covered.)
>>
>
> If that's the case, then it appears that I've misunderstood INCLUDING.
> From reading _bt_doinsert() it appeared that it'll ignore the INCLUDING
> columns and just find the insert position based on the key columns. Yet
> that's not the way that it appears to work. I was also a bit confused, as
> from working with another database which has very similar syntax to this,
> that one only includes the columns to allow index only scans, and the
> included columns are not indexed, therefore can't be part of index quals
> and the index only provides a sorted path for the indexed columns, and not
> the included columns.
>
>
> Thank you for properly testing. Order by clause in this case definitely
> doesn't work as expected.
> The problem is fixed by patching a planner function
> "build_index_pathkeys()'. It disables using of index if sorting of included
> columns is required.
> Test example works correctly now - it always performs seq scan and sort.
>
>
Thank you for updating the patch.
That's cleared up my confusion. All the code I read seemed to indicate that
INCLUDING columns were leaf only, it just confused me as to why the indexed
appeared to search and order on all columns, including the including
columns. Thanks for clearing up my confusion and fixing the patch.


> Saying that, I'm now a bit confused to why the following does not produce
> 2 indexes which are the same size:
>
> create table t1 (a int, b text);
> insert into t1 select x,md5(random()::text) from
> generate_series(1,100) x(x);
> create index t1_a_inc_b_idx on t1 (a) including (b);
> create index t1_a_b_idx on t1 (a,b);
> select pg_relation_Size('t1_a_b_idx'),pg_relation_size('t1_a_inc_b_idx');
>  pg_relation_size | pg_relation_size
> --+--
>  59064320 | 58744832
> (1 row)
>
>
> I suppose you've already found that in discussion above. Included columns
> are stored only in leaf index pages. The difference is the size of
> attributes 'b' which are situated in inner pages of index "t1_a_b_idx".
>

Yeah, I saw that from the code too. I just was confused as they appeared to
work like normal indexes.

I've made another pass of the covering_unique_4.0.patch. Again somethings
are nit picky (sorry), but it made sense to write them down as I noticed
them.

-   multiple entries with identical keys.  An access method that supports
this
+   multiple entries with identical keys. An access method that supports
this

Space removed by mistake.

feature sets pg_am.amcanunique true.
-   (At present, only b-tree supports it.)
+   Columns included with clause INCLUDING  aren't used to enforce
uniqueness.
+   (At present, only b-tree supports them.)

Maybe

+   (At present amcanunique is only supported by b-tree
+   indexes.)

As the extra line you've added confuses what "it" or "them" means, so maybe
best to clarify that.


+   INCLUDING aren't used to enforce constraints
(UNIQUE, PRIMARY KEY, etc).

Goes beyond 80 chars.


  right_item = CopyIndexTuple(item);
+ right_item = index_reform_tuple(rel, right_item, rel->rd_index->indnatts,
rel->rd_index->indnkeyatts);

Duplicate assignment. Should this perhaps be:

+ if (rel->rd_index->indnatts == rel->rd_index->indnkeyatts)
+   right_item = CopyIndexTuple(item);
+ else
+ right_item = index_reform_tuple(rel, right_item, rel->rd_index->indnatts,
rel->rd_index->indnkeyatts);

?

- natts = RelationGetNumberOfAttributes(rel);
- indoption = rel->rd_indoption;

- skey = (ScanKey) palloc(natts * sizeof(ScanKeyData));
+ Assert(rel->rd_index->indnkeyatts != 0);
+ Assert(rel->rd_index->indnkeyatts <= rel->rd_index->indnatts);

- for (i = 0; i < natts; i++)
+ nkeyatts = rel->rd_index->indnkeyatts;

Since 

Re: [HACKERS]WIP: Covering + unique indexes.

2016-01-12 Thread David Rowley
On 13 January 2016 at 06:47, Jeff Janes  wrote:

>
> Why is omit_opclass a separate patch?  If the included columns now
> never participate in the index ordering, shouldn't it be an inherent
> property of the main patch that you can "cover" things without btree
> opclasses?
>
>
I also wondered this. We can't have covering indexes without fixing the
problem with the following arrays:

  info->indexkeys = (int *) palloc(sizeof(int) * ncolumns);
  info->indexcollations = (Oid *) palloc(sizeof(Oid) * ncolumns);
  info->opfamily = (Oid *) palloc(sizeof(Oid) * ncolumns);

These need to be sized according to the number of key columns, not the
total number of columns. Of course, the TODO item in the patch states this
too.

I don't personally think the covering_unique_4.0.patch is that close to
being too big to review, I think things would make more sense of the
omit_opclass_4.0.patch was included together with this.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS]WIP: Covering + unique indexes.

2016-01-12 Thread Anastasia Lubennikova

08.01.2016 00:12, David Rowley:
On 7 January 2016 at 06:36, Jeff Janes > wrote:


On Tue, Jan 5, 2016 at 11:55 PM, David Rowley
> wrote:
> create table ab (a int,b int);
> insert into ab select x,y from generate_series(1,20) x(x),
> generate_series(10,1,-1) y(y);
> create index on ab (a) including (b);
> explain select * from ab order by a,b;
> QUERY PLAN
> --
>  Sort  (cost=10.64..11.14 rows=200 width=8)
>Sort Key: a, b
>->  Seq Scan on ab  (cost=0.00..3.00 rows=200 width=8)
> (3 rows)

If you set enable_sort=off, then you get the index-only scan with no
sort.  So it believes the index can be used for ordering (correctly, I
think), just sometimes it thinks it is not faster to do it that way.

I'm not sure why this would be a correctness problem. The covered
column does not participate in uniqueness checks, but it still usually
participates in index ordering.  (That is why dummy op-classes are
needed if you want to include non-sortable-type columns as being
covered.)


If that's the case, then it appears that I've misunderstood INCLUDING. 
From reading _bt_doinsert() it appeared that it'll ignore the 
INCLUDING columns and just find the insert position based on the key 
columns. Yet that's not the way that it appears to work. I was also a 
bit confused, as from working with another database which has very 
similar syntax to this, that one only includes the columns to allow 
index only scans, and the included columns are not indexed, therefore 
can't be part of index quals and the index only provides a sorted path 
for the indexed columns, and not the included columns.


Thank you for properly testing. Order by clause in this case definitely 
doesn't work as expected.
The problem is fixed by patching a planner function 
"build_index_pathkeys()'. It disables using of index if sorting of 
included columns is required.

Test example works correctly now - it always performs seq scan and sort.

Saying that, I'm now a bit confused to why the following does not 
produce 2 indexes which are the same size:


create table t1 (a int, b text);
insert into t1 select x,md5(random()::text) from 
generate_series(1,100) x(x);

create index t1_a_inc_b_idx on t1 (a) including (b);
create index t1_a_b_idx on t1 (a,b);
select pg_relation_Size('t1_a_b_idx'),pg_relation_size('t1_a_inc_b_idx');
 pg_relation_size | pg_relation_size
--+--
 59064320 | 58744832
(1 row)


I suppose you've already found that in discussion above. Included 
columns are stored only in leaf index pages. The difference is the size 
of attributes 'b' which are situatedin inner pages of index "t1_a_b_idx".


Also, if we want INCLUDING() to mean "uniqueness is not enforced on 
these columns, but they're still in the index", then I don't really 
think allowing types without a btree opclass is a good idea. It's 
likely too surprised filled and might not be what the user actually 
wants. I'd suggest that these non-indexed columns would be better 
defined by further expanding the syntax, the first (perhaps not very 
good) thing that comes to mind is:


create unique index idx_name on table (unique_col) also index 
(other,idx,cols) including (leaf,onlycols);


Looking up thread, I don't think I was the first to be confused by this.


Included columns are still in the index physically - they are stored in 
the index relation. But they are not indexedin the true sense of the 
word. It's impossible to use them for index scan or ordering. At the 
beginning, I've got an idea that included columns are supposed to be 
used for combination of unique index on one columns and covering on 
others. In a very rare instances one could prefer a non-unique index 
with included columns "t1_a_inc_b_idx"to a regular multicolumn index 
"t1_a_b_idx". Frankly, I didn't see such use cases at all. Index size 
reduction is not considerable, while we lose some useful index 
functionality on included column. I think that it should be mentioned as 
a note in documentation, but I need help to phrase it clear.


But now I see the reason to create non-unique index with included 
columns - lack of suitable opclass on column "b".
It's impossible to add it into the index as a key column, but that's not 
a problem with INCLUDING clause.

Look at example.

create table t1 (a int, b box);
create index t1_a_inc_b_idx on t1 (a) including (b);
create index on tbl (a,b);
ERROR:  data type box has no default operator class for access method 
"btree"
HINT:  You must specify an operator class for the index or define a 
default operator class for the data type.

create index on tbl (a) including (b);
CREATE INDEX

This functionality is provided by the attached patch 

Re: [HACKERS]WIP: Covering + unique indexes.

2016-01-07 Thread David Rowley
On 7 January 2016 at 06:36, Jeff Janes  wrote:

> On Tue, Jan 5, 2016 at 11:55 PM, David Rowley
>  wrote:
> > create table ab (a int,b int);
> > insert into ab select x,y from generate_series(1,20) x(x),
> > generate_series(10,1,-1) y(y);
> > create index on ab (a) including (b);
> > explain select * from ab order by a,b;
> > QUERY PLAN
> > --
> >  Sort  (cost=10.64..11.14 rows=200 width=8)
> >Sort Key: a, b
> >->  Seq Scan on ab  (cost=0.00..3.00 rows=200 width=8)
> > (3 rows)
>
> If you set enable_sort=off, then you get the index-only scan with no
> sort.  So it believes the index can be used for ordering (correctly, I
> think), just sometimes it thinks it is not faster to do it that way.
>
> I'm not sure why this would be a correctness problem.  The covered
> column does not participate in uniqueness checks, but it still usually
> participates in index ordering.  (That is why dummy op-classes are
> needed if you want to include non-sortable-type columns as being
> covered.)
>

If that's the case, then it appears that I've misunderstood INCLUDING. From
reading _bt_doinsert() it appeared that it'll ignore the INCLUDING columns
and just find the insert position based on the key columns. Yet that's not
the way that it appears to work. I was also a bit confused, as from working
with another database which has very similar syntax to this, that one only
includes the columns to allow index only scans, and the included columns
are not indexed, therefore can't be part of index quals and the index only
provides a sorted path for the indexed columns, and not the included
columns.

Saying that, I'm now a bit confused to why the following does not produce 2
indexes which are the same size:

create table t1 (a int, b text);
insert into t1 select x,md5(random()::text) from generate_series(1,100)
x(x);
create index t1_a_inc_b_idx on t1 (a) including (b);
create index t1_a_b_idx on t1 (a,b);
select pg_relation_Size('t1_a_b_idx'),pg_relation_size('t1_a_inc_b_idx');
 pg_relation_size | pg_relation_size
--+--
 59064320 | 58744832
(1 row)

Also, if we want INCLUDING() to mean "uniqueness is not enforced on these
columns, but they're still in the index", then I don't really think
allowing types without a btree opclass is a good idea. It's likely too
surprised filled and might not be what the user actually wants. I'd suggest
that these non-indexed columns would be better defined by further expanding
the syntax, the first (perhaps not very good) thing that comes to mind is:

create unique index idx_name on table (unique_col) also index
(other,idx,cols) including (leaf,onlycols);

Looking up thread, I don't think I was the first to be confused by this.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS]WIP: Covering + unique indexes.

2016-01-06 Thread Jeff Janes
On Tue, Jan 5, 2016 at 11:55 PM, David Rowley
 wrote:
> On 4 January 2016 at 21:49, David Rowley 
> wrote:
>>
>> I've not tested the patch yet. I will send another email soon with the
>> results of that.
>
>
> Hi,
>
> As promised I've done some testing on this, and I've found something which
> is not quite right:
>
> create table ab (a int,b int);
> insert into ab select x,y from generate_series(1,20) x(x),
> generate_series(10,1,-1) y(y);
> create index on ab (a) including (b);
> explain select * from ab order by a,b;
> QUERY PLAN
> --
>  Sort  (cost=10.64..11.14 rows=200 width=8)
>Sort Key: a, b
>->  Seq Scan on ab  (cost=0.00..3.00 rows=200 width=8)
> (3 rows)

If you set enable_sort=off, then you get the index-only scan with no
sort.  So it believes the index can be used for ordering (correctly, I
think), just sometimes it thinks it is not faster to do it that way.

I'm not sure why this would be a correctness problem.  The covered
column does not participate in uniqueness checks, but it still usually
participates in index ordering.  (That is why dummy op-classes are
needed if you want to include non-sortable-type columns as being
covered.)

>
> This is what I'd expect
>
> truncate table ab;
> insert into ab select x,y from generate_series(1,20) x(x),
> generate_series(10,1,-1) y(y);
> explain select * from ab order by a,b;
>   QUERY PLAN
> --
>  Index Only Scan using ab_a_b_idx on ab  (cost=0.15..66.87 rows=2260
> width=8)
> (1 row)
>
> This index, as we've defined it should not be able to satisfy the query's
> order by, although it does give correct results, that's because the index
> seems to be built wrongly in cases where the rows are added after the index
> exists.

I think this just causes differences in planner statistics leading to
different plans.  ANALYZE the table and it goes back to doing the
sort, because it thinks the sort is faster.

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS]WIP: Covering + unique indexes.

2016-01-05 Thread David Rowley
On 4 January 2016 at 21:49, David Rowley 
wrote:

> I've not tested the patch yet. I will send another email soon with the
> results of that.
>

Hi,

As promised I've done some testing on this, and I've found something which
is not quite right:

create table ab (a int,b int);
insert into ab select x,y from generate_series(1,20) x(x),
generate_series(10,1,-1) y(y);
create index on ab (a) including (b);
explain select * from ab order by a,b;
QUERY PLAN
--
 Sort  (cost=10.64..11.14 rows=200 width=8)
   Sort Key: a, b
   ->  Seq Scan on ab  (cost=0.00..3.00 rows=200 width=8)
(3 rows)

This is what I'd expect

truncate table ab;
insert into ab select x,y from generate_series(1,20) x(x),
generate_series(10,1,-1) y(y);
explain select * from ab order by a,b;
  QUERY PLAN

--
 Index Only Scan using ab_a_b_idx on ab  (cost=0.15..66.87 rows=2260
width=8)
(1 row)

This index, as we've defined it should not be able to satisfy the query's
order by, although it does give correct results, that's because the index
seems to be built wrongly in cases where the rows are added after the index
exists.

If we then do:

reindex table ab;
explain select * from ab order by a,b;
QUERY PLAN
--
 Sort  (cost=10.64..11.14 rows=200 width=8)
   Sort Key: a, b
   ->  Seq Scan on ab  (cost=0.00..3.00 rows=200 width=8)
(3 rows)

It looks normal again.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS]WIP: Covering + unique indexes.

2016-01-04 Thread David Rowley
On 2 December 2015 at 01:53, Anastasia Lubennikova <
a.lubennik...@postgrespro.ru> wrote:

> Finally, completed patch "covering_unique_3.0.patch" is here.
> It includes the functionality discussed above in the thread, regression
> tests and docs update.
> I think it's quite ready for review.
>

Hi Anastasia,

I've maybe mentioned before that I think this is a great feature and I
think it will be very useful to have, so I've signed up to review the
patch, and below is the results of my first pass from reading the code.
Apologies if some of the things seem like nitpicks, I've basically just
listed everything I've noticed during, no matter how small.

- int natts = rel->rd_rel->relnatts;
+ int nkeyatts = rel->rd_index->indnkeyatts;
+
+ Assert (rel->rd_index != NULL);
+ Assert(rel->rd_index->indnatts != 0);
+ Assert(rel->rd_index->indnkeyatts != 0);
+
  SnapshotData SnapshotDirty;

There's a couple of problems here. According to [1] the C code must follow
the C89 standard, but this appears not to. You have some statements before
the final variable declaration, and also there's a problem as you're
Asserting that rel->rd_index != NULL after already trying to dereference it
in the assignment to nkeyatts, which makes the Assert() useless.

+   An access method that supports this feature sets
pg_am.amcanincluding true.

I don't think this belongs under the "Index Uniqueness Checks" title. I
think the "Columns included with clause INCLUDING  aren't used to enforce
uniqueness." that you've added before it is a good idea, but perhaps the
details of amcanincluding are best explained elsewhere.

-   indexed columns are equal in multiple rows.
+   indexed columns are equal in multiple rows. Columns included with clause
+   INCLUDING  aren't used to enforce constraints (UNIQUE, PRIMARY KEY,
etc).

 is missing around "INCLUDING" here. Perhaps this part needs more
explanation in a new paragraph. Likely it's good idea to also inform the
reader that the columns which are part of the INCLUDING clause exist only
to allow the query planner to skip having to perform a lookup to the heap
when all of the columns required for the relation are present in the
indexed columns, or in the INCLUDING columns. I think you should explain
that the index can also only be used as pre-sorted input for columns which
are in the "indexed columns" part of the index, and the INCLUDING column
are not searchable as index quals.

--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -137,7 +137,6 @@ CheckIndexCompatible(Oid oldId,
  Relation irel;
  int i;
  Datum d;
-
  /* Caller should already have the relation locked in some way. */

You've accidentally removed an empty line here.

+ /*
+ * All information about key and included cols is in numberOfKeyAttributes
number.
+ * So we can concat all index params into one list.
+ */
+ stmt->indexParams = list_concat(stmt->indexParams,
stmt->indexIncludingParams);

I think this should be explained with a better comment, perhaps:

/*
 * We append any INCLUDING columns onto the indexParams list so that
 * we have one list with all columns. Later we can determine which of these
 * are indexed, and which are just part of the INCLUDING list by check the
list
 * position. A list item in a position less than ii_NumIndexKeyAttrs is
part of
 * the indexed columns, and anything equal to and over is part of the
 * INCLUDING columns.
 */

+ stack = _bt_search(rel, IndexRelationGetNumberOfKeyAttributes(rel),
itup_scankey,

This line is longer than 80 chars.

+ /* Truncate nonkey attributes when inserting on nonleaf pages */
+ if (wstate->index->rd_index->indnatts !=
wstate->index->rd_index->indnkeyatts)
+ {
+ BTPageOpaque pageop = (BTPageOpaque) PageGetSpecialPointer(npage);
+
+ if (!P_ISLEAF(pageop))
+ {
+ itup = index_reform_tuple(wstate->index, itup,
wstate->index->rd_index->indnatts, wstate->index->rd_index->indnkeyatts);
+ itupsz = IndexTupleDSize(*itup);
+ itupsz = MAXALIGN(itupsz);
+ }
+ }

A few of the lines here are over 80 chars.


+This clause specifies additional columns to be appended to the set
of index columns.
+Included columns don't support any constraints (UNIQUE,
PRMARY KEY, EXCLUSION CONSTRAINT).
+These columns can improve the performance of some queries  through
using advantages of index-only scan
+(Or so called covering indexes. Covering
index is the index that
+covers all columns required in the query and prevents a table
access).
+Besides that, included attributes are not stored in index inner
pages.
+It allows to decrease index size and furthermore it provides a way
to extend included
+columns to store atttributes without suitable opclass (not
implemented yet).
+This clause could be applied to both unique and nonunique indexes.
+It's possible to have non-unique covering index, which behaves as
a regular
+multi-column index with a bit smaller index-size.
+Currently, only the 

  1   2   >