Re: Proposal: Global Index

2021-01-18 Thread 曾文旌


> 2021年1月12日 02:37,Robert Haas  写道:
> 
> On Mon, Jan 11, 2021 at 12:46 PM Bruce Momjian  wrote:
>>> For 1) The DETACH old child table can be finished immediately, global index 
>>> can be kept valid after DETACH is completed, and the cleanup of garbage 
>>> data in global index can be deferred to VACUUM.
>>> This is similar to the global index optimization done by Oracle12c.
>>> For 2) ATTACH new empty child table can also be completed immediately.
>>> If this is the case, many of the advantages of partitioned tables will be 
>>> retained, while the advantages of global indexes will be gained.
>> 
>> Yes, we can keep the index rows for the deleted partition and clean them
>> up later, but what is the advantage of partitioning then?  Just heap
>> deletion quickly?  Is that enough of a value?
> 
> I actually think the idea of lazily deleting the index entries is
> pretty good, but it won't work if the way the global index is
> implemented is by adding a tableoid column. Because then, I might
> detach a partition and later reattach it and the old index entries are
> still there but the table contents might have changed. Worse yet, the
> table might be dropped and the table OID reused for a completely
> unrelated table with completely unrelated contents, which could then
> be attached as a new partition.
> 
> One of the big selling points of global indexes is that they allow you
> to enforce uniqueness on a column unrelated to the partitioning
> column. Another is that you can look up a value by doing a single
> index scan on the global index rather than an index scan per
> partition. Those things are no less valuable for performing index
> deletion lazily.
> 
> However, there is a VACUUM amplification effect to worry about here
> which Wenjing seems not to be considering. Suppose I have a table
> which is not partitioned and it is 1TB in size with an index that is
> 128GB in size. To vacuum the table, I need to do 1TB + 128GB of I/O.
> Now, suppose I now partition the table into 1024 partitions each with
> its own local index. Each partition is 1GB in size and the index on
> each partition is 128MB in size. To vacuum an individual partition
> requires 1GB + 128MB of I/O, so to vacuum all the partitions requires
> the same amount of total I/O as before. But, now suppose that I have a
> single global index instead of a local index per partition. First, how
> big will that index be? It will not be 128GB, but somewhat bigger,
> because it needs extra space for every indexed tuple. Let's say 140GB.
> Furthermore, it will need to be vacuumed whenever any child is
> vacuumed, because it contains some index entries from every child. So
> the total I/O to vacuum all partitions is now 1GB * 1024 + 140GB *
> 1024 = 141TB, which is a heck of a lot worse than the 1.125TB I
> required with the unpartitioned table or the locally partitioned
> table.
Thank you for pointing this out.
It seems that some optimization can be done, but there is no good way
to completely eliminate the vacuum amplification effect of the global index.
Maybe we can only count on Zheap, which doesn't need to do Vaccum.



> 
> That's not necessarily a death sentence for every use case, but it's
> going to be pretty bad for tables that are big and heavily updated.
> 
> -- 
> Robert Haas
> EDB: http://www.enterprisedb.com



smime.p7s
Description: S/MIME cryptographic signature


Re: Proposal: Global Index

2021-01-11 Thread Bruce Momjian
On Mon, Jan 11, 2021 at 12:05:43PM -0800, Peter Geoghegan wrote:
> On Mon, Jan 11, 2021 at 11:25 AM Bruce Momjian  wrote:
> > Once you layer on all the places a global index will be worse than just
> > creating a single large table, or a partitioned table with an index per
> > child, there might not be much usefulness left.  A POC patch might tell
> > us that, and might allow us to mark it as "not wanted".
> 
> I'm confused. Of course it's true to some degree that having a global
> index "defeats the purpose" of having a partitioned table. But only to
> a degree. And for some users it will make the difference between using
> partitioning and not using partitioning -- they simply won't be able
> to tolerate not having it available (e.g. because of a requirement for
> a unique constraint that does not cover the partitioning key).

Yes, that is a good point.  For those cases, I think we need to look at
the code complexity/overhead of supporting that feature.  There are
going to be a few cases it is a win, but will the code complexity be
worth it?

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Proposal: Global Index

2021-01-11 Thread Peter Geoghegan
On Mon, Jan 11, 2021 at 11:25 AM Bruce Momjian  wrote:
> Once you layer on all the places a global index will be worse than just
> creating a single large table, or a partitioned table with an index per
> child, there might not be much usefulness left.  A POC patch might tell
> us that, and might allow us to mark it as "not wanted".

I'm confused. Of course it's true to some degree that having a global
index "defeats the purpose" of having a partitioned table. But only to
a degree. And for some users it will make the difference between using
partitioning and not using partitioning -- they simply won't be able
to tolerate not having it available (e.g. because of a requirement for
a unique constraint that does not cover the partitioning key).

-- 
Peter Geoghegan




Re: Proposal: Global Index

2021-01-11 Thread Bruce Momjian
On Mon, Jan 11, 2021 at 11:01:20AM -0800, Peter Geoghegan wrote:
> However, it probably would be okay if a global index feature performed
> poorly in scenarios where partitions get lots of UPDATEs that produce
> lots of index bloat and cause lots of LP_DEAD line pointers to
> accumulate in heap pages. It is probably reasonable to just expect
> users to not do that if they want to get acceptable performance while
> using a global index. Especially since it probably is not so bad if
> the index bloat situation gets out of hand for just one of the
> partitions (say the most recent one) every once in a while. You at
> least don't have the same crazy I/O multiplier effect that you
> described.

Once you layer on all the places a global index will be worse than just
creating a single large table, or a partitioned table with an index per
child, there might not be much usefulness left.  A POC patch might tell
us that, and might allow us to mark it as "not wanted".

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Proposal: Global Index

2021-01-11 Thread Bruce Momjian
On Mon, Jan 11, 2021 at 01:37:02PM -0500, Robert Haas wrote:
> However, there is a VACUUM amplification effect to worry about here
...
> That's not necessarily a death sentence for every use case, but it's
> going to be pretty bad for tables that are big and heavily updated.

Yeah, I had not really gotten that far in my thinking, but someone is
going to need to create a POC and then we need to test it to see if it
offers a reasonably valuable feature.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Proposal: Global Index

2021-01-11 Thread Peter Geoghegan
On Mon, Jan 11, 2021 at 10:37 AM Robert Haas  wrote:
> I actually think the idea of lazily deleting the index entries is
> pretty good, but it won't work if the way the global index is
> implemented is by adding a tableoid column.

Perhaps there is an opportunity to apply some of the infrastructure
that Masahiko Sawada has been working on, that makes VACUUM more
incremental in certain specific scenarios:

https://postgr.es/m/cad21aod0ske11fmw4jd4renawbmcw1wasvnwpjvw3tvqpoq...@mail.gmail.com

I think that VACUUM can be taught to skip the ambulkdelete() step for
indexes in many common scenarios. Global indexes might be one place in
which that's almost essential.

> However, there is a VACUUM amplification effect to worry about here
> which Wenjing seems not to be considering.

> That's not necessarily a death sentence for every use case, but it's
> going to be pretty bad for tables that are big and heavily updated.

The main way in which index vacuuming is currently a death sentence
for this design (as you put it) is that it's an all-or-nothing thing.
Presumably you'll need to VACUUM the entire global index for each
partition that receives even one UPDATE. That seems pretty extreme,
and probably not acceptable. In a way it's not really a new problem,
but the fact remains: it makes global indexes much less valuable.

However, it probably would be okay if a global index feature performed
poorly in scenarios where partitions get lots of UPDATEs that produce
lots of index bloat and cause lots of LP_DEAD line pointers to
accumulate in heap pages. It is probably reasonable to just expect
users to not do that if they want to get acceptable performance while
using a global index. Especially since it probably is not so bad if
the index bloat situation gets out of hand for just one of the
partitions (say the most recent one) every once in a while. You at
least don't have the same crazy I/O multiplier effect that you
described.

-- 
Peter Geoghegan




Re: Proposal: Global Index

2021-01-11 Thread Robert Haas
On Mon, Jan 11, 2021 at 12:46 PM Bruce Momjian  wrote:
> > For 1) The DETACH old child table can be finished immediately, global index 
> > can be kept valid after DETACH is completed, and the cleanup of garbage 
> > data in global index can be deferred to VACUUM.
> > This is similar to the global index optimization done by Oracle12c.
> > For 2) ATTACH new empty child table can also be completed immediately.
> > If this is the case, many of the advantages of partitioned tables will be 
> > retained, while the advantages of global indexes will be gained.
>
> Yes, we can keep the index rows for the deleted partition and clean them
> up later, but what is the advantage of partitioning then?  Just heap
> deletion quickly?  Is that enough of a value?

I actually think the idea of lazily deleting the index entries is
pretty good, but it won't work if the way the global index is
implemented is by adding a tableoid column. Because then, I might
detach a partition and later reattach it and the old index entries are
still there but the table contents might have changed. Worse yet, the
table might be dropped and the table OID reused for a completely
unrelated table with completely unrelated contents, which could then
be attached as a new partition.

One of the big selling points of global indexes is that they allow you
to enforce uniqueness on a column unrelated to the partitioning
column. Another is that you can look up a value by doing a single
index scan on the global index rather than an index scan per
partition. Those things are no less valuable for performing index
deletion lazily.

However, there is a VACUUM amplification effect to worry about here
which Wenjing seems not to be considering. Suppose I have a table
which is not partitioned and it is 1TB in size with an index that is
128GB in size. To vacuum the table, I need to do 1TB + 128GB of I/O.
Now, suppose I now partition the table into 1024 partitions each with
its own local index. Each partition is 1GB in size and the index on
each partition is 128MB in size. To vacuum an individual partition
requires 1GB + 128MB of I/O, so to vacuum all the partitions requires
the same amount of total I/O as before. But, now suppose that I have a
single global index instead of a local index per partition. First, how
big will that index be? It will not be 128GB, but somewhat bigger,
because it needs extra space for every indexed tuple. Let's say 140GB.
Furthermore, it will need to be vacuumed whenever any child is
vacuumed, because it contains some index entries from every child. So
the total I/O to vacuum all partitions is now 1GB * 1024 + 140GB *
1024 = 141TB, which is a heck of a lot worse than the 1.125TB I
required with the unpartitioned table or the locally partitioned
table.

That's not necessarily a death sentence for every use case, but it's
going to be pretty bad for tables that are big and heavily updated.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Proposal: Global Index

2021-01-11 Thread Tom Lane
Bruce Momjian  writes:
> On Mon, Jan 11, 2021 at 07:40:18PM +0800, 曾文旌 wrote:
>> This is indeed a typical scenario for a partitioned table.
>> there are two basic operations
>> 1) Monthly DETACH old child table
>> 2) Monthly ATTACH new child table
>> 
>> For 1) The DETACH old child table can be finished immediately, global index 
>> can be kept valid after DETACH is completed, and the cleanup of garbage data 
>> in global index can be deferred to VACUUM.

> Yes, we can keep the index rows for the deleted partition and clean them
> up later, but what is the advantage of partitioning then?  Just heap
> deletion quickly?  Is that enough of a value?

More to the point, you still have a massive index cleanup operation to do.
Deferred or not, that's going to take a lot of cycles, and it will leave
you with a bloated global index.  I find it hard to believe that this
approach will seem like an improvement over doing partitioning the way
we do it now.

regards, tom lane




Re: Proposal: Global Index

2021-01-11 Thread Bruce Momjian
On Mon, Jan 11, 2021 at 07:40:18PM +0800, 曾文旌 wrote:
> >> In addition you mentioned: "It is still unclear if these use-cases justify 
> >> the architectural changes needed to enable global indexes."
> >> Please also describe the problems you see, I will confirm each specific 
> >> issue one by one.
> > 
> > One example is date partitioning.  People frequently need to store
> > only the most recent data.  For instance doing a monthly partitioning
> > and dropping the oldest partition every month once you hit the wanted
> > retention is very efficient for that use case, as it should be almost
> > instant (provided that you can acquire the necessary locks
> > immediately).  But if you have a global index, you basically lose the
> > advantage of partitioning as it'll require heavy changes on that
> > index.
> If the global index removes all the major benefits of partitioned tables, 
> then it is not worth having it.
> 
> This is indeed a typical scenario for a partitioned table.
> there are two basic operations
> 1) Monthly DETACH old child table
> 2) Monthly ATTACH new child table
> 
> For 1) The DETACH old child table can be finished immediately, global index 
> can be kept valid after DETACH is completed, and the cleanup of garbage data 
> in global index can be deferred to VACUUM.
> This is similar to the global index optimization done by Oracle12c.
> For 2) ATTACH new empty child table can also be completed immediately.
> If this is the case, many of the advantages of partitioned tables will be 
> retained, while the advantages of global indexes will be gained.

Yes, we can keep the index rows for the deleted partition and clean them
up later, but what is the advantage of partitioning then?  Just heap
deletion quickly?  Is that enough of a value?

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Proposal: Global Index

2021-01-11 Thread 曾文旌


> 2021年1月7日 23:04,Robert Haas  写道:
> 
> On Thu, Jan 7, 2021 at 4:44 AM 曾文旌  wrote:
>> I've been following this topic for a long time. It's been a year since the 
>> last response.
>> It was clear that our customers wanted this feature as well, and a large 
>> number of them mentioned it.
>> 
>> So, I wish the whole feature to mature as soon as possible.
>> I summarized the scheme mentioned in the email and completed the POC 
>> patch(base on PG_13).
> 
> You need to summarize the basic design choices you've made here. Like,
> what did you do about the fact that TIDs have to be 6 bytes?

These are the basic choices, and most of them come from discussions in previous 
emails.

1 Definition of global index
Obviously, we need to expand Index address info(CTID) to include child table 
info in GlobalIndexTuple.

1.1 As mentioned in the previous email, Bruce suggested having the OID
instead of relfilenode as relfilenode can be duplicated across tablespaces. 
I agree with that.

1.2 And Heikki pointed me to include heap specific information using the 
INCLUDE keyword so that heap information
is stored with each index node as data.

So ,In POC stage, I choose use INCLUDE keyword to INCLUDE the tableoid of 
global index. This will add 4 bytes to each IndexTuple.

Considering that if a single partitioned table does not exceed 65535 child 
tables, perhaps two bytes for tracking which child table the data belongs to is 
sufficient.

2. Maintenance of global index by partition table DML.
The DML of each child table of the partitioned table needs to maintain the 
global index on the partitioned table.

3. Global index scan
Planner: 
Processes predicate on the primary partition, generating paths and plans for 
the global index.
The cost model of the global index may need to be considered. We need to make 
the global index or the local index selected in their respective advantageous 
scenarios.

Executer: 
The index scan get indextup, get the tableoid from indextup, and verify the 
visibility of the data in the child table.
If a child table is DETACH, then the index item of this table is ignored during 
the index scan until VACUUM finishes cleaning up the global index.

4. Vacuum partition table maintains global index.
Old data in the global index also needs to be cleaned up, and vaccum is 
suitable for it.
Each child table in VACUUM, while vacuuming its own index, also vacuums the 
global index on the partitioned table.

5. Other
The global index indexes all of the child tables, which makes the global index 
large and has many levels. 
Follow the technical route, The partitioned indexes are a further target.

This is my basic idea for implementing global index.
Looking forward to your feedback.

Thanks!

Wenjing

> 
> -- 
> Robert Haas
> EDB: http://www.enterprisedb.com



smime.p7s
Description: S/MIME cryptographic signature


Re: Proposal: Global Index

2021-01-11 Thread 曾文旌


> 2021年1月8日 16:26,Julien Rouhaud  写道:
> 
> On Fri, Jan 8, 2021 at 4:02 PM 曾文旌  wrote:
>> 
>>> 2021年1月7日 22:16,Bruce Momjian  写道:
>>> 
>>> On Thu, Jan  7, 2021 at 05:44:01PM +0800, 曾文旌 wrote:
 I've been following this topic for a long time. It's been a year since the 
 last response.
 It was clear that our customers wanted this feature as well, and a large 
 number of them mentioned it.
 
 So, I wish the whole feature to mature as soon as possible.
 I summarized the scheme mentioned in the email and completed the POC 
 patch(base on PG_13).
>>> 
>>> I think you need to address the items mentioned in this blog, and the
>>> email link it mentions:
>>> 
>>>  https://momjian.us/main/blogs/pgblog/2020.html#July_1_2020
>> 
>> Thank you for your reply.
>> I read your blog and it helped me a lot.
>> 
>> The blog mentions a specific problem: "A large global index might also 
>> reintroduce problems that prompted the creation of partitioning in the first 
>> place. "
>> I don't quite understand, could you give some specific information?
>> 
>> In addition you mentioned: "It is still unclear if these use-cases justify 
>> the architectural changes needed to enable global indexes."
>> Please also describe the problems you see, I will confirm each specific 
>> issue one by one.
> 
> One example is date partitioning.  People frequently need to store
> only the most recent data.  For instance doing a monthly partitioning
> and dropping the oldest partition every month once you hit the wanted
> retention is very efficient for that use case, as it should be almost
> instant (provided that you can acquire the necessary locks
> immediately).  But if you have a global index, you basically lose the
> advantage of partitioning as it'll require heavy changes on that
> index.
If the global index removes all the major benefits of partitioned tables, then 
it is not worth having it.

This is indeed a typical scenario for a partitioned table.
there are two basic operations
1) Monthly DETACH old child table
2) Monthly ATTACH new child table

For 1) The DETACH old child table can be finished immediately, global index can 
be kept valid after DETACH is completed, and the cleanup of garbage data in 
global index can be deferred to VACUUM.
This is similar to the global index optimization done by Oracle12c.
For 2) ATTACH new empty child table can also be completed immediately.
If this is the case, many of the advantages of partitioned tables will be 
retained, while the advantages of global indexes will be gained.



smime.p7s
Description: S/MIME cryptographic signature


Re: Proposal: Global Index

2021-01-08 Thread Bruce Momjian
On Fri, Jan  8, 2021 at 11:26:48AM +0800, 曾文旌 wrote:
> > On Thu, Jan  7, 2021 at 05:44:01PM +0800, 曾文旌 wrote:
> >> I've been following this topic for a long time. It's been a year since the 
> >> last response.
> >> It was clear that our customers wanted this feature as well, and a large 
> >> number of them mentioned it.
> >> 
> >> So, I wish the whole feature to mature as soon as possible.
> >> I summarized the scheme mentioned in the email and completed the POC 
> >> patch(base on PG_13).
> > 
> > I think you need to address the items mentioned in this blog, and the
> > email link it mentions:
> > 
> > https://momjian.us/main/blogs/pgblog/2020.html#July_1_2020
> 
> Thank you for your reply.
> I read your blog and it helped me a lot.
> 
> The blog mentions a specific problem: "A large global index might also 
> reintroduce problems that prompted the creation of partitioning in the first 
> place. "
> I don't quite understand, could you give some specific information?

Well, if you created partitions, you probably did so because:

1.  heap files are smaller, allowing for more targeted sequential scans
2.  smaller indexes
3.  the ability to easily drop tables/indexes that are too old

If you have global indexes, #1 is the same, but #2 is not longer true,
and for #3, you can drop the heap but the index entries still exist in
the global index and must be removed.

So, if you created partitions for one of the three reasons, once you
have global indexes, some of those advantage of partitioning are no
longer true.  I am sure there are some workloads where the advantages of
partitioning, minus the advantages lost when using global indexes, are
useful, but are there enough of them to make the feature useful?  I
don't know.

> In addition you mentioned: "It is still unclear if these use-cases justify 
> the architectural changes needed to enable global indexes."
> Please also describe the problems you see, I will confirm each specific issue 
> one by one.

Well, the email thread I linked to has a lot of them, but the
fundamental issue is that you have to break the logic that a single
index serves a single heap file.  Considering what I said above, is
there enough usefulness to warrant such an architectural change?

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Proposal: Global Index

2021-01-08 Thread Julien Rouhaud
On Fri, Jan 8, 2021 at 4:02 PM 曾文旌  wrote:
>
> > 2021年1月7日 22:16,Bruce Momjian  写道:
> >
> > On Thu, Jan  7, 2021 at 05:44:01PM +0800, 曾文旌 wrote:
> >> I've been following this topic for a long time. It's been a year since the 
> >> last response.
> >> It was clear that our customers wanted this feature as well, and a large 
> >> number of them mentioned it.
> >>
> >> So, I wish the whole feature to mature as soon as possible.
> >> I summarized the scheme mentioned in the email and completed the POC 
> >> patch(base on PG_13).
> >
> > I think you need to address the items mentioned in this blog, and the
> > email link it mentions:
> >
> >   https://momjian.us/main/blogs/pgblog/2020.html#July_1_2020
>
> Thank you for your reply.
> I read your blog and it helped me a lot.
>
> The blog mentions a specific problem: "A large global index might also 
> reintroduce problems that prompted the creation of partitioning in the first 
> place. "
> I don't quite understand, could you give some specific information?
>
> In addition you mentioned: "It is still unclear if these use-cases justify 
> the architectural changes needed to enable global indexes."
> Please also describe the problems you see, I will confirm each specific issue 
> one by one.

One example is date partitioning.  People frequently need to store
only the most recent data.  For instance doing a monthly partitioning
and dropping the oldest partition every month once you hit the wanted
retention is very efficient for that use case, as it should be almost
instant (provided that you can acquire the necessary locks
immediately).  But if you have a global index, you basically lose the
advantage of partitioning as it'll require heavy changes on that
index.




Re: Proposal: Global Index

2021-01-08 Thread 曾文旌


> 2021年1月7日 22:16,Bruce Momjian  写道:
> 
> On Thu, Jan  7, 2021 at 05:44:01PM +0800, 曾文旌 wrote:
>> I've been following this topic for a long time. It's been a year since the 
>> last response.
>> It was clear that our customers wanted this feature as well, and a large 
>> number of them mentioned it.
>> 
>> So, I wish the whole feature to mature as soon as possible.
>> I summarized the scheme mentioned in the email and completed the POC 
>> patch(base on PG_13).
> 
> I think you need to address the items mentioned in this blog, and the
> email link it mentions:
> 
>   https://momjian.us/main/blogs/pgblog/2020.html#July_1_2020

Thank you for your reply.
I read your blog and it helped me a lot.

The blog mentions a specific problem: "A large global index might also 
reintroduce problems that prompted the creation of partitioning in the first 
place. "
I don't quite understand, could you give some specific information?

In addition you mentioned: "It is still unclear if these use-cases justify the 
architectural changes needed to enable global indexes."
Please also describe the problems you see, I will confirm each specific issue 
one by one.


Thanks

Wenjing


> 
> I am not clear this is a feature we will want.  Yes, people ask for it,
> but if the experience will be bad for them and they will regret using
> it, I am not sure we want it.  Of course, if you code it up and we get
> a good user experience, we would want it --- I am just saying it is not
> clear right now.
> 
> -- 
>  Bruce Momjian  https://momjian.us
>  EnterpriseDB https://enterprisedb.com
> 
>  The usefulness of a cup is in its emptiness, Bruce Lee



smime.p7s
Description: S/MIME cryptographic signature


Re: Proposal: Global Index

2021-01-07 Thread Robert Haas
On Thu, Jan 7, 2021 at 4:44 AM 曾文旌  wrote:
> I've been following this topic for a long time. It's been a year since the 
> last response.
> It was clear that our customers wanted this feature as well, and a large 
> number of them mentioned it.
>
> So, I wish the whole feature to mature as soon as possible.
> I summarized the scheme mentioned in the email and completed the POC 
> patch(base on PG_13).

You need to summarize the basic design choices you've made here. Like,
what did you do about the fact that TIDs have to be 6 bytes?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Proposal: Global Index

2021-01-07 Thread Bruce Momjian
On Thu, Jan  7, 2021 at 05:44:01PM +0800, 曾文旌 wrote:
> I've been following this topic for a long time. It's been a year since the 
> last response.
> It was clear that our customers wanted this feature as well, and a large 
> number of them mentioned it.
> 
> So, I wish the whole feature to mature as soon as possible.
> I summarized the scheme mentioned in the email and completed the POC 
> patch(base on PG_13).

I think you need to address the items mentioned in this blog, and the
email link it mentions:

https://momjian.us/main/blogs/pgblog/2020.html#July_1_2020

I am not clear this is a feature we will want.  Yes, people ask for it,
but if the experience will be bad for them and they will regret using
it, I am not sure we want it.  Of course, if you code it up and we get
a good user experience, we would want it --- I am just saying it is not
clear right now.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Proposal: Global Index

2021-01-07 Thread 曾文旌
I've been following this topic for a long time. It's been a year since the last response.It was clear that our customers wanted this feature as well, and a large number of them mentioned it.So, I wish the whole feature to mature as soon as possible.I summarized the scheme mentioned in the email and completed the POC patch(base on PG_13).Next, I encountered some difficulties when implementing the DDL of the partition table with global index, and I hope to get some help from the communityHere are some details what has been implemented1 Definition of global indexUsing the INCLUDE keyword to include the tableoid of the partitioned table.2. Maintenance of global index by partition table DML.Both INSERT and UPDATE of a partitioned table maintain global index3. Global index scanPlanner: Processes predicate conditions on the primary partition, generating paths and plans for the global index.Executer: index scan get indextup, get the tableoid from indextup, and verify the visibility of the data in the partition.4. Vacuum partition table maintains global indexEach partitioned table VACUUM cleans its own garbage data in the global index.After the above function point is completed, the global index can be used without partition table DDL.Demo:--Use pgbench to create the test partition tablepgbench -i -s 1000 --partitions=6 --partition-method=range—- create global index on bid, bid is not partition keyCREATE INDEX  idx_pgbench_accounts_bid on pgbench_accounts(bid) global;— check global index statusselect * , sum(alivetup) over()as sum_alivetup, sum(deadtup) over() as sum_deadtup from bt_get_global_index_status('idx_pgbench_accounts_bid');      relname       | alivetup | deadtup | sum_alivetup | sum_deadtup +--+-+--+- pgbench_accounts_1 | 1667 |       0 |    1 |           0 pgbench_accounts_2 | 1667 |       0 |    1 |           0 pgbench_accounts_3 | 1667 |       0 |    1 |           0 pgbench_accounts_4 | 1667 |       0 |    1 |           0 pgbench_accounts_5 | 1667 |       0 |    1 |           0 pgbench_accounts_6 | 1665 |       0 |    1 |           0(6 rows)— run pgbench for for a whilepgbench -M prepared  -j 32 -c 32 -T 60 -P1—- check global index, The index has bloatedpostgres=# select * , sum(alivetup) over()as sum_alivetup, sum(deadtup) over() as sum_deadtup from bt_get_global_index_status('idx_pgbench_accounts_bid');      relname       | alivetup | deadtup | sum_alivetup | sum_deadtup +--+-+--+- pgbench_accounts_1 | 16717733 |       0 |    100306102 |           0 pgbench_accounts_2 | 16717409 |       0 |    100306102 |           0 pgbench_accounts_3 | 16717540 |       0 |    100306102 |           0 pgbench_accounts_4 | 16717972 |       0 |    100306102 |           0 pgbench_accounts_5 | 16717578 |       0 |    100306102 |           0 pgbench_accounts_6 | 16717870 |       0 |    100306102 |           0(6 rows)—- vacuum partition tablevacuum pgbench_accounts;—- Garbage is collected, global index still looks correct and valid.postgres=# select * , sum(alivetup) over()as sum_alivetup, sum(deadtup) over() as sum_deadtup from bt_get_global_index_status('idx_pgbench_accounts_bid');      relname       | alivetup | deadtup | sum_alivetup | sum_deadtup +--+-+--+- pgbench_accounts_1 | 1667 |       0 |    1 |           0 pgbench_accounts_2 | 1667 |       0 |    1 |           0 pgbench_accounts_3 | 1667 |       0 |    1 |           0 pgbench_accounts_4 | 1667 |       0 |    1 |           0 pgbench_accounts_5 | 1667 |       0 |    1 |           0 pgbench_accounts_6 | 1665 |       0 |    1 |           0(6 rows)—-—- global index scan works wellpostgres=# select tableoid ,count(*) from pgbench_accounts where bid = 834 group by tableoid; tableoid | count --+---    16455 | 5    16458 | 5(2 rows)postgres=# explain select tableoid ,count(*) from pgbench_accounts where bid = 834 group by tableoid;                                                     QUERY PLAN                                                      HashAggregate  (cost=2945.23..2945.24 rows=1 width=12)   Group Key: pgbench_accounts.tableoid   ->  Global Index Scan using idx_pgbench_accounts_bid on pgbench_accounts  (cost=0.50..10.18 rows=587011 width=4)         Index Cond: (bid = 834)(4 rows)The following is how to implement DDL of global index. How to maintain global index of DDL of partitioned table.This seems to be more difficult than the previous work.I understand there are four main parts1 Build global index or reindex, especially in concurrent mode2 Detach partitionWould it be a good idea to make a flag to global 

Re: Proposal: Global Index

2019-12-19 Thread Bruce Momjian
On Thu, Dec 19, 2019 at 11:28:55AM -0800, Jeremy Schneider wrote:
> On 12/19/19 08:12, Bruce Momjian wrote:
> > I don't see lossy BRIN indexes helping with the uniqueness use-case, and
> > I am not sure they would help with the rare case either.  They would
> > help for range-based partitions, but I thought our existing facilities
> > worked in that case.
> 
> Correlated data.  The existing facilities work if the filtering column
> is exactly the same as the partition column.  But it's not at all
> uncommon to have other columns with correlated data, perhaps the most
> obvious of which is timeseries tables with many date columns of various
> definitions (row first update, row latest update, invoice date, payment
> date, process date, ship date, etc).
> 
> What if you could use *two* indexes in a single execution plan?  Use the
> global BRIN to narrow down to 2 or 3 out of a hundred or more
> partitions, then use local indexes to find specific rows in the
> partitions of interest?  That might work, without being too overly
> complicated.

No, that is very interesting --- having secondary indexes for
partitioned tables that trim most partitions.  Would index lookups on
each partition index be very slow?  BRIN indexes?  I am assuming global
indexes would only avoid such lookups.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +




Re: Proposal: Global Index

2019-12-19 Thread Bruce Momjian
On Thu, Dec 19, 2019 at 09:48:40AM +0100, Jose Luis Tallon wrote:
> On 19/12/19 4:03, Bruce Momjian wrote:
> > On Mon, Nov 25, 2019 at 03:44:39PM -0800, Jeremy Schneider wrote:
> > > On 11/25/19 15:05, Jeremy Schneider wrote:
> > > > ... the cost of doing the individual index lookups across 180
> > > > partitions (and 180 indexes) was very high, so they stored max and min
> > > > txn id per partition and would generate a query with all the dates that
> > > > a txn id could have been in so that only a small number of partition
> > > > indexes would be accessed.
> > > > 
> > > > .. If we are looking for higher concurrency, we can usually
> > > > add a hack/workaround that filters on a partition key to provide “pretty
> > > > good” pruning.  The net result is that you get 2-3x the IO due to the
> > > > lack of global index (same workaround as first story above).
> > > Is that basically like a global BRIN index with granularity at the
> > > partition level?
> > Exactly!  :-)
> 
> Actually, one "kind of" BRIN index *per partitioned table* mapping (key
> range) -> (partition oid)... and so concurrency doesn't need to be very
> affected.
> 
> (we don't need to do things just like other RDBMS do, ya know... ;)
> 
> 
> IIRC, this precise approach was suggested around 2016 when initially
> discussing the "declarative partitioning" which originated Postgres' current
> partitioning scheme, in order to optimize partition pruning.

Robert Haas identified two needs for global indexes:


https://www.postgresql.org/message-id/ca+tgmob_j2m2+qkwrhg2njqekmewzntfd7a6ubg34fjuzpk...@mail.gmail.com

One of the biggest reasons why people want it is to enforce uniqueness
for secondary keys - e.g. the employees table is partitioned by
employee ID, but SSN should also be unique, at least among employees
for whom it's not NULL.

But people also want it for faster data retrieval: if you're looking
for a commonly-occurring value, an index per partition is fine. But if
you're looking for values that occur only once or a few times across
the whole hierarchy, an index scan per partition is very costly.

I don't see lossy BRIN indexes helping with the uniqueness use-case, and
I am not sure they would help with the rare case either.  They would
help for range-based partitions, but I thought our existing facilities
worked in that case.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +




Re: Proposal: Global Index

2019-12-19 Thread Jose Luis Tallon

On 19/12/19 4:03, Bruce Momjian wrote:

On Mon, Nov 25, 2019 at 03:44:39PM -0800, Jeremy Schneider wrote:

On 11/25/19 15:05, Jeremy Schneider wrote:

... the cost of doing the individual index lookups across 180
partitions (and 180 indexes) was very high, so they stored max and min
txn id per partition and would generate a query with all the dates that
a txn id could have been in so that only a small number of partition
indexes would be accessed.

.. If we are looking for higher concurrency, we can usually
add a hack/workaround that filters on a partition key to provide “pretty
good” pruning.  The net result is that you get 2-3x the IO due to the
lack of global index (same workaround as first story above).

Is that basically like a global BRIN index with granularity at the
partition level?

Exactly!  :-)


Actually, one "kind of" BRIN index *per partitioned table* mapping (key 
range) -> (partition oid)... and so concurrency doesn't need to be very 
affected.


(we don't need to do things just like other RDBMS do, ya know... ;)


IIRC, this precise approach was suggested around 2016 when initially 
discussing the "declarative partitioning" which originated Postgres' 
current partitioning scheme, in order to optimize partition pruning.



Just my .02€

    / J.L.






Re: Proposal: Global Index

2019-12-18 Thread Bruce Momjian
On Mon, Nov 25, 2019 at 03:44:39PM -0800, Jeremy Schneider wrote:
> On 11/25/19 15:05, Jeremy Schneider wrote:
> > ... the cost of doing the individual index lookups across 180
> > partitions (and 180 indexes) was very high, so they stored max and min
> > txn id per partition and would generate a query with all the dates that
> > a txn id could have been in so that only a small number of partition
> > indexes would be accessed. 
> > 
> > .. If we are looking for higher concurrency, we can usually
> > add a hack/workaround that filters on a partition key to provide “pretty
> > good” pruning.  The net result is that you get 2-3x the IO due to the
> > lack of global index (same workaround as first story above).
> 
> Is that basically like a global BRIN index with granularity at the
> partition level?

Exactly!  :-)

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +




Re: Proposal: Global Index

2019-10-31 Thread Tomas Vondra

On Thu, Oct 31, 2019 at 03:02:40PM -0400, Isaac Morland wrote:

On Thu, 31 Oct 2019 at 14:50, Stephen Frost  wrote:


Greetings,

* Peter Geoghegan (p...@bowt.ie) wrote:


[]



Absolutely- our lack of such is a common point of issue when folks are
considering using or migrating to PostgreSQL.



Not sure how similar my situation really is, but I find myself wanting to
have indices that cross non-partition members of an inheritance hierarchy:

create table t (
   id int,
   primary key (id)
);

create table t1 (
   a text
) inherits (t);

create table t2 (
   b int,
   c int
) inherits (t);

So "t"s are identified by an integer; and one kind of "t" has a single text
attribute while a different kind of "t" has 2 int attributes. The idea is
that there is a single primary key constraint on the whole hierarchy that
ensures only one record with a particular id can exist in all the tables
together. I can imagine wanting to do this with other unique constraints
also.



IMO the chances of us supporting global indexes with generic inheritance
hierarchies are about zero. We don't even support creating "partition"
indexes on those hierarchies ...


At present I don't actually use inheritance; instead I put triggers on the
child tables that do an insert on the parent table, which has the effect of
enforcing the uniqueness I want.


Does it? Are you sure it actually works in READ COMMITTED? What exactly
does the trigger do?

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Proposal: Global Index

2019-10-31 Thread Isaac Morland
On Thu, 31 Oct 2019 at 14:50, Stephen Frost  wrote:

> Greetings,
>
> * Peter Geoghegan (p...@bowt.ie) wrote:
>
[]

>
> Absolutely- our lack of such is a common point of issue when folks are
> considering using or migrating to PostgreSQL.
>

Not sure how similar my situation really is, but I find myself wanting to
have indices that cross non-partition members of an inheritance hierarchy:

create table t (
id int,
primary key (id)
);

create table t1 (
a text
) inherits (t);

create table t2 (
b int,
c int
) inherits (t);

So "t"s are identified by an integer; and one kind of "t" has a single text
attribute while a different kind of "t" has 2 int attributes. The idea is
that there is a single primary key constraint on the whole hierarchy that
ensures only one record with a particular id can exist in all the tables
together. I can imagine wanting to do this with other unique constraints
also.

At present I don't actually use inheritance; instead I put triggers on the
child tables that do an insert on the parent table, which has the effect of
enforcing the uniqueness I want.


Re: Proposal: Global Index

2019-10-31 Thread Stephen Frost
Greetings,

* Peter Geoghegan (p...@bowt.ie) wrote:
> On Wed, Oct 30, 2019 at 9:23 AM Tom Lane  wrote:
> > Well, the *effects* of the feature seem desirable, but that doesn't
> > mean that we want an implementation that actually has a shared index.
> > As soon as you do that, you've thrown away most of the benefits of
> > having a partitioned data structure in the first place.
> 
> Right, but that's only the case for the global index. Global indexes
> are useful when used judiciously. They enable the use of partitioning
> for use cases where not being able to enforce uniqueness across all
> partitions happens to be a deal breaker. I bet that this is fairly
> common.

Absolutely- our lack of such is a common point of issue when folks are
considering using or migrating to PostgreSQL.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Proposal: Global Index

2019-10-30 Thread Andres Freund
Hi,

On 2019-10-30 13:05:57 -0400, Tom Lane wrote:
> Peter Geoghegan  writes:
> > On Wed, Oct 30, 2019 at 9:23 AM Tom Lane  wrote:
> >> Well, the *effects* of the feature seem desirable, but that doesn't
> >> mean that we want an implementation that actually has a shared index.
> >> As soon as you do that, you've thrown away most of the benefits of
> >> having a partitioned data structure in the first place.
> 
> > Right, but that's only the case for the global index. Global indexes
> > are useful when used judiciously.
> 
> But ... why bother with partitioning then?  To me, the main reasons
> why you might want a partitioned table are

Quite commonly there's a lot of *other* indexes, often on a lot wider
data than the primary key, that don't need to be global. And whereas in
a lot of cases the primary key in a partitioned table has pretty good
locality (and thus will be mostly buffered IO), other indexes will often
not have that property (i.e. not have much correlation with table
position).


> * ability to cheaply add and remove partitions, primarily so that
> you can cheaply do things like "delete the oldest month's data".

You can still do that to some degree with a global index. Imagine
e.g. keeping a 'partition id' as a sort-of column in the global
index. That allows you to drop the partition, without having to
immediately rebuild the index, by checking the partition id against the
live partitions during lookup.  So sure, your'e wasting space for a bit
in the global index, but it'll also be space that is likely to be fairly
efficiently reclaimed the next time vacuum touches the index.  And if
not the global index can be rebuilt concurrently without blocking
writes.


> * ability to scale past our limits on the physical size of one table
> --- both the hard BlockNumber-based limit, and the performance
> constraints of e.g. vacuuming a very large table.

For that to be a problem for a global index the global index (which will
often be something like two int4 or int8 columns) itself would need to
be above the block number based limit - which doesn't seem that close.

WRT vacuuming - based on my observations the table itself isn't a
performance problem for vacuuming all that commonly anymore, it's the
associated index scans. So yea, that's a problem.



Greetings,

Andres Freund




Re: Proposal: Global Index

2019-10-30 Thread Tom Lane
Peter Geoghegan  writes:
> On Wed, Oct 30, 2019 at 9:23 AM Tom Lane  wrote:
>> Well, the *effects* of the feature seem desirable, but that doesn't
>> mean that we want an implementation that actually has a shared index.
>> As soon as you do that, you've thrown away most of the benefits of
>> having a partitioned data structure in the first place.

> Right, but that's only the case for the global index. Global indexes
> are useful when used judiciously.

But ... why bother with partitioning then?  To me, the main reasons
why you might want a partitioned table are

* ability to cheaply add and remove partitions, primarily so that
you can cheaply do things like "delete the oldest month's data".

* ability to scale past our limits on the physical size of one table
--- both the hard BlockNumber-based limit, and the performance
constraints of e.g. vacuuming a very large table.

Both of those go out the window with a global index.  So you might
as well just have one table and forget all the overhead.

regards, tom lane




Re: Proposal: Global Index

2019-10-30 Thread Peter Geoghegan
On Wed, Oct 30, 2019 at 9:23 AM Tom Lane  wrote:
> Well, the *effects* of the feature seem desirable, but that doesn't
> mean that we want an implementation that actually has a shared index.
> As soon as you do that, you've thrown away most of the benefits of
> having a partitioned data structure in the first place.

Right, but that's only the case for the global index. Global indexes
are useful when used judiciously. They enable the use of partitioning
for use cases where not being able to enforce uniqueness across all
partitions happens to be a deal breaker. I bet that this is fairly
common.

> No, I don't have an idea how we might support, eg, uniqueness of
> non-partition-key columns without that.  But we need to spend our
> effort on figuring that out, not on building a complicated mechanism
> whose performance is never going to not suck.

I don't think that there is a way to solve the problem that doesn't
look very much like a global index. Also, being able to push down a
partition number when scanning a global index seems like it would be
very compelling in some scenarios.

I'm a bit worried about the complexity that will need to be added to
nbtree to make global indexes work, but it's probably possible to come
up with something that isn't too bad. GIN already uses an
implementation level attribute number column for multi-column GIN
indexes, which is a little like what Ibrar has in mind. The really
complicated new code required for global indexes will be in places
like vacuumlazy.c.


--
Peter Geoghegan




Re: Proposal: Global Index

2019-10-30 Thread Tom Lane
Robert Haas  writes:
> On Wed, Oct 30, 2019 at 10:13 AM Tom Lane  wrote:
>> I believe that the current design of partitioning is explicitly intended
>> to avoid the need for such a construct.  It'd be absolutely disastrous
>> to have such a thing from many standpoints, including the breadth of
>> locking needed to work with the global index, the difficulty of vacuuming,
>> and the impossibility of cheaply attaching or detaching partitions.
>> In other words, this is a "feature" we do not want.

> I don't think that's true. Certainly, a lot of EnterpriseDB customers
> want this feature - it comes up regularly in discussions here. But
> that is not to say that the technical challenges are not formidable,
> and I don't think this proposal really grapples with any of them.
> However, that doesn't mean that the feature isn't desirable.

Well, the *effects* of the feature seem desirable, but that doesn't
mean that we want an implementation that actually has a shared index.
As soon as you do that, you've thrown away most of the benefits of
having a partitioned data structure in the first place.

No, I don't have an idea how we might support, eg, uniqueness of
non-partition-key columns without that.  But we need to spend our
effort on figuring that out, not on building a complicated mechanism
whose performance is never going to not suck.

regards, tom lane




Re: Proposal: Global Index

2019-10-30 Thread Robert Haas
On Wed, Oct 30, 2019 at 10:13 AM Tom Lane  wrote:
> I believe that the current design of partitioning is explicitly intended
> to avoid the need for such a construct.  It'd be absolutely disastrous
> to have such a thing from many standpoints, including the breadth of
> locking needed to work with the global index, the difficulty of vacuuming,
> and the impossibility of cheaply attaching or detaching partitions.
>
> In other words, this is a "feature" we do not want.

I don't think that's true. Certainly, a lot of EnterpriseDB customers
want this feature - it comes up regularly in discussions here. But
that is not to say that the technical challenges are not formidable,
and I don't think this proposal really grapples with any of them.
However, that doesn't mean that the feature isn't desirable.

One of the biggest reasons why people want it is to enforce uniqueness
for secondary keys - e.g. the employees table is partitioned by
employee ID, but SSN should also be unique, at least among employees
for whom it's not NULL.

But people also want it for faster data retrieval: if you're looking
for a commonly-occurring value, an index per partition is fine. But if
you're looking for values that occur only once or a few times across
the whole hierarchy, an index scan per partition is very costly.
Consider, e.g.:

Nested Loop
-> Seq Scan
-> Append
  -> Index Scan on each_partition

You don't have to have very many partitions for that to suck, and it's
a thing that people want to do. Runtime partition pruning helps with
this case a lot, but, once again, only for the primary key. Secondary
keys are a big problem for partitioning today, in many ways.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Proposal: Global Index

2019-10-30 Thread Tom Lane
Ibrar Ahmed  writes:
> A global index by very definition is a single index on the parent table
> that maps to many
> underlying table partitions.

I believe that the current design of partitioning is explicitly intended
to avoid the need for such a construct.  It'd be absolutely disastrous
to have such a thing from many standpoints, including the breadth of
locking needed to work with the global index, the difficulty of vacuuming,
and the impossibility of cheaply attaching or detaching partitions.

In other words, this is a "feature" we do not want.

regards, tom lane