Re: [HACKERS] Proposal: global index

2017-08-25 Thread Robert Haas
On Fri, Aug 25, 2017 at 6:52 AM, Ildar Musin  wrote:
> I agree with you that garbage collection after partitions drop could be a
> major downside of single index scheme. On the other hand not all
> partitioning use-cases imply dropping partitions. What worries me about
> global unique index built on multiple local indexes is the need to lookup
> (almost) every index for every insert/update/FK check. In some cases we can
> reduce the number of the indexes to be checked (e.g. by storing min/max
> values in metapage), but it will not be possible if key values are spread
> across indexes evenly. And it can get quite expensive as partition count
> grows.

+1.  I think that in the end we probably need both things for
different use cases.  Some people are going to want 1000 partitions
(or, if they can get away with it, 100,000 partitions) and be able to
do lookups on a secondary key without searching O(n) indexes.  Other
people are going to want partitioned indexes so that they can drop
them quickly, vacuum them quickly, etc.  I don't see anything wrong
with eventually offering both things.

I do think that it might be premature to work on this without solving
some of the other problems in this area first.  I think a good first
step would be to solve all the problems with declaring an index on a
parent table and having it cascade down to all children - i.e. a
partitioned index - cf.
https://www.postgresql.org/message-id/c8fe4f6b-ff46-aae0-89e3-e936a35f0...@postgrespro.ru
- and then work on the problems associated with defining foreign keys
reference such an index (e.g. in the case where the index matches the
partitioning key, or using the technique Andres describes) - and only
then do what you're proposing here, once all of those preliminaries
have been sorted out.  Otherwise, I fear that this patch will get
tangled up in a lot of issues that are really separate concerns.

JD is quite right that there are a lot of things about partitioning
that need to be improved from where we are today, but I think it's
important that we're a bit methodical about how we do that so that we
don't end up with a mess.  We're not going accept quick hacks in
related areas just to get global indexes; all of the issues about how
global indexes interact with the SQL syntax, foreign key constraints,
partitioned indexes, etc. need to be well-sorted out before we accept
a patch for global indexes.  It will be easiest, I think, to sort
those things out first and add this at the end.  That doesn't mean
that development can't be done concurrently, but I think what you're
likely to find is that getting the actual index machinery to do what
you want is a job and a half by itself without burdening the same
patch with anything additional.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: global index

2017-08-25 Thread Chris Travers
On Fri, Aug 25, 2017 at 12:15 PM, Petr Jelinek  wrote:

> On 25/08/17 10:28, Chris Travers wrote:
> >
> >
> > On Thu, Aug 24, 2017 at 9:44 PM, Andres Freund  > > wrote:
> >
> > Hi,
> >
> > On 2017-08-18 12:12:58 +0300, Ildar Musin wrote:
> > > While we've been developing pg_pathman extension one of the most
> frequent
> > > questions we got from our users was about global index support. We
> cannot
> > > provide it within an extension. And I couldn't find any recent
> discussion
> > > about someone implementing it. So I'm thinking about giving it a
> shot and
> > > start working on a patch for postgres.
> >
> > FWIW, I personally think for constraints the better approach is to
> make
> > the constraint checking code cope with having to check multiple
> > indexes. Initially by just checking all indexes, over the longer term
> > perhaps pruning the set of to-be-checked indexes based on the values
> in
> > the partition key if applicable.   The problem with creating huge
> global
> > indexes is that you give away some the major advantages of
> partitioning:
> > - dropping partitions now is slow / leaves a lof of garbage again
> > - there's no way you can do this with individual partitions being
> remote
> >   or such
> > - there's a good chunk of locality loss in global indexes
> >
> > The logic we have for exclusion constraints checking can essentially
> be
> > extended to do uniqueness checking over multiple partitions.
> Depending
> > on the desired deadlock behaviour one might end up doing speculative
> > insertions in addition.  The foreign key constraint checking is
> fairly
> > simple, essentially one "just" need to remove the ONLY from the
> > generated check query.
> >
>
> +1 (or +as much as I am allowed to get away with really ;) )
>
> >
> > To be clear, this would still require a high-level concept of a global
> > index and the only question is whether it gets stored as multiple
> > partitions against partitioned tables vs stored in one giant index,
> right?
> >
> No, just global constraints. For example, if you consider unique index
> to be implementation detail of a unique constraint, there is nothing
> stopping us to use multiple such indexes (one per partition) as
> implementation detail to single global unique constraint. No need for
> global index at all.
>

Ok so in this case a global constraint needs to track partitioned indexes,
right?

How does this differ, in practice from a "global but partitioned index?"
 This seems like splitting hairs but I am trying to see if there is
disagreement here that goes beyond language.

For example, could I have a global, partial unique constraint the way I can
do things with a single table currently (something like create global
unique index foo_id_idxuf on foo(Id) where id > 12345)?  Is this something
the discussion here would foreclose?

It seems to me that you can get both of these (and more) by adding the
concept of a global index which means:
1.  Index is on parent table
2.  Index is inherited to child tables and managed on parent.
3.  Writes to children that hit inherited unique index ALSO must check
(with exclusion constraints etc) ALL other tables in the inheritance tree
of the index.

That would also have a few important side benefits:
1.  Easier management of indexes where all partitions (or other children
since there are other uses for table inheritance than partitioning) must be
indexed
2.  Ability to have partial unique indexes enforced consistently across an
inheritance tree.

An alternative might be to generalise partial unique indexes into partial
unique constraints. (alter table foo add unique (bar) where id > 12345)


>
> --
>   Petr Jelinek  http://www.2ndQuadrant.com/
>   PostgreSQL Development, 24x7 Support, Training & Services
>



-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin


Re: [HACKERS] Proposal: global index

2017-08-25 Thread Ildar Musin

Hi,

On 24.08.2017 22:44, Andres Freund wrote:

Hi,

On 2017-08-18 12:12:58 +0300, Ildar Musin wrote:

While we've been developing pg_pathman extension one of the most frequent
questions we got from our users was about global index support. We cannot
provide it within an extension. And I couldn't find any recent discussion
about someone implementing it. So I'm thinking about giving it a shot and
start working on a patch for postgres.


FWIW, I personally think for constraints the better approach is to make
the constraint checking code cope with having to check multiple
indexes. Initially by just checking all indexes, over the longer term
perhaps pruning the set of to-be-checked indexes based on the values in
the partition key if applicable.   The problem with creating huge global
indexes is that you give away some the major advantages of partitioning:
- dropping partitions now is slow / leaves a lof of garbage again
- there's no way you can do this with individual partitions being remote
  or such
- there's a good chunk of locality loss in global indexes



I agree with you that garbage collection after partitions drop could be 
a major downside of single index scheme. On the other hand not all 
partitioning use-cases imply dropping partitions. What worries me about 
global unique index built on multiple local indexes is the need to 
lookup (almost) every index for every insert/update/FK check. In some 
cases we can reduce the number of the indexes to be checked (e.g. by 
storing min/max values in metapage), but it will not be possible if key 
values are spread across indexes evenly. And it can get quite expensive 
as partition count grows.


The good thing about multiple indexes is that they are more compact and 
manageable.


--
Ildar Musin
i.mu...@postgrespro.ru


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: global index

2017-08-25 Thread Petr Jelinek
On 25/08/17 10:28, Chris Travers wrote:
> 
> 
> On Thu, Aug 24, 2017 at 9:44 PM, Andres Freund  > wrote:
> 
> Hi,
> 
> On 2017-08-18 12:12:58 +0300, Ildar Musin wrote:
> > While we've been developing pg_pathman extension one of the most 
> frequent
> > questions we got from our users was about global index support. We 
> cannot
> > provide it within an extension. And I couldn't find any recent 
> discussion
> > about someone implementing it. So I'm thinking about giving it a shot 
> and
> > start working on a patch for postgres.
> 
> FWIW, I personally think for constraints the better approach is to make
> the constraint checking code cope with having to check multiple
> indexes. Initially by just checking all indexes, over the longer term
> perhaps pruning the set of to-be-checked indexes based on the values in
> the partition key if applicable.   The problem with creating huge global
> indexes is that you give away some the major advantages of partitioning:
> - dropping partitions now is slow / leaves a lof of garbage again
> - there's no way you can do this with individual partitions being remote
>   or such
> - there's a good chunk of locality loss in global indexes
> 
> The logic we have for exclusion constraints checking can essentially be
> extended to do uniqueness checking over multiple partitions. Depending
> on the desired deadlock behaviour one might end up doing speculative
> insertions in addition.  The foreign key constraint checking is fairly
> simple, essentially one "just" need to remove the ONLY from the
> generated check query.
> 

+1 (or +as much as I am allowed to get away with really ;) )

> 
> To be clear, this would still require a high-level concept of a global
> index and the only question is whether it gets stored as multiple
> partitions against partitioned tables vs stored in one giant index, right?
> 
No, just global constraints. For example, if you consider unique index
to be implementation detail of a unique constraint, there is nothing
stopping us to use multiple such indexes (one per partition) as
implementation detail to single global unique constraint. No need for
global index at all.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: global index

2017-08-25 Thread Chris Travers
On Thu, Aug 24, 2017 at 9:44 PM, Andres Freund  wrote:

> Hi,
>
> On 2017-08-18 12:12:58 +0300, Ildar Musin wrote:
> > While we've been developing pg_pathman extension one of the most frequent
> > questions we got from our users was about global index support. We cannot
> > provide it within an extension. And I couldn't find any recent discussion
> > about someone implementing it. So I'm thinking about giving it a shot and
> > start working on a patch for postgres.
>
> FWIW, I personally think for constraints the better approach is to make
> the constraint checking code cope with having to check multiple
> indexes. Initially by just checking all indexes, over the longer term
> perhaps pruning the set of to-be-checked indexes based on the values in
> the partition key if applicable.   The problem with creating huge global
> indexes is that you give away some the major advantages of partitioning:
> - dropping partitions now is slow / leaves a lof of garbage again
> - there's no way you can do this with individual partitions being remote
>   or such
> - there's a good chunk of locality loss in global indexes
>
> The logic we have for exclusion constraints checking can essentially be
> extended to do uniqueness checking over multiple partitions. Depending
> on the desired deadlock behaviour one might end up doing speculative
> insertions in addition.  The foreign key constraint checking is fairly
> simple, essentially one "just" need to remove the ONLY from the
> generated check query.
>


To be clear, this would still require a high-level concept of a global
index and the only question is whether it gets stored as multiple
partitions against partitioned tables vs stored in one giant index, right?

Also for foreign key constraints, does it make sense, for
backwards-compatibility reasons to introduce a new syntax for checking all
child tables?



>
> Greetings,
>
> Andres Freund
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin


Re: [HACKERS] Proposal: global index

2017-08-24 Thread Andres Freund
Hi,

On 2017-08-18 12:12:58 +0300, Ildar Musin wrote:
> While we've been developing pg_pathman extension one of the most frequent
> questions we got from our users was about global index support. We cannot
> provide it within an extension. And I couldn't find any recent discussion
> about someone implementing it. So I'm thinking about giving it a shot and
> start working on a patch for postgres.

FWIW, I personally think for constraints the better approach is to make
the constraint checking code cope with having to check multiple
indexes. Initially by just checking all indexes, over the longer term
perhaps pruning the set of to-be-checked indexes based on the values in
the partition key if applicable.   The problem with creating huge global
indexes is that you give away some the major advantages of partitioning:
- dropping partitions now is slow / leaves a lof of garbage again
- there's no way you can do this with individual partitions being remote
  or such
- there's a good chunk of locality loss in global indexes

The logic we have for exclusion constraints checking can essentially be
extended to do uniqueness checking over multiple partitions. Depending
on the desired deadlock behaviour one might end up doing speculative
insertions in addition.  The foreign key constraint checking is fairly
simple, essentially one "just" need to remove the ONLY from the
generated check query.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: global index

2017-08-24 Thread Joshua D. Drake

On 08/24/2017 10:52 AM, Adam Brusselback wrote:
My understanding is that global indexes allow foreign keys to work 
naturally with partitioned tables, or tables in an inheritance 
hierarchy.  That is pretty big IMO, as it allows you to partition a 
table without making a trade-off in your database integrity.


It is, in fact the reason that even with 10 we don't really have 
partitioning as much as syntactical sugar for partitioning. (Not trying 
to take away from that, having the syntactical sugar is a huge first step).


JD


--
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc

PostgreSQL Centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://pgconf.us
* Unless otherwise stated, opinions are my own.   *


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: global index

2017-08-24 Thread Adam Brusselback
My understanding is that global indexes allow foreign keys to work
naturally with partitioned tables, or tables in an inheritance hierarchy.
That is pretty big IMO, as it allows you to partition a table without
making a trade-off in your database integrity.


Re: [HACKERS] Proposal: global index

2017-08-23 Thread Chris Travers
On Aug 21, 2017 07:47, "Simon Riggs"  wrote:

On 18 August 2017 at 15:40, Alvaro Herrera  wrote:
> Ildar Musin wrote:
>
>> While we've been developing pg_pathman extension one of the most frequent
>> questions we got from our users was about global index support. We cannot
>> provide it within an extension. And I couldn't find any recent discussion
>> about someone implementing it. So I'm thinking about giving it a shot and
>> start working on a patch for postgres.
>>
>> One possible solution is to create an extended version of item pointer
which
>> would store relation oid along with block number and position:
>
> I've been playing with the index code in order to allow indirect tuples,
> which are stored in a format different from IndexTupleData.
>
> I've been adding an "InMemoryIndexTuple" (maybe there's a better name)
> which internally has pointers to both IndexTupleData and
> IndirectIndexTupleData, which makes it easier to pass around the index
> tuple in either format.

> It's very easy to add an OID to that struct,
> which then allows to include the OID in either an indirect index tuple
> or a regular one.

If there is a unique index then there is no need for that. Additional
data to the index makes it even bigger and even less useful, so we
need to count that as a further disadvantage of global indexes.

I have a very clear statement from a customer recently that "We will
never use global indexes", based upon their absolute uselessness in
Oracle.


It is worth noting that the only use case I can see where global indexes
fill a functionality gap are with unique indexes which allow you to enforce
uniqueness across an inheritance tree where the uniqueness is orthogonal to
any partition key.

I could find large numbers of uses for that.  That could also allow
referential integrity to check against a root table rather than force
partition explosion.Will

Otherwise the following mostly works:

Create table (like foo including all) inherits (foo);

So the gap this addresses is very real even if it is narrow.


> Then, wherever we're using IndexTupleData in the index AM code, we would
> replace it with InMemoryIndexTuple.  This should satisfy both your use
> case and mine.

Global indexes are a subset of indirect indexes use case but luckily
not the only use.

--
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: global index

2017-08-20 Thread Simon Riggs
On 18 August 2017 at 15:40, Alvaro Herrera  wrote:
> Ildar Musin wrote:
>
>> While we've been developing pg_pathman extension one of the most frequent
>> questions we got from our users was about global index support. We cannot
>> provide it within an extension. And I couldn't find any recent discussion
>> about someone implementing it. So I'm thinking about giving it a shot and
>> start working on a patch for postgres.
>>
>> One possible solution is to create an extended version of item pointer which
>> would store relation oid along with block number and position:
>
> I've been playing with the index code in order to allow indirect tuples,
> which are stored in a format different from IndexTupleData.
>
> I've been adding an "InMemoryIndexTuple" (maybe there's a better name)
> which internally has pointers to both IndexTupleData and
> IndirectIndexTupleData, which makes it easier to pass around the index
> tuple in either format.

> It's very easy to add an OID to that struct,
> which then allows to include the OID in either an indirect index tuple
> or a regular one.

If there is a unique index then there is no need for that. Additional
data to the index makes it even bigger and even less useful, so we
need to count that as a further disadvantage of global indexes.

I have a very clear statement from a customer recently that "We will
never use global indexes", based upon their absolute uselessness in
Oracle.

> Then, wherever we're using IndexTupleData in the index AM code, we would
> replace it with InMemoryIndexTuple.  This should satisfy both your use
> case and mine.

Global indexes are a subset of indirect indexes use case but luckily
not the only use.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: global index

2017-08-19 Thread Alvaro Herrera
Ildar Musin wrote:

> I found the thread about indirect indexes
> (https://www.postgresql.org/message-id/20161018182843.xczrxsa2yd47pnru%40alvherre.pgsql),
> but it seems that it haven't been updated for some time and I couldn't
> find InMemoryIndexTuple in the latest patch. Is there a newer version?
> Generally I think this may be a good idea.

Nope, I haven't posted a version with the InMemoryIndexTuple stuff yet.
I'm working on it.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: global index

2017-08-19 Thread Ildar Musin


18/08/2017 17:40, Alvaro Herrera пишет:
> Ildar Musin wrote:
>
>> While we've been developing pg_pathman extension one of the most frequent
>> questions we got from our users was about global index support. We cannot
>> provide it within an extension. And I couldn't find any recent discussion
>> about someone implementing it. So I'm thinking about giving it a shot and
>> start working on a patch for postgres.
>>
>> One possible solution is to create an extended version of item pointer which
>> would store relation oid along with block number and position:
> I've been playing with the index code in order to allow indirect tuples,
> which are stored in a format different from IndexTupleData.
>
> I've been adding an "InMemoryIndexTuple" (maybe there's a better name)
> which internally has pointers to both IndexTupleData and
> IndirectIndexTupleData, which makes it easier to pass around the index
> tuple in either format.  It's very easy to add an OID to that struct,
> which then allows to include the OID in either an indirect index tuple
> or a regular one.
>
> Then, wherever we're using IndexTupleData in the index AM code, we would
> replace it with InMemoryIndexTuple.  This should satisfy both your use
> case and mine.
>
I found the thread about indirect indexes
(https://www.postgresql.org/message-id/20161018182843.xczrxsa2yd47pnru%40alvherre.pgsql),
but it seems that it haven't been updated for some time and I couldn't
find InMemoryIndexTuple in the latest patch. Is there a newer version?
Generally I think this may be a good idea.

-- 
Ildar Musin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: global index

2017-08-19 Thread Ildar Musin


18/08/2017 17:45, Alvaro Herrera пишет:
> Erikjan Rijkers wrote:
>> On 2017-08-18 11:12, Ildar Musin wrote:
>>> Hi hackers,
>>>
>>> While we've been developing pg_pathman extension one of the most
>>> frequent questions we got from our users was about global index
>>> support. We cannot provide it within an extension. And I couldn't find
>>> any recent discussion about someone implementing it. So I'm thinking
>>> about giving it a shot and start working on a patch for postgres.
>> Sorry to be dense but what exactly is a "Global Index"?
> A global index covers all partitions of a partitioned table.  It allows
> you to have unique indexes across the partitioned table.
>
> The main disadvantage of global indexes is that you need some kind of
> cleanup after you drop a partition.  Either make partition drop wait
> until all the index pointers are removed, or you need some kind of
> after-commit cleanup process that removes them afterwards (which
> requires some assurance that they are really all gone).  You can't let
> them linger forever, or you risk a new partition that reuses the same
> OID causing the whole index to become automatically corrupt.
>
Thanks for the notion, I haven't thought this through yet. We could
probably keep the list of removed partitions somewhere in the catalog or
in the index itself. Usually autovacuum (or the process we start right
after partition drop as you suggested) would clean the index up before
OID wraparound. But if it didn't and user is trying to add a new
partition with the same oid then the cleanup will be forced. I think the
latter is very unlikely.

-- 
Ildar Musin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: global index

2017-08-18 Thread Alvaro Herrera
Erikjan Rijkers wrote:
> On 2017-08-18 11:12, Ildar Musin wrote:
> > Hi hackers,
> > 
> > While we've been developing pg_pathman extension one of the most
> > frequent questions we got from our users was about global index
> > support. We cannot provide it within an extension. And I couldn't find
> > any recent discussion about someone implementing it. So I'm thinking
> > about giving it a shot and start working on a patch for postgres.
> 
> Sorry to be dense but what exactly is a "Global Index"?

A global index covers all partitions of a partitioned table.  It allows
you to have unique indexes across the partitioned table.

The main disadvantage of global indexes is that you need some kind of
cleanup after you drop a partition.  Either make partition drop wait
until all the index pointers are removed, or you need some kind of
after-commit cleanup process that removes them afterwards (which
requires some assurance that they are really all gone).  You can't let
them linger forever, or you risk a new partition that reuses the same
OID causing the whole index to become automatically corrupt.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: global index

2017-08-18 Thread Alvaro Herrera
Ildar Musin wrote:

> While we've been developing pg_pathman extension one of the most frequent
> questions we got from our users was about global index support. We cannot
> provide it within an extension. And I couldn't find any recent discussion
> about someone implementing it. So I'm thinking about giving it a shot and
> start working on a patch for postgres.
> 
> One possible solution is to create an extended version of item pointer which
> would store relation oid along with block number and position:

I've been playing with the index code in order to allow indirect tuples,
which are stored in a format different from IndexTupleData.

I've been adding an "InMemoryIndexTuple" (maybe there's a better name)
which internally has pointers to both IndexTupleData and
IndirectIndexTupleData, which makes it easier to pass around the index
tuple in either format.  It's very easy to add an OID to that struct,
which then allows to include the OID in either an indirect index tuple
or a regular one.

Then, wherever we're using IndexTupleData in the index AM code, we would
replace it with InMemoryIndexTuple.  This should satisfy both your use
case and mine.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: global index

2017-08-18 Thread Erikjan Rijkers

On 2017-08-18 11:12, Ildar Musin wrote:

Hi hackers,

While we've been developing pg_pathman extension one of the most
frequent questions we got from our users was about global index
support. We cannot provide it within an extension. And I couldn't find
any recent discussion about someone implementing it. So I'm thinking
about giving it a shot and start working on a patch for postgres.


Sorry to be dense but what exactly is a "Global Index"?

You mention pg_pathman; is a global index related to just partitions? Or 
is it a more generally applicable concept?


Could you (or someone) perhaps expand a little?

thanks,

Erik Rijkers



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: global index

2017-08-18 Thread Ildar Musin

Hi Chris,

On 18.08.2017 16:15, Chris Travers wrote:

I would really like to see global indexes.  It would make things a lot
easier for things like unique constraints across table inheritance trees.

On Fri, Aug 18, 2017 at 11:12 AM, Ildar Musin > wrote:

Hi hackers,

While we've been developing pg_pathman extension one of the most
frequent questions we got from our users was about global index
support. We cannot provide it within an extension. And I couldn't
find any recent discussion about someone implementing it. So I'm
thinking about giving it a shot and start working on a patch for
postgres.

One possible solution is to create an extended version of item
pointer which would store relation oid along with block number and
position:

struct ItemPointerExt
{
Oid ip_relid;
BlockIdData ip_blkid;
OffsetNumber ip_posid;
};

and use it in global index (regular index will still use old
version). This will require in-depth refactoring of existing index
nodes to make them support both versions. Accordingly, we could
replace ItemPointer with ItemPointerExt in index AM to make unified
API to access both regular and global indexes. TIDBitmap will
require refactoring as well to be able to deal with relation oids.


So, to be clear on-disk representations would be unchanged for old
indexes (ensuring that pg_upgrade would not be broken), right?


Yes, the idea is to keep old indexes untouched so that there would be no 
need in any further conversion. And global indexes in turn will have 
extended TID format.





It seems to be quite an invasive patch since it requires changes in
general index routines, existing index nodes, catalog, vacuum
routines and syntax. So I'm planning to implement it step by step.
As a first prototype it could be:

* refactoring of btree index to be able to store both regular and
extended item pointers;


Do you foresee any performance implementation of handling both?


It's hard to say until there is some working prototype. I think there 
can be (and most like will be) some overhead due to unified API (and 
hence addition conversion operations). It will require benchmarking to 
say how bad is it.



--
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com

Saarbrücker Straße 37a, 10405 Berlin



--
Ildar Musin
i.mu...@postgrespro.ru


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: global index

2017-08-18 Thread Chris Travers
I would really like to see global indexes.  It would make things a lot
easier for things like unique constraints across table inheritance trees.

On Fri, Aug 18, 2017 at 11:12 AM, Ildar Musin 
wrote:

> Hi hackers,
>
> While we've been developing pg_pathman extension one of the most frequent
> questions we got from our users was about global index support. We cannot
> provide it within an extension. And I couldn't find any recent discussion
> about someone implementing it. So I'm thinking about giving it a shot and
> start working on a patch for postgres.
>
> One possible solution is to create an extended version of item pointer
> which would store relation oid along with block number and position:
>
> struct ItemPointerExt
> {
> Oid ip_relid;
> BlockIdData ip_blkid;
> OffsetNumber ip_posid;
> };
>
> and use it in global index (regular index will still use old version).
> This will require in-depth refactoring of existing index nodes to make them
> support both versions. Accordingly, we could replace ItemPointer with
> ItemPointerExt in index AM to make unified API to access both regular and
> global indexes. TIDBitmap will require refactoring as well to be able to
> deal with relation oids.
>

So, to be clear on-disk representations would be unchanged for old indexes
(ensuring that pg_upgrade would not be broken), right?

>
> It seems to be quite an invasive patch since it requires changes in
> general index routines, existing index nodes, catalog, vacuum routines and
> syntax. So I'm planning to implement it step by step. As a first prototype
> it could be:
>
> * refactoring of btree index to be able to store both regular and extended
> item pointers;


Do you foresee any performance implementation of handling both?


>

* refactoring of TIDBitmap;
> * refactoring of general index routines (index_insert, index_getnext, etc)
> and indexAM api;
> * catalog (add pg_index.indisglobal attribute and/or a specific relkind as
> discussed in [1] thread);
> * syntax for global index definition. E.g., it could be oracle-like syntax:
>
> CREATE INDEX my_idx ON my_tbl (key) GLOBAL;
>
> If it goes well, then I’ll do the rest of indexes and vacuuming. If you
> have any ideas or concerns I’ll be glad to hear it.
>
> [1] https://www.postgresql.org/message-id/c8fe4f6b-ff46-aae0-89e
> 3-e936a35f0cfd%40postgrespro.ru
>
> Thanks!
>
> --
> Ildar Musin
> i.mu...@postgrespro.ru
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin


[HACKERS] Proposal: global index

2017-08-18 Thread Ildar Musin

Hi hackers,

While we've been developing pg_pathman extension one of the most 
frequent questions we got from our users was about global index support. 
We cannot provide it within an extension. And I couldn't find any recent 
discussion about someone implementing it. So I'm thinking about giving 
it a shot and start working on a patch for postgres.


One possible solution is to create an extended version of item pointer 
which would store relation oid along with block number and position:


struct ItemPointerExt
{
Oid ip_relid;
BlockIdData ip_blkid;
OffsetNumber ip_posid;
};

and use it in global index (regular index will still use old version). 
This will require in-depth refactoring of existing index nodes to make 
them support both versions. Accordingly, we could replace ItemPointer 
with ItemPointerExt in index AM to make unified API to access both 
regular and global indexes. TIDBitmap will require refactoring as well 
to be able to deal with relation oids.


It seems to be quite an invasive patch since it requires changes in 
general index routines, existing index nodes, catalog, vacuum routines 
and syntax. So I'm planning to implement it step by step. As a first 
prototype it could be:


* refactoring of btree index to be able to store both regular and 
extended item pointers;

* refactoring of TIDBitmap;
* refactoring of general index routines (index_insert, index_getnext, 
etc) and indexAM api;
* catalog (add pg_index.indisglobal attribute and/or a specific relkind 
as discussed in [1] thread);

* syntax for global index definition. E.g., it could be oracle-like syntax:

CREATE INDEX my_idx ON my_tbl (key) GLOBAL;

If it goes well, then I’ll do the rest of indexes and vacuuming. If you 
have any ideas or concerns I’ll be glad to hear it.


[1] 
https://www.postgresql.org/message-id/c8fe4f6b-ff46-aae0-89e3-e936a35f0cfd%40postgrespro.ru


Thanks!

--
Ildar Musin
i.mu...@postgrespro.ru


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers