On Wed, Jan 11, 2017 at 12:54 AM, Alvaro Herrera
<alvhe...@2ndquadrant.com> wrote:
> Two options are on the table to attack the problem of updates causing
> write amplification: WARM and indirect indexes.  They are completely
> different approaches but have overlapping effects on what scenarios are
> improved.  Here's a recap of both features, with the intent that we make
> a well-considered decision about each.
> The main effect of both features is that an updated tuple doesn't
> require updating indexes that are on unmodified columns.  Indirect
> indexes are a completely new server feature which may enable other
> refinements later on; WARM is a targeted optimization on top of the HOT
> optimization.
> The big advantage of WARM is that it works automatically, like HOT: the
> user doesn't need to do anything different than today to get the
> benefit.  With indirect indexes, the user needs to create the index as
> indirect explicitely.
> There are two big disadvantages to WARM (as to HOT): it cannot be
> applied when the heap page is full; and concurrent long-running
> transactions can spoil reclaimability of recently dead tuples in heap
> pages.  There's a further disadvantage: currently, there can be only one
> WARM update in an update chain.  (Pavan believes it is possible to allow
> multiple ones.)  All those cases can benefit from indirect indexes.
> Another interesting case is a table with a primary key and a JSON
> object, on which you have a GIN index (or an int[] array, or text
> search).  What happens if you modify the JSON?  With WARM, this is just
> a normal index update.  With indirect indexes, you may be able to skip
> inserting index entries for all the JSON elements except those which
> changed.  (However, this is not implemented yet.)

I think both are of use in somewhat different scenarios, probably WARM
will be more effective and can be used in more number of cases to
reduce write amplification.  It seems to me that indirect indexes are
generally useful with clustered index where data is stored in the leaf
node of the tree, that doesn't mean it can't be used in other cases as
mentioned by you.  If you see the work required by indirect indexes is
justified with respect to its usecase, then I think it will be okay to
get both features as the indirect index can be used in future if
somebody implements clustered index.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to