On Fri, May 24, 2019 at 10:25 AM Jeremy Finzel <finz...@gmail.com> wrote:

> I have been hoping for clearer direction from the community about
> specifically btree_gin indexes for low cardinality columns (as well as low
> cardinality multi-column indexes).  In general there is very little
> discussion about this both online and in the docs.  Rather, the emphasis
> for GIN indexes discussed is always on full text search of JSON indexing,
> not btree_gin indexes.
>
> However, I have never been happy with the options open to me for indexing
> low cardinality columns and was hoping this could be a big win.  Often I
> use partial indexes as a solution, but I really want to know how many use
> cases btree_gin could solve better than either a normal btree or a partial
> index.
>
> Here are my main questions:
>
> 1.
>
> "The docs say regarding *index only scans*: The index type must support
> index-only scans. B-tree indexes always do. GiST and SP-GiST indexes
> support index-only scans for some operator classes but not others. Other
> index types have no support. The underlying requirement is that the index
> must physically store, or else be able to reconstruct, the original data
> value for each index entry. As a counterexample, GIN indexes cannot support
> index-only scans because each index entry typically holds only part of the
> original data value."
>
> This is confusing to say "B-tree indexes always do" and "GIN indexes
> cannot support index-only scans", when we have a btree_gin index type.
> Explanation please ???
>
> Is it true that for a btree_gin index on a regular column, "each index
> entry typically holds only part of the original data value"?  Do these
> still not support index only scans?  Could they?  I can't see why they
> shouldn't be able to for a single indexed non-expression field?
>
> 2.
>
> Lack of index only scans is definitely a downside.  However, I see
> basically identical performance, but way less memory and space usage, for
> gin indexes.  In terms of read-only performance, if index only scans are
> not a factor, why not always recommend btree_gin indexes instead of regular
> btree for low cardinality fields, which will yield similar performance but
> use far, far less space and resources?
>
> 3.
>
> This relates to 2.  I understand the write overhead can be much greater
> for GIN indexes, which is why the fastupdate feature exists.  But again, in
> those discussions in the docs, it appears to me they are emphasizing that
> penalty more for full text or json GIN indexes.  Does the same overhead
> apply to a btree_gin index on a regular column with no expressions?
>
> Those are my questions.
>
> FYI, I can see an earlier thread about this topic (
> https://www.postgresql.org/message-id/flat/E260AEE7-95B3-4142-9A4B-A4416F1701F0%40aol.com#5def5ce1864298a3c0ba2d4881a660c2),
> but a few questions were left unanswered and unclear there.
>
> I first started seriously considering using btree_gin indexes for low
> cardinality columns, for example some text field with 30 unique values
> across 100 million rows, after reading a summary of index types from
> Bruce's article: https://momjian.us/main/writings/pgsql/indexing.pdf
>
> This article was also helpful but yet again I wonder it's broader
> viability:
> http://hlinnaka.iki.fi/2014/03/28/gin-as-a-substitute-for-bitmap-indexes/
>
>
> Thank you!
> Jeremy
>

Could anyone shed any light on these questions?  I appreciate it.

Thanks,
Jeremy

Reply via email to