Eh, I checked, you are right. Something terribly wrong with hash index in PostgreSQL. But there are another hash index gin(jsonb_path_ops), may be correctly say gin+hash index. Looked like it is the best for this purpose.
> 26 окт. 2018 г., в 19:27, Jeff Janes <jeff.ja...@gmail.com> написал(а): > > On Thu, Oct 25, 2018 at 9:36 AM Олег Самойлов <spl...@ya.ru > <mailto:spl...@ya.ru>> wrote: > >> 17 окт. 2018 г., в 13:46, Ravi Krishna <srkrish...@aol.com >> <mailto:srkrish...@aol.com>> написал(а): >> >> In >> https://www.cybertec-postgresql.com/en/ideas-for-scaling-postgresql-to-multi-terabyte-and-beyond/ >> >> <https://www.cybertec-postgresql.com/en/ideas-for-scaling-postgresql-to-multi-terabyte-and-beyond/> >> >> it is mentioned: >> >> "GIN, the most know non-default index type perhaps, has been actually around >> for ages (full-text search) and in short is perfect for indexing columns >> where there are lot of repeating values – think all kinds of statuses or >> good old Mr/Mrs/Miss. GIN only stores every unique column value only once as >> for the default B-tree you’ll have e.g. 1 millon leaf nodes with the integer >> “1” in it." >> >> >> Does it mean that GIN is a very good choice for low cardinality columns. > > Not necessary. There is other index which also don’t keep column value in an > every leaf. Hash, for instance. > > For smallish values (which low cardinality columns tend to be) the per-tuple > overhead and the pointer itself is probably much larger than the value, so > hash won't save you much if any space. The GIN index removes not just the > value, but the per-tuple overhead. And also compresses the point list to > further save space. > > Here is a real-world example from one of my databases where each value is > about 17 characters long, and is present about 20 times: > > gin: 411 MB > btree: 2167 MB > hash: 2159 MB > > Cheers, > > Jeff