Hello, This is very interesting. In my pipeline currently GiST index rebuild is the biggest time consuming step.
I believe introducing optional concept of order in the GiST opclass will be beneficial not only for fast build, but for other tasks later: - CLUSTER can order the table using that notion, in parallel way. - btree_gist can be even closer to btree by getting the tuples sorted inside page. - tree descend on insertion in future can traverse the list in more opportunistic way, calculating penalty for siblings-by-order first. I believe everywhere the idea of ordering is needed it's provided by giving a btree opclass. How about giving a link to btree opclass inside a gist opclass? On Mon, Aug 26, 2019 at 10:59 AM Andrey Borodin <x4...@yandex-team.ru> wrote: > Hi! > > In many cases GiST index can be build fast using z-order sorting. > > I've looked into proof of concept by Nikita Glukhov [0] and it looks very > interesting. > So, I've implemented yet another version of B-tree-like GiST build. > It's main use case and benefits can be summarized with small example: > > postgres=# create table x as select point (random(),random()) from > generate_series(1,3000000,1); > SELECT 3000000 > Time: 5061,967 ms (00:05,062) > postgres=# create index ON x using gist (point ) with > (fast_build_sort_function=gist_point_sortsupport); > CREATE INDEX > Time: 6140,227 ms (00:06,140) > postgres=# create index ON x using gist (point ); > CREATE INDEX > Time: 32061,200 ms (00:32,061) > > As you can see, Z-order build is on order of magnitude faster. Select > performance is roughly the same. Also, index is significantly smaller. > > Nikita's PoC is faster because it uses parallel build, but it intervenes > into B-tree code a lot (for reuse). This patchset is GiST-isolated. > My biggest concern is that passing function to relation option seems a bit > hacky. You can pass there any function matching sort support signature. > Embedding this function into opclass makes no sense: it does not affect > scan anyhow. > > In current version, docs and tests are not implemented. I want to discuss > overall design. Do we really want yet another GiST build, if it is 3-10 > times faster? > > Thanks! > > Best regards, Andrey Borodin. > > [0] > https://github.com/postgres/postgres/compare/master...glukhovn:gist_btree_build > > -- Darafei Praliaskouski Support me: http://patreon.com/komzpa