On Nov 21, 2017 00:05, "Matthew Hall" <mh...@mhcomputing.net> wrote:


> Are all indexes present at the time you insert?  It will probably be much
faster to insert without the gin index (at least) and build it after the
load.

There is some flexibility on the initial load, but the updates in the
future will require the de-duplication capability. I'm willing to accept
that might be somewhat slower on the load process, to get the accurate
updates, provided we could try meeting the read-side goal I wrote about, or
at least figure out why it's impossible, so I can understand what I need to
fix to make it possible.


As long as you don't let anyone use the table between the initial load and
when the index build finishes, you don't have to compromise on
correctness.  But yeah, makes sense to worry about query speed first.






> If you repeat the same query, is it then faster, or is it still slow?

If you keep the expression exactly the same, it still takes a few seconds
as could be expected for such a torture test query, but it's still WAY
faster than the first such query. If you change it out to a different
expression, it's longer again of course. There does seem to be a
low-to-medium correlation between the number of rows found and the query
completion time.


To make this quick, you will need to get most of the table and most of the
index cached into RAM.  A good way to do that is with pg_prewarm.  Of
course that only works if you have enough RAM in the first place.

What is the size of the table and the gin index?


Cheers,

Jeff

Reply via email to