Re: insert and query performance on big string table with pg_trgm

Gábor SZŰCS Sat, 25 Nov 2017 01:21:06 -0800

Don't know if it would make PostgreSQL happier but how about adding a hash
value column and creating the unique index on that one? May block some
false duplicates but the unique index would be way smaller, speeding up
inserts.


2017. nov. 25. 7:35 ezt írta ("Jeff Janes" <[email protected]>):

>
>
> On Nov 21, 2017 00:05, "Matthew Hall" <[email protected]> wrote:
>
>
> > Are all indexes present at the time you insert?  It will probably be
> much faster to insert without the gin index (at least) and build it after
> the load.
>
> There is some flexibility on the initial load, but the updates in the
> future will require the de-duplication capability. I'm willing to accept
> that might be somewhat slower on the load process, to get the accurate
> updates, provided we could try meeting the read-side goal I wrote about, or
> at least figure out why it's impossible, so I can understand what I need to
> fix to make it possible.
>
>
> As long as you don't let anyone use the table between the initial load and
> when the index build finishes, you don't have to compromise on
> correctness.  But yeah, makes sense to worry about query speed first.
>
>
>
>
>
>
> > If you repeat the same query, is it then faster, or is it still slow?
>
> If you keep the expression exactly the same, it still takes a few seconds
> as could be expected for such a torture test query, but it's still WAY
> faster than the first such query. If you change it out to a different
> expression, it's longer again of course. There does seem to be a
> low-to-medium correlation between the number of rows found and the query
> completion time.
>
>
> To make this quick, you will need to get most of the table and most of the
> index cached into RAM.  A good way to do that is with pg_prewarm.  Of
> course that only works if you have enough RAM in the first place.
>
> What is the size of the table and the gin index?
>
>
> Cheers,
>
> Jeff
>
>

Re: insert and query performance on big string table with pg_trgm

Reply via email to