Re: [GENERAL] plpgsql function with offset - Postgres 9.1

Peter J. Holzer Thu, 15 Jun 2017 22:30:30 -0700

On 2017-06-16 10:19:45 +1200, Patrick B wrote:
> 2017-05-29 19:27 GMT+12:00 Albe Laurenz <[email protected]>:
>     Patrick B wrote:
>     > I am running a background task on my DB, which will copy data from 
> tableA
>     to tableB. For
>     > that, I'm writing a PL/PGSQL function which basically needs to do the
>     following:
>     >
>     >
>     > 1.    Select the data from tableA
>     > 2.    The limit will be put when calling the function
>     > 3.    insert the selected data on Step 1 onto new table
[...]
>     >               FOR row IN EXECUTE '
>     >                           SELECT
>     >                                   id,
>     >                                   path,
>     >                                   name,
>     >                                   name_last,
>     >                                   created_at
>     >                           FROM
>     >                                   tablea
>     >                           WHERE
>     >                                   ready = true
>     >                           ORDER BY 1 LIMIT ' || rows || ' OFFSET ' ||
>     rows || ''
> 
>     '... LIMIT ' || p_limit || ' OFFSET ' || p_offset
> 
>     >               LOOP
> 
>     num_rows := num_rows + 1;
>    
>     >               INSERT INTO tableB (id,path,name,name_last,created_at)
>     >               VALUES (row.id,row.path,row.name,row.
>     name_last,row.created_at);
>     >
>     >               END LOOP;
[...]
> 
>     There are two problems with this approach:
> 
>     1. It will do the wrong thing if rows are added or deleted in "tablea"
>     while
>        you process it.
> 
> 
> 
> There will be actually records being inserted in tablea while processing the
> migration.... Any ideas here?


Is id monotonically increasing? You might be able to use that, as Albe
suggests:

>     The solution is to avoid OFFSET and to use "keyset pagination":
>     http://use-the-index-luke.com/no-offset

But it works only if rows cannot become ready after their id range has
already been processed. Otherwise you will miss them.

> I can add another column in tablea, like example: row_migrated boolean --> if
> that helps

Yes that's probably the best way. Instead of using an additional column
you could also make ready tristate: New -> ready_for_migration -> migrated.

>     2. Queries with hight OFFSET values have bad performance.
> 
> 
> No problem. The plan is to perform 2k rows at once, which is not much.

Are rows deleted from tablea after they are migrated? Otherwise you will
have a problem:

    select ... limit 2000 offset 1234000

will have to retrieve 1236000 rows and then discard 1234000 of them.

        hp

-- 
   _  | Peter J. Holzer    | we build much bigger, better disasters now
|_|_) |                    | because we have much more sophisticated
| |   | [email protected]         | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>

signature.asc
Description: Digital signature

Re: [GENERAL] plpgsql function with offset - Postgres 9.1

Reply via email to