Hi,

Playing with new PG9.6rc1 / Pgis 2.3beta1, I found parallel query mode to
be really efficient to process big tables.
Unfortunately, it is not possible to directly create tables with parallel
plan (create table as select...) (see:
https://wiki.postgresql.org/wiki/Parallel_Query).
It is possible, though, to use copy mode with psql feed to create a table
with parallel plan enabled.

To allow creating tables directly in pure SQL script, I developed a small
hack function that takes a SQL query and creates a table from it, using
COPY command with psql PROGRAM executing the query.

Usage:

select * from create_table_parallel(
    'table_name',
    'select p.id as idparc, c.gid as idcarreau
        st_intersection(p.geom, c.geom) as geom
      from parcelle_sample2 p
      join carreau_sample2 c on st_intersects(p.geom, c.geom)',
    '/usr/local/pgsql-9.6/bin/psql -A -t -p 5439 -d nicolas -c',
    8, -- workers
    true);

Limitations:

• delimiter used for copy operation defaults to '|'
• a 'LIMIT 0' clause is inserted at the end of the passed query to create
table structure: query to run cannot contain a LIMIT clause.
• The function is not safe, as it injects user parameters to build psql
command, and it's not extensively tested.

Perfomance expected:

Depends on the number of workers configured and planned:
On a small dataset (~15 000 pg intersected with 360 000 pg), with 8 workers
configured and 3 choosen by the planner, table creation took *24s vs 1m25s*
with a traditionnal create table as select...

Nicolas

Attachment: fn_create_table_parallel.sql
Description: Binary data

_______________________________________________
postgis-users mailing list
postgis-users@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/postgis-users

Reply via email to