On Wed, Dec 9, 2015 at 12:14 AM, Robert Haas <robertmh...@gmail.com> wrote:

> On Wed, Dec 2, 2015 at 6:45 AM, Rushabh Lathia <rushabh.lat...@gmail.com>
> wrote:
> > Thanks Ashutosh.
> >
> > Re-reviewed and Re-verified the patch, pg_sort_all_pd_v5.patch
> > looks good to me.
>
> This patch needs a rebase.
>

Done.


>
> It's not going to work to say this is a patch proposed for commit when
> it's still got a TODO comment in it that obviously needs to be
> changed.   And the formatting of that long comment is pretty weird,
> too, and not consistent with other functions in that same file (e.g.
> get_remote_estimate, ec_member_matches_foreign, create_cursor).
>
>
The TODO was present in v4 but not in v5 and is not present in v6 attached
here.. Formatted comment according estimate_path_cost_size(),
convert_prep_stmt_params().


> Aside from that, I think before we commit this, somebody should do
> some testing that demonstrates that this is actually a good idea.  Not
> as part of the test case set for this patch, but just in general.
> Merge joins are typically going to be relevant for large tables, but
> the examples in the regression tests are necessarily tiny.  I'd like
> to see some sample data and some sample queries that get appreciably
> faster with this code.  If we can't find any, we don't need the code.
>
>
I tested the patch on my laptop with two types of queries, a join between
two foreign tables on different foreign servers (pointing to the same self
server) and a join between one foreign and one local table. The foreign
tables and servers are created using sort_pd_setup.sql attached. Foreign
tables pointed to table with index useful for join clause. Both the joining
tables had 10M rows. The execution time of query was measured for 100 runs
and average and standard deviation were calculated (using function
query_execution_stats() in script sort_pd.sql) and are presented below.

1. Query between foreign tables
SELECT ft1.val, ft2.val FROM ft1 join ft2 on (ft1.val = ft2.val)

Plan and timings without patch
EXPLAIN (VERBOSE, ANALYSE) :query ;
                                                                 QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=508510.02..1129945.94 rows=999995 width=8) (actual
time=33803.826..82416.342 rows=10000000 loops=1)
   Output: ft1.val, ft2.val
   Hash Cond: (ft1.val = ft2.val)
   ->  Foreign Scan on public.ft1  (cost=100.00..344347.31 rows=9999977
width=4) (actual time=0.624..28531.803 rows=10000000 loops=1)
         Output: ft1.val
         Remote SQL: SELECT val FROM public.lt
   ->  Hash  (cost=344347.31..344347.31 rows=9999977 width=4) (actual
time=33258.025..33258.025 rows=10000000 loops=1)
         Output: ft2.val
         Buckets: 131072  Batches: 256  Memory Usage: 2400kB
         ->  Foreign Scan on public.ft2  (cost=100.00..344347.31
rows=9999977 width=4) (actual time=22.171..28134.970 rows=10000000 loops=1)
               Output: ft2.val
               Remote SQL: SELECT val FROM public.lt
 Planning time: 33.155 ms
 Execution time: 82914.607 ms
(14 rows)

 avg_exe_time | std_dev_exe_time | min_exe_time | max_exe_time
--------------+------------------+--------------+--------------
  78750.95487 | 2911.51825687913 |    74314.886 |    89358.464

Plan and timing with patch
EXPLAIN (VERBOSE, ANALYSE) :query ;
                                                                 QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=200.86..1183070.86 rows=10000000 width=8) (actual
time=1.776..73140.219 rows=10000000 loops=1)
   Output: ft1.val, ft2.val
   Merge Cond: (ft1.val = ft2.val)
   ->  Foreign Scan on public.ft1  (cost=100.43..504035.43 rows=10000000
width=4) (actual time=0.937..30422.457 rows=10000000 loops=1)
         Output: ft1.val, ft1.val2
         Remote SQL: SELECT val FROM public.lt ORDER BY val ASC
   ->  Materialize  (cost=100.43..529035.43 rows=10000000 width=4) (actual
time=0.826..33448.822 rows=10000000 loops=1)
         Output: ft2.val, ft2.val2
         ->  Foreign Scan on public.ft2  (cost=100.43..504035.43
rows=10000000 width=4) (actual time=0.818..31035.362 rows=10000000 loops=1)
               Output: ft2.val, ft2.val2
               Remote SQL: SELECT val FROM public.lt ORDER BY val ASC
 Planning time: 163.161 ms
 Execution time: 73654.106 ms
(13 rows)

 avg_exe_time | std_dev_exe_time | min_exe_time | max_exe_time
--------------+------------------+--------------+--------------
  71881.15916 | 819.091605498189 |    70197.312 |    74653.314

It can be observed that the with the patch, merge join strategy is used
instead of hash join and the execution time reduces by approx 9%. A desired
effect is that the deviation in the execution time has reduced heavily
(almost by 75%).

2. Join between local and foreign table

Without patch the plan and timings are
EXPLAIN (VERBOSE, ANALYSE) :query ;
                                                              QUERY
PLAN
--------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=308410.66..1019846.69 rows=9999970 width=8) (actual
time=7674.681..47767.136 rows=10000000 loops=1)
   Output: lt.val, ft1.val
   Hash Cond: (ft1.val = lt.val)
   ->  Foreign Scan on public.ft1  (cost=100.00..344347.55 rows=9999985
width=4) (actual time=0.506..26679.980 rows=10000000 loops=1)
         Output: ft1.val
         Remote SQL: SELECT val FROM public.lt
   ->  Hash  (cost=144247.85..144247.85 rows=9999985 width=4) (actual
time=7667.598..7667.598 rows=10000000 loops=1)
         Output: lt.val
         Buckets: 131072  Batches: 256  Memory Usage: 2400kB
         ->  Seq Scan on public.lt  (cost=0.00..144247.85 rows=9999985
width=4) (actual time=0.018..2959.111 rows=10000000 loops=1)
               Output: lt.val
 Planning time: 8.668 ms
 Execution time: 48209.365 ms
(13 rows)

SELECT avg_exe_time, std_dev_exe_time, min_exe_time, max_exe_time
        FROM query_execution_stats(:'query', :num_samples);
 avg_exe_time | std_dev_exe_time | min_exe_time | max_exe_time
--------------+------------------+--------------+--------------
  47246.46956 | 2579.42041949119 |    43603.411 |    56096.759

With the patch the plan and timings are
EXPLAIN (VERBOSE, ANALYSE) :query ;
                                                                     QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=155.01..957924.85 rows=9999970 width=8) (actual
time=0.592..45125.356 rows=10000000 loops=1)
   Output: lt.val, ft1.val
   Merge Cond: (ft1.val = lt.val)
   ->  Foreign Scan on public.ft1  (cost=100.43..504038.91 rows=9999985
width=4) (actual time=0.551..30526.048 rows=10000000 loops=1)
         Output: ft1.val, ft1.val2
         Remote SQL: SELECT val FROM public.lt ORDER BY val ASC
   ->  Index Only Scan using i_lt_val on public.lt  (cost=0.43..303939.21
rows=9999985 width=4) (actual time=0.032..6192.406 rows=10000000 loops=1)
         Output: lt.val
         Heap Fetches: 10000000
 Planning time: 9.043 ms
 Execution time: 45666.023 ms
(11 rows)


 avg_exe_time | std_dev_exe_time | min_exe_time | max_exe_time
--------------+------------------+--------------+--------------
  42803.36105 | 166.874491432755 |    42321.314 |    43316.902

Again observe that with the patch, merge join is used instead of hash join
and timing reduces by approx 9%. Again the deviation in execution reduces
heavily (almost by 75%). There is increase in planning time with the patch
owing to firing EXPLAIN on the foreign server.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
-- \set num_rows (1000*1000*10) -- patched server uses merge join from this
\set num_rows (1000*1000*10)
-- Create local tables (to be pointed by foreign tables)
DROP TABLE lt CASCADE;
CREATE TABLE lt(val int, val2 int);
INSERT INTO lt SELECT i, i FROM (SELECT generate_series(1, :num_rows)) s(i);

CREATE INDEX i_lt_val ON lt(val);
DROP EXTENSION postgres_fdw CASCADE;
create extension postgres_fdw;
CREATE SERVER pg1 FOREIGN DATA WRAPPER postgres_fdw OPTIONS (dbname 'postgres');
CREATE SERVER pg2 FOREIGN DATA WRAPPER postgres_fdw OPTIONS (dbname 'postgres');
CREATE USER MAPPING FOR CURRENT_USER SERVER pg1;
CREATE USER MAPPING FOR CURRENT_USER SERVER pg2;
CREATE FOREIGN TABLE ft1 () INHERITS (lpt) SERVER pg1 OPTIONS (table_name 'lt',
																use_remote_estimate 'true');
CREATE FOREIGN TABLE ft2 () INHERITS (lpt) SERVER pg2 OPTIONS (table_name 'lt',
																use_remote_estimate 'true');

DROP FUNCTION query_execution_stats(query text, num_samples int,
													OUT avg_exe_time float,
													OUT exec_time_dev float,
													OUT min_exe_time float,
													OUT max_exe_time float);

CREATE FUNCTION query_execution_stats(query text, num_samples int,
													OUT avg_exe_time float,
													OUT std_dev_exe_time float,
													OUT min_exe_time float,
													OUT max_exe_time float)
RETURNS record LANGUAGE plpgsql AS $$
DECLARE
	plan json;
BEGIN
	CREATE TEMPORARY TABLE query_exe_times(exe_time float); 

	-- Execute query a few times (5% of user specified runs) to warm the cache
	FOR i IN 1 .. num_samples/20 LOOP
		EXECUTE query;
	END LOOP;

	FOR i IN 1 .. num_samples LOOP
		EXECUTE 'EXPLAIN (analyze, format json) ' || query INTO plan;
		INSERT INTO query_exe_times VALUES ((plan->0->'Execution Time')::text::float);
		RAISE NOTICE 'completed % samples', i;
	END LOOP;

	SELECT avg(exe_time), stddev(exe_time), min(exe_time), max(exe_time)
		INTO avg_exe_time, std_dev_exe_time, min_exe_time, max_exe_time
		FROM query_exe_times;

	DROP TABLE query_exe_times;
END;
$$;

ANALYZE ft1;
ANALYZE lt;

Attachment: pg_sort_all_pd_v6.patch
Description: binary/octet-stream

\set num_samples 100
\set query 'SELECT lt.val, ft1.val FROM lt join ft1 on (lt.val = ft1.val)'
EXPLAIN (VERBOSE, ANALYSE) :query ;
SELECT avg_exe_time, std_dev_exe_time, min_exe_time, max_exe_time
		FROM query_execution_stats(:'query', :num_samples);
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to