Re: [HACKERS] Getting sorted data from foreign server for merge join

2016-03-08 Thread Robert Haas
On Thu, Jan 7, 2016 at 4:05 AM, Ashutosh Bapat
 wrote:
> In get_useful_ecs_for_relation(), while checking whether to use left or
> right argument of a mergejoinable operator, the arguments to bms_is_subset()
> are passed in reverse order. bms_is_subset() checks whether the first
> argument in subset of the second, but in this function the subset to be
> checked is passed as the second argument. Because of this following query
> when run in contrib_regression database after "make installcheck" in
> contrib/postgres_fdw trips assertion Assert(bms_is_subset(relids,
> restrictinfo->left_ec->ec_relids));
>
> EXPLAIN (COSTS false, VERBOSE)
> SELECT t1."C 1" FROM "S 1"."T 1" t1 left join ft1 t2 join ft2 t3 on
> (t2.c1 = t3.c1) on (t3.c1 = t1."C 1") OFFSET 100 LIMIT 10;
>
> PFA patch to fix it.

The test case failed for me, possibly because of Tom's upper planner
pathification, but the substantive part of the fix looks right to me,
so committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting sorted data from foreign server for merge join

2015-12-22 Thread Ashutosh Bapat
Thanks.

On Wed, Dec 23, 2015 at 12:24 AM, Robert Haas  wrote:

> On Mon, Dec 21, 2015 at 6:34 AM, Ashutosh Bapat
>  wrote:
> >> I went over this patch in some detail today and did a lot of cosmetic
> >> cleanup.  The results are attached.  I'm fairly happy with this
> >> version, but let me know what you think.  Of course, feedback from
> >> others is more than welcome also.
> >>
> >
> > Attached patch with some cosmetic changes (listed here for your quick
> > reference)
> > 1. , was replaced with ; in comment "inner join, expressions in the " at
> one
> > place, which is correct, but missed other place.
> > 2. The comment "First, consider whether any each active EC is
> potentially"
> > should use either "any" or "each". I have reworded it as "First, consider
> > whether any of the active ECs is potentially ...". Or we can use "First,
> > find all of the active ECs which are potentially ".
> > 3. "having the remote side due the sort generally won't be any worse
> ..." -
> > instead of "due" we should use "do"?
> > 4. Added static prototype of function get_useful_ecs_for_relation().
> > 5. The comment "Extract unique EC for query, if any, so we don't
> consider it
> > again." is too crisp. Phrase "Unique EC for query" is confusing; EC can
> not
> > be associated with a query per say and EC's are always unique because of
> > canonicalisation. May be we should reword it as "Extract single EC for
> > ordering of query, if any, so we don't consider it again." Is that
> cryptic
> > as well?
>
> Thanks.  I committed this version with one small tweak.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>



-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] Getting sorted data from foreign server for merge join

2015-12-22 Thread Robert Haas
On Mon, Dec 21, 2015 at 6:34 AM, Ashutosh Bapat
 wrote:
>> I went over this patch in some detail today and did a lot of cosmetic
>> cleanup.  The results are attached.  I'm fairly happy with this
>> version, but let me know what you think.  Of course, feedback from
>> others is more than welcome also.
>>
>
> Attached patch with some cosmetic changes (listed here for your quick
> reference)
> 1. , was replaced with ; in comment "inner join, expressions in the " at one
> place, which is correct, but missed other place.
> 2. The comment "First, consider whether any each active EC is potentially"
> should use either "any" or "each". I have reworded it as "First, consider
> whether any of the active ECs is potentially ...". Or we can use "First,
> find all of the active ECs which are potentially ".
> 3. "having the remote side due the sort generally won't be any worse ..." -
> instead of "due" we should use "do"?
> 4. Added static prototype of function get_useful_ecs_for_relation().
> 5. The comment "Extract unique EC for query, if any, so we don't consider it
> again." is too crisp. Phrase "Unique EC for query" is confusing; EC can not
> be associated with a query per say and EC's are always unique because of
> canonicalisation. May be we should reword it as "Extract single EC for
> ordering of query, if any, so we don't consider it again." Is that cryptic
> as well?

Thanks.  I committed this version with one small tweak.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting sorted data from foreign server for merge join

2015-12-21 Thread Ashutosh Bapat
>
> I went over this patch in some detail today and did a lot of cosmetic
> cleanup.  The results are attached.  I'm fairly happy with this
> version, but let me know what you think.  Of course, feedback from
> others is more than welcome also.
>
>
Attached patch with some cosmetic changes (listed here for your quick
reference)
1. , was replaced with ; in comment "inner join, expressions in the " at
one place, which is correct, but missed other place.
2. The comment "First, consider whether any each active EC is potentially"
should use either "any" or "each". I have reworded it as "First, consider
whether any of the active ECs is potentially ...". Or we can use "First,
find all of the active ECs which are potentially ".
3. "having the remote side due the sort generally won't be any worse ..." -
instead of "due" we should use "do"?
4. Added static prototype of function get_useful_ecs_for_relation().
5. The comment "Extract unique EC for query, if any, so we don't consider
it again." is too crisp. Phrase "Unique EC for query" is confusing; EC can
not be associated with a query per say and EC's are always unique because
of canonicalisation. May be we should reword it as "Extract single EC for
ordering of query, if any, so we don't consider it again." Is that cryptic
as well?

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


order-for-merge-pushdown-rmh_v2.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting sorted data from foreign server for merge join

2015-12-19 Thread Rushabh Lathia
On Sat, Dec 19, 2015 at 2:17 AM, Robert Haas  wrote:

> On Thu, Dec 17, 2015 at 3:32 AM, Ashutosh Bapat
>  wrote:
> > On Wed, Dec 9, 2015 at 12:14 AM, Robert Haas 
> wrote:
> >> On Wed, Dec 2, 2015 at 6:45 AM, Rushabh Lathia <
> rushabh.lat...@gmail.com>
> >> wrote:
> >> > Thanks Ashutosh.
> >> >
> >> > Re-reviewed and Re-verified the patch, pg_sort_all_pd_v5.patch
> >> > looks good to me.
> >>
> >> This patch needs a rebase.
> >
> > Done.
>
> Thanks.
>
> >> It's not going to work to say this is a patch proposed for commit when
> >> it's still got a TODO comment in it that obviously needs to be
> >> changed.   And the formatting of that long comment is pretty weird,
> >> too, and not consistent with other functions in that same file (e.g.
> >> get_remote_estimate, ec_member_matches_foreign, create_cursor).
> >>
> >
> > The TODO was present in v4 but not in v5 and is not present in v6
> attached
> > here.. Formatted comment according estimate_path_cost_size(),
> > convert_prep_stmt_params().
>
> Hrm, I must have been looking at the wrong version somehow.  Sorry about
> that.
>
> >> Aside from that, I think before we commit this, somebody should do
> >> some testing that demonstrates that this is actually a good idea.  Not
> >> as part of the test case set for this patch, but just in general.
> >> Merge joins are typically going to be relevant for large tables, but
> >> the examples in the regression tests are necessarily tiny.  I'd like
> >> to see some sample data and some sample queries that get appreciably
> >> faster with this code.  If we can't find any, we don't need the code.
> >>
> >
> > I tested the patch on my laptop with two types of queries, a join between
> > two foreign tables on different foreign servers (pointing to the same
> self
> > server) and a join between one foreign and one local table. The foreign
> > tables and servers are created using sort_pd_setup.sql attached. Foreign
> > tables pointed to table with index useful for join clause. Both the
> joining
> > tables had 10M rows. The execution time of query was measured for 100
> runs
> > and average and standard deviation were calculated (using function
> > query_execution_stats() in script sort_pd.sql) and are presented below.
>
> OK, cool.
>
> I went over this patch in some detail today and did a lot of cosmetic
> cleanup.  The results are attached.  I'm fairly happy with this
> version, but let me know what you think.  Of course, feedback from
> others is more than welcome also.
>
>
Had a quick look at the patch changes and also performed basic
sanity check. Patch looks good to me.

Thanks.

--
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>



-- 
Rushabh Lathia
www.EnterpriseDB.com


Re: [HACKERS] Getting sorted data from foreign server for merge join

2015-12-18 Thread Robert Haas
On Thu, Dec 17, 2015 at 3:32 AM, Ashutosh Bapat
 wrote:
> On Wed, Dec 9, 2015 at 12:14 AM, Robert Haas  wrote:
>> On Wed, Dec 2, 2015 at 6:45 AM, Rushabh Lathia 
>> wrote:
>> > Thanks Ashutosh.
>> >
>> > Re-reviewed and Re-verified the patch, pg_sort_all_pd_v5.patch
>> > looks good to me.
>>
>> This patch needs a rebase.
>
> Done.

Thanks.

>> It's not going to work to say this is a patch proposed for commit when
>> it's still got a TODO comment in it that obviously needs to be
>> changed.   And the formatting of that long comment is pretty weird,
>> too, and not consistent with other functions in that same file (e.g.
>> get_remote_estimate, ec_member_matches_foreign, create_cursor).
>>
>
> The TODO was present in v4 but not in v5 and is not present in v6 attached
> here.. Formatted comment according estimate_path_cost_size(),
> convert_prep_stmt_params().

Hrm, I must have been looking at the wrong version somehow.  Sorry about that.

>> Aside from that, I think before we commit this, somebody should do
>> some testing that demonstrates that this is actually a good idea.  Not
>> as part of the test case set for this patch, but just in general.
>> Merge joins are typically going to be relevant for large tables, but
>> the examples in the regression tests are necessarily tiny.  I'd like
>> to see some sample data and some sample queries that get appreciably
>> faster with this code.  If we can't find any, we don't need the code.
>>
>
> I tested the patch on my laptop with two types of queries, a join between
> two foreign tables on different foreign servers (pointing to the same self
> server) and a join between one foreign and one local table. The foreign
> tables and servers are created using sort_pd_setup.sql attached. Foreign
> tables pointed to table with index useful for join clause. Both the joining
> tables had 10M rows. The execution time of query was measured for 100 runs
> and average and standard deviation were calculated (using function
> query_execution_stats() in script sort_pd.sql) and are presented below.

OK, cool.

I went over this patch in some detail today and did a lot of cosmetic
cleanup.  The results are attached.  I'm fairly happy with this
version, but let me know what you think.  Of course, feedback from
others is more than welcome also.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 866a09b..f10752d 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -343,6 +343,76 @@ SELECT 'fixed', NULL FROM ft1 t1 WHERE c1 = 1;
  fixed| 
 (1 row)
 
+-- Test forcing the remote server to produce sorted data for a merge join.
+SET enable_hashjoin TO false;
+SET enable_nestloop TO false;
+-- inner join; expressions in the clauses appear in the equivalence class list
+EXPLAIN (VERBOSE, COSTS false)
+	SELECT t1.c1, t2."C 1" FROM ft2 t1 JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C 1") OFFSET 100 LIMIT 10;
+ QUERY PLAN 
+
+ Limit
+   Output: t1.c1, t2."C 1"
+   ->  Merge Join
+ Output: t1.c1, t2."C 1"
+ Merge Cond: (t1.c1 = t2."C 1")
+ ->  Foreign Scan on public.ft2 t1
+   Output: t1.c1
+   Remote SQL: SELECT "C 1" FROM "S 1"."T 1" ORDER BY "C 1" ASC
+ ->  Index Only Scan using t1_pkey on "S 1"."T 1" t2
+   Output: t2."C 1"
+(10 rows)
+
+SELECT t1.c1, t2."C 1" FROM ft2 t1 JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C 1") OFFSET 100 LIMIT 10;
+ c1  | C 1 
+-+-
+ 101 | 101
+ 102 | 102
+ 103 | 103
+ 104 | 104
+ 105 | 105
+ 106 | 106
+ 107 | 107
+ 108 | 108
+ 109 | 109
+ 110 | 110
+(10 rows)
+
+-- outer join; expressions in the clauses do not appear in equivalence class
+-- list but no output change as compared to the previous query
+EXPLAIN (VERBOSE, COSTS false)
+	SELECT t1.c1, t2."C 1" FROM ft2 t1 LEFT JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C 1") OFFSET 100 LIMIT 10;
+ QUERY PLAN 
+
+ Limit
+   Output: t1.c1, t2."C 1"
+   ->  Merge Left Join
+ Output: t1.c1, t2."C 1"
+ Merge Cond: (t1.c1 = t2."C 1")
+ ->  Foreign Scan on public.ft2 t1
+   Output: t1.c1
+   Remote SQL: SELECT "C 1" FROM "S 1"."T 1" ORDER BY "C 1" ASC
+ ->  Index Only Scan using t1_pkey on "S 1"."T 1" t2
+   Output: t2."C 1"
+(10 rows)
+
+SELECT t1.c1, t2."C 1" FROM ft2 t1 LEFT JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C 1") OFFSET 100 LIMIT 10;
+ c1  | C 1 
+-+-
+ 101 | 101
+ 102 | 102
+ 103 | 103
+ 104 | 104
+ 105 | 105
+ 106 | 

Re: [HACKERS] Getting sorted data from foreign server for merge join

2015-12-17 Thread Ashutosh Bapat
On Wed, Dec 9, 2015 at 12:14 AM, Robert Haas  wrote:

> On Wed, Dec 2, 2015 at 6:45 AM, Rushabh Lathia 
> wrote:
> > Thanks Ashutosh.
> >
> > Re-reviewed and Re-verified the patch, pg_sort_all_pd_v5.patch
> > looks good to me.
>
> This patch needs a rebase.
>

Done.


>
> It's not going to work to say this is a patch proposed for commit when
> it's still got a TODO comment in it that obviously needs to be
> changed.   And the formatting of that long comment is pretty weird,
> too, and not consistent with other functions in that same file (e.g.
> get_remote_estimate, ec_member_matches_foreign, create_cursor).
>
>
The TODO was present in v4 but not in v5 and is not present in v6 attached
here.. Formatted comment according estimate_path_cost_size(),
convert_prep_stmt_params().


> Aside from that, I think before we commit this, somebody should do
> some testing that demonstrates that this is actually a good idea.  Not
> as part of the test case set for this patch, but just in general.
> Merge joins are typically going to be relevant for large tables, but
> the examples in the regression tests are necessarily tiny.  I'd like
> to see some sample data and some sample queries that get appreciably
> faster with this code.  If we can't find any, we don't need the code.
>
>
I tested the patch on my laptop with two types of queries, a join between
two foreign tables on different foreign servers (pointing to the same self
server) and a join between one foreign and one local table. The foreign
tables and servers are created using sort_pd_setup.sql attached. Foreign
tables pointed to table with index useful for join clause. Both the joining
tables had 10M rows. The execution time of query was measured for 100 runs
and average and standard deviation were calculated (using function
query_execution_stats() in script sort_pd.sql) and are presented below.

1. Query between foreign tables
SELECT ft1.val, ft2.val FROM ft1 join ft2 on (ft1.val = ft2.val)

Plan and timings without patch
EXPLAIN (VERBOSE, ANALYSE) :query ;
 QUERY
PLAN
-
 Hash Join  (cost=508510.02..1129945.94 rows=95 width=8) (actual
time=33803.826..82416.342 rows=1000 loops=1)
   Output: ft1.val, ft2.val
   Hash Cond: (ft1.val = ft2.val)
   ->  Foreign Scan on public.ft1  (cost=100.00..344347.31 rows=977
width=4) (actual time=0.624..28531.803 rows=1000 loops=1)
 Output: ft1.val
 Remote SQL: SELECT val FROM public.lt
   ->  Hash  (cost=344347.31..344347.31 rows=977 width=4) (actual
time=33258.025..33258.025 rows=1000 loops=1)
 Output: ft2.val
 Buckets: 131072  Batches: 256  Memory Usage: 2400kB
 ->  Foreign Scan on public.ft2  (cost=100.00..344347.31
rows=977 width=4) (actual time=22.171..28134.970 rows=1000 loops=1)
   Output: ft2.val
   Remote SQL: SELECT val FROM public.lt
 Planning time: 33.155 ms
 Execution time: 82914.607 ms
(14 rows)

 avg_exe_time | std_dev_exe_time | min_exe_time | max_exe_time
--+--+--+--
  78750.95487 | 2911.51825687913 |74314.886 |89358.464

Plan and timing with patch
EXPLAIN (VERBOSE, ANALYSE) :query ;
 QUERY
PLAN
-
 Merge Join  (cost=200.86..1183070.86 rows=1000 width=8) (actual
time=1.776..73140.219 rows=1000 loops=1)
   Output: ft1.val, ft2.val
   Merge Cond: (ft1.val = ft2.val)
   ->  Foreign Scan on public.ft1  (cost=100.43..504035.43 rows=1000
width=4) (actual time=0.937..30422.457 rows=1000 loops=1)
 Output: ft1.val, ft1.val2
 Remote SQL: SELECT val FROM public.lt ORDER BY val ASC
   ->  Materialize  (cost=100.43..529035.43 rows=1000 width=4) (actual
time=0.826..33448.822 rows=1000 loops=1)
 Output: ft2.val, ft2.val2
 ->  Foreign Scan on public.ft2  (cost=100.43..504035.43
rows=1000 width=4) (actual time=0.818..31035.362 rows=1000 loops=1)
   Output: ft2.val, ft2.val2
   Remote SQL: SELECT val FROM public.lt ORDER BY val ASC
 Planning time: 163.161 ms
 Execution time: 73654.106 ms
(13 rows)

 avg_exe_time | std_dev_exe_time | min_exe_time | max_exe_time
--+--+--+--
  71881.15916 | 819.091605498189 |70197.312 |74653.314

It can be observed that the with the patch, merge join strategy is used
instead of hash join and the execution time reduces by approx 9%. A desired
effect is that the deviation in the execution time has reduced heavily
(almost by 75%).

2. Join between local and 

Re: [HACKERS] Getting sorted data from foreign server for merge join

2015-12-08 Thread Robert Haas
On Wed, Dec 2, 2015 at 6:45 AM, Rushabh Lathia  wrote:
> Thanks Ashutosh.
>
> Re-reviewed and Re-verified the patch, pg_sort_all_pd_v5.patch
> looks good to me.

This patch needs a rebase.

It's not going to work to say this is a patch proposed for commit when
it's still got a TODO comment in it that obviously needs to be
changed.   And the formatting of that long comment is pretty weird,
too, and not consistent with other functions in that same file (e.g.
get_remote_estimate, ec_member_matches_foreign, create_cursor).

Aside from that, I think before we commit this, somebody should do
some testing that demonstrates that this is actually a good idea.  Not
as part of the test case set for this patch, but just in general.
Merge joins are typically going to be relevant for large tables, but
the examples in the regression tests are necessarily tiny.  I'd like
to see some sample data and some sample queries that get appreciably
faster with this code.  If we can't find any, we don't need the code.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting sorted data from foreign server for merge join

2015-12-02 Thread Rushabh Lathia
Thanks Ashutosh.

Re-reviewed and Re-verified the patch, pg_sort_all_pd_v5.patch
looks good to me.


On Fri, Nov 27, 2015 at 3:02 PM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

> Thanks Rushabh for your review and comments.
>
> On Thu, Nov 26, 2015 at 5:39 PM, Rushabh Lathia 
> wrote:
>
>> Hi Ashutosh,
>>
>> I reviewed your latest version of patch and over all the implementation
>> and other details look good to me.
>>
>> Here are few cosmetic issues which I found:
>>
>> 1) Patch not getting applied cleanly - white space warning
>>
>>
> Done.
>
>
>> 2)
>>
>> -List   *usable_pathkeys = NIL;
>> +List*useful_pathkeys_list = NIL;/* List of all pathkeys
>> */
>>
>> Code alignment is not correct with other declared variables.
>>
>>
> Incorporated the change in the patch.
>
> 3)
>>
>> +{
>> +PathKey*pathkey;
>> +List*pathkeys;
>> +
>> +pathkey = make_canonical_pathkey(root, cur_ec,
>> +
>> linitial_oid(cur_ec->ec_opfamilies),
>> +BTLessStrategyNumber,
>> +false);
>> +pathkeys = list_make1(pathkey);
>> +useful_pathkeys_list = lappend(useful_pathkeys_list,
>> pathkeys);
>> +}
>>
>> Code alignment need to fix at make_canonical_pathkey().
>>
>
> Incorporated the change in the patch.
>
> I have also removed the TODO item in the prologue of this function, since
> none has objected to externalization of make_canonical_pathkeys till now
> and it's not expected to be part of the final commit.
>
>
>>
>> 4)
>>
>> I don't understand the meaning of following added testcase into
>> postgres_fdw.
>>
>> +++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
>> @@ -171,20 +171,35 @@ SELECT COUNT(*) FROM ft1 t1;
>>  -- join two tables
>>  SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3,
>> t1.c1 OFFSET 100 LIMIT 10;
>>  -- subquery
>>  SELECT * FROM ft1 t1 WHERE t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 <=
>> 10) ORDER BY c1;
>>  -- subquery+MAX
>>  SELECT * FROM ft1 t1 WHERE t1.c3 = (SELECT MAX(c3) FROM ft2 t2) ORDER BY
>> c1;
>>  -- used in CTE
>>  WITH t1 AS (SELECT * FROM ft1 WHERE c1 <= 10) SELECT t2.c1, t2.c2,
>> t2.c3, t2.c4 FROM t1, ft2 t2 WHERE t1.c1 = t2.c1 ORDER BY t1.c1;
>>  -- fixed values
>>  SELECT 'fixed', NULL FROM ft1 t1 WHERE c1 = 1;
>> +-- getting data sorted from the foreign table for merge join
>> +-- Since we are interested in merge join, disable other joins
>> +SET enable_hashjoin TO false;
>> +SET enable_nestloop TO false;
>> +-- inner join, expressions in the clauses appear in the equivalence
>> class list
>> +EXPLAIN (VERBOSE, COSTS false)
>> +SELECT t1.c1, t2."C 1" FROM ft2 t1 JOIN "S 1"."T 1" t2 ON (t1.c1 =
>> t2."C 1") ORDER BY t1.c1 OFFSET 100 LIMIT 10;
>> +SELECT t1.c1, t2."C 1" FROM ft2 t1 JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C
>> 1") ORDER BY t1.c1 OFFSET 100 LIMIT 10;
>> +-- outer join, expression in the clauses do not appear in equivalence
>> class list
>> +-- but no output change as compared to the previous query
>> +EXPLAIN (VERBOSE, COSTS false)
>> +SELECT t1.c1, t2."C 1" FROM ft2 t1 LEFT JOIN "S 1"."T 1" t2 ON
>> (t1.c1 = t2."C 1") ORDER BY t1.c1 OFFSET 100 LIMIT 10;
>> +SELECT t1.c1, t2."C 1" FROM ft2 t1 LEFT JOIN "S 1"."T 1" t2 ON (t1.c1 =
>> t2."C 1") ORDER BY t1.c1 OFFSET 100 LIMIT 10;
>> +SET enable_hashjoin TO true;
>> +SET enable_nestloop TO true;
>>
>> Because, I removed the code changes of the patch and then I run the test
>> seem like it has nothing to do with the code changes. Above set of test
>> giving
>> same result with/without patch.
>>
>> Am I missing something ?
>>
>
> Actually, the output of merge join is always ordered by the pathkeys used
> for merge join. That routed through LIMIT node remains ordered. So, we
> actually do not need ORDER BY t1.c1 clause in the above queries. Without
> that clause, the tests will show difference output with and without patch.
> I have changed the attached patch accordingly.
>
>
>>
>> Apart from this I debugged the patch for each scenario (query pathkeys and
>> pathkeys arising out of the equivalence classes) and so far patch looks
>> good
>> to me.
>>
>>
> Thanks.
>
>
>> Attaching update version of patch by fixing the cosmetic changes.
>>
>>
> Attached version of patch contains your changes.
> --
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company
>
>
>
>


-- 
Rushabh Lathia


Re: [HACKERS] Getting sorted data from foreign server for merge join

2015-11-27 Thread Ashutosh Bapat
Thanks Rushabh for your review and comments.

On Thu, Nov 26, 2015 at 5:39 PM, Rushabh Lathia 
wrote:

> Hi Ashutosh,
>
> I reviewed your latest version of patch and over all the implementation
> and other details look good to me.
>
> Here are few cosmetic issues which I found:
>
> 1) Patch not getting applied cleanly - white space warning
>
>
Done.


> 2)
>
> -List   *usable_pathkeys = NIL;
> +List*useful_pathkeys_list = NIL;/* List of all pathkeys */
>
> Code alignment is not correct with other declared variables.
>
>
Incorporated the change in the patch.

3)
>
> +{
> +PathKey*pathkey;
> +List*pathkeys;
> +
> +pathkey = make_canonical_pathkey(root, cur_ec,
> +
> linitial_oid(cur_ec->ec_opfamilies),
> +BTLessStrategyNumber,
> +false);
> +pathkeys = list_make1(pathkey);
> +useful_pathkeys_list = lappend(useful_pathkeys_list,
> pathkeys);
> +}
>
> Code alignment need to fix at make_canonical_pathkey().
>

Incorporated the change in the patch.

I have also removed the TODO item in the prologue of this function, since
none has objected to externalization of make_canonical_pathkeys till now
and it's not expected to be part of the final commit.


>
> 4)
>
> I don't understand the meaning of following added testcase into
> postgres_fdw.
>
> +++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
> @@ -171,20 +171,35 @@ SELECT COUNT(*) FROM ft1 t1;
>  -- join two tables
>  SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3,
> t1.c1 OFFSET 100 LIMIT 10;
>  -- subquery
>  SELECT * FROM ft1 t1 WHERE t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 <=
> 10) ORDER BY c1;
>  -- subquery+MAX
>  SELECT * FROM ft1 t1 WHERE t1.c3 = (SELECT MAX(c3) FROM ft2 t2) ORDER BY
> c1;
>  -- used in CTE
>  WITH t1 AS (SELECT * FROM ft1 WHERE c1 <= 10) SELECT t2.c1, t2.c2, t2.c3,
> t2.c4 FROM t1, ft2 t2 WHERE t1.c1 = t2.c1 ORDER BY t1.c1;
>  -- fixed values
>  SELECT 'fixed', NULL FROM ft1 t1 WHERE c1 = 1;
> +-- getting data sorted from the foreign table for merge join
> +-- Since we are interested in merge join, disable other joins
> +SET enable_hashjoin TO false;
> +SET enable_nestloop TO false;
> +-- inner join, expressions in the clauses appear in the equivalence class
> list
> +EXPLAIN (VERBOSE, COSTS false)
> +SELECT t1.c1, t2."C 1" FROM ft2 t1 JOIN "S 1"."T 1" t2 ON (t1.c1 =
> t2."C 1") ORDER BY t1.c1 OFFSET 100 LIMIT 10;
> +SELECT t1.c1, t2."C 1" FROM ft2 t1 JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C
> 1") ORDER BY t1.c1 OFFSET 100 LIMIT 10;
> +-- outer join, expression in the clauses do not appear in equivalence
> class list
> +-- but no output change as compared to the previous query
> +EXPLAIN (VERBOSE, COSTS false)
> +SELECT t1.c1, t2."C 1" FROM ft2 t1 LEFT JOIN "S 1"."T 1" t2 ON (t1.c1
> = t2."C 1") ORDER BY t1.c1 OFFSET 100 LIMIT 10;
> +SELECT t1.c1, t2."C 1" FROM ft2 t1 LEFT JOIN "S 1"."T 1" t2 ON (t1.c1 =
> t2."C 1") ORDER BY t1.c1 OFFSET 100 LIMIT 10;
> +SET enable_hashjoin TO true;
> +SET enable_nestloop TO true;
>
> Because, I removed the code changes of the patch and then I run the test
> seem like it has nothing to do with the code changes. Above set of test
> giving
> same result with/without patch.
>
> Am I missing something ?
>

Actually, the output of merge join is always ordered by the pathkeys used
for merge join. That routed through LIMIT node remains ordered. So, we
actually do not need ORDER BY t1.c1 clause in the above queries. Without
that clause, the tests will show difference output with and without patch.
I have changed the attached patch accordingly.


>
> Apart from this I debugged the patch for each scenario (query pathkeys and
> pathkeys arising out of the equivalence classes) and so far patch looks
> good
> to me.
>
>
Thanks.


> Attaching update version of patch by fixing the cosmetic changes.
>
>
Attached version of patch contains your changes.
-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 866a09b..a9b0bd8 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -336,20 +336,93 @@ WITH t1 AS (SELECT * FROM ft1 WHERE c1 <= 10) SELECT t2.c1, t2.c2, t2.c3, t2.c4
  10 |  0 | 00010 | Sun Jan 11 00:00:00 1970 PST
 (10 rows)
 
 -- fixed values
 SELECT 'fixed', NULL FROM ft1 t1 WHERE c1 = 1;
  ?column? | ?column? 
 --+--
  fixed| 
 (1 row)
 
+-- getting data sorted from the foreign table for merge join
+-- Since we are interested in merge join, disable other joins
+-- Merge join always produces the result in the sorted order, so no need for
+-- separate ORDER BY clause to get ordered results.
+SET enable_hashjoin TO false;
+SET enable_nestloop 

Re: [HACKERS] Getting sorted data from foreign server for merge join

2015-11-26 Thread Rushabh Lathia
Hi Ashutosh,

I reviewed your latest version of patch and over all the implementation
and other details look good to me.

Here are few cosmetic issues which I found:

1) Patch not getting applied cleanly - white space warning

2)

-List   *usable_pathkeys = NIL;
+List*useful_pathkeys_list = NIL;/* List of all pathkeys */

Code alignment is not correct with other declared variables.

3)

+{
+PathKey*pathkey;
+List*pathkeys;
+
+pathkey = make_canonical_pathkey(root, cur_ec,
+
linitial_oid(cur_ec->ec_opfamilies),
+BTLessStrategyNumber,
+false);
+pathkeys = list_make1(pathkey);
+useful_pathkeys_list = lappend(useful_pathkeys_list, pathkeys);
+}

Code alignment need to fix at make_canonical_pathkey().

4)

I don't understand the meaning of following added testcase into
postgres_fdw.

+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -171,20 +171,35 @@ SELECT COUNT(*) FROM ft1 t1;
 -- join two tables
 SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3,
t1.c1 OFFSET 100 LIMIT 10;
 -- subquery
 SELECT * FROM ft1 t1 WHERE t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 <= 10)
ORDER BY c1;
 -- subquery+MAX
 SELECT * FROM ft1 t1 WHERE t1.c3 = (SELECT MAX(c3) FROM ft2 t2) ORDER BY
c1;
 -- used in CTE
 WITH t1 AS (SELECT * FROM ft1 WHERE c1 <= 10) SELECT t2.c1, t2.c2, t2.c3,
t2.c4 FROM t1, ft2 t2 WHERE t1.c1 = t2.c1 ORDER BY t1.c1;
 -- fixed values
 SELECT 'fixed', NULL FROM ft1 t1 WHERE c1 = 1;
+-- getting data sorted from the foreign table for merge join
+-- Since we are interested in merge join, disable other joins
+SET enable_hashjoin TO false;
+SET enable_nestloop TO false;
+-- inner join, expressions in the clauses appear in the equivalence class
list
+EXPLAIN (VERBOSE, COSTS false)
+SELECT t1.c1, t2."C 1" FROM ft2 t1 JOIN "S 1"."T 1" t2 ON (t1.c1 =
t2."C 1") ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2."C 1" FROM ft2 t1 JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C
1") ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+-- outer join, expression in the clauses do not appear in equivalence
class list
+-- but no output change as compared to the previous query
+EXPLAIN (VERBOSE, COSTS false)
+SELECT t1.c1, t2."C 1" FROM ft2 t1 LEFT JOIN "S 1"."T 1" t2 ON (t1.c1
= t2."C 1") ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+SELECT t1.c1, t2."C 1" FROM ft2 t1 LEFT JOIN "S 1"."T 1" t2 ON (t1.c1 =
t2."C 1") ORDER BY t1.c1 OFFSET 100 LIMIT 10;
+SET enable_hashjoin TO true;
+SET enable_nestloop TO true;

Because, I removed the code changes of the patch and then I run the test
seem like it has nothing to do with the code changes. Above set of test
giving
same result with/without patch.

Am I missing something ?

Apart from this I debugged the patch for each scenario (query pathkeys and
pathkeys arising out of the equivalence classes) and so far patch looks good
to me.

Attaching update version of patch by fixing the cosmetic changes.


On Fri, Nov 20, 2015 at 9:14 PM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

>
>
> On Thu, Nov 19, 2015 at 7:30 AM, Robert Haas 
> wrote:
>
>> On Tue, Nov 17, 2015 at 8:33 AM, Ashutosh Bapat
>>  wrote:
>> >> Although I'm usually on the side of marking things as extern whenever
>> >> we find it convenient, I'm nervous about doing that to
>> >> make_canonical_pathkey(), because it has side effects.  Searching the
>> >> list of canonical pathkeys for the one we need is reasonable, but is
>> >> it really right to ever think that we might create a new one at this
>> >> stage?  Maybe it is, but if so I'd like to hear a good explanation as
>> >> to why.
>> >
>> > For a foreign table no pathkeys are created before creating paths for
>> > individual foreign table since there are no indexes. For a regular
>> table,
>> > the pathkeys get created for useful indexes.
>>
>> OK.
>>
>> Could some portion of the second foreach loop inside
>> get_useful_pathkeys_for_relation be replaced by a call to
>> eclass_useful_for_merging?  The logic looks very similar.
>>
>
> Yes. I have incorporated this change in the patch attached.
>
>
>>
>> More broadly, I'm wondering about the asymmetries between this code
>> and pathkeys_useful_for_merging().  The latter has a
>> right_merge_direction() test and a case for non-EC-derivable join
>> clauses that aren't present in your code.
>
> I wonder if we could just
>> generate pathkeys speculatively and then test which ones survive
>> truncate_useless_pathkeys(), or something like that.  This isn't an
>> enormously complicated patch, but it duplicates more of the logic in
>> pathkeys.c than I'd like.
>>
>
>
> pathkeys_useful_for_merging(), truncate_useless_pathkeys() and host of
> functions in that area are crafted to assess the usefulness of given
> pathkeys which for regular tables are "speculated" from indexes on 

Re: [HACKERS] Getting sorted data from foreign server for merge join

2015-11-20 Thread Ashutosh Bapat
On Thu, Nov 19, 2015 at 7:30 AM, Robert Haas  wrote:

> On Tue, Nov 17, 2015 at 8:33 AM, Ashutosh Bapat
>  wrote:
> >> Although I'm usually on the side of marking things as extern whenever
> >> we find it convenient, I'm nervous about doing that to
> >> make_canonical_pathkey(), because it has side effects.  Searching the
> >> list of canonical pathkeys for the one we need is reasonable, but is
> >> it really right to ever think that we might create a new one at this
> >> stage?  Maybe it is, but if so I'd like to hear a good explanation as
> >> to why.
> >
> > For a foreign table no pathkeys are created before creating paths for
> > individual foreign table since there are no indexes. For a regular table,
> > the pathkeys get created for useful indexes.
>
> OK.
>
> Could some portion of the second foreach loop inside
> get_useful_pathkeys_for_relation be replaced by a call to
> eclass_useful_for_merging?  The logic looks very similar.
>

Yes. I have incorporated this change in the patch attached.


>
> More broadly, I'm wondering about the asymmetries between this code
> and pathkeys_useful_for_merging().  The latter has a
> right_merge_direction() test and a case for non-EC-derivable join
> clauses that aren't present in your code.

I wonder if we could just
> generate pathkeys speculatively and then test which ones survive
> truncate_useless_pathkeys(), or something like that.  This isn't an
> enormously complicated patch, but it duplicates more of the logic in
> pathkeys.c than I'd like.
>


pathkeys_useful_for_merging(), truncate_useless_pathkeys() and host of
functions in that area are crafted to assess the usefulness of given
pathkeys which for regular tables are "speculated" from indexes on that
table. Foreign tables do not have indexes and neither we have information
about the indexes (if any) on foreign server, thus pathkeys to be assessed
are not readily available. Hence we need some other way of "speculating"
pathkeys for foreign tables. We can not just generate pathkeys because
there are infinite possible expressions on which pathkeys can be built. So,
we hunt for the expressions at various places like root_pathkeys, merge
joinable clauses etc. The seeming duplication of code is because the places
where we are hunting for useful pathkeys in case of foreign table are same
as the places used for assessing usefulness for pathkeys in case of regular
table. Thus in case of foreign tables, we always generate useful pathkeys,
which do not need any further assessment. For regular tables we have set of
pathkeys which need to be assessed. Because of this difference the code
differs in details.

right_merge_direction() compares whether a given pathkey (readily available
or derived from an index) has the same direction as required by
root->query_pathkeys or ascending direction. For foreign tables, the
pathkeys are themselves created from root->query_pathkeys, thus doesn't
require this assessment. The patch creates pathkeys other than those from
root->query_pathkeys in the ascending order (like other places in the code
eg. select_outer_pathkeys_for_merge()), which is what
right_merge_direction() assesses. So again we don't need to call
right_merge_direction().

non-EC-derivable join is interesting. Thanks for bringing it out. These are
mergejoinable clauses in an OUTER join, where columns from inner side can
be NULL, and thus not exactly equal. I will incorporate that change in the
patch. Again while doing so, we have to just pick up the right or left
equivalence class from a given RestrictInfo and don't need to assess it
further like pathkeys_useful_for_merging().

Added this change in the attached patch.

>
> I'm inclined to think that we shouldn't consider pathkeys that might
> be useful for merge-joining unless we're using remote estimates.  It
> seems more speculative than pushing down a toplevel sort.
>

I agree with you but let me elaborate why I agree. The pathkeys we are
generating are heauristic in nature and thus may not be useful in the same
sense as pathkeys for regular tables. If use_remote_estimate is ON, the
planner will spend time in explaining multiple queries, but it will be in
better position to cost the usage. If use_remote_estimate is OFF, we will
altogether loose chances of merge join, which doesn't look good to me. But
a speculative costing done in this case can result in choosing wrong plan
similar to the same case as toplevel sort. But then choosing a wrong join
has more serious implications than estimating wrong cost for top level join.


>
> This patch seems rather light on test cases.
>
>
Added tests. Let me know if you have any specific scenario in mind.
-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 866a09b..8b05fcd 100644
--- 

Re: [HACKERS] Getting sorted data from foreign server for merge join

2015-11-18 Thread Robert Haas
On Tue, Nov 17, 2015 at 8:33 AM, Ashutosh Bapat
 wrote:
>> Although I'm usually on the side of marking things as extern whenever
>> we find it convenient, I'm nervous about doing that to
>> make_canonical_pathkey(), because it has side effects.  Searching the
>> list of canonical pathkeys for the one we need is reasonable, but is
>> it really right to ever think that we might create a new one at this
>> stage?  Maybe it is, but if so I'd like to hear a good explanation as
>> to why.
>
> For a foreign table no pathkeys are created before creating paths for
> individual foreign table since there are no indexes. For a regular table,
> the pathkeys get created for useful indexes.

OK.

Could some portion of the second foreach loop inside
get_useful_pathkeys_for_relation be replaced by a call to
eclass_useful_for_merging?  The logic looks very similar.

More broadly, I'm wondering about the asymmetries between this code
and pathkeys_useful_for_merging().  The latter has a
right_merge_direction() test and a case for non-EC-derivable join
clauses that aren't present in your code.  I wonder if we could just
generate pathkeys speculatively and then test which ones survive
truncate_useless_pathkeys(), or something like that.  This isn't an
enormously complicated patch, but it duplicates more of the logic in
pathkeys.c than I'd like.

I'm inclined to think that we shouldn't consider pathkeys that might
be useful for merge-joining unless we're using remote estimates.  It
seems more speculative than pushing down a toplevel sort.

This patch seems rather light on test cases.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting sorted data from foreign server for merge join

2015-11-17 Thread Ashutosh Bapat
Thanks Robert for your comments. Please see my replies inlined. Updated
patch is attached.

On Fri, Nov 6, 2015 at 10:02 PM, Robert Haas  wrote:

>
> I think this approach is generally reasonable, but I suggested parts
> of it, so may be biased.  I would be interested in hearing the
> opinions of others.
>
> Random notes:
>
> "possibily" is a typo.
>

Fixed.


>
> usable_pklist is confusing because it seems like it might be talking
> about primary keys rather than pathkeys.  Also, I realize now, looking
> at this again, that you're saying "usable" when what I really think
> you mean is "useful". Lots of pathkeys are usable, but only a few of
> those are useful.  I suggest renaming usable_pathkeys to
> query_pathkeys


Done.


> and usable_pklist to useful_pathkeys.


pathkeys is being used to mean a list of pathkey nodes. What we want here
is list of pathkeys so I have renamed it as useful_pathkeys_list, somewhat
long but clear.


> Similarly, let's
> rename generate_pathkeys_for_relation() to
> get_useful_pathkeys_for_relation().
>

Done.


>
> Although I'm usually on the side of marking things as extern whenever
> we find it convenient, I'm nervous about doing that to
> make_canonical_pathkey(), because it has side effects.  Searching the
> list of canonical pathkeys for the one we need is reasonable, but is
> it really right to ever think that we might create a new one at this
> stage?  Maybe it is, but if so I'd like to hear a good explanation as
> to why.
>

For a foreign table no pathkeys are created before creating paths for
individual foreign table since there are no indexes. For a regular table,
the pathkeys get created for useful indexes. Exception to this is if the
expression appears in the ORDER BY clause, the pathkey for the same is
created while handling ORDER BY clause. So, we will always need to create
pathkey for a foreign table unless the corresponding expression does not
appear in the ORDER BY clause. This can be verified by breaking in
make_canonical_pathkey() while executing "explain verbose select * from ft1
join lt using (val);" ft1(val int, val2 int) is foreign table and lt (val
int, val2 int) is local table. You will hit the first breakpoint with stack
trace
Breakpoint 1, make_canonical_pathkey (root=0x231d858, eclass=0x231f030,
opfamily=1976, strategy=1, nulls_first=0 '\000') at pathkeys.c:60
(gdb) where
#0  make_canonical_pathkey (root=0x231d858, eclass=0x231f030,
opfamily=1976, strategy=1, nulls_first=0 '\000') at pathkeys.c:60
#1  0x7f6740077e39 in get_useful_pathkeys_for_relation (root=0x231d858,
rel=0x231e390) at postgres_fdw.c:663
#2  0x7f6740077f34 in postgresGetForeignPaths (root=0x231d858,
baserel=0x231e390, foreigntableid=16393) at postgres_fdw.c:705
#3  0x007079cf in set_foreign_pathlist (root=0x231d858,
rel=0x231e390, rte=0x231c488) at allpaths.c:600
#4  0x00707653 in set_rel_pathlist (root=0x231d858, rel=0x231e390,
rti=1, rte=0x231c488) at allpaths.c:395
#5  0x0070735f in set_base_rel_pathlists (root=0x231d858) at
allpaths.c:277

at this point

(gdb) p root->canon_pathkeys
$1 = (List *) 0x0

so get_useful_pathkeys_for_relation->make_canonical_pathkey() will create
the first pathkey.

I have left the corresponding TODO untouched, if anybody else wants to
review that part of the code. I will remove it in the final version of the
patch.


>
>  Is the comment "Equivalence classes covering relations other than the
> current one are of interest here" missing a "not"?
>

The sentence is correct. We need equivalence class covering relations other
than the current one, because only such equivalence classes indicate joins
between two relations. If an equivalence class covers only a single baserel
(has only a single member in ec_relids), it indicates equivalence between
columns/expressions of the same table, which can not result in a join. I
have changed to comment to be more elaborate.


>
> I don't find this comment illuminating:
>
> + * In case of child relation, we need to check that the
> + * equivalence class indicates a join to a relation other than
> + * parents, other children and itself (something similar to
> above).
> + * Otherwise we will end up creating useless paths. The code
> below is
> + * similar to generate_implied_equalities_for_column(), which
> might
> + * give a hint.
>
> That basically just says that we have to do it this way because the
> other way would be wrong.  But it doesn't say WHY the other way would
> be wrong.


There's no "other way" which is wrong. What's the other way you are talking
about?

Anway, I have updated the comment to be more readable.


> Then a few lines later, you have another comment which says
> the same thing again:
>
> +/*
> + * Ignore equivalence members which correspond to children
> + * or same relation or to parent relations
> + */
>
>
Updated this too.


Re: [HACKERS] Getting sorted data from foreign server for merge join

2015-11-16 Thread Ashutosh Bapat
On Mon, Nov 9, 2015 at 9:39 PM, Robert Haas  wrote:

> On Fri, Nov 6, 2015 at 2:03 PM, Kevin Grittner  wrote:
> > Has anyone taken a close look at what happens if the two sides of
> > the merge join have different implementations of the same collation
> > name?  Is there anything we should do to defend against the
> > problem?
>
> The issue of FDWs vs. collations has been thought about to some degree
> (see FDW_COLLATE_NONE/SAFE/UNSAFE), but I don't quite understand that
> stuff in detail> .
> >
>

collations arising from a foreign table's var are considered to be safer
(FDW_COLLATE_SAFE) to push down to the foreign server , since they are
either local default collation or are assumed to be same on foreign and
local server as per user declaration. The onus of making that sure is on
the user who declares particular collation for a foreign table var. As said
upthread, different glibc implementations can cause collation ordering
mismatch, this patch will be susceptible to the same problem as
master/standby problem.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] Getting sorted data from foreign server for merge join

2015-11-16 Thread Craig Ringer
On 16 November 2015 at 17:47, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:
>
>
> collations arising from a foreign table's var are considered to be safer
> (FDW_COLLATE_SAFE) to push down to the foreign server , since they are
> either local default collation or are assumed to be same on foreign and
> local server as per user declaration. The onus of making that sure is on
> the user who declares particular collation for a foreign table var. As said
> upthread, different glibc implementations can cause collation ordering
> mismatch, this patch will be susceptible to the same problem as
> master/standby problem.
>
>
Yeah. It's less bad than the related problems we already have:

* Upgrading glibc can cause your indexes to no longer match your local
collations
* Different glibc versions on master and replica(s) can have the same effect

I don't see a problem with adding another way this same issue can be
expressed, since there's no sane way to fix it _properly_ without either
versioned collations in glibc or bringing collation onboard into Pg.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Getting sorted data from foreign server for merge join

2015-11-09 Thread Robert Haas
On Sat, Nov 7, 2015 at 5:42 PM, Greg Stark  wrote:
> On Fri, Nov 6, 2015 at 4:54 AM, Ashutosh Bapat
>  wrote:
>> PFA patch to get data sorted from the foreign server (postgres_fdw)
>> according to the pathkeys useful for merge join.
>
> An idle thought. There are going to be a lot of cases where different
> software systems actually disagree about collation rules. I wonder if
> it would be valuable to have a node that just checks that each row is
> in fact greater than the previous row and throws an error if not. That
> can be optional or a parameter of the FDW but it's probably cheap
> enough to have enabled by default. It would save a lot of difficult to
> heartache since the behaviour if the inputs aren't correctly sorted
> will be strangely incorrect join results. Often the results may just
> be missing or duplicated rows and that can be easy to miss and lead to
> corrupted databases or security problems.

It's not a bad thought, but it could come up even locally - we've had
more than one situation where indexes have gotten corrupted by
updating glibc.  The new glibc doesn't agree with the old one on what
the collation ordering is, and so the indexes are wrong with respect
to the new glibc version.  If we were going to design something like
this, rather than making it a separate node, I'd be inclined to create
it as a C-callable function that could be invoked anywhere we want to
check that the ordering is valid.  I suspect you're wrong about the
cost, though: I bet it wouldn't be too hard to find cases where it
imposes a really noticeable penalty.

Also, to be clear, I don't think this patch needs to solve that problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting sorted data from foreign server for merge join

2015-11-09 Thread Robert Haas
On Fri, Nov 6, 2015 at 2:03 PM, Kevin Grittner  wrote:
> Has anyone taken a close look at what happens if the two sides of
> the merge join have different implementations of the same collation
> name?  Is there anything we should do to defend against the
> problem?

The issue of FDWs vs. collations has been thought about to some degree
(see FDW_COLLATE_NONE/SAFE/UNSAFE), but I don't quite understand that
stuff in detail.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting sorted data from foreign server for merge join

2015-11-09 Thread Robert Haas
On Sat, Nov 7, 2015 at 12:07 PM, Corey Huinker  wrote:
> Sorry to barge in late, but I was wondering if what we've learned with this
> patch can be applied to the case of asserting a sort order on a query
> returning from dblink().

Nope.

Sorry to the bearer of bad news, but that would be a different and
much harder development project.  Set-returning functions have no way
to indicate anything about the ordering of their return values at
present.  We could invent something, but the work involved wouldn't
have much to do with this patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting sorted data from foreign server for merge join

2015-11-09 Thread Amit Langote
On 2015/11/10 0:56, Robert Haas wrote:
> On Sat, Nov 7, 2015 at 12:07 PM, Corey Huinker  
> wrote:
>> Sorry to barge in late, but I was wondering if what we've learned with this
>> patch can be applied to the case of asserting a sort order on a query
>> returning from dblink().
> 
> Nope.
> 
> Sorry to the bearer of bad news, but that would be a different and
> much harder development project.  Set-returning functions have no way
> to indicate anything about the ordering of their return values at
> present.  We could invent something, but the work involved wouldn't
> have much to do with this patch.

There was a patch (which, it seems, was rejected [1]) to do this sort of
thing.

Thanks,
Amit

[1] https://commitfest.postgresql.org/4/88/



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting sorted data from foreign server for merge join

2015-11-07 Thread Corey Huinker
On Fri, Nov 6, 2015 at 11:32 AM, Robert Haas  wrote:

> On Thu, Nov 5, 2015 at 11:54 PM, Ashutosh Bapat
>  wrote:
> > Hi All,
> > PFA patch to get data sorted from the foreign server (postgres_fdw)
> > according to the pathkeys useful for merge join.
> >
> > For a given base relation (extendable to join when that becomes
> available in
> > postgres_fdw), the patch tries to find merge joinable clauses. It then
> adds
> > paths with pathkeys extracted out of these merge joinable clauses. The
> merge
> > joinable clauses form equivalence classes. The patch searches in
> > root->eq_classes for equivalence members belonging to the given relation.
> > For every such expression it creates a single member pathkey list and
> > corresponding path. The test postgres_fdw.sql has an existing join which
> > uses merge join. With this patch the data is sorted on the foreign server
> > than locally.
> >
> > While mergejoinable clauses can be obtained from rel->joininfo as well.
> But
> > rel->joininfo contains other clauses as well and we need extra efforts to
> > remove duplicates if the same expression appears in multiple merge
> joinable
> > clauses.
> >
> > Two joining relations can have multiple merge joinable clauses, requiring
> > multi-member pathkeys so that merge join is efficient to the maximum
> extent.
> > The order in which the expressions appears in pathkeys can change the
> costs
> > of sorting the data according to the pathkeys, depending upon the
> > expressions and the presence of indexes containing those expressions.
> Thus
> > ideally we would need to club all the expressions appearing in all the
> > clauses for given two relations and create paths with pathkeys for every
> > order of these expressions.That explodes the number of possible paths. We
> > may restrict the number of paths created by considering only certain
> orders
> > like sort_inner_and_outer(). In any case, costing such paths increases
> the
> > planning time which may not be worth it. So, this patch uses a heuristic
> > approach of creating single member pathkeys path for every merge joinable
> > expression.
> >
> > The pathkeys need to be canonicalised using make_canonical_pathkey(),
> which
> > is a static function. I have added a TODO and comments in the patch
> > explaining possible ways to avoid "extern"alization of this function.
> >
> > Comments/suggestions are welcome.
>
> I think this approach is generally reasonable, but I suggested parts
> of it, so may be biased.  I would be interested in hearing the
> opinions of others.
>
> Random notes:
>
> "possibily" is a typo.
>
> usable_pklist is confusing because it seems like it might be talking
> about primary keys rather than pathkeys.  Also, I realize now, looking
> at this again, that you're saying "usable" when what I really think
> you mean is "useful".  Lots of pathkeys are usable, but only a few of
> those are useful.  I suggest renaming usable_pathkeys to
> query_pathkeys and usable_pklist to useful_pathkeys.  Similarly, let's
> rename generate_pathkeys_for_relation() to
> get_useful_pathkeys_for_relation().
>
> Although I'm usually on the side of marking things as extern whenever
> we find it convenient, I'm nervous about doing that to
> make_canonical_pathkey(), because it has side effects.  Searching the
> list of canonical pathkeys for the one we need is reasonable, but is
> it really right to ever think that we might create a new one at this
> stage?  Maybe it is, but if so I'd like to hear a good explanation as
> to why.
>
>  Is the comment "Equivalence classes covering relations other than the
> current one are of interest here" missing a "not"?
>
> I don't find this comment illuminating:
>
> + * In case of child relation, we need to check that the
> + * equivalence class indicates a join to a relation other than
> + * parents, other children and itself (something similar to
> above).
> + * Otherwise we will end up creating useless paths. The code
> below is
> + * similar to generate_implied_equalities_for_column(), which
> might
> + * give a hint.
>
> That basically just says that we have to do it this way because the
> other way would be wrong.  But it doesn't say WHY the other way would
> be wrong. Then a few lines later, you have another comment which says
> the same thing again:
>
> +/*
> + * Ignore equivalence members which correspond to children
> + * or same relation or to parent relations
> + */
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Sorry to barge in late, but I was wondering if what we've learned with this
patch can be applied to the case of asserting a 

Re: [HACKERS] Getting sorted data from foreign server for merge join

2015-11-07 Thread Greg Stark
On Fri, Nov 6, 2015 at 4:54 AM, Ashutosh Bapat
 wrote:
> PFA patch to get data sorted from the foreign server (postgres_fdw)
> according to the pathkeys useful for merge join.


An idle thought. There are going to be a lot of cases where different
software systems actually disagree about collation rules. I wonder if
it would be valuable to have a node that just checks that each row is
in fact greater than the previous row and throws an error if not. That
can be optional or a parameter of the FDW but it's probably cheap
enough to have enabled by default. It would save a lot of difficult to
heartache since the behaviour if the inputs aren't correctly sorted
will be strangely incorrect join results. Often the results may just
be missing or duplicated rows and that can be easy to miss and lead to
corrupted databases or security problems.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting sorted data from foreign server for merge join

2015-11-06 Thread Kevin Grittner
On Friday, November 6, 2015 10:32 AM, Robert Haas  wrote:

> I think this approach is generally reasonable, but I suggested
> parts of it, so may be biased.  I would be interested in hearing
> the opinions of others.

Has anyone taken a close look at what happens if the two sides of
the merge join have different implementations of the same collation
name?  Is there anything we should do to defend against the
problem?

We already face the issue of corrupted indexes when we have
different revisions of glibc on a primary and a standby or when the
OS on a server is updated, so this wouldn't be entirely a *new*
problem:

http://www.postgresql.org/message-id/ba6132ed-1f6b-4a0b-ac22-81278f5ab...@tripadvisor.com

... but it would be a brand-new way to hit it, and we might be able
to spot the problem in a merge join by watching for rows being fed
to either side of the join which are not in order according to the
machine doing the join.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting sorted data from foreign server for merge join

2015-11-06 Thread Robert Haas
On Thu, Nov 5, 2015 at 11:54 PM, Ashutosh Bapat
 wrote:
> Hi All,
> PFA patch to get data sorted from the foreign server (postgres_fdw)
> according to the pathkeys useful for merge join.
>
> For a given base relation (extendable to join when that becomes available in
> postgres_fdw), the patch tries to find merge joinable clauses. It then adds
> paths with pathkeys extracted out of these merge joinable clauses. The merge
> joinable clauses form equivalence classes. The patch searches in
> root->eq_classes for equivalence members belonging to the given relation.
> For every such expression it creates a single member pathkey list and
> corresponding path. The test postgres_fdw.sql has an existing join which
> uses merge join. With this patch the data is sorted on the foreign server
> than locally.
>
> While mergejoinable clauses can be obtained from rel->joininfo as well. But
> rel->joininfo contains other clauses as well and we need extra efforts to
> remove duplicates if the same expression appears in multiple merge joinable
> clauses.
>
> Two joining relations can have multiple merge joinable clauses, requiring
> multi-member pathkeys so that merge join is efficient to the maximum extent.
> The order in which the expressions appears in pathkeys can change the costs
> of sorting the data according to the pathkeys, depending upon the
> expressions and the presence of indexes containing those expressions. Thus
> ideally we would need to club all the expressions appearing in all the
> clauses for given two relations and create paths with pathkeys for every
> order of these expressions.That explodes the number of possible paths. We
> may restrict the number of paths created by considering only certain orders
> like sort_inner_and_outer(). In any case, costing such paths increases the
> planning time which may not be worth it. So, this patch uses a heuristic
> approach of creating single member pathkeys path for every merge joinable
> expression.
>
> The pathkeys need to be canonicalised using make_canonical_pathkey(), which
> is a static function. I have added a TODO and comments in the patch
> explaining possible ways to avoid "extern"alization of this function.
>
> Comments/suggestions are welcome.

I think this approach is generally reasonable, but I suggested parts
of it, so may be biased.  I would be interested in hearing the
opinions of others.

Random notes:

"possibily" is a typo.

usable_pklist is confusing because it seems like it might be talking
about primary keys rather than pathkeys.  Also, I realize now, looking
at this again, that you're saying "usable" when what I really think
you mean is "useful".  Lots of pathkeys are usable, but only a few of
those are useful.  I suggest renaming usable_pathkeys to
query_pathkeys and usable_pklist to useful_pathkeys.  Similarly, let's
rename generate_pathkeys_for_relation() to
get_useful_pathkeys_for_relation().

Although I'm usually on the side of marking things as extern whenever
we find it convenient, I'm nervous about doing that to
make_canonical_pathkey(), because it has side effects.  Searching the
list of canonical pathkeys for the one we need is reasonable, but is
it really right to ever think that we might create a new one at this
stage?  Maybe it is, but if so I'd like to hear a good explanation as
to why.

 Is the comment "Equivalence classes covering relations other than the
current one are of interest here" missing a "not"?

I don't find this comment illuminating:

+ * In case of child relation, we need to check that the
+ * equivalence class indicates a join to a relation other than
+ * parents, other children and itself (something similar to above).
+ * Otherwise we will end up creating useless paths. The code below is
+ * similar to generate_implied_equalities_for_column(), which might
+ * give a hint.

That basically just says that we have to do it this way because the
other way would be wrong.  But it doesn't say WHY the other way would
be wrong. Then a few lines later, you have another comment which says
the same thing again:

+/*
+ * Ignore equivalence members which correspond to children
+ * or same relation or to parent relations
+ */

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting sorted data from foreign server

2015-11-03 Thread Robert Haas
On Fri, Oct 30, 2015 at 6:19 AM, Ashutosh Bapat
 wrote:
> If there is a collate clause in the ORDER BY, the server crashes with
> assertion
> +Assert(loc_cxt.state == FDW_COLLATE_NONE ||
> +loc_cxt.state == FDW_COLLATE_SAFE);
>
>
> The assertion is fine as long as is_foreign_expr() tests only boolean
> expressions (appearing in quals). This patch uses the function to test an
> expression appearing in ORDER BY clause, which need not be boolean. Attached
> patch removed the assertion and instead makes the function return false,
> when the walker deems collation of the expression unsafe. The walker can not
> return false when it encounter unsafe expression since the subtree it's
> examining might be part of an expression which does not use the collation
> ultimately.

I've committed this version.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting sorted data from foreign server

2015-11-03 Thread Ashutosh Bapat
On Tue, Nov 3, 2015 at 11:35 PM, Robert Haas  wrote:

> On Fri, Oct 30, 2015 at 6:19 AM, Ashutosh Bapat
>  wrote:
> > If there is a collate clause in the ORDER BY, the server crashes with
> > assertion
> > +Assert(loc_cxt.state == FDW_COLLATE_NONE ||
> > +loc_cxt.state == FDW_COLLATE_SAFE);
> >
> >
> > The assertion is fine as long as is_foreign_expr() tests only boolean
> > expressions (appearing in quals). This patch uses the function to test an
> > expression appearing in ORDER BY clause, which need not be boolean.
> Attached
> > patch removed the assertion and instead makes the function return false,
> > when the walker deems collation of the expression unsafe. The walker can
> not
> > return false when it encounter unsafe expression since the subtree it's
> > examining might be part of an expression which does not use the collation
> > ultimately.
>
> I've committed this version.
>
>
Thanks.


> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>



-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] Getting sorted data from foreign server

2015-10-30 Thread Ashutosh Bapat
If there is a collate clause in the ORDER BY, the server crashes with
assertion
+Assert(loc_cxt.state == FDW_COLLATE_NONE ||
+loc_cxt.state == FDW_COLLATE_SAFE);


The assertion is fine as long as is_foreign_expr() tests only boolean
expressions (appearing in quals). This patch uses the function to test an
expression appearing in ORDER BY clause, which need not be boolean.
Attached patch removed the assertion and instead makes the function return
false, when the walker deems collation of the expression unsafe. The walker
can not return false when it encounter unsafe expression since the subtree
it's examining might be part of an expression which does not use the
collation ultimately.

On Wed, Oct 28, 2015 at 11:51 AM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

> On Tue, Oct 27, 2015 at 6:44 PM, Fabrízio de Royes Mello <
> fabriziome...@gmail.com> wrote:
>
>>
>>
>> On Tue, Oct 27, 2015 at 5:26 AM, Ashutosh Bapat <
>> ashutosh.ba...@enterprisedb.com> wrote:
>> >
>> >
>> >
>> > On Fri, Oct 23, 2015 at 2:43 AM, Robert Haas 
>> wrote:
>> >>
>> >> On Wed, Oct 21, 2015 at 5:23 AM, Ashutosh Bapat
>> >>  wrote:
>> >> > Increasing the sorting cost factor (when use_remote_estimates =
>> false) from
>> >> > 1.1 to 1.2 makes the difference disappear.
>> >> >
>> >> > Since the startup costs for postgres_fdw are large portion of total
>> cost,
>> >> > extra 10% of rest of the cost is comparable to 1% fuzzy limit. IMO,
>> we
>> >> > shouldn't bother too much about it as the path costs are not much
>> different.
>> >>
>> >> My feeling is that cranking the sorting cost factor up to 20-25% would
>> >> be a good idea, just so we have less unnecessary plan churn.  I dunno
>> >> if sorting always costs that much, but if a 10% cost overhead is
>> >> really 1% because it only applies to a fraction of the cost, I don't
>> >> think that's good.  The whole point was to pick something large enough
>> >> that we wouldn't take the sorted path unless we will benefit from the
>> >> sort, and clearly that's not what happened here.
>> >>
>> >
>> > PFA patch with the default multiplication factor for sort bumped up to
>> 1.2.
>> >
>>
>> +/* If no remote estimates, assume a sort costs 10% extra */
>> +#define DEFAULT_FDW_SORT_MULTIPLIER 1.2
>>
>> The above comment should not be 20%?
>>
>> Ah! Here's patch with comment fixed.
>
> --
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company
>



-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 697de60..3cb728f 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -186,23 +186,26 @@ is_foreign_expr(PlannerInfo *root,
 	 * Check that the expression consists of nodes that are safe to execute
 	 * remotely.
 	 */
 	glob_cxt.root = root;
 	glob_cxt.foreignrel = baserel;
 	loc_cxt.collation = InvalidOid;
 	loc_cxt.state = FDW_COLLATE_NONE;
 	if (!foreign_expr_walker((Node *) expr, _cxt, _cxt))
 		return false;
 
-	/* Expressions examined here should be boolean, ie noncollatable */
-	Assert(loc_cxt.collation == InvalidOid);
-	Assert(loc_cxt.state == FDW_COLLATE_NONE);
+	/*
+	 * If the expression has a valid collation that does not arise from a
+	 * foreign var, the expression can not be sent over.
+	 */
+	if (loc_cxt.state == FDW_COLLATE_UNSAFE)
+		return false;
 
 	/*
 	 * An expression which includes any mutable functions can't be sent over
 	 * because its result is not stable.  For example, sending now() remote
 	 * side could cause confusion from clock offsets.  Future versions might
 	 * be able to make this choice with more granularity.  (We check this last
 	 * because it requires a lot of expensive catalog lookups.)
 	 */
 	if (contain_mutable_functions((Node *) expr))
 		return false;
@@ -1870,10 +1873,57 @@ printRemoteParam(int paramindex, Oid paramtype, int32 paramtypmod,
  */
 static void
 printRemotePlaceholder(Oid paramtype, int32 paramtypmod,
 	   deparse_expr_cxt *context)
 {
 	StringInfo	buf = context->buf;
 	char	   *ptypename = format_type_with_typemod(paramtype, paramtypmod);
 
 	appendStringInfo(buf, "((SELECT null::%s)::%s)", ptypename, ptypename);
 }
+
+/*
+ * Deparse ORDER BY clause according to the given pathkeys for given base
+ * relation. From given pathkeys expressions belonging entirely to the given
+ * base relation are obtained and deparsed.
+ */
+void
+appendOrderByClause(StringInfo buf, PlannerInfo *root, RelOptInfo *baserel,
+	List *pathkeys)
+{
+	ListCell			*lcell;
+	deparse_expr_cxt	context;
+	int	nestlevel;
+	char*delim = " ";
+
+	/* Set up context struct for recursion */
+	context.root = root;
+	context.foreignrel = baserel;
+	context.buf = buf;
+	context.params_list = NULL;
+
+	/* Make sure any constants in the exprs are printed portably */
+	nestlevel = 

Re: [HACKERS] Getting sorted data from foreign server

2015-10-28 Thread Ashutosh Bapat
On Tue, Oct 27, 2015 at 6:44 PM, Fabrízio de Royes Mello <
fabriziome...@gmail.com> wrote:

>
>
> On Tue, Oct 27, 2015 at 5:26 AM, Ashutosh Bapat <
> ashutosh.ba...@enterprisedb.com> wrote:
> >
> >
> >
> > On Fri, Oct 23, 2015 at 2:43 AM, Robert Haas 
> wrote:
> >>
> >> On Wed, Oct 21, 2015 at 5:23 AM, Ashutosh Bapat
> >>  wrote:
> >> > Increasing the sorting cost factor (when use_remote_estimates =
> false) from
> >> > 1.1 to 1.2 makes the difference disappear.
> >> >
> >> > Since the startup costs for postgres_fdw are large portion of total
> cost,
> >> > extra 10% of rest of the cost is comparable to 1% fuzzy limit. IMO, we
> >> > shouldn't bother too much about it as the path costs are not much
> different.
> >>
> >> My feeling is that cranking the sorting cost factor up to 20-25% would
> >> be a good idea, just so we have less unnecessary plan churn.  I dunno
> >> if sorting always costs that much, but if a 10% cost overhead is
> >> really 1% because it only applies to a fraction of the cost, I don't
> >> think that's good.  The whole point was to pick something large enough
> >> that we wouldn't take the sorted path unless we will benefit from the
> >> sort, and clearly that's not what happened here.
> >>
> >
> > PFA patch with the default multiplication factor for sort bumped up to
> 1.2.
> >
>
> +/* If no remote estimates, assume a sort costs 10% extra */
> +#define DEFAULT_FDW_SORT_MULTIPLIER 1.2
>
> The above comment should not be 20%?
>
> Ah! Here's patch with comment fixed.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 697de60..cb5c3ae 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -186,23 +186,26 @@ is_foreign_expr(PlannerInfo *root,
 	 * Check that the expression consists of nodes that are safe to execute
 	 * remotely.
 	 */
 	glob_cxt.root = root;
 	glob_cxt.foreignrel = baserel;
 	loc_cxt.collation = InvalidOid;
 	loc_cxt.state = FDW_COLLATE_NONE;
 	if (!foreign_expr_walker((Node *) expr, _cxt, _cxt))
 		return false;
 
-	/* Expressions examined here should be boolean, ie noncollatable */
-	Assert(loc_cxt.collation == InvalidOid);
-	Assert(loc_cxt.state == FDW_COLLATE_NONE);
+	/*
+	 * The collation of the expression should be none or originate from a
+	 * foreign var.
+	 */
+	Assert(loc_cxt.state == FDW_COLLATE_NONE ||
+			loc_cxt.state == FDW_COLLATE_SAFE);
 
 	/*
 	 * An expression which includes any mutable functions can't be sent over
 	 * because its result is not stable.  For example, sending now() remote
 	 * side could cause confusion from clock offsets.  Future versions might
 	 * be able to make this choice with more granularity.  (We check this last
 	 * because it requires a lot of expensive catalog lookups.)
 	 */
 	if (contain_mutable_functions((Node *) expr))
 		return false;
@@ -1870,10 +1873,57 @@ printRemoteParam(int paramindex, Oid paramtype, int32 paramtypmod,
  */
 static void
 printRemotePlaceholder(Oid paramtype, int32 paramtypmod,
 	   deparse_expr_cxt *context)
 {
 	StringInfo	buf = context->buf;
 	char	   *ptypename = format_type_with_typemod(paramtype, paramtypmod);
 
 	appendStringInfo(buf, "((SELECT null::%s)::%s)", ptypename, ptypename);
 }
+
+/*
+ * Deparse ORDER BY clause according to the given pathkeys for given base
+ * relation. From given pathkeys expressions belonging entirely to the given
+ * base relation are obtained and deparsed.
+ */
+void
+appendOrderByClause(StringInfo buf, PlannerInfo *root, RelOptInfo *baserel,
+	List *pathkeys)
+{
+	ListCell			*lcell;
+	deparse_expr_cxt	context;
+	int	nestlevel;
+	char*delim = " ";
+
+	/* Set up context struct for recursion */
+	context.root = root;
+	context.foreignrel = baserel;
+	context.buf = buf;
+	context.params_list = NULL;
+
+	/* Make sure any constants in the exprs are printed portably */
+	nestlevel = set_transmission_modes();
+
+	appendStringInfo(buf, " ORDER BY");
+	foreach(lcell, pathkeys)
+	{
+		PathKey*pathkey = lfirst(lcell);
+		Expr*em_expr;
+
+		em_expr = find_em_expr_for_rel(pathkey->pk_eclass, baserel);
+		Assert(em_expr != NULL);
+
+		appendStringInfoString(buf, delim);
+		deparseExpr(em_expr, );
+		if (pathkey->pk_strategy == BTLessStrategyNumber)
+			appendStringInfoString(buf, " ASC");
+		else
+			appendStringInfoString(buf, " DESC");
+
+		if (pathkey->pk_nulls_first)
+			appendStringInfoString(buf, " NULLS FIRST");
+
+		delim = ", ";
+	}
+	reset_transmission_modes(nestlevel);
+}
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 65ea6e8..58bf76c 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -127,86 +127,82 @@ ALTER FOREIGN TABLE ft2 ALTER COLUMN c1 OPTIONS (column_name 'C 1');
 (2 rows)
 
 -- Now we should be able 

Re: [HACKERS] Getting sorted data from foreign server

2015-10-27 Thread Ashutosh Bapat
On Fri, Oct 23, 2015 at 2:43 AM, Robert Haas  wrote:

> On Wed, Oct 21, 2015 at 5:23 AM, Ashutosh Bapat
>  wrote:
> > Increasing the sorting cost factor (when use_remote_estimates = false)
> from
> > 1.1 to 1.2 makes the difference disappear.
> >
> > Since the startup costs for postgres_fdw are large portion of total cost,
> > extra 10% of rest of the cost is comparable to 1% fuzzy limit. IMO, we
> > shouldn't bother too much about it as the path costs are not much
> different.
>
> My feeling is that cranking the sorting cost factor up to 20-25% would
> be a good idea, just so we have less unnecessary plan churn.  I dunno
> if sorting always costs that much, but if a 10% cost overhead is
> really 1% because it only applies to a fraction of the cost, I don't
> think that's good.  The whole point was to pick something large enough
> that we wouldn't take the sorted path unless we will benefit from the
> sort, and clearly that's not what happened here.
>
>
PFA patch with the default multiplication factor for sort bumped up to 1.2.


> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>



-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 697de60..cb5c3ae 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -186,23 +186,26 @@ is_foreign_expr(PlannerInfo *root,
 	 * Check that the expression consists of nodes that are safe to execute
 	 * remotely.
 	 */
 	glob_cxt.root = root;
 	glob_cxt.foreignrel = baserel;
 	loc_cxt.collation = InvalidOid;
 	loc_cxt.state = FDW_COLLATE_NONE;
 	if (!foreign_expr_walker((Node *) expr, _cxt, _cxt))
 		return false;
 
-	/* Expressions examined here should be boolean, ie noncollatable */
-	Assert(loc_cxt.collation == InvalidOid);
-	Assert(loc_cxt.state == FDW_COLLATE_NONE);
+	/*
+	 * The collation of the expression should be none or originate from a
+	 * foreign var.
+	 */
+	Assert(loc_cxt.state == FDW_COLLATE_NONE ||
+			loc_cxt.state == FDW_COLLATE_SAFE);
 
 	/*
 	 * An expression which includes any mutable functions can't be sent over
 	 * because its result is not stable.  For example, sending now() remote
 	 * side could cause confusion from clock offsets.  Future versions might
 	 * be able to make this choice with more granularity.  (We check this last
 	 * because it requires a lot of expensive catalog lookups.)
 	 */
 	if (contain_mutable_functions((Node *) expr))
 		return false;
@@ -1870,10 +1873,57 @@ printRemoteParam(int paramindex, Oid paramtype, int32 paramtypmod,
  */
 static void
 printRemotePlaceholder(Oid paramtype, int32 paramtypmod,
 	   deparse_expr_cxt *context)
 {
 	StringInfo	buf = context->buf;
 	char	   *ptypename = format_type_with_typemod(paramtype, paramtypmod);
 
 	appendStringInfo(buf, "((SELECT null::%s)::%s)", ptypename, ptypename);
 }
+
+/*
+ * Deparse ORDER BY clause according to the given pathkeys for given base
+ * relation. From given pathkeys expressions belonging entirely to the given
+ * base relation are obtained and deparsed.
+ */
+void
+appendOrderByClause(StringInfo buf, PlannerInfo *root, RelOptInfo *baserel,
+	List *pathkeys)
+{
+	ListCell			*lcell;
+	deparse_expr_cxt	context;
+	int	nestlevel;
+	char*delim = " ";
+
+	/* Set up context struct for recursion */
+	context.root = root;
+	context.foreignrel = baserel;
+	context.buf = buf;
+	context.params_list = NULL;
+
+	/* Make sure any constants in the exprs are printed portably */
+	nestlevel = set_transmission_modes();
+
+	appendStringInfo(buf, " ORDER BY");
+	foreach(lcell, pathkeys)
+	{
+		PathKey*pathkey = lfirst(lcell);
+		Expr*em_expr;
+
+		em_expr = find_em_expr_for_rel(pathkey->pk_eclass, baserel);
+		Assert(em_expr != NULL);
+
+		appendStringInfoString(buf, delim);
+		deparseExpr(em_expr, );
+		if (pathkey->pk_strategy == BTLessStrategyNumber)
+			appendStringInfoString(buf, " ASC");
+		else
+			appendStringInfoString(buf, " DESC");
+
+		if (pathkey->pk_nulls_first)
+			appendStringInfoString(buf, " NULLS FIRST");
+
+		delim = ", ";
+	}
+	reset_transmission_modes(nestlevel);
+}
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 65ea6e8..58bf76c 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -127,86 +127,82 @@ ALTER FOREIGN TABLE ft2 ALTER COLUMN c1 OPTIONS (column_name 'C 1');
 (2 rows)
 
 -- Now we should be able to run ANALYZE.
 -- To exercise multiple code paths, we use local stats on ft1
 -- and remote-estimate mode on ft2.
 ANALYZE ft1;
 ALTER FOREIGN TABLE ft2 OPTIONS (use_remote_estimate 'true');
 -- ===
 -- simple queries
 -- ===
--- 

Re: [HACKERS] Getting sorted data from foreign server

2015-10-27 Thread Fabrízio de Royes Mello
On Tue, Oct 27, 2015 at 5:26 AM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:
>
>
>
> On Fri, Oct 23, 2015 at 2:43 AM, Robert Haas 
wrote:
>>
>> On Wed, Oct 21, 2015 at 5:23 AM, Ashutosh Bapat
>>  wrote:
>> > Increasing the sorting cost factor (when use_remote_estimates = false)
from
>> > 1.1 to 1.2 makes the difference disappear.
>> >
>> > Since the startup costs for postgres_fdw are large portion of total
cost,
>> > extra 10% of rest of the cost is comparable to 1% fuzzy limit. IMO, we
>> > shouldn't bother too much about it as the path costs are not much
different.
>>
>> My feeling is that cranking the sorting cost factor up to 20-25% would
>> be a good idea, just so we have less unnecessary plan churn.  I dunno
>> if sorting always costs that much, but if a 10% cost overhead is
>> really 1% because it only applies to a fraction of the cost, I don't
>> think that's good.  The whole point was to pick something large enough
>> that we wouldn't take the sorted path unless we will benefit from the
>> sort, and clearly that's not what happened here.
>>
>
> PFA patch with the default multiplication factor for sort bumped up to
1.2.
>

+/* If no remote estimates, assume a sort costs 10% extra */
+#define DEFAULT_FDW_SORT_MULTIPLIER 1.2

The above comment should not be 20%?

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello


Re: [HACKERS] Getting sorted data from foreign server

2015-10-22 Thread Robert Haas
On Wed, Oct 21, 2015 at 5:23 AM, Ashutosh Bapat
 wrote:
> Increasing the sorting cost factor (when use_remote_estimates = false) from
> 1.1 to 1.2 makes the difference disappear.
>
> Since the startup costs for postgres_fdw are large portion of total cost,
> extra 10% of rest of the cost is comparable to 1% fuzzy limit. IMO, we
> shouldn't bother too much about it as the path costs are not much different.

My feeling is that cranking the sorting cost factor up to 20-25% would
be a good idea, just so we have less unnecessary plan churn.  I dunno
if sorting always costs that much, but if a 10% cost overhead is
really 1% because it only applies to a fraction of the cost, I don't
think that's good.  The whole point was to pick something large enough
that we wouldn't take the sorted path unless we will benefit from the
sort, and clearly that's not what happened here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting sorted data from foreign server

2015-10-21 Thread Ashutosh Bapat
Here's patch with the regression fixed.

On Wed, Oct 21, 2015 at 2:53 PM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

>
>
> On Fri, Oct 16, 2015 at 11:33 PM, Robert Haas 
> wrote:
>
>> On Thu, Oct 15, 2015 at 6:28 AM, Ashutosh Bapat
>>  wrote:
>> > Attached is the patch which takes care of above comments.
>>
>> I spent some time on this patch today.  But it's still not right.
>>
>> I've attached a new version which fixes a serious problem with your
>> last version - having postgresGetForeignPaths do the costing of the
>> sorted path itself instead of delegating that to
>> estimate_path_cost_size is wrong.  In your version, 10% increment gets
>> applied to the network transmission costs as well as the cost of
>> generating the tupes - but only when use_remote_estimate == false.  I
>> fixed this and did some cosmetic cleanup.
>>
>
> That's better.
>
>
>>
>> But you'll notice if you try this some of postgres_fdw's regression
>> tests fail.  This is rather mysterious:
>>
>> ***
>> *** 697,715 
>>Sort
>>  Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
>>  Sort Key: t1.c1
>> !->  Nested Loop Semi Join
>>Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
>> !  Join Filter: (t1.c3 = t2.c3)
>>->  Foreign Scan on public.ft1 t1
>>  Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7,
>> t1.c8
>>  Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8
>> FROM "S 1"."T 1" WHERE (("C 1" < 20))
>> !  ->  Materialize
>>  Output: t2.c3
>> !->  Foreign Scan on public.ft2 t2
>>Output: t2.c3
>> !  Filter: (date(t2.c4) = '01-17-1970'::date)
>> !  Remote SQL: SELECT c3, c4 FROM "S 1"."T 1"
>> WHERE (("C 1" > 10))
>> ! (15 rows)
>>
>>   EXECUTE st2(10, 20);
>>c1 | c2 |  c3   |  c4  |c5
>>   | c6 | c7 | c8
>> --- 697,718 
>>Sort
>>  Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
>>  Sort Key: t1.c1
>> !->  Hash Join
>>Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
>> !  Hash Cond: (t1.c3 = t2.c3)
>>->  Foreign Scan on public.ft1 t1
>>  Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7,
>> t1.c8
>>  Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8
>> FROM "S 1"."T 1" WHERE (("C 1" < 20))
>> !  ->  Hash
>>  Output: t2.c3
>> !->  HashAggregate
>>Output: t2.c3
>> !  Group Key: t2.c3
>> !  ->  Foreign Scan on public.ft2 t2
>> !Output: t2.c3
>> !Filter: (date(t2.c4) = '01-17-1970'::date)
>> !Remote SQL: SELECT c3, c4 FROM "S 1"."T
>> 1" WHERE (("C 1" > 10))
>> ! (18 rows)
>>
>> What I think is happening here is that the planner notices that
>> instead of doing a parameterized nestloop, it could pull down the data
>> already sorted from the remote side, cheaply unique-ify it by using
>> the ordering provided by the remote side, and then do a standard hash
>> join. That might well be a sensible approach, but the ORDER BY that
>> would make it correct doesn't show up in the Remote SQL.  I don't know
>> why that's happening, but it's not good.
>>
>>
> The unique-ifying is happening through HashAggregate, so we do not need
> ORDER BY clause in RemoteSQL. So the plan is correct.
>
> Careful examination of paths created revealed that the change in plan is
> result of our fuzzy path selection logic. Let me explain it using the cost
> values I got on my machine. Please note that the costs are described as a
> tuple (startup cost, total cost, number of rows, present of pathkeys).
>
> With your patch, base relation paths have following costs:
>
> ft1 path without pathkeys - (100, 123.9, 20, no)
> ft1 path with pathkeys (100, 126.25, 20, yes)
> ft2 path without pathkeys (100, 143.645, 4, no)
> ft2 path without pathkeys with params (100.01, 125.365, 1, no)
>
> Notice the sorted path is only marginally costly (2.5%) compared to the
> non-sorted path for ft1. During the path creation process several nested
> loop paths are created, but except for two, they are too costly. The two
> nested loop paths of interest have costs (200, 268.755, 10, no) and (200,
> 271.105, 10, yes). The path with pathkeys kicks out the one without
> pathkeys because the later's costs are "fuzzily" equal and pathkeys give
> extra advantage. The "fuzzy" equality is effect of the marginally extra
> sorting cost. The nested loop path with pathkeys remains in the path list.
> Several hash join paths and merge join paths are created but are costlier
> than one particular hash path which has costs (243.677, 267.6624, 10, no).
> 

Re: [HACKERS] Getting sorted data from foreign server

2015-10-21 Thread Ashutosh Bapat
On Fri, Oct 16, 2015 at 11:33 PM, Robert Haas  wrote:

> On Thu, Oct 15, 2015 at 6:28 AM, Ashutosh Bapat
>  wrote:
> > Attached is the patch which takes care of above comments.
>
> I spent some time on this patch today.  But it's still not right.
>
> I've attached a new version which fixes a serious problem with your
> last version - having postgresGetForeignPaths do the costing of the
> sorted path itself instead of delegating that to
> estimate_path_cost_size is wrong.  In your version, 10% increment gets
> applied to the network transmission costs as well as the cost of
> generating the tupes - but only when use_remote_estimate == false.  I
> fixed this and did some cosmetic cleanup.
>

That's better.


>
> But you'll notice if you try this some of postgres_fdw's regression
> tests fail.  This is rather mysterious:
>
> ***
> *** 697,715 
>Sort
>  Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
>  Sort Key: t1.c1
> !->  Nested Loop Semi Join
>Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
> !  Join Filter: (t1.c3 = t2.c3)
>->  Foreign Scan on public.ft1 t1
>  Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7,
> t1.c8
>  Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8
> FROM "S 1"."T 1" WHERE (("C 1" < 20))
> !  ->  Materialize
>  Output: t2.c3
> !->  Foreign Scan on public.ft2 t2
>Output: t2.c3
> !  Filter: (date(t2.c4) = '01-17-1970'::date)
> !  Remote SQL: SELECT c3, c4 FROM "S 1"."T 1"
> WHERE (("C 1" > 10))
> ! (15 rows)
>
>   EXECUTE st2(10, 20);
>c1 | c2 |  c3   |  c4  |c5
>   | c6 | c7 | c8
> --- 697,718 
>Sort
>  Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
>  Sort Key: t1.c1
> !->  Hash Join
>Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
> !  Hash Cond: (t1.c3 = t2.c3)
>->  Foreign Scan on public.ft1 t1
>  Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7,
> t1.c8
>  Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8
> FROM "S 1"."T 1" WHERE (("C 1" < 20))
> !  ->  Hash
>  Output: t2.c3
> !->  HashAggregate
>Output: t2.c3
> !  Group Key: t2.c3
> !  ->  Foreign Scan on public.ft2 t2
> !Output: t2.c3
> !Filter: (date(t2.c4) = '01-17-1970'::date)
> !Remote SQL: SELECT c3, c4 FROM "S 1"."T
> 1" WHERE (("C 1" > 10))
> ! (18 rows)
>
> What I think is happening here is that the planner notices that
> instead of doing a parameterized nestloop, it could pull down the data
> already sorted from the remote side, cheaply unique-ify it by using
> the ordering provided by the remote side, and then do a standard hash
> join. That might well be a sensible approach, but the ORDER BY that
> would make it correct doesn't show up in the Remote SQL.  I don't know
> why that's happening, but it's not good.
>
>
The unique-ifying is happening through HashAggregate, so we do not need
ORDER BY clause in RemoteSQL. So the plan is correct.

Careful examination of paths created revealed that the change in plan is
result of our fuzzy path selection logic. Let me explain it using the cost
values I got on my machine. Please note that the costs are described as a
tuple (startup cost, total cost, number of rows, present of pathkeys).

With your patch, base relation paths have following costs:

ft1 path without pathkeys - (100, 123.9, 20, no)
ft1 path with pathkeys (100, 126.25, 20, yes)
ft2 path without pathkeys (100, 143.645, 4, no)
ft2 path without pathkeys with params (100.01, 125.365, 1, no)

Notice the sorted path is only marginally costly (2.5%) compared to the
non-sorted path for ft1. During the path creation process several nested
loop paths are created, but except for two, they are too costly. The two
nested loop paths of interest have costs (200, 268.755, 10, no) and (200,
271.105, 10, yes). The path with pathkeys kicks out the one without
pathkeys because the later's costs are "fuzzily" equal and pathkeys give
extra advantage. The "fuzzy" equality is effect of the marginally extra
sorting cost. The nested loop path with pathkeys remains in the path list.
Several hash join paths and merge join paths are created but are costlier
than one particular hash path which has costs (243.677, 267.6624, 10, no).
This hash path is not beaten by the nested loop path since because of lower
total cost (which is beyond the fuzzy margins) and gets ultimately chosen
by planner to perform ft1 join ft2.

Interestingly, if the nested loop path with pathkeys had not kicked out
that 

Re: [HACKERS] Getting sorted data from foreign server

2015-10-16 Thread Robert Haas
On Thu, Oct 15, 2015 at 6:28 AM, Ashutosh Bapat
 wrote:
> Attached is the patch which takes care of above comments.

I spent some time on this patch today.  But it's still not right.

I've attached a new version which fixes a serious problem with your
last version - having postgresGetForeignPaths do the costing of the
sorted path itself instead of delegating that to
estimate_path_cost_size is wrong.  In your version, 10% increment gets
applied to the network transmission costs as well as the cost of
generating the tupes - but only when use_remote_estimate == false.  I
fixed this and did some cosmetic cleanup.

But you'll notice if you try this some of postgres_fdw's regression
tests fail.  This is rather mysterious:

***
*** 697,715 
   Sort
 Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
 Sort Key: t1.c1
!->  Nested Loop Semi Join
   Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
!  Join Filter: (t1.c3 = t2.c3)
   ->  Foreign Scan on public.ft1 t1
 Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
 Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8
FROM "S 1"."T 1" WHERE (("C 1" < 20))
!  ->  Materialize
 Output: t2.c3
!->  Foreign Scan on public.ft2 t2
   Output: t2.c3
!  Filter: (date(t2.c4) = '01-17-1970'::date)
!  Remote SQL: SELECT c3, c4 FROM "S 1"."T 1"
WHERE (("C 1" > 10))
! (15 rows)

  EXECUTE st2(10, 20);
   c1 | c2 |  c3   |  c4  |c5
  | c6 | c7 | c8
--- 697,718 
   Sort
 Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
 Sort Key: t1.c1
!->  Hash Join
   Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
!  Hash Cond: (t1.c3 = t2.c3)
   ->  Foreign Scan on public.ft1 t1
 Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
 Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8
FROM "S 1"."T 1" WHERE (("C 1" < 20))
!  ->  Hash
 Output: t2.c3
!->  HashAggregate
   Output: t2.c3
!  Group Key: t2.c3
!  ->  Foreign Scan on public.ft2 t2
!Output: t2.c3
!Filter: (date(t2.c4) = '01-17-1970'::date)
!Remote SQL: SELECT c3, c4 FROM "S 1"."T
1" WHERE (("C 1" > 10))
! (18 rows)

What I think is happening here is that the planner notices that
instead of doing a parameterized nestloop, it could pull down the data
already sorted from the remote side, cheaply unique-ify it by using
the ordering provided by the remote side, and then do a standard hash
join.  That might well be a sensible approach, but the ORDER BY that
would make it correct doesn't show up in the Remote SQL.  I don't know
why that's happening, but it's not good.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 697de60..cb5c3ae 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -193,9 +193,12 @@ is_foreign_expr(PlannerInfo *root,
 	if (!foreign_expr_walker((Node *) expr, _cxt, _cxt))
 		return false;
 
-	/* Expressions examined here should be boolean, ie noncollatable */
-	Assert(loc_cxt.collation == InvalidOid);
-	Assert(loc_cxt.state == FDW_COLLATE_NONE);
+	/*
+	 * The collation of the expression should be none or originate from a
+	 * foreign var.
+	 */
+	Assert(loc_cxt.state == FDW_COLLATE_NONE ||
+			loc_cxt.state == FDW_COLLATE_SAFE);
 
 	/*
 	 * An expression which includes any mutable functions can't be sent over
@@ -1877,3 +1880,50 @@ printRemotePlaceholder(Oid paramtype, int32 paramtypmod,
 
 	appendStringInfo(buf, "((SELECT null::%s)::%s)", ptypename, ptypename);
 }
+
+/*
+ * Deparse ORDER BY clause according to the given pathkeys for given base
+ * relation. From given pathkeys expressions belonging entirely to the given
+ * base relation are obtained and deparsed.
+ */
+void
+appendOrderByClause(StringInfo buf, PlannerInfo *root, RelOptInfo *baserel,
+	List *pathkeys)
+{
+	ListCell			*lcell;
+	deparse_expr_cxt	context;
+	int	nestlevel;
+	char*delim = " ";
+
+	/* Set up context struct for recursion */
+	context.root = root;
+	context.foreignrel = baserel;
+	context.buf = buf;
+	context.params_list = NULL;
+
+	/* Make sure any constants in the exprs are printed portably */
+	nestlevel = set_transmission_modes();
+
+	appendStringInfo(buf, " ORDER BY");
+	foreach(lcell, pathkeys)
+	{
+		PathKey*pathkey = lfirst(lcell);
+		Expr*em_expr;
+
+		em_expr = find_em_expr_for_rel(pathkey->pk_eclass, baserel);
+		Assert(em_expr != NULL);
+
+		appendStringInfoString(buf, delim);

Re: [HACKERS] Getting sorted data from foreign server

2015-10-15 Thread Ashutosh Bapat
On Thu, Oct 15, 2015 at 2:27 AM, Robert Haas  wrote:

> On Wed, Oct 14, 2015 at 7:30 AM, Ashutosh Bapat
>  wrote:
> > The patch uses a factor of 1.1 (10% increase) to multiple the startup and
> > total costs in fpinfo for unsorted data.
> >
> > This change has caused the plans for few queries in the test
> postgres_fdw to
> > change. There is ORDER BY and LIMIT clause in the queries in postgres_fdw
> > testcase to keep test outputs sane and consistent. These ORDER BY clauses
> > are not being pushed down the foreign servers. I tried using values upto
> 2
> > for this but still the foreign paths with pathkeys won over those without
> > pathkeys.
>
> That's great.  Seeing unnecessary sorts disappear from the regression
> tests as a result of this patch is, IMHO, pretty cool.  But these
> compiler warnings might also be related:
>

Thanks for catching these warnings. My compiler (gcc (Ubuntu/Linaro
4.6.3-1ubuntu5) 4.6.3) didn't catch those.  Sorry for those mistakes.
Worst, values in fpinfo were being modified.


>
> postgres_fdw.c:605:7: error: variable 'rows' is used uninitialized
> whenever 'if'
>   condition is false [-Werror,-Wsometimes-uninitialized]
> if (fpinfo->use_remote_estimate)
> ^~~
> postgres_fdw.c:628:13: note: uninitialized use occurs here
>   ...rows,
>

Done.


>  ^~~~
> postgres_fdw.c:605:3: note: remove the 'if' if its condition is always true
> if (fpinfo->use_remote_estimate)
> ^~~~
>

That isn't true. Left the code as is.

> postgres_fdw.c:596:15: note: initialize the variable 'rows' to silence this
>   warning
> double  rows;
> ^
>  = 0.0
>

That suggestion isn't applicable either. Left the code as is.


> postgres_fdw.c:605:7: error: variable 'startup_cost' is used uninitialized
>   whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
> if (fpinfo->use_remote_estimate)
> ^~~
> postgres_fdw.c:629:13: note: uninitialized use occurs here
>   ...startup_cost,
>  ^~~~
>

Done.


> postgres_fdw.c:605:3: note: remove the 'if' if its condition is always true
> if (fpinfo->use_remote_estimate)
> ^~~~
>

That's not true. Left the code as is.


> postgres_fdw.c:598:21: note: initialize the variable 'startup_cost' to
> silence
>   this warning
> Coststartup_cost;
> ^
>  = 0.0
>

That suggestion isn't applicable either. Left the code as is.


> postgres_fdw.c:605:7: error: variable 'total_cost' is used uninitialized
>   whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
> if (fpinfo->use_remote_estimate)
> ^~~
>
postgres_fdw.c:630:13: note: uninitialized use occurs here
>   ...total_cost,
>  ^~
>

Done.


> postgres_fdw.c:605:3: note: remove the 'if' if its condition is always true
> if (fpinfo->use_remote_estimate)
> ^~~~
> postgres_fdw.c:599:19: note: initialize the variable 'total_cost' to
> silence
>   this warning
> Costtotal_cost;
>   ^
>= 0.0
>
>
Those suggestions aren't applicable.


> At least some of those look like actual mistakes.
>
>
All of those were mistakes. Can you please check if you see still see those
warnings?


> I would suggest that instead of the regression test you added, you
> instead change one of the existing tests to order by a non-pushable
> expression that produces the same final output ordering.  The revised
> EXPLAIN output shows that the existing tests are already testing that
> the sort is getting pushed down; we don't need another test for the
> same thing.


Right, removed the testcase.


> Instead, let's adjust one of the tests so that we can
> also show that we don't push down ORDER BY clauses when it would be
> wrong to do so.  For example, suppose the user specifies a collation
> in the ORDER BY clause.
>

Done. Modified single table with alias test to use tableoid in ORDER BY.
tableoid being a system column is unsafe to be pushed down and it being
constant doesn't change the order of result. Using collation might get into
problems of supported/default collation, which affect the ability to push
expressions down.


>
> appendOrderByClause could use a header comment, and its definition
> line is more than 80 characters, so it should be wrapped.
>

Done.


>
> DEFAULT_FDW_SORT_MULTIPLIER should have a one-line comment, like /* If
> no remote 

Re: [HACKERS] Getting sorted data from foreign server

2015-10-14 Thread Robert Haas
On Wed, Oct 14, 2015 at 7:30 AM, Ashutosh Bapat
 wrote:
> The patch uses a factor of 1.1 (10% increase) to multiple the startup and
> total costs in fpinfo for unsorted data.
>
> This change has caused the plans for few queries in the test postgres_fdw to
> change. There is ORDER BY and LIMIT clause in the queries in postgres_fdw
> testcase to keep test outputs sane and consistent. These ORDER BY clauses
> are not being pushed down the foreign servers. I tried using values upto 2
> for this but still the foreign paths with pathkeys won over those without
> pathkeys.

That's great.  Seeing unnecessary sorts disappear from the regression
tests as a result of this patch is, IMHO, pretty cool.  But these
compiler warnings might also be related:

postgres_fdw.c:605:7: error: variable 'rows' is used uninitialized whenever 'if'
  condition is false [-Werror,-Wsometimes-uninitialized]
if (fpinfo->use_remote_estimate)
^~~
postgres_fdw.c:628:13: note: uninitialized use occurs here
  ...rows,
 ^~~~
postgres_fdw.c:605:3: note: remove the 'if' if its condition is always true
if (fpinfo->use_remote_estimate)
^~~~
postgres_fdw.c:596:15: note: initialize the variable 'rows' to silence this
  warning
double  rows;
^
 = 0.0
postgres_fdw.c:605:7: error: variable 'startup_cost' is used uninitialized
  whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
if (fpinfo->use_remote_estimate)
^~~
postgres_fdw.c:629:13: note: uninitialized use occurs here
  ...startup_cost,
 ^~~~
postgres_fdw.c:605:3: note: remove the 'if' if its condition is always true
if (fpinfo->use_remote_estimate)
^~~~
postgres_fdw.c:598:21: note: initialize the variable 'startup_cost' to silence
  this warning
Coststartup_cost;
^
 = 0.0
postgres_fdw.c:605:7: error: variable 'total_cost' is used uninitialized
  whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
if (fpinfo->use_remote_estimate)
^~~
postgres_fdw.c:630:13: note: uninitialized use occurs here
  ...total_cost,
 ^~
postgres_fdw.c:605:3: note: remove the 'if' if its condition is always true
if (fpinfo->use_remote_estimate)
^~~~
postgres_fdw.c:599:19: note: initialize the variable 'total_cost' to silence
  this warning
Costtotal_cost;
  ^
   = 0.0

At least some of those look like actual mistakes.

I would suggest that instead of the regression test you added, you
instead change one of the existing tests to order by a non-pushable
expression that produces the same final output ordering.  The revised
EXPLAIN output shows that the existing tests are already testing that
the sort is getting pushed down; we don't need another test for the
same thing.  Instead, let's adjust one of the tests so that we can
also show that we don't push down ORDER BY clauses when it would be
wrong to do so.  For example, suppose the user specifies a collation
in the ORDER BY clause.

appendOrderByClause could use a header comment, and its definition
line is more than 80 characters, so it should be wrapped.

DEFAULT_FDW_SORT_MULTIPLIER should have a one-line comment, like /* If
no remote estimates, assume a sort costs 10% extra */.  The comment
inside postgresGetForeignPaths shouldn't refer specifically to the 10%
value, in case we change it.  So maybe: "Assume sorting costs
something, so that we won't ask for sorted results when they aren't
useful, but not too much, so that remote sorts will look attractive
when the ordering is useful to us."  Note that "attractive" is missing
a "t" in the patch.

Do you need to ignore ec_has_volatile equivalence classes in
find_em_expr_for_rel?  I bet yes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting sorted data from foreign server

2015-10-14 Thread Ashutosh Bapat
PFA the patch with all the comments addressed.

On Tue, Oct 13, 2015 at 10:07 PM, Robert Haas  wrote:

> On Tue, Oct 13, 2015 at 3:29 AM, Ashutosh Bapat
>  wrote:
> >> - You consider pushing down ORDER BY if any prefix of the query
> >> pathkeys satisfy is_foreign_expr(), but that doesn't seem right to me.
> >> If the user says SELECT * FROM remotetab ORDER BY a, unsafe(a),
> >> ordering the result set by a doesn't help us much.  We've talked a few
> >> times about an incremental sort capability that would take a stream of
> >> tuples already ordered by one or more columns and sort each group by
> >> additional columns, but I don't think we have that currently.  Without
> >> that capability, I don't think there's much benefit in sorting by a
> >> prefix of the pathkeys.  I suspect that if we can't get them all, it's
> >> not worth doing.
> >
> > I somehow thought, we are using output, which is ordered by prefix of
> > pathkeys in Sort nodes. But as you rightly pointed out that's not the
> case.
> > Only complete pathkeys are useful.
>
> A truncated list of pathkeys is useful for merge joins, but not for
> toplevel ordering.
>

Ok. Taken care in the attached patch.


>
> >> - Right now, you have this code below the point where we bail out if
> >> use_remote_estimate is not set.  If we keep it like that, the comment
> >> needs updating.  But I suggest that we consider an ordered path even
> >> if we are not using remote estimates.  Estimate the sorted path to
> >> cost somewhat more than the unsorted path, so that we only choose that
> >> path if the sort actually benefits us.  I don't know exactly how to
> >> come up with a principled estimate given that we don't actually know
> >> whether the remote side will need an extra sort or not, but maybe a
> >> dumb estimate is still better than not trying a sorted path.
> >
> > I like that idea, although there are two questions
> > 1. How can we estimate cost of getting the data sorted? If there is an
> > appropriate index on foreign server we can get the data sorted at no
> extra
> > cost. If there isn't the cost of sorting is proportionate to NlogN where
> N
> > is the size of data. It seems unreasonable to arrive at the cost of
> sorting
> > by multiplying with some constant multiplier. Also, the constant
> multiplier
> > to the NlogN estimate depends heavily upon the properties of foreign
> server
> > e.g. size of memory available for sorting, disk and CPU speed etc. The
> last
> > two might have got factored into fdw_tuple_cost and fdw_startup_cost, so
> > that's probably taken care of. If the estimate we come up turns out to be
> > too pessimistic, we will not get sorted data even if that's the right
> thing
> > to do. If too optimistic, we will incur heavy cost at the time of
> execution.
> > Setting the cost estimate to some constant factor of NlogN would be too
> > pessimistic if there is an appropriate index on foreign server. Otherway
> > round if there isn't an appropriate index on foreign server.
> >
> > Even if we leave these arguments aside for a while, the question remains
> as
> > to what should be the constant factor 10% or 20% or 50% or 100% or
> something
> > else on top of the estimate for simple foreign table scan estimates (or
> > NlogN of that)? I am unable to justify any of these factors myself. What
> do
> > you say?
>
> I think we want to estimate the cost in such a way that we'll tend to
> pick the ordered path if it's useful, but skip it if it's not.  So,
> say we pick 10%.  That's definitely enough that we won't pick a remote
> sort when it's useless, but it's small enough that if a remote sort is
> useful, we will probably choose to do it.  I think that's what we
> want.  I believe we should err on the side of a small estimate because
> it's generally better to do as much work as possible on the remote
> side.  In some cases the sort may turn out to be free at execution
> time because the remote server was going to generate the results in
> that order anyway, and it may know that because of its own pathkeys,
> and thus be able to skip the explicit ordering step.
>

The patch uses a factor of 1.1 (10% increase) to multiple the startup and
total costs in fpinfo for unsorted data.

This change has caused the plans for few queries in the test postgres_fdw
to change. There is ORDER BY and LIMIT clause in the queries in
postgres_fdw testcase to keep test outputs sane and consistent. These ORDER
BY clauses are not being pushed down the foreign servers. I tried using
values upto 2 for this but still the foreign paths with pathkeys won over
those without pathkeys.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 697de60..25d8650 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -186,23 +186,26 @@ is_foreign_expr(PlannerInfo *root,
 	 

Re: [HACKERS] Getting sorted data from foreign server

2015-10-13 Thread Jeevan Chalke
> On Thu, Oct 8, 2015 at 9:39 PM, Robert Haas  wrote:
>
>>
>> In the interest of full disclosure, I asked Ashutosh to work on this
>> patch and have discussed the design with him several times.  I believe
>> that this is a good direction for PostgreSQL to be going.  It's
>> trivially easy right now to write a query against an FDW that performs
>> needlessly easy, because a join, or a sort, or an aggregate is
>> performed on the local server rather than the remote one.   I, and
>> EnterpriseDB, want that to get fixed.  However, we also want it to get
>> fixed in the best possible way, and not to do anything unless there is
>> consensus on it.  So, if anyone has opinions on this topic, please
>> jump in.
>>
>

Are we planning to push sorting on foreign server with parametrized
foreign path?

We might get a parametrized path when local table is small enough and
foreign table is bigger, like, for this query
SELECT big_ft.x FROM big_ft, small_lt WHERE big_ft.x = small_lt.y;
we might end up getting parametrized foreign path with remote query
like:
SELECT big_ft.x FROM big_ft WHERE big_ft.x = $1;

And with this, if we have an ORDER BY clause like "ORDER BY big_ft.x"
we will get remote query like:
SELECT big_ft.x FROM big_ft WHERE big_ft.x = $1 ORDER BY big_ft.x;

Is this possible???

If yes, then don't we need to sort again on local server?

Assume each row on local server matches three rows from foreign table,
then for each $1, we will have 3 rows returned from the foreign server,
of-course sorted. But then all these fetched rows in batch of 3, needs
to be re-sorted on local server, isn't it?
If yes, this will be much more costly than fetching unsorted rows and
sorting then locally only once.

Or am I missing something here?

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Getting sorted data from foreign server

2015-10-13 Thread Ashutosh Bapat
On Thu, Oct 8, 2015 at 9:39 PM, Robert Haas  wrote:

> On Tue, Oct 6, 2015 at 6:46 AM, Ashutosh Bapat
>  wrote:
> > standard_qp_callback() sets root->query_pathkeys to pathkeys on which the
> > result of query_planner are expected be sorted upon (see the function for
> > more details). The patch checks if any prefix of these pathkeys can be
> > entirely evaluated using the foreign relation and at the foreign server
> > under consideration. If yes, it gets estimates of costs involved and adds
> > paths with those pathkeys. There can be multiple pathkeyless paths added
> for
> > a given base relation. For every such path one path with pathkeys is
> added.
> > If there is an index matching on the foreign server, getting the data
> sorted
> > from foreign server improves execution time as seen from the results. The
> > patch adds this functionality entirely in postgres_fdw.
>
> In the interest of full disclosure, I asked Ashutosh to work on this
> patch and have discussed the design with him several times.  I believe
> that this is a good direction for PostgreSQL to be going.  It's
> trivially easy right now to write a query against an FDW that performs
> needlessly easy, because a join, or a sort, or an aggregate is
> performed on the local server rather than the remote one.   I, and
> EnterpriseDB, want that to get fixed.  However, we also want it to get
> fixed in the best possible way, and not to do anything unless there is
> consensus on it.  So, if anyone has opinions on this topic, please
> jump in.
>
> That having been said, here are some review comments on this patch:
>
> - You consider pushing down ORDER BY if any prefix of the query
> pathkeys satisfy is_foreign_expr(), but that doesn't seem right to me.
> If the user says SELECT * FROM remotetab ORDER BY a, unsafe(a),
> ordering the result set by a doesn't help us much.  We've talked a few
> times about an incremental sort capability that would take a stream of
> tuples already ordered by one or more columns and sort each group by
> additional columns, but I don't think we have that currently.  Without
> that capability, I don't think there's much benefit in sorting by a
> prefix of the pathkeys.  I suspect that if we can't get them all, it's
> not worth doing.
>

I somehow thought, we are using output, which is ordered by prefix of
pathkeys in Sort nodes. But as you rightly pointed out that's not the case.
Only complete pathkeys are useful.


>
> - Right now, you have this code below the point where we bail out if
> use_remote_estimate is not set.  If we keep it like that, the comment
> needs updating.  But I suggest that we consider an ordered path even
> if we are not using remote estimates.  Estimate the sorted path to
> cost somewhat more than the unsorted path, so that we only choose that
> path if the sort actually benefits us.  I don't know exactly how to
> come up with a principled estimate given that we don't actually know
> whether the remote side will need an extra sort or not, but maybe a
> dumb estimate is still better than not trying a sorted path.
>

I like that idea, although there are two questions
1. How can we estimate cost of getting the data sorted? If there is an
appropriate index on foreign server we can get the data sorted at no extra
cost. If there isn't the cost of sorting is proportionate to NlogN where N
is the size of data. It seems unreasonable to arrive at the cost of sorting
by multiplying with some constant multiplier. Also, the constant multiplier
to the NlogN estimate depends heavily upon the properties of foreign server
e.g. size of memory available for sorting, disk and CPU speed etc. The last
two might have got factored into fdw_tuple_cost and fdw_startup_cost, so
that's probably taken care of. If the estimate we come up turns out to be
too pessimistic, we will not get sorted data even if that's the right thing
to do. If too optimistic, we will incur heavy cost at the time of
execution. Setting the cost estimate to some constant factor of NlogN would
be too pessimistic if there is an appropriate index on foreign server.
Otherway round if there isn't an appropriate index on foreign server.

Even if we leave these arguments aside for a while, the question remains as
to what should be the constant factor 10% or 20% or 50% or 100% or
something else on top of the estimate for simple foreign table scan
estimates (or NlogN of that)? I am unable to justify any of these factors
myself. What do you say?

Given that we save on the local resources required for sorting if we get
the data sorted from the foreign server, it might be better to always get
it sorted from the foreign server, but our costing model doesn't account
for that benefit today.

We might be able to do a good job there if we know more things about the
foreign server/table e.g. indexes on the foreign table, memory available
for sorting etc. that leads to your next comment.


>
> - In the long run, we 

Re: [HACKERS] Getting sorted data from foreign server

2015-10-13 Thread Ashutosh Bapat
On Tue, Oct 13, 2015 at 1:48 PM, Jeevan Chalke <
jeevan.cha...@enterprisedb.com> wrote:

>
> On Thu, Oct 8, 2015 at 9:39 PM, Robert Haas  wrote:
>>
>>>
>>> In the interest of full disclosure, I asked Ashutosh to work on this
>>> patch and have discussed the design with him several times.  I believe
>>> that this is a good direction for PostgreSQL to be going.  It's
>>> trivially easy right now to write a query against an FDW that performs
>>> needlessly easy, because a join, or a sort, or an aggregate is
>>> performed on the local server rather than the remote one.   I, and
>>> EnterpriseDB, want that to get fixed.  However, we also want it to get
>>> fixed in the best possible way, and not to do anything unless there is
>>> consensus on it.  So, if anyone has opinions on this topic, please
>>> jump in.
>>>
>>
>
> Are we planning to push sorting on foreign server with parametrized
> foreign path?
>
> We might get a parametrized path when local table is small enough and
> foreign table is bigger, like, for this query
> SELECT big_ft.x FROM big_ft, small_lt WHERE big_ft.x = small_lt.y;
> we might end up getting parametrized foreign path with remote query
> like:
> SELECT big_ft.x FROM big_ft WHERE big_ft.x = $1;
>
> And with this, if we have an ORDER BY clause like "ORDER BY big_ft.x"
> we will get remote query like:
> SELECT big_ft.x FROM big_ft WHERE big_ft.x = $1 ORDER BY big_ft.x;
>
> Is this possible???
>
> If yes, then don't we need to sort again on local server?
>
> Assume each row on local server matches three rows from foreign table,
> then for each $1, we will have 3 rows returned from the foreign server,
> of-course sorted. But then all these fetched rows in batch of 3, needs
> to be re-sorted on local server, isn't it?
> If yes, this will be much more costly than fetching unsorted rows and
> sorting then locally only once.
>
> Or am I missing something here?
>
>
Thanks a lot for the catch. Per add_path() prologue
368   * ... First, we treat all parameterized paths
 369  *as having NIL pathkeys, so that they cannot win comparisons on
the
 370  *basis of sort order.

So, anyway those paths were getting eliminated by add_path().

I will take care of this one in the next version of patch.


> --
> Jeevan B Chalke
> Principal Software Engineer, Product Development
> EnterpriseDB Corporation
> The Enterprise PostgreSQL Company
>
>


-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] Getting sorted data from foreign server

2015-10-13 Thread Robert Haas
On Tue, Oct 13, 2015 at 3:29 AM, Ashutosh Bapat
 wrote:
>> - You consider pushing down ORDER BY if any prefix of the query
>> pathkeys satisfy is_foreign_expr(), but that doesn't seem right to me.
>> If the user says SELECT * FROM remotetab ORDER BY a, unsafe(a),
>> ordering the result set by a doesn't help us much.  We've talked a few
>> times about an incremental sort capability that would take a stream of
>> tuples already ordered by one or more columns and sort each group by
>> additional columns, but I don't think we have that currently.  Without
>> that capability, I don't think there's much benefit in sorting by a
>> prefix of the pathkeys.  I suspect that if we can't get them all, it's
>> not worth doing.
>
> I somehow thought, we are using output, which is ordered by prefix of
> pathkeys in Sort nodes. But as you rightly pointed out that's not the case.
> Only complete pathkeys are useful.

A truncated list of pathkeys is useful for merge joins, but not for
toplevel ordering.

>> - Right now, you have this code below the point where we bail out if
>> use_remote_estimate is not set.  If we keep it like that, the comment
>> needs updating.  But I suggest that we consider an ordered path even
>> if we are not using remote estimates.  Estimate the sorted path to
>> cost somewhat more than the unsorted path, so that we only choose that
>> path if the sort actually benefits us.  I don't know exactly how to
>> come up with a principled estimate given that we don't actually know
>> whether the remote side will need an extra sort or not, but maybe a
>> dumb estimate is still better than not trying a sorted path.
>
> I like that idea, although there are two questions
> 1. How can we estimate cost of getting the data sorted? If there is an
> appropriate index on foreign server we can get the data sorted at no extra
> cost. If there isn't the cost of sorting is proportionate to NlogN where N
> is the size of data. It seems unreasonable to arrive at the cost of sorting
> by multiplying with some constant multiplier. Also, the constant multiplier
> to the NlogN estimate depends heavily upon the properties of foreign server
> e.g. size of memory available for sorting, disk and CPU speed etc. The last
> two might have got factored into fdw_tuple_cost and fdw_startup_cost, so
> that's probably taken care of. If the estimate we come up turns out to be
> too pessimistic, we will not get sorted data even if that's the right thing
> to do. If too optimistic, we will incur heavy cost at the time of execution.
> Setting the cost estimate to some constant factor of NlogN would be too
> pessimistic if there is an appropriate index on foreign server. Otherway
> round if there isn't an appropriate index on foreign server.
>
> Even if we leave these arguments aside for a while, the question remains as
> to what should be the constant factor 10% or 20% or 50% or 100% or something
> else on top of the estimate for simple foreign table scan estimates (or
> NlogN of that)? I am unable to justify any of these factors myself. What do
> you say?

I think we want to estimate the cost in such a way that we'll tend to
pick the ordered path if it's useful, but skip it if it's not.  So,
say we pick 10%.  That's definitely enough that we won't pick a remote
sort when it's useless, but it's small enough that if a remote sort is
useful, we will probably choose to do it.  I think that's what we
want.  I believe we should err on the side of a small estimate because
it's generally better to do as much work as possible on the remote
side.  In some cases the sort may turn out to be free at execution
time because the remote server was going to generate the results in
that order anyway, and it may know that because of its own pathkeys,
and thus be able to skip the explicit ordering step.

>> - In the long run, we should probably either add some configuration so
>> that the local side can make better estimates even without
>> use_remote_estimate, or maybe have a way for the FDW to keep a cache
>> of data in the system catalogs that is updated by ANALYZE.  Then,
>> analyzing a foreign table could store information locally about
>> indexes and so forth, instead of starting each query planning cycle
>> with no knowledge about the remote side.  That's not a matter for this
>> patch, I don't think, but it seems like something we should do.
>
> To an extent knowing which indexes are available on the tables on foreign
> server will help. Now, I do understand that not every foreign server will
> have indexes like PostgreSQL, but as long as whatever they have can be
> translated into a language that PostgreSQL can understand it should be fine.
> From that point of view, will it help if we have declarative indexes on
> foreign tables similar to the declarative constraints? Obviously, we will be
> burdening user with extra work of maintaining the declarative indexes in
> sync like we do for constraints. But we might ease the 

Re: [HACKERS] Getting sorted data from foreign server

2015-10-08 Thread Robert Haas
On Tue, Oct 6, 2015 at 6:46 AM, Ashutosh Bapat
 wrote:
> standard_qp_callback() sets root->query_pathkeys to pathkeys on which the
> result of query_planner are expected be sorted upon (see the function for
> more details). The patch checks if any prefix of these pathkeys can be
> entirely evaluated using the foreign relation and at the foreign server
> under consideration. If yes, it gets estimates of costs involved and adds
> paths with those pathkeys. There can be multiple pathkeyless paths added for
> a given base relation. For every such path one path with pathkeys is added.
> If there is an index matching on the foreign server, getting the data sorted
> from foreign server improves execution time as seen from the results. The
> patch adds this functionality entirely in postgres_fdw.

In the interest of full disclosure, I asked Ashutosh to work on this
patch and have discussed the design with him several times.  I believe
that this is a good direction for PostgreSQL to be going.  It's
trivially easy right now to write a query against an FDW that performs
needlessly easy, because a join, or a sort, or an aggregate is
performed on the local server rather than the remote one.   I, and
EnterpriseDB, want that to get fixed.  However, we also want it to get
fixed in the best possible way, and not to do anything unless there is
consensus on it.  So, if anyone has opinions on this topic, please
jump in.

That having been said, here are some review comments on this patch:

- You consider pushing down ORDER BY if any prefix of the query
pathkeys satisfy is_foreign_expr(), but that doesn't seem right to me.
If the user says SELECT * FROM remotetab ORDER BY a, unsafe(a),
ordering the result set by a doesn't help us much.  We've talked a few
times about an incremental sort capability that would take a stream of
tuples already ordered by one or more columns and sort each group by
additional columns, but I don't think we have that currently.  Without
that capability, I don't think there's much benefit in sorting by a
prefix of the pathkeys.  I suspect that if we can't get them all, it's
not worth doing.

- Right now, you have this code below the point where we bail out if
use_remote_estimate is not set.  If we keep it like that, the comment
needs updating.  But I suggest that we consider an ordered path even
if we are not using remote estimates.  Estimate the sorted path to
cost somewhat more than the unsorted path, so that we only choose that
path if the sort actually benefits us.  I don't know exactly how to
come up with a principled estimate given that we don't actually know
whether the remote side will need an extra sort or not, but maybe a
dumb estimate is still better than not trying a sorted path.

- In the long run, we should probably either add some configuration so
that the local side can make better estimates even without
use_remote_estimate, or maybe have a way for the FDW to keep a cache
of data in the system catalogs that is updated by ANALYZE.  Then,
analyzing a foreign table could store information locally about
indexes and so forth, instead of starting each query planning cycle
with no knowledge about the remote side.  That's not a matter for this
patch, I don't think, but it seems like something we should do.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting sorted data from foreign server

2015-10-08 Thread Jeremy Harris
On 08/10/15 17:09, Robert Haas wrote:
> - You consider pushing down ORDER BY if any prefix of the query
> pathkeys satisfy is_foreign_expr(), but that doesn't seem right to me.
> If the user says SELECT * FROM remotetab ORDER BY a, unsafe(a),
> ordering the result set by a doesn't help us much.  We've talked a few
> times about an incremental sort capability that would take a stream of
> tuples already ordered by one or more columns and sort each group by
> additional columns, but I don't think we have that currently.  Without
> that capability, I don't think there's much benefit in sorting by a
> prefix of the pathkeys.  I suspect that if we can't get them all, it's
> not worth doing.

That depends how often the additional columns affect the sorted
order, if the sort method takes advantage of sorted input.
-- 
Jeremy



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting sorted data from foreign server

2015-10-08 Thread Robert Haas
On Thu, Oct 8, 2015 at 3:04 PM, Jeremy Harris  wrote:
> That depends how often the additional columns affect the sorted
> order, if the sort method takes advantage of sorted input.

What I'm saying is - ours doesn't.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers