We are having an issue with a query that will return no results when the query does a merge join with a foreign table, but (correctly) returns results when using a hash join.
Here is the situation on the "remote" database (9.5): # \d+ table_with_en_us_utf8_encoding Table "public.table_with_en_us_utf8_encoding" Column | Type | Modifiers | Storage | Stats target | Description --------+------------------------+-----------+----------+--------------+------------- id | bigint | | plain | | str1 | character varying(255) | | extended | | str2 | character varying(255) | | extended | | str3 | character varying(255) | | extended | | str4 | character varying(3) | | extended | | analytics=# select encoding, datcollate, datctype from pg_database where datname = current_database(); encoding | datcollate | datctype ----------+-------------+------------- 6 | en_US.UTF-8 | en_US.UTF-8 And here's what we do on the local side (9.6): # select encoding, datcollate, datctype from pg_database where datname = current_database(); encoding | datcollate | datctype ----------+------------+---------- 6 | C | C # import foreign schema public limit to (table_with_en_us_utf8_encoding) from server primary_replica into public; # \d+ table_with_en_us_utf8_encoding Foreign table "public.table_with_en_us_utf8_encoding" Column | Type | Collation | Nullable | Default | FDW options | Storage | Stats target | Description --------+------------------------+-----------+----------+---------+----------------------+----------+--------------+------------- id | bigint | | | | (column_name 'id') | plain | | str1 | character varying(255) | | | | (column_name 'str1') | extended | | str2 | character varying(255) | | | | (column_name 'str2') | extended | | str3 | character varying(255) | | | | (column_name 'str3') | extended | | str4 | character varying(3) | | | | (column_name 'str4') | extended | | Server: primary_replica FDW options: (schema_name 'public', table_name 'table_with_en_us_utf8_encoding') # create temporary table tmp_on_c_collated_foreign_server (str2 text); # insert into tmp_on_c_collated_foreign_server (str2) values ('576228972'); # insert into tmp_on_c_collated_foreign_server (str2) values ('576228972'); # insert into tmp_on_c_collated_foreign_server (str2) values ('576228972'); -- -- query with merge join, returns zero rows -- # explain (analyze, verbose) select e.str1, e.str2, e.str3 from tmp_on_c_collated_foreign_server c left join table_with_en_us_utf8_encoding e on c.str2 = e.str2 where e.str4='2' ; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------ Merge Join (cost=18041.88..22322.92 rows=229221 width=1548) (actual time=102.849..102.849 *rows=0* loops=1) Output: e.str1, e.str2, e.str3 Merge Cond: ((e.str2)::text = c.str2) -> Foreign Scan on public.table_with_en_us_utf8_encoding e (cost=17947.50..18705.95 rows=33709 width=93) (actual time=102.815..102.815 rows=1 loops=1) Output: e.id, e.str1, e.str2, e.str3, e.str4 Remote SQL: *SELECT str1, str2, str3 FROM public.table_with_en_us_utf8_encoding WHERE ((str4 = '2'::text)) ORDER BY str2 ASC NULLS LAST* -> Sort (cost=94.38..97.78 rows=1360 width=32) (actual time=0.028..0.029 rows=7 loops=1) Output: c.str2 Sort Key: c.str2 Sort Method: quicksort Memory: 25kB -> Seq Scan on pg_temp_3.tmp_on_c_collated_foreign_server c (cost=0.00..23.60 rows=1360 width=32) (actual time=0.010..0.011 rows=7 loops=1) Output: c.str2 Planning time: 4.285 ms Execution time: 104.458 ms (14 rows) -- -- query with hash join, returns rows -- -- the default for the foreign server is to use remote estimates, so we turn that off... # alter foreign table table_with_en_us_utf8_encoding OPTIONS (ADD use_remote_estimate 'false'); ALTER FOREIGN TABLE -- and then run the same query again # explain (analyze, verbose) select e.str1, e.str2, e.str3 from tmp_on_c_collated_foreign_server c left join table_with_en_us_utf8_encoding e on c.str2 = e.str2 where e.str4='2' ; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------- Hash Join (cost=110.68..139.45 rows=7 width=1548) (actual time=154.280..154.286 *rows=7* loops=1) Output: e.str1, e.str2, e.str3 Hash Cond: (c.str2 = (e.str2)::text) -> Seq Scan on pg_temp_3.tmp_on_c_collated_foreign_server c (cost=0.00..23.60 rows=1360 width=32) (actual time=0.006..0.008 rows=7 loops=1) Output: c.str2 -> Hash (cost=110.67..110.67 rows=1 width=1548) (actual time=154.264..154.264 rows=33418 loops=1) Output: e.str1, e.str2, e.str3 Buckets: 65536 (originally 1024) Batches: 1 (originally 1) Memory Usage: 4003kB -> Foreign Scan on public.table_with_en_us_utf8_encoding e (cost=100.00..110.67 rows=1 width=1548) (actual time=8.289..144.210 rows=33418 loops=1) Output: e.str1, e.str2, e.str3 Remote *SQL: SELECT str1, str2, str3 FROM public.table_with_en_us_utf8_encoding WHERE ((str4 = '2'::text))* Planning time: 0.153 ms Execution time: 156.557 ms (13 rows) So we get different answers based on whether the planner decides to push do a merge join (pushing down an order by clause) vs a hash join (no order by).