(2011/12/15 11:30), Etsuro Fujita wrote:
> (2011/12/14 15:34), Shigeru Hanada wrote:
>> I think this patch could be marked as "Ready for committer" with some
>> minor fixes.  Please find attached a revised patch (v6.1).

I've tried to make pgsql_fdw work with this feature, and found that few
static functions to be needed to exported to implement ANALYZE handler
in short-cut style.  The "Short-cut style" means the way to generate
statistics (pg_class and pg_statistic) for foreign tables without
retrieving sample data from foreign server.

Attached patch (export_funcs.patch) exports examine_attribute and
update_attstats which are necessary to implement ANALYZE handler for
pgsql_fdw.  In addition to exporting, update_attstats is also renamed to
vac_update_attstats to fit with already exported function
vac_update_relstats.

I also attached archive of WIP pgsql_fdw with ANALYZE support.  This
version has better estimation than original pgsql_fdw, because it can
use selectivity of qualifiers evaluated on local side to estimate number
of result rows.  To show the effect of ANALYZE clearly, WHERE push-down
feature is disabled.  Please see pgsqlAnalyzeForeignTable and
store_remote_stats in pgsql_fdw.c.

I used pgbench_accounts tables with 30000 records, and got reasonable
rows estimation for queries below.

<on remote side>
postgres=# UPDATE pgbench_accounts SET filler = NULL
postgres-# WHERE aid % 3 = 0;
postgres=# ANALYZE;

<on local side>
postgres=# ANALYZE pgbench_accounts;  -- needs explicit table name
postgres=# EXPLAIN SELECT * FROM pgbench_accounts WHERE filler IS NULL;
                                                         QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
 Foreign Scan on pgbench_accounts  (cost=100.00..40610.00 rows=100030
width=97)
   Filter: (filler IS NULL)
   Remote SQL: DECLARE pgsql_fdw_cursor_13 SCROLL CURSOR FOR SELECT aid,
bid, abalance, filler FROM public.pgbench_accounts
(3 rows)

postgres=# EXPLAIN SELECT * FROM pgbench_accounts WHERE aid < 100;
                                                         QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
 Foreign Scan on pgbench_accounts  (cost=100.00..40610.00 rows=96 width=97)
   Filter: (aid < 100)
   Remote SQL: DECLARE pgsql_fdw_cursor_14 SCROLL CURSOR FOR SELECT aid,
bid, abalance, filler FROM public.pgbench_accounts
(3 rows)

postgres=# EXPLAIN SELECT * FROM pgbench_accounts WHERE aid < 1000;
                                                         QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
 Foreign Scan on pgbench_accounts  (cost=100.00..40610.00 rows=1004
width=97)
   Filter: (aid < 1000)
   Remote SQL: DECLARE pgsql_fdw_cursor_15 SCROLL CURSOR FOR SELECT aid,
bid, abalance, filler FROM public.pgbench_accounts
(3 rows)

In implementing ANALYZE handler, hardest part was copying anyarray
values from remote to local.  If we can make it common in core, it would
help FDW authors who want to implement ANALYZE handler without
retrieving sample rows from remote server.

Regards,
-- 
Shigeru Hanada
commit bb28cb5a69aae3bd9c7fbebc8b9483d23711bec4
Author: Shigeru Hanada <shigeru.han...@gmail.com>
Date:   Thu Feb 9 16:06:14 2012 +0900

    Export functions which are useful for FDW analyze support.
    
    Export examine_attribute and update_attstats (with renaming to
    vac_update_attstats) which are useful (and nealy required) to implement
    short-cut version of ANALYZE handler in FDWs.

diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 6a22d49..d0a323a 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -94,8 +94,6 @@ static void compute_index_stats(Relation onerel, double 
totalrows,
                                        AnlIndexData *indexdata, int nindexes,
                                        HeapTuple *rows, int numrows,
                                        MemoryContext col_context);
-static VacAttrStats *examine_attribute(Relation onerel, int attnum,
-                                 Node *index_expr);
 static int acquire_sample_rows(Relation onerel,
                                                           HeapTuple *rows, int 
targrows,
                                                           double *totalrows, 
double *totaldeadrows,
@@ -105,8 +103,6 @@ static int acquire_inherited_sample_rows(Relation onerel,
                                                          double *totalrows, 
double *totaldeadrows,
                                                          BlockNumber 
*totalpages, int elevel);
 static int     compare_rows(const void *a, const void *b);
-static void update_attstats(Oid relid, bool inh,
-                               int natts, VacAttrStats **vacattrstats);
 static Datum std_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
 static Datum ind_fetch_func(VacAttrStatsP stats, int rownum, bool *isNull);
 
@@ -215,9 +211,9 @@ analyze_rel(Oid relid, VacuumStmt *vacstmt, 
BufferAccessStrategy bstrategy)
        }
 
        /*
-        * We can ANALYZE any table except pg_statistic.  See update_attstats.  
In
-        * addition, we can ANALYZE foreign tables if AnalyzeForeignTable 
callback
-        * routines of underlying foreign-data wrappers are implemented.
+        * We can ANALYZE any table except pg_statistic.  See 
vac_update_attstats.
+        * In addition, we can ANALYZE foreign tables if AnalyzeForeignTable
+        * callback routines of underlying foreign-data wrappers are 
implemented.
         */
        if (RelationGetRelid(onerel) == StatisticRelationId)
        {
@@ -283,7 +279,7 @@ analyze_rel(Oid relid, VacuumStmt *vacstmt, 
BufferAccessStrategy bstrategy)
         * Close source relation now, but keep lock so that no one deletes it
         * before we commit.  (If someone did, they'd fail to clean up the 
entries
         * we made in pg_statistic.  Also, releasing the lock before commit 
would
-        * expose us to concurrent-update failures in update_attstats.)
+        * expose us to concurrent-update failures in vac_update_attstats.)
         */
        relation_close(onerel, NoLock);
 
@@ -551,15 +547,15 @@ do_analyze_rel(Relation onerel, VacuumStmt *vacstmt, int 
elevel,
                 * previous statistics for the target columns.  (If there are 
stats in
                 * pg_statistic for columns we didn't process, we leave them 
alone.)
                 */
-               update_attstats(RelationGetRelid(onerel), inh,
-                                               attr_cnt, vacattrstats);
+               vac_update_attstats(RelationGetRelid(onerel), inh,
+                                                       attr_cnt, vacattrstats);
 
                for (ind = 0; ind < nindexes; ind++)
                {
                        AnlIndexData *thisdata = &indexdata[ind];
 
-                       update_attstats(RelationGetRelid(Irel[ind]), false,
-                                                       thisdata->attr_cnt, 
thisdata->vacattrstats);
+                       vac_update_attstats(RelationGetRelid(Irel[ind]), false,
+                                                               
thisdata->attr_cnt, thisdata->vacattrstats);
                }
        }
 
@@ -842,7 +838,7 @@ compute_index_stats(Relation onerel, double totalrows,
  * If index_expr isn't NULL, then we're trying to analyze an expression index,
  * and index_expr is the expression tree representing the column's data.
  */
-static VacAttrStats *
+VacAttrStats *
 examine_attribute(Relation onerel, int attnum, Node *index_expr)
 {
        Form_pg_attribute attr = onerel->rd_att->attrs[attnum - 1];
@@ -1583,7 +1579,7 @@ acquire_inherited_sample_rows(Relation onerel, HeapTuple 
*rows, int targrows,
 
 
 /*
- *     update_attstats() -- update attribute statistics for one relation
+ *     vac_update_attstats() -- update attribute statistics for one relation
  *
  *             Statistics are stored in several places: the pg_class row for 
the
  *             relation has stats about the whole relation, and there is a
@@ -1604,8 +1600,8 @@ acquire_inherited_sample_rows(Relation onerel, HeapTuple 
*rows, int targrows,
  *             ANALYZE the same table concurrently.  Presently, we lock that 
out
  *             by taking a self-exclusive lock on the relation in 
analyze_rel().
  */
-static void
-update_attstats(Oid relid, bool inh, int natts, VacAttrStats **vacattrstats)
+void
+vac_update_attstats(Oid relid, bool inh, int natts, VacAttrStats 
**vacattrstats)
 {
        Relation        sd;
        int                     attno;
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 1530970..b165953 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -152,6 +152,10 @@ extern void vac_update_relstats(Relation relation,
                                        BlockNumber num_all_visible_pages,
                                        bool hasindex,
                                        TransactionId frozenxid);
+extern void vac_update_attstats(Oid relid,
+                                                               bool inh,
+                                                               int natts,
+                                                               VacAttrStats 
**vacattrstats);
 extern void vacuum_set_xid_limits(int freeze_min_age, int freeze_table_age,
                                          bool sharedRel,
                                          TransactionId *oldestXmin,
@@ -177,6 +181,8 @@ extern void do_analyze_rel(Relation onerel, VacuumStmt 
*vacstmt, int elevel,
 extern double random_fract(void);
 extern double init_selection_state(int n);
 extern double get_next_S(double t, int n, double *stateptr);
+extern VacAttrStats *examine_attribute(Relation onerel, int attnum,
+                                                                          Node 
*index_expr);
 
 
 #endif   /* VACUUM_H */

Attachment: pgsql_fdw.tar.gz
Description: application/gzip

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to