Last year I was working on a patch to postgres_fdw where the fetch_size could be set at the table level and the server level.
I was able to get the settings parsed and they would show up in pg_foreign_table and pg_foreign_servers. Unfortunately, I'm not very familiar with how foreign data wrappers work, so I wasn't able to figure out how to get these custom values passed from the PgFdwRelationInfo struct into the query's PgFdwScanState struct. I bring this up only because it might be a simpler solution, in that the table designer could set the fetch size very high for narrow tables, and lower or default for wider tables. It's also a very clean syntax, just another option on the table and/or server creation. My incomplete patch is attached. On Tue, Jan 27, 2015 at 4:24 AM, Kyotaro HORIGUCHI < horiguchi.kyot...@lab.ntt.co.jp> wrote: > Thank you for the comment. > > The automatic way to determin the fetch_size looks become too > much for the purpose. An example of non-automatic way is a new > foreign table option like 'fetch_size' but this exposes the > inside too much... Which do you think is preferable? > > Thu, 22 Jan 2015 11:17:52 -0500, Tom Lane <t...@sss.pgh.pa.us> wrote in < > 24503.1421943...@sss.pgh.pa.us> > > Kyotaro HORIGUCHI <horiguchi.kyot...@lab.ntt.co.jp> writes: > > > Hello, as the discuttion on async fetching on postgres_fdw, FETCH > > > with data-size limitation would be useful to get memory usage > > > stability of postgres_fdw. > > > > > Is such a feature and syntax could be allowed to be added? > > > > This seems like a lot of work, and frankly an incredibly ugly API, > > for a benefit that is entirely hypothetical. Have you got numbers > > showing any actual performance win for postgres_fdw? > > The API is a rush work to make the path for the new parameter > (but, yes, I did too much for the purpose that use from > postgres_fdw..) and it can be any saner syntax but it's not the > time to do so yet. > > The data-size limitation, any size to limit, would give > significant gain especially for small sized rows. > > This patch began from the fact that it runs about twice faster > when fetch size = 10000 than 100. > > > http://www.postgresql.org/message-id/20150116.171849.109146500.horiguchi.kyot...@lab.ntt.co.jp > > I took exec times to get 1M rows from localhost via postgres_fdw > and it showed the following numbers. > > =# SELECT a from ft1; > fetch_size, avg row size(*1), time, alloced_mem/fetch(Mbytes)(*1) > (local) 0.75s > 100 60 6.2s 6000 (0.006) > 10000 60 2.7s 600000 (0.6 ) > 33333 60 2.2s 1999980 (2.0 ) > 66666 60 2.4s 3999960 (4.0 ) > > =# SELECT a, b, c from ft1; > fetch_size, avg row size(*1), time, alloced_mem/fetch(Mbytes)(*1) > (local) 0.8s > 100 204 12 s 20400 (0.02 ) > 1000 204 10 s 204000 (0.2 ) > 10000 204 5.8s 2040000 (2 ) > 20000 204 5.9s 4080000 (4 ) > > =# SELECT a, b, d from ft1; > fetch_size, avg row size(*1), time, alloced_mem/fetch(Mbytes)(*1) > (local) 0.8s > 100 1356 17 s 135600 (0.136) > 1000 1356 15 s 1356000 (1.356) > 1475 1356 13 s 2000100 (2.0 ) > 2950 1356 13 s 4000200 (4.0 ) > > The definitions of the environment are the following. > > CREATE SERVER sv1 FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host > 'localhost', dbname 'postgres'); > CREATE USER MAPPING FOR PUBLIC SERVER sv1; > CREATE TABLE lt1 (a int, b timestamp, c text, d text); > CREATE FOREIGN TABLE ft1 (a int, b timestamp, c text, d text) SERVER sv1 > OPTIONS (table_name 'lt1'); > INSERT INTO lt1 (SELECT a, now(), repeat('x', 128), repeat('x', 1280) FROM > generate_series(0, 999999) a); > > The "avg row size" is alloced_mem/fetch_size and the alloced_mem > is the sum of HeapTuple[fetch_size] and (HEAPTUPLESIZE + > tup->t_len) for all stored tuples in the receiver side, > fetch_more_data() in postgres_fdw. > > They are about 50% gain for the smaller tuple size and 25% for > the larger. They looks to be optimal at where alloced_mem is > around 2MB by the reason unknown to me. Anyway the difference > seems to be significant. > > > Even if we wanted to do something like this, I strongly object to > > measuring size by heap_compute_data_size. That's not a number that users > > would normally have any direct knowledge of; nor does it have anything > > at all to do with the claimed use-case, where what you'd really need to > > measure is bytes transmitted down the wire. (The difference is not > small: > > for instance, toasted values would likely still be toasted at the point > > where you're measuring.) > > Sure. Finally, the attached patch #1 which does the following > things. > > - Sender limits the number of tuples using the sum of the net > length of the column values to be sent, not including protocol > overhead. It is calculated in the added function > slot_compute_attr_size(), using raw length for compressed > values. > > - postgres_fdw calculates fetch limit bytes by the following > formula, > > MAX_FETCH_MEM - MAX_FETCH_SIZE * (estimated overhead per tuple); > > The result of the patch is as follows. MAX_FETCH_MEM = 2MiB and > MAX_FETCH_SIZE = 30000. > > fetch_size, avg row size(*1), time, max alloced_mem/fetch(Mbytes) > (auto) 60 2.4s 1080000 ( 1.08) > (auto) 204 7.3s 536400 ( 0.54) > (auto) 1356 15 s 430236 ( 0.43) > > This is meaningfully fast but the patch looks too big and the > meaning of the new parameter is hard to understand..:( > > > On the other hand the cause of the displacements of alloced_mem > shown above is per-tuple overhead, the sum of which is unknown > before execution. The second patch makes FETCH accept the tuple > overhead bytes. The result seems pretty good, but I think this > might be too spcialized to this usage. > > MAX_FETCH_SIZE = 30000 and MAX_FETCH_MEM = 2MiB, > max_fetch_size, avg row size(*1), time, max > alloced_mem/fetch(MiBytes) > 30000 60 2.3s 1080000 ( 1.0) > 9932 204 5.7s 1787760 ( 1.7) > 1376 1356 13 s 1847484 ( 1.8) > > MAX_FETCH_SIZE = 25000 and MAX_FETCH_MEM = 1MiB, > max_fetch_size, avg row size(*1), time, max > alloced_mem/fetch(MiBytes) > 25000 60 2.4s 900000 ( 0.86) > 4358 204 6.6s 816840 ( 0.78) > 634 1356 16 s 844488 ( 0.81) > > MAX_FETCH_SIZE = 10000 and MAX_FETCH_MEM = 0.5MiB, > max_fetch_size, avg row size(*1), time, max > alloced_mem/fetch(MiBytes) > 10000 60 2.8s 360000 ( 0.35) > 2376 204 7.8s 427680 ( 0.41) > 332 1356 17 s 442224 ( 0.42) > > regards, > > -- > Kyotaro Horiguchi > NTT Open Source Software Center > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > >
diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c index e1d4c47..0d3426f 100644 --- a/contrib/postgres_fdw/option.c +++ b/contrib/postgres_fdw/option.c @@ -113,6 +113,7 @@ postgres_fdw_validator(PG_FUNCTION_ARGS) (void) defGetBoolean(def); } else if (strcmp(def->defname, "fdw_startup_cost") == 0 || + strcmp(def->defname, "fetch_size") == 0 || strcmp(def->defname, "fdw_tuple_cost") == 0) { /* these must have a non-negative numeric value */ @@ -155,6 +156,9 @@ InitPgFdwOptions(void) /* updatable is available on both server and table */ {"updatable", ForeignServerRelationId, false}, {"updatable", ForeignTableRelationId, false}, + /* fetch_size is available on both server and table */ + {"fetch_size", ForeignServerRelationId, false}, + {"fetch_size", ForeignTableRelationId, false}, {NULL, InvalidOid, false} }; diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index 5de1835..2729ba3 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -46,6 +46,9 @@ PG_MODULE_MAGIC; /* Default CPU cost to process 1 row (above and beyond cpu_tuple_cost). */ #define DEFAULT_FDW_TUPLE_COST 0.01 +/* default fetch size */ +#define DEFAULT_FETCH_SIZE 100 + /* * FDW-specific planner information kept in RelOptInfo.fdw_private for a * foreign table. This information is collected by postgresGetForeignRelSize. @@ -73,6 +76,7 @@ typedef struct PgFdwRelationInfo bool use_remote_estimate; Cost fdw_startup_cost; Cost fdw_tuple_cost; + int fetch_size; /* Cached catalog information. */ ForeignTable *table; @@ -156,6 +160,9 @@ typedef struct PgFdwScanState /* working memory contexts */ MemoryContext batch_cxt; /* context holding current batch of tuples */ MemoryContext temp_cxt; /* context for per-tuple temporary data */ + + /* fetch size */ + int fetch_size; /* how many rows to get per fetch */ } PgFdwScanState; /* @@ -395,12 +402,13 @@ postgresGetForeignRelSize(PlannerInfo *root, fpinfo->server = GetForeignServer(fpinfo->table->serverid); /* - * Extract user-settable option values. Note that per-table setting of + * Extract user-settable option values. Note that per-table settings of * use_remote_estimate overrides per-server setting. */ fpinfo->use_remote_estimate = false; fpinfo->fdw_startup_cost = DEFAULT_FDW_STARTUP_COST; fpinfo->fdw_tuple_cost = DEFAULT_FDW_TUPLE_COST; + fpinfo->fetch_size = DEFAULT_FETCH_SIZE; foreach(lc, fpinfo->server->options) { @@ -412,16 +420,17 @@ postgresGetForeignRelSize(PlannerInfo *root, fpinfo->fdw_startup_cost = strtod(defGetString(def), NULL); else if (strcmp(def->defname, "fdw_tuple_cost") == 0) fpinfo->fdw_tuple_cost = strtod(defGetString(def), NULL); + else if (strcmp(def->defname, "fetch_size") == 0) + fpinfo->fetch_size = strtod(defGetString(def), NULL); } foreach(lc, fpinfo->table->options) { DefElem *def = (DefElem *) lfirst(lc); if (strcmp(def->defname, "use_remote_estimate") == 0) - { fpinfo->use_remote_estimate = defGetBoolean(def); - break; /* only need the one value */ - } + else if (strcmp(def->defname, "fetch_size") == 0) + fpinfo->fetch_size = strtod(defGetString(def), NULL); } /* @@ -979,6 +988,14 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags) fsstate->param_values = (const char **) palloc0(numParams * sizeof(char *)); else fsstate->param_values = NULL; + + + + ForeignTable *table; + ForeignServer *server; + + fsstate->fetch_size = + fpinfo->fetch_size = DEFAULT_FETCH_SIZE; } /*
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers