On Mon, Feb 7, 2011 at 16:01, Shigeru HANADA <han...@metrosystems.co.jp> wrote:
> This patch is based on latest FDW API patches which are posted in
> another thread "SQL/MED FDW API", and copy_export-20110104.patch which
> was posted by Itagaki-san.

I have questions about estimate_costs().

* What value does baserel->tuples have?
  Foreign tables are never analyzed for now. Is the number correct?

* Your previous measurement showed it has much more startup_cost.
  When you removed ReScan, it took long time but planner didn't choose
  materialized plans. It might come from lower startup costs.

* Why do you use lstat() in it?
  Even if the file is a symlink, we will read the linked file in the
  succeeding copy. So, I think it should be stat() rather than lstat().

+estimate_costs(const char *filename, RelOptInfo *baserel,
+              double *startup_cost, double *total_cost)
+{
...
+   /* get size of the file */
+   if (lstat(filename, &stat) == -1)
+   {
+       ereport(ERROR,
+               (errcode_for_file_access(),
+                errmsg("could not stat file \"%s\": %m", filename)));
+   }
+
+   /*
+    * The way to estimate costs is almost same as cost_seqscan(), but there
+    * are some differences:
+    * - DISK costs are estimated from file size.
+    * - CPU costs are 10x of seq scan, for overhead of parsing records.
+    */
+   pages = stat.st_size / BLCKSZ + (stat.st_size % BLCKSZ > 0 ? 1 : 0);
+   run_cost += seq_page_cost * pages;
+
+   *startup_cost += baserel->baserestrictcost.startup;
+   cpu_per_tuple = cpu_tuple_cost + baserel->baserestrictcost.per_tuple;
+   run_cost += cpu_per_tuple * 10 * baserel->tuples;
+   *total_cost = *startup_cost + run_cost;
+
+   return stat.st_size;
+}

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to