Re: [HACKERS] FDW for PostgreSQL

2012-11-06 Thread

On 2012/11/07, at 1:35, Tom Lane t...@sss.pgh.pa.us wrote:
 
 Isn't it possible to pick-up only columns to be used in targetlist or
 local qualifiers, without modification of baserestrictinfo?
 
 What the doc means to suggest is that you can look through the
 baserestrictinfo list and then record information elsewhere about
 interesting clauses you find.  If the FDW is actually *modifying* that
 list, I agree that seems like a bad idea.
 
Kaigai-san might have misunderstood that postgresql_fdw changes
baserestrictinfo, since it did so in old implementation.

ClassifyConditions creates new lists, local_conds and remote_conds,
which have cells which point RestrictInfo(s) in baserestrictinfo.
It doesn't copy RestrictInfo for new lists, but I think it's ok
because baserestrictinfo list itself and RestrictInfo(s) pointed by
it are never modified by postgresql_fdw.

I don't recall anything in
 the core system that does that, so it seems fragile.  The closest
 parallel I can think of in the core system is indexscans pulling out
 restriction clauses to use as index quals.  That code doesn't modify
 the baserestrictinfo list, only make new lists with some of the same
 entries.

Thanks for the advise.  I found relation_excluded_by_constraints
which is called by set_rel_size() creates new RestrictInfo list from
baserestrictinfo, and this seems like what postgresql_fdw does in
GetForeignRelSize, from the perspective of relation size estimation.

Regards,
--
Shigeru HANADA
shigeru.han...@gmail.com






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: optimized DROP of multiple tables within a transaction

2012-10-17 Thread
Hi Tomas,

On 2012/10/17, at 20:45, Tomas Vondra t...@fuzzy.cz wrote:
 
 Dne 17.10.2012 12:34, Shigeru HANADA napsal:
 Performance test
 
 I tested 1000 tables case (each is copy of pgbench_branches with 10
 rows) on 1GB shared_buffers server.  Please note that I tested on
 MacBook air, i.e. storage is not HDD but SSD.  Here is the test procedure:
 
 1) loop 1000 times
 1-1) create copy table of pgbench_accounts as accounts$i
 1-2) load 10 rows
 1-3) add primary key
 1-4) select all rows to cache pages in shared buffer
 2) BEGIN
 3) loop 1000 times
 3-1) DROP TABLE accounts$i
 4) COMMIT
 
 I don't think the 'load rows' and 'select all rows' is really necessary.
 And AFAIK sequential scans use small circular buffer not to pollute shared
 buffers, so I'd guess the pages are not cached in shared buffers anyway.
 Have you verified that, e.g. by pg_buffercache?

Oops, you're right.  I omitted 1-3 and 1-4 in actual measurement, but IMO 
loading data is necessary to fill the shared buffer up, because # of buffers 
which are deleted during COMMIT is major factor of this patch.  And, yes, I 
verified that all shared buffers are used after all loading have been finished.

 
 Our system creates a lot of working tables (even 100.000) and we need
 to perform garbage collection (dropping obsolete tables) regularly. This
 often took ~ 1 hour, because we're using big AWS instances with lots of
 RAM (which tends to be slower than RAM on bare hw). After applying this
 patch and dropping tables in groups of 100, the gc runs in less than 4
 minutes (i.e. a 15x speed-up).
 
 Hm, my environment seems very different from yours.  Could you show the
 setting of shared_buffers in your environment?  I'd like to make my test
 environment as similar as possible to yours.
 
 We're using m2.4xlarge instances (70G of RAM) with 10GB shared buffers.

Thank you, it's more huge than I expected.  I'm not sure whether my boss allows 
me to use such rich environment... :(


Here are results of additional measurements on my MBA.

* stats of 1000 bare DROP TABLE statements

90%ile of patched PG is just 2% slower than Master, so it would be acceptable.

 |  Patched   |   Master
-++
 Average |   1.595 ms |   1.634 ms
 Median  |   1.791 ms |   1.900 ms
 90%ile  |   2.517 ms |   2.477 ms
 Max |  37.526 ms |  24.553 ms

* Total time to complete 1000 DROP TABLEs and COMMIT

   | Patched |  Master
---+-+-
 Bare  | 1595 ms | 1634 ms
 In TX |  672 ms | 1459 ms

Regards,
--
Shigeru HANADA
shigeru.han...@gmail.com






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in SQL/MED?

2011-06-30 Thread
(2011/06/29 21:23), Albe Laurenz wrote:
 If you invoke any of the SQL/MED CREATE or ALTER commands,
 the validator function is only called if an option list was given.
 
 That means that you cannot enforce required options at object creation
 time, because the validator function is not always called.
 I consider that unexpected an undesirable behaviour.
 
 Example:
 CREATE EXTENSION file_fdw;
 CREATE FOREIGN DATA WRAPPER file HANDLER file_fdw_handler VALIDATOR
 file_fdw_validator;
 CREATE SERVER file FOREIGN DATA WRAPPER file;
 CREATE FOREIGN TABLE flat (id integer) SERVER file OPTIONS (format
 'csv');
 SELECT * FROM flat;
 ERROR:  filename is required for file_fdw foreign tables
 
 Now file_fdw does not try to enforce the filename option, but it
 would be nice to be able to do so.
 
 The attached patch would change the behaviour so that the validator
 function
 is always called.
 
 
 If that change meets with approval, should file_fdw be changed so that
 it
 requires filename at table creation time?

I think this proposal is reasonable.  IMHO this fix is enough trivial to
be merged into 9.1 release.

I attached a patch which fixes file_fdw to check required option
(filename) in its validator function.  I think that such requirement
should be checked again in PlanForeignScan(), as it had been so far.
Note that this patch requires fdw.patch has been applied.

With this patch, file_fdw rejects commands which eliminate filename
option as result.  Please see attached sample.txt for detail.

BTW, I noticed that current document says just the validator function
is called for CREATE FDW/SERVER/FOREIGN TABLE, and doesn't mention
ALTER command or USER MAPPING.  I'll post another patch for this issue
later.

Regards,
-- 
Shigeru Hanada

diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index 466c015..57e522f 100644
*** a/contrib/file_fdw/file_fdw.c
--- b/contrib/file_fdw/file_fdw.c
*** file_fdw_validator(PG_FUNCTION_ARGS)
*** 215,220 
--- 215,230 
 */
ProcessCopyOptions(NULL, true, other_options);
  
+   /*
+* filename is a required option.  Validity of other options including
+* relative ones have been checked in ProcessCopyOptions().
+* Note: We don't care its value even though it might be empty, because
+* COPY comand doesn't.
+*/
+   if (catalog == ForeignTableRelationId  filename == NULL)
+   ereport(ERROR,
+   (errmsg(filename is required for file_fdw 
foreign tables)));
+ 
PG_RETURN_VOID();
  }
  
*** fileGetOptions(Oid foreigntableid,
*** 286,295 
}
prev = lc;
}
-   if (*filename == NULL)
-   ereport(ERROR,
-   (errcode(ERRCODE_FDW_UNABLE_TO_CREATE_REPLY),
-errmsg(filename is required for file_fdw 
foreign tables)));
*other_options = options;
  }
  
--- 296,301 
*** filePlanForeignScan(Oid foreigntableid,
*** 308,313 
--- 314,323 
  
/* Fetch options --- we only need filename at this point */
fileGetOptions(foreigntableid, filename, options);
+   if (filename == NULL)
+   ereport(ERROR,
+   (errcode(ERRCODE_FDW_UNABLE_TO_CREATE_REPLY),
+errmsg(filename is required for file_fdw 
foreign tables)));
  
/* Construct FdwPlan with cost estimates */
fdwplan = makeNode(FdwPlan);
diff --git a/contrib/file_fdw/input/file_fdw.source 
b/contrib/file_fdw/input/file_fdw.source
index 9ff7235..8d6dfa3 100644
*** a/contrib/file_fdw/input/file_fdw.source
--- b/contrib/file_fdw/input/file_fdw.source
*** CREATE FOREIGN TABLE tbl () SERVER file_
*** 59,64 
--- 59,65 
  ');   -- ERROR
  CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'csv', null '
  ');   -- ERROR
+ CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'csv');
   -- ERROR
  
  CREATE FOREIGN TABLE agg_text (
a   int2,
diff --git a/contrib/file_fdw/output/file_fdw.source 
b/contrib/file_fdw/output/file_fdw.source
index 2ba36c9..6cc6746 100644
*** a/contrib/file_fdw/output/file_fdw.source
--- b/contrib/file_fdw/output/file_fdw.source
*** ERROR:  COPY delimiter cannot be newline
*** 75,80 
--- 75,82 
  CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'csv', null '
  ');   -- ERROR
  ERROR:  COPY null representation cannot use newline or carriage return
+ CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'csv');
   -- ERROR
+ ERROR:  filename is required for file_fdw foreign tables
  CREATE FOREIGN TABLE agg_text (
a   int2,
b   float4
postgres=# CREATE FOREIGN TABLE foo () SERVER file;
ERROR:  filename is required for file_fdw foreign tables
postgres=# CREATE FOREIGN TABLE