2012/1/23 Robert Haas <robertmh...@gmail.com>: > On Sun, Jan 22, 2012 at 10:48 AM, Kohei KaiGai <kai...@kaigai.gr.jp> wrote: >> I tried to implement a fdw module that is designed to utilize GPU >> devices to execute >> qualifiers of sequential-scan on foreign tables managed by this module. >> >> It was named PG-Strom, and the following wikipage gives a brief >> overview of this module. >> http://wiki.postgresql.org/wiki/PGStrom >> >> In our measurement, it achieves about x10 times faster on >> sequential-scan with complex- >> qualifiers, of course, it quite depends on type of workloads. > > That's pretty neat. In terms of tuning the non-GPU based > implementation, have you done any profiling? Sometimes that leads to > an "oh, woops" moment. > Not yet, except for \timing.
What options are available to see rate of workloads of components within a particular query? I tried to google some keywords, but does not hit to me. As an aside, I also tries to modify is_device_executable_qual() always return false to disable qualifiers pushed-down. In this case, 2100ms of 7679ms was consumed within this module, thus, I guess rest of 5500ms was mostly consumed by ExecQual(), although it is just an estimation... postgres=# SET pg_strom.exec_profile = on; SET Time: 1.075 ms postgres=# SELECT count(*) FROM ftbl WHERE sqrt((x-25.6)^2 + (y-12.8)^2) < 10; INFO: PG-Strom Exec Profile on "ftbl" INFO: Total PG-Strom consumed time: 2100.898 ms INFO: Time to JIT Compile GPU code: 0.000 ms INFO: Time to initialize devices: 0.000 ms INFO: Time to Load column-stores: 7.013 ms INFO: Time to Scan column-stores: 1219.746 ms INFO: Time to Fetch virtual tuples: 874.095 ms INFO: Time of GPU Synchronization: 0.000 ms INFO: Time of Async memcpy: 0.000 ms INFO: Time of Async kernel exec: 0.000 ms count ------- 3159 (1 row) Time: 7679.342 ms Thanks, -- KaiGai Kohei <kai...@kaigai.gr.jp> -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers