(2013/07/23 20:02), Greg Smith wrote: > On 7/23/13 2:16 AM, Satoshi Nagayasu wrote: >> I've been working on new pgstattuple function to allow >> block sampling [1] in order to reduce block reads while >> scanning a table. A PoC patch is attached. > > Take a look at all of the messages linked in > https://commitfest.postgresql.org/action/patch_view?id=778 > > Jaime and I tried to do what you're working on then, including a random > block sampling mechanism modeled on the stats_target mechanism. We > didn't do that as part of pgstattuple though, which was a mistake. > > Noah created some test cases as part of his thorough review that were > not computing the correct results. Getting the results correct for all > of the various types of PostgreSQL tables and indexes ended up being > much harder than the sampling part. See > http://www.postgresql.org/message-id/20120222052747.ge8...@tornado.leadboat.com > in particular for that.
Thanks for the info. I have read the previous discussion. I'm looking forward to seeing more feedback on this approach, in terms of design and performance improvement. So, I have submitted this for the next CF. >> This new function, pgstattuple2(), samples only 3,000 blocks >> (which accounts 24MB) from the table randomly, and estimates >> several parameters of the entire table. > > There should be an input parameter to the function for how much sampling > to do, and if it's possible to make the scale for it to look like > ANALYZE that's helpful too. > > I have a project for this summer that includes reviving this topic and > making sure it works on some real-world systems. If you want to work on > this too, I can easily combine that project into what you're doing. Yeah, I'm interested in that. Something can be shared? Regards, -- Satoshi Nagayasu <sn...@uptime.jp> Uptime Technologies, LLC. http://www.uptime.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers