Re: [Proposal] Global temporary tables

Julien Rouhaud Sat, 02 Nov 2019 01:56:04 -0700

On Sat, Nov 2, 2019 at 8:23 AM Pavel Stehule <[email protected]> wrote:
>
> so 2. 11. 2019 v 8:18 odesílatel Julien Rouhaud <[email protected]> napsal:
>>
>> On Sat, Nov 2, 2019 at 6:31 AM Pavel Stehule <[email protected]> wrote:
>> >
>> > pá 1. 11. 2019 v 17:09 odesílatel Konstantin Knizhnik 
>> > <[email protected]> napsal:
>> >>
>> >> On 01.11.2019 18:26, Robert Haas wrote:
>> >> > On Fri, Nov 1, 2019 at 11:15 AM Konstantin Knizhnik
>> >> > <[email protected]> wrote:
>> >> >> It seems to me that I have found quite elegant solution for 
>> >> >> per-backend statistic for GTT: I just inserting it in backend's 
>> >> >> catalog cache, but not in pg_statistic table itself.
>> >> >> To do it I have to add InsertSysCache/InsertCatCache functions which 
>> >> >> insert pinned entry in the correspondent cache.
>> >> >> I wonder if there are some pitfalls of such approach?
>> >> > That sounds pretty hackish. You'd have to be very careful, for
>> >> > example, that if the tables were dropped or re-analyzed, all of the
>> >> > old entries got removed --
>> >>
>> >> I have checked it:
>> >> - when table is reanalyzed, then cache entries are replaced.
>> >> - when table is dropped, then cache entries are removed.
>> >>
>> >> > and then it would still fail if any code
>> >> > tried to access the statistics directly from the table, rather than
>> >> > via the caches. My assumption is that the statistics ought to be
>> >> > stored in some backend-private data structure designed for that
>> >> > purpose, and that the code that needs the data should be taught to
>> >> > look for it there when the table is a GTT.
>> >>
>> >> Yes, if you do "select * from pg_statistic" then you will not see
>> >> statistic for GTT in this case.
>> >> But I do not think that it is so critical. I do not believe that anybody
>> >> is trying to manually interpret values in this table.
>> >> And optimizer is retrieving statistic through sys-cache mechanism and so
>> >> is able to build correct plan in this case.
>> >
>> >
>> > Years ago, when I though about it, I wrote patch with similar design. It's 
>> > working, but surely it's ugly.
>> >
>> > I have another idea. Can be pg_statistics view instead a table?
>> >
>> > Some like
>> >
>> > SELECT * FROM pg_catalog.pg_statistics_rel
>> > UNION ALL
>> > SELECT * FROM pg_catalog.pg_statistics_gtt();
>> >
>> > Internally - when stat cache is filled, then there can be used 
>> > pg_statistics_rel and pg_statistics_gtt() directly. What I remember, there 
>> > was not possibility to work with queries, only with just relations.
>>
>> It'd be a loss if you lose the ability to see the statistics, as there
>> are valid use cases where you need to see the stats, eg. understanding
>> why you don't get the plan you wanted.  There's also at least one
>> extension [1] that allows you to backup and use restored statistics,
>> so there are definitely people interested in it.
>>
>> [1]: https://github.com/ossc-db/pg_dbms_stats
>
>
> I don't think - the extensions can use UNION and the content will be same as 
> caches used by planner.


Yes, I agree that changing pg_statistics to be a view as you showed
would fix the problem.  I was answering Konstantin's point:

>> >> But I do not think that it is so critical. I do not believe that anybody
>> >> is trying to manually interpret values in this table.
>> >> And optimizer is retrieving statistic through sys-cache mechanism and so
>> >> is able to build correct plan in this case.

which is IMHO a wrong assumption.

Re: [Proposal] Global temporary tables

Reply via email to