Re: [HACKERS] Traffic jams in fn_extra
On Sun, Nov 24, 2013 at 4:21 PM, Simon Riggs si...@2ndquadrant.com wrote: Why do you need to do this dance with fn_extra? It's possible to allocate a hash table in a Transaction-lifetime memory context on first call into a function then cache things there. fn_extra gives a handle per function call site. It sounds to me like the complexity is coming not from having many Postgres functions but from having lots of infrastructure backing up those functions. So if many of their Postgres functions call a C function to do various things and the C function wants to cache something somewhere related to the object they've been passed then the natural thing to do is have the Postgres function pass fn_extra down to the C function but if they have many such C functions it gets a bit tricky. But you could declare fn_extra to be a hash table yourself since it's your Postgres function that gets to decide how fn_extra is going to be used. You could then pass that hash table down to the various C functions to cache state. However that might still be a bit odd. If you call the same C function twice from the same Postgres function it'll get the same hash table for both calls. fn_extra is per Postgres function call site. -- greg
Re: [HACKERS] Traffic jams in fn_extra
On Sunday, November 24, 2013 at 4:42 PM, Tom Lane wrote: The real question of course is whether transaction-level caching is appropriate for what they're storing. If they want only statement-level caching then using fn_extra is often the right thing. I'm not certain it is... we get some great effects out of the statement level stuff, and it really works great except for those cases where somebody else is already taking the slot (SRF_*) Also note that having the cache go away is the easy part. The hard part is knowing whether you've created it yet in the current transaction, and finding it if you have. The usual method is to keep a static variable pointing to it, and plugging into the transaction cleanup callback mechanism with a routine that'll reset the pointer to NULL at transaction end. For examples, look for callers of RegisterXactCallback(). I'm glad you said that, because I felt too stoopid to ask :) my previous spelunking through memory contexts showed that it's easy to get the memory, hard to find it again. Which is why fn_extra is so appealing, it's just sitting there, and the parent context goes away at the end of the query, so it's wonderfully simple to use... if there's not already someone in it. Thanks for the advice, it could be very helpful for my pointcloud work, since having access to a cache object during SRF_* stuff could improve performance quite a bit, so the complexity trade-off of using the transactional context could be well worth it. P. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Traffic jams in fn_extra
On 19 November 2013 23:08, Paul Ramsey pram...@cleverelephant.ca wrote: On the solution, I wasn't suggesting another void* slot, but rather a slot that holds a hash table, so that an arbitrary number of things can be stuffed in. Overkill, really, since in 99.9% of times only one thing would be in there, and in the other 0.1% of times two things. In our own GenericCacheCollection, we just statically allocate 16 slots. Why do you need to do this dance with fn_extra? It's possible to allocate a hash table in a Transaction-lifetime memory context on first call into a function then cache things there. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Traffic jams in fn_extra
Hi Simon, We do the dance because it’s how we always have and don’t know any other way, any better way. :) The usual explanation. Is there any place you can point to that demonstrates your technique? Thanks! P -- Paul Ramsey http://cleverelephant.ca/ http://postgis.net/ On Sunday, November 24, 2013 at 8:21 AM, Simon Riggs wrote: On 19 November 2013 23:08, Paul Ramsey pram...@cleverelephant.ca (mailto:pram...@cleverelephant.ca) wrote: On the solution, I wasn't suggesting another void* slot, but rather a slot that holds a hash table, so that an arbitrary number of things can be stuffed in. Overkill, really, since in 99.9% of times only one thing would be in there, and in the other 0.1% of times two things. In our own GenericCacheCollection, we just statically allocate 16 slots. Why do you need to do this dance with fn_extra? It's possible to allocate a hash table in a Transaction-lifetime memory context on first call into a function then cache things there. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Traffic jams in fn_extra
On 24 November 2013 16:02, Paul Ramsey pram...@cleverelephant.ca wrote: We do the dance because it’s how we always have and don’t know any other way, any better way. :) The usual explanation. Is there any place you can point to that demonstrates your technique? src/backend/utils/mmgr/README You can create memory contexts as children of other contexts, so for example you might create PostGIS Cache Context as a sub-context of TopTransactionContext. So it can be created dynamically as needed and will automatically go away at end of xact. Or you could use CurTransactionContext if you want to do things at subtransaction level. This is all used very heavily within Postgres itself, including the various caches in different parts of the code. Obviously, if you start cacheing too much then people will claim that PostGIS is leaking memory, so it depends how far you go. But then you might alleviate that with a postgis.session_cache parameter to acknowledge and allow control. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Traffic jams in fn_extra
Simon Riggs si...@2ndquadrant.com writes: On 24 November 2013 16:02, Paul Ramsey pram...@cleverelephant.ca wrote: We do the dance because its how we always have and dont know any other way, any better way. :) The usual explanation. Is there any place you can point to that demonstrates your technique? src/backend/utils/mmgr/README You can create memory contexts as children of other contexts, so for example you might create PostGIS Cache Context as a sub-context of TopTransactionContext. So it can be created dynamically as needed and will automatically go away at end of xact. The real question of course is whether transaction-level caching is appropriate for what they're storing. If they want only statement-level caching then using fn_extra is often the right thing. Also note that having the cache go away is the easy part. The hard part is knowing whether you've created it yet in the current transaction, and finding it if you have. The usual method is to keep a static variable pointing to it, and plugging into the transaction cleanup callback mechanism with a routine that'll reset the pointer to NULL at transaction end. For examples, look for callers of RegisterXactCallback(). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Traffic jams in fn_extra
As an extension with a lot of CPU load, we (postgis) tend to use flinfo-fn_extra a lot, for caching things that are intensive to calculate at the start of a query and reuse throughout subsequent functions calls. - coordinate projection objects - indexes of the edges of large geometries - other kinds of indexes of the edges of large geometries :) - schemas of lidar pointcloud collections As we've added different kinds of caching, in our own project, we've banged up against problems of multiple functions trying to stuff information into the same pointer, and ended up putting an extra container of our own into fn_extra, to hold the different kinds of stuff we might want to store, a GenericCacheCollection https://github.com/postgis/postgis/blob/svn-trunk/libpgcommon/lwgeom_cache.c#L46-L48 As (by now) a connoisseur of fn_extra caching, I've noticed while reading bits of PgSQL code that there are far more places that stuff state into fn_extra than I ever knew, and that they do so without any substantial concern that other state might already be there. (Well, that's not true, they usually check for NULL and they give up if fn_extra is already in use.) The range types, I was surprised to find doing some performance caching in fn_extra. The set-returning function macros use it to hold state. And many others I'm sure. Would it be good/wise to add another, better managed, slot to flinfo, one that isn't just void* but is a hash? (Or, has some management macros to handle it and make it a hash* if it's null, whatever API makes sense) so that multiple bits of code can cache state over function calls without banging into one another? flinfo-fn_extra_hash perhaps? If this sounds OK, I'd be honored to try and make it my first submission to PgSQL. P. -- Paul Ramsey http://cleverelephant.ca http://postgis.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Traffic jams in fn_extra
Paul Ramsey pram...@cleverelephant.ca writes: As we've added different kinds of caching, in our own project, we've banged up against problems of multiple functions trying to stuff information into the same pointer, and ended up putting an extra container of our own into fn_extra, to hold the different kinds of stuff we might want to store, a GenericCacheCollection TBH, I fail to understand what you're on about here. Any one function owns the value of fn_extra in calls to it, and is solely responsible for its usage; furthermore, there's no way for any other code to mangle that pointer unless the owner explicitly makes it available. So where is the problem? And if there is a problem, how does adding another field of exactly the same kind make it better? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Traffic jams in fn_extra
On Tue, Nov 19, 2013 at 7:32 PM, Tom Lane t...@sss.pgh.pa.us wrote: Paul Ramsey pram...@cleverelephant.ca writes: As we've added different kinds of caching, in our own project, we've banged up against problems of multiple functions trying to stuff information into the same pointer, and ended up putting an extra container of our own into fn_extra, to hold the different kinds of stuff we might want to store, a GenericCacheCollection TBH, I fail to understand what you're on about here. Any one function owns the value of fn_extra in calls to it, and is solely responsible for its usage; furthermore, there's no way for any other code to mangle that pointer unless the owner explicitly makes it available. So where is the problem? And if there is a problem, how does adding another field of exactly the same kind make it better? Right, sorry, I'm reasoning overly aggressively from the specific to the general. The specific problems have been - two lines of geometry caching code, either of which can be called within a single function, depending on the inputs, which mostly didn't result in conflicts, except when they did, which was eventually rectified by layering a GenericCacheCollection into the function - a cached lidar schema object which would have been very useful to have around in an SRF, but couldn't because the SRF needed the fn_extra slot The first case is an application-specific problem, and since we've coded around it, the only advantage to a pgsql-specific fix would be to allow others who also need to cache several independent things to not reinvent that wheel. The second case is one of the instances where the pgsql code itself is getting in the way and cannot be worked around at the application level. My solution was just not to do caching for that function. So, that's what I perceive the problem to be. Now that you point it out to me, yes, it's pretty small bore stuff. On the solution, I wasn't suggesting another void* slot, but rather a slot that holds a hash table, so that an arbitrary number of things can be stuffed in. Overkill, really, since in 99.9% of times only one thing would be in there, and in the other 0.1% of times two things. In our own GenericCacheCollection, we just statically allocate 16 slots. P. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers