Re: [HACKERS] Traffic jams in fn_extra

2013-11-27 Thread Greg Stark
On Sun, Nov 24, 2013 at 4:21 PM, Simon Riggs si...@2ndquadrant.com wrote:

 Why do you need to do this dance with fn_extra?

 It's possible to allocate a hash table in a Transaction-lifetime
 memory context on first call into a function then cache things there.


fn_extra gives a handle per function call site.

It sounds to me like the complexity is coming not from having many Postgres
functions but from having lots of infrastructure backing up those
functions. So if many of their Postgres functions call a C function to do
various things and the C function wants to cache something somewhere
related to the object they've been passed then the natural thing to do is
have the Postgres function pass fn_extra down to the C function but if they
have many such C functions it gets a bit tricky.


But you could declare fn_extra to be a hash table yourself since it's your
Postgres function that gets to decide how fn_extra is going to be used. You
could then pass that hash table down to the various C functions to cache
state. However that might still be a bit odd. If you call the same C
function twice from the same Postgres function it'll get the same hash
table for both calls. fn_extra is per Postgres function call site.


-- 
greg


Re: [HACKERS] Traffic jams in fn_extra

2013-11-26 Thread Paul Ramsey
On Sunday, November 24, 2013 at 4:42 PM, Tom Lane wrote:
 The real question of course is whether transaction-level caching is
 appropriate for what they're storing. If they want only statement-level
 caching then using fn_extra is often the right thing.

I'm not certain it is... we get some great effects out of the statement level 
stuff, and it really works great except for those cases where somebody else is 
already taking the slot (SRF_*)
 
 Also note that having the cache go away is the easy part. The hard part
 is knowing whether you've created it yet in the current transaction, and
 finding it if you have. The usual method is to keep a static variable
 pointing to it, and plugging into the transaction cleanup callback
 mechanism with a routine that'll reset the pointer to NULL at transaction
 end. For examples, look for callers of RegisterXactCallback().
 

I'm glad you said that, because I felt too stoopid to ask :) my previous 
spelunking through memory contexts showed that it's easy to get the memory, 
hard to find it again. Which is why fn_extra is so appealing, it's just sitting 
there, and the parent context goes away at the end of the query, so it's 
wonderfully simple to use... if there's not already someone in it.

Thanks for the advice, it could be very helpful for my pointcloud work, since 
having access to a cache object during SRF_* stuff could improve performance 
quite a bit, so the complexity trade-off of using the transactional context 
could be well worth it.

P.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Traffic jams in fn_extra

2013-11-24 Thread Simon Riggs
On 19 November 2013 23:08, Paul Ramsey pram...@cleverelephant.ca wrote:

 On the solution, I wasn't suggesting another void* slot, but rather a
 slot that holds a hash table, so that an arbitrary number of things
 can be stuffed in. Overkill, really, since in 99.9% of times only one
 thing would be in there, and in the other 0.1% of times two things. In
 our own GenericCacheCollection, we just statically allocate 16 slots.

Why do you need to do this dance with fn_extra?

It's possible to allocate a hash table in a Transaction-lifetime
memory context on first call into a function then cache things there.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Traffic jams in fn_extra

2013-11-24 Thread Paul Ramsey
Hi Simon,

We do the dance because it’s how we always have and don’t know any other way, 
any better way. :) The usual explanation. Is there any place you can point to 
that demonstrates your technique?  

Thanks!

P

--  
Paul Ramsey
http://cleverelephant.ca/
http://postgis.net/


On Sunday, November 24, 2013 at 8:21 AM, Simon Riggs wrote:

 On 19 November 2013 23:08, Paul Ramsey pram...@cleverelephant.ca 
 (mailto:pram...@cleverelephant.ca) wrote:
  
  On the solution, I wasn't suggesting another void* slot, but rather a
  slot that holds a hash table, so that an arbitrary number of things
  can be stuffed in. Overkill, really, since in 99.9% of times only one
  thing would be in there, and in the other 0.1% of times two things. In
  our own GenericCacheCollection, we just statically allocate 16 slots.
  
  
  
 Why do you need to do this dance with fn_extra?
  
 It's possible to allocate a hash table in a Transaction-lifetime
 memory context on first call into a function then cache things there.
  
 --  
 Simon Riggs http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Traffic jams in fn_extra

2013-11-24 Thread Simon Riggs
On 24 November 2013 16:02, Paul Ramsey pram...@cleverelephant.ca wrote:

 We do the dance because it’s how we always have and don’t know any other way, 
 any better way. :) The usual explanation. Is there any place you can point to 
 that demonstrates your technique?

src/backend/utils/mmgr/README

You can create memory contexts as children of other contexts, so for
example you might create PostGIS Cache Context as a sub-context of
TopTransactionContext. So it can be created dynamically as needed and
will automatically go away at end of xact.

Or you could use CurTransactionContext if you want to do things at
subtransaction level.

This is all used very heavily within Postgres itself, including the
various caches in different parts of the code.

Obviously, if you start cacheing too much then people will claim that
PostGIS is leaking memory, so it depends how far you go. But then you
might alleviate that with a postgis.session_cache parameter to
acknowledge and allow control.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Traffic jams in fn_extra

2013-11-24 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 On 24 November 2013 16:02, Paul Ramsey pram...@cleverelephant.ca wrote:
 We do the dance because it’s how we always have and don’t know any other 
 way, any better way. :) The usual explanation. Is there any place you can 
 point to that demonstrates your technique?

 src/backend/utils/mmgr/README

 You can create memory contexts as children of other contexts, so for
 example you might create PostGIS Cache Context as a sub-context of
 TopTransactionContext. So it can be created dynamically as needed and
 will automatically go away at end of xact.

The real question of course is whether transaction-level caching is
appropriate for what they're storing.  If they want only statement-level
caching then using fn_extra is often the right thing.

Also note that having the cache go away is the easy part.  The hard part
is knowing whether you've created it yet in the current transaction, and
finding it if you have.  The usual method is to keep a static variable
pointing to it, and plugging into the transaction cleanup callback
mechanism with a routine that'll reset the pointer to NULL at transaction
end.  For examples, look for callers of RegisterXactCallback().

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Traffic jams in fn_extra

2013-11-19 Thread Paul Ramsey
As an extension with a lot of CPU load, we (postgis) tend to use 
flinfo-fn_extra a lot, for caching things that are intensive to calculate at 
the start of a query and reuse throughout subsequent functions calls. 

- coordinate projection objects
- indexes of the edges of large geometries
- other kinds of indexes of the edges of large geometries :)
- schemas of lidar pointcloud collections

As we've added different kinds of caching, in our own project, we've banged up 
against problems of multiple functions trying to stuff information into the 
same pointer, and ended up putting an extra container of our own into fn_extra, 
to hold the different kinds of stuff we might want to store, a 
GenericCacheCollection

https://github.com/postgis/postgis/blob/svn-trunk/libpgcommon/lwgeom_cache.c#L46-L48

As (by now) a connoisseur of fn_extra caching, I've noticed while reading bits 
of PgSQL code that there are far more places that stuff state into fn_extra 
than I ever knew, and that they do so without any substantial concern that 
other state might already be there. (Well, that's not true, they usually check 
for NULL and they give up if fn_extra is already in use.) The range types, I 
was surprised to find doing some performance caching in fn_extra. The 
set-returning function macros use it to hold state. And many others I'm sure.

Would it be good/wise to add another, better managed, slot to flinfo, one that 
isn't just void* but is a hash? (Or, has some management macros to handle it 
and make it a hash* if it's null, whatever API makes sense) so that multiple 
bits of code can cache state over function calls without banging into one 
another?

flinfo-fn_extra_hash perhaps?

If this sounds OK, I'd be honored to try and make it my first submission to 
PgSQL.

P. 

-- 
Paul Ramsey
http://cleverelephant.ca
http://postgis.net




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Traffic jams in fn_extra

2013-11-19 Thread Tom Lane
Paul Ramsey pram...@cleverelephant.ca writes:
 As we've added different kinds of caching, in our own project, we've banged 
 up against problems of multiple functions trying to stuff information into 
 the same pointer, and ended up putting an extra container of our own into 
 fn_extra, to hold the different kinds of stuff we might want to store, a 
 GenericCacheCollection

TBH, I fail to understand what you're on about here.  Any one function
owns the value of fn_extra in calls to it, and is solely responsible for
its usage; furthermore, there's no way for any other code to mangle that
pointer unless the owner explicitly makes it available.  So where is
the problem?  And if there is a problem, how does adding another field
of exactly the same kind make it better?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Traffic jams in fn_extra

2013-11-19 Thread Paul Ramsey
On Tue, Nov 19, 2013 at 7:32 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Paul Ramsey pram...@cleverelephant.ca writes:
 As we've added different kinds of caching, in our own project, we've banged 
 up against problems of multiple functions trying to stuff information into 
 the same pointer, and ended up putting an extra container of our own into 
 fn_extra, to hold the different kinds of stuff we might want to store, a 
 GenericCacheCollection

 TBH, I fail to understand what you're on about here.  Any one function
 owns the value of fn_extra in calls to it, and is solely responsible for
 its usage; furthermore, there's no way for any other code to mangle that
 pointer unless the owner explicitly makes it available.  So where is
 the problem?  And if there is a problem, how does adding another field
 of exactly the same kind make it better?

Right, sorry, I'm reasoning overly aggressively from the specific to
the general. The specific problems have been

- two lines of geometry caching code, either of which can be called
within a single function, depending on the inputs, which mostly didn't
result in conflicts, except when they did, which was eventually
rectified by layering a GenericCacheCollection into the function
- a cached lidar schema object which would have been very useful to
have around in an SRF, but couldn't because the SRF needed the
fn_extra slot

The first case is an application-specific problem, and since we've
coded around it, the only advantage to a pgsql-specific fix would be
to allow others who also need to cache several independent things to
not reinvent that wheel.

The second case is one of the instances where the pgsql code itself is
getting in the way and cannot be worked around at the application
level. My solution was just not to do caching for that function.

So, that's what I perceive the problem to be. Now that you point it
out to me, yes, it's pretty small bore stuff.

On the solution, I wasn't suggesting another void* slot, but rather a
slot that holds a hash table, so that an arbitrary number of things
can be stuffed in. Overkill, really, since in 99.9% of times only one
thing would be in there, and in the other 0.1% of times two things. In
our own GenericCacheCollection, we just statically allocate 16 slots.

P.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers