Re: [GENERAL] Pl/Python runtime overhead

2013-08-12 Thread Seref Arikan
Thanks for the confirmation Peter,
I guess I'll take a good look at the existing implementations.

All the best
Seref



On Fri, Aug 9, 2013 at 10:24 PM, Peter Eisentraut pete...@gmx.net wrote:

 On 8/7/13 10:43 AM, Seref Arikan wrote:
  When a pl/python based function is invoked, does it keep a python
  runtime running across calls to same function? That is, if I use
  connection pooling, can I save on the python runtime initialization and
  loading costs?

 The Python interpreter is initialized once during a session, normally
 when the first PL/Python function is called.  So yes, connection pooling
 can be helpful here.

  Are there any documents/books etc you'd recommend to get a good
  understanding of extending postgres with languages like python? I'd
  really like to get a good grip of the architecture of this type of
  extension, and possibly attempt to introduce a language of my own
  choosing. The docs I've seen so far are mostly too specific, making it a
  bit for hard for me to see the forest from the trees.

 The basic documentation is here:
 http://www.postgresql.org/docs/devel/static/plhandler.html.  The rest is
 mainly experience and copying from existing language handler
 implementations.




Re: [GENERAL] Pl/Python runtime overhead

2013-08-09 Thread Peter Eisentraut
On 8/7/13 10:43 AM, Seref Arikan wrote:
 When a pl/python based function is invoked, does it keep a python
 runtime running across calls to same function? That is, if I use
 connection pooling, can I save on the python runtime initialization and
 loading costs? 

The Python interpreter is initialized once during a session, normally
when the first PL/Python function is called.  So yes, connection pooling
can be helpful here.

 Are there any documents/books etc you'd recommend to get a good
 understanding of extending postgres with languages like python? I'd
 really like to get a good grip of the architecture of this type of
 extension, and possibly attempt to introduce a language of my own
 choosing. The docs I've seen so far are mostly too specific, making it a
 bit for hard for me to see the forest from the trees.

The basic documentation is here:
http://www.postgresql.org/docs/devel/static/plhandler.html.  The rest is
mainly experience and copying from existing language handler
implementations.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Pl/Python runtime overhead

2013-08-08 Thread Seref Arikan
Thanks Sergey,
This is going to help for sure. I'll also look at the url. What I've been
trying to understand is when python runtime is invoked during the function
execution (lifecycle?) . Maybe looking at plpython's source may help get an
understanding of that.

Regards
Seref



On Thu, Aug 8, 2013 at 2:54 AM, Sergey Konoplev gray...@gmail.com wrote:

 On Wed, Aug 7, 2013 at 7:43 AM, Seref Arikan
 serefari...@kurumsalteknoloji.com wrote:
  When a pl/python based function is invoked, does it keep a python runtime
  running across calls to same function? That is, if I use connection
 pooling,
  can I save on the python runtime initialization and loading costs?

 You can use the following wrapping technique to cache function's body,
 that will save you some resources and time. It stores the main() in SD
 (session data) built-in object and retrieves it when stored, so
 plpython does not need to process it every time stored function is
 called.

 CREATE OR REPLACE FUNCTION some_plpython_function()
  RETURNS integer
  LANGUAGE plpythonu
 AS $function$
  An example of a function's body caching and error handling 

 sdNamespace = 'some_plpython_function'

 if sdNamespace not in SD:

 def main():
  The function is assumed to be cached in SD and reused 

 result = None

 # Do whatever you need here

 return result

 # Cache body in SD
 SD[sdNamespace] = main

 try:
 return SD[sdNamespace]()
 except Exception, e:
 import traceback
 plpy.info(traceback.format_exc())

 $function$;

 I can also recommend you to cache query plans, as plpython does not do
 it itself. The code below also works with SD to store prepared plans
 and retrieve them. This allows you to avoid preparing every time you
 are executing the same query. Just like plpgsql does, but manually.

 if SD.has_key('%s_somePlan' % sdNamespace):
 somePlan = SD['%s_planName' % sdNamespace]
 else:
 somePlan = plpy.prepare(...)


  Are there any documents/books etc you'd recommend to get a good
  understanding of extending postgres with languages like python? I'd
 really
  like to get a good grip of the architecture of this type of extension,
 and
  possibly attempt to introduce a language of my own choosing. The docs
 I've
  seen so far are mostly too specific, making it a bit for hard for me to
 see
  the forest from the trees.

 AFAIK, this one is the best one
 http://www.postgresql.org/docs/9.2/interactive/plpython.html.

 --
 Kind regards,
 Sergey Konoplev
 PostgreSQL Consultant and DBA

 http://www.linkedin.com/in/grayhemp
 +1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
 gray...@gmail.com



[GENERAL] Pl/Python runtime overhead

2013-08-07 Thread Seref Arikan
Greetings,
Somehow I have failed to find the appropriate keywords for successful
results for my question.

When a pl/python based function is invoked, does it keep a python runtime
running across calls to same function? That is, if I use connection
pooling, can I save on the python runtime initialization and loading costs?

Are there any documents/books etc you'd recommend to get a good
understanding of extending postgres with languages like python? I'd really
like to get a good grip of the architecture of this type of extension, and
possibly attempt to introduce a language of my own choosing. The docs I've
seen so far are mostly too specific, making it a bit for hard for me to see
the forest from the trees.

Regards
Seref


Re: [GENERAL] Pl/Python runtime overhead

2013-08-07 Thread Sergey Konoplev
On Wed, Aug 7, 2013 at 7:43 AM, Seref Arikan
serefari...@kurumsalteknoloji.com wrote:
 When a pl/python based function is invoked, does it keep a python runtime
 running across calls to same function? That is, if I use connection pooling,
 can I save on the python runtime initialization and loading costs?

You can use the following wrapping technique to cache function's body,
that will save you some resources and time. It stores the main() in SD
(session data) built-in object and retrieves it when stored, so
plpython does not need to process it every time stored function is
called.

CREATE OR REPLACE FUNCTION some_plpython_function()
 RETURNS integer
 LANGUAGE plpythonu
AS $function$
 An example of a function's body caching and error handling 

sdNamespace = 'some_plpython_function'

if sdNamespace not in SD:

def main():
 The function is assumed to be cached in SD and reused 

result = None

# Do whatever you need here

return result

# Cache body in SD
SD[sdNamespace] = main

try:
return SD[sdNamespace]()
except Exception, e:
import traceback
plpy.info(traceback.format_exc())

$function$;

I can also recommend you to cache query plans, as plpython does not do
it itself. The code below also works with SD to store prepared plans
and retrieve them. This allows you to avoid preparing every time you
are executing the same query. Just like plpgsql does, but manually.

if SD.has_key('%s_somePlan' % sdNamespace):
somePlan = SD['%s_planName' % sdNamespace]
else:
somePlan = plpy.prepare(...)


 Are there any documents/books etc you'd recommend to get a good
 understanding of extending postgres with languages like python? I'd really
 like to get a good grip of the architecture of this type of extension, and
 possibly attempt to introduce a language of my own choosing. The docs I've
 seen so far are mostly too specific, making it a bit for hard for me to see
 the forest from the trees.

AFAIK, this one is the best one
http://www.postgresql.org/docs/9.2/interactive/plpython.html.

-- 
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray...@gmail.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general