Re: [PERFORM] Question about SQL performance

PFC Tue, 05 Jun 2007 05:59:46 -0700

What sort of speed increase is there usually with binding parameters(and thus preparing statements) v. straight sql with interpolatedvariables? Will Postgresql realize that the following queries areeffectively the same (and thus re-use the query plan) or will it thinkthey are different?
        SELECT * FROM mytable WHERE item = 5;
        SELECT * FROM mytable WHERE item = 10;

No, if you send the above as text (not prepared) they are two differentqueries.Postgres' query executor is so fast that parsing and planning can takelonger than query execution sometimes. This is true of very simple selectslike above, or some very complex queries which take a long time to planbut don't actually process a lot of rows.I had this huge query (1 full page of SQL) with 5 joins, aggregates andsubqueries, returning about 30 rows ; it executed in about 5 ms, planningand parsing time was significant...

Obviously to me or you they could use the same plan. From what Iunderstand (correct me if I'm wrong), if you use parameter binding -like "SELECT * FROM mytable WHERE item = ?" - Postgresql will know thatthe queries can re-use the query plan, but I don't know if the systemwill recognize this with above situation.

It depends if your client library is smart enough to prepare thestatements...

Also, what's the difference between prepared statements (using PREPAREand EXECUTE) and regular functions (CREATE FUNCTION)? How do they impactperformance? From what I understand there is no exact parallel to storedprocedures (as in MS SQL or oracle, that are completely precompiled) inPostgresql. At the same time, the documentation (and other sites aswell, probably because they don't know what they're talking about whenit comes to databases) is vague because PL/pgSQL is often said to beable to write stored procedures but nowhere does it say that PL/pgSQLprograms are precompiled.

PG stores the stored procedures as text. On first invocation, in eachconnection, they are "compiled", ie. all statements in the SP areprepared, so the first invocation in a connection is slower than nextinvocations. This is a problem if you do not use persistent connections.

A simple select, when prepared, will take about 25 microseconds inside aSP and 50-100 microseconds as a query over the network. If not prepared,about 150 µs or 2-3x slower.

FYI Postgres beats MyISAM on "small simple selects" if you use preparedqueries.



        I use the following Python code to auto-prepare my queries :

db = PGConn( a function that returns a DB connection )

db.prep_exec( "SELECT * FROM stuff WHERE id = %s", 1 ) # prepares andexecutes

db.prep_exec( "SELECT * FROM stuff WHERE id = %s", 2 )        # executes only


class PGConn( object ):
        
        def __init__( self, db_connector ):
                self.db_connector = db_connector
                self.reconnect()
        
        def reconnect( self ):
                self.prep_cache = {}
                self.db = self.db_connector()
                self.db.set_isolation_level( 0 ) # autocommit
        
        def cursor( self ):
#               return self.db.cursor( 
cursor_factory=psycopg2.extras.DictCursor )
                return self.db.cursor(  )
                
        def execute( self, sql, *args ):
                cursor = self.cursor()
                try:
                        cursor.execute( sql, args )
                except:
                        cursor.execute( "ROLLBACK" )
                        raise
                return cursor

        def executemany( self, sql, *args ):
                cursor = self.cursor()
                try:
                        cursor.executemany( sql, args )
                except:
                        cursor.execute( "ROLLBACK" )
                        raise
                return cursor

        def prep_exec( self, sql, *args ):
                cursor = self.cursor()
                stmt = self.prep_cache.get( sql )
                if stmt is None:
                        name = "stmt_%s" % (len( self.prep_cache ) + 1)
                        if args:
                                prep = sql % tuple( "$%d"%(x+1) for x in 
xrange( len( args )) )
                        else:
                                prep = sql
                        prep = "PREPARE %s AS %s" % (name, prep)
                        cursor.execute( prep )
                        if args:
                                stmt = "EXECUTE %s( %s )" % (name, ", ".join( 
["%s"] * len( args ) ))
                        else:
                                stmt = "EXECUTE %s" % (name,)
                        self.prep_cache[ sql ] = stmt
                        
                try:
                        cursor.execute( stmt, args )
                except Exception, e:
                        traceback.print_exc()
                        print "Error while executing prepared SQL statement :", 
stmt
                        print "Arguments :", args
                        print "Original SQL is :", sql
                        cursor.execute( "ROLLBACK" )
                        raise
                
                return cursor


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match

Re: [PERFORM] Question about SQL performance

Reply via email to