Re: [PyGreSQL] query formatting is nearly 10x slower than query

Justin Pryzby Sat, 19 Jan 2019 09:25:13 -0800

I propose addressing this with a few related changes:

1) in 5.0, document that relative to query, query_formatted has an overhead
"which can be significant for queries repeated many times", and document that
the mitigation is to use inline=True; or, use prepared statements "available
since 5.1".  Note that for simple queries like INSERT, the significant overhead
is in pygres, but for complex queries like JOINs/large inheritence trees/etc,
the more overhead is in planning.


2) For 5.1.1 (and maybe 5.0), something to mitigate the cost of isinstance() in
pg and pgdb.

3) In 5.1 (but probably not 5.0?), consider changing query_formatted default to
inline=True.  In my test, this inserted 30% faster (!) even with no 2nd patch.

|$ python2.7 ./testinsert.py
|diff 192.718273878
|vs
|$ python2.7 ./testinsert.py
|diff 309.562824965

That might be good to consider for other reasons: there's 1) pqExec vs 2)
pqExecParams.  1) supports multiple commands; but 2) allows binary protocol
(which pygres doesn't currently support).  Binary protocol (or anything using
pqExecParams) will never support multiple commands.

If there aren't params, query_formatted currently calls query and pqExec, to
allow the possibility of including multiple commands.  I wonder whether
(starting in v5.1) perhaps pygres shouldn't call pqExecParams() in the case
that there are no params?  Otherwise it's odd that query_formatted would call
pqExec sometimes only, and an odd conditional which complicates any future
support for things like binary format.  I realize that binary format isn't
going to happentime anytime soon, if ever, but 5.1 is maybe an opportunity to
make that change.

Maybe multiple commands should be documented, and it's odd to write that q_f
supports multiple command "if there are no params, or if inline=True".  It'd be
better to be able to say "multiple commands are not supported except when
inline=True"; or, if that was the default, "multiple commands are not supported
if inline=False".

If pg always (or defaulted) used inline=True, we could always use multiple
commands (even with pqExecParams), and it would be similar to pgdb.  But maybe
that's moving in the wrong direction, too..

Or maybe that should wait until binary protocol is on the table :)

I realize this message is addressing multiple things and maybe not very
focused, but a few of these are kind of connected.

Cheers,
Justin

>>> pg.DB('ts').query_formatted('SELECT %s; SELECT %s;', [1,1]).getresult()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pg.py", line 1879, in query_formatted
    command, parameters, types, inline, prepare))
  File "pg.py", line 1861, in query
    return self.db.query(command, args)
pg.ProgrammingError: ERROR:  cannot insert multiple commands into a prepared 
statement

>>> pg.DB('ts').query_formatted('SELECT %s; SELECT %s;', [1,1], 
>>> inline=True).getresult()
[(1,)]

_______________________________________________
PyGreSQL mailing list
PyGreSQL@Vex.Net
https://mail.vex.net/mailman/listinfo/pygresql

Re: [PyGreSQL] query formatting is nearly 10x slower than query

Reply via email to