On 11/8/20 2:21 PM, Daniele Varrazzo wrote:
On Sun, 8 Nov 2020 at 20:35, Adrian Klaver <adrian.kla...@aklaver.com> wrote:

Alright I understand now.
More below.


In psycopg3 the idea is to use a more advanced protocol, which
separates query and parameters. It brings several benefits: can use
prepared statements (send a query once, several parameters later),
passing large data doesn't bloat the parser (the params don't hit the
lexer/parser), can use binary format (useful to pass large binary
blobs without escaping them in a textual form), the format of the data
is more homogeneous (no need to quoting), so we can use Python objects
in COPY instead of limiting the interface for the copy functions to
file-like objects only.

Both in psycopg2 and 3 there is an adaptation from Python types to
Postgres string representation. In pg2 there is additional quoting,
because apart from numbers and bools you need to quote a literal
string to merge it to the query and make it syntactically valid.

So the issue in the psycopg3 protocol is making the parameters that are
passed in separately match up correctly in type to what the server is
expecting(or can cast implicitly)?

Yes, correct. What we have to choose is which Postgres oid to map to
each Python type.

Sometimes the mapping is trivial (e.g. `datetime.date` -> `date` in
Postgres, `uuid.UUID` -> `uuid`...)

Sometimes it might be ambiguous: is a `datetime.datetime` a
`timestamp` or a `timestamptz`? In some cases we don't care (here we
can say `timestamptz` no problem: if the Python datetime doesn't have
tzinfo, Postgres will use the `TimeZone` setting).

Sometimes it's messy: what Python type corresponds to a Postgres
`jsonb`? It might be a dict, or a list, or types that have other
representations too (numbers, strings, bools). In this case, as in
psycopg2, there can be a wrapper, e.g. `Json`, to tell psycopg that
this dict, or list, or whatever else, must be jsonified for the db.

When there are mismatches, sometimes the database cast rules help
(e.gi in the timestamp[tz] case). Sometimes not: if we say `text` to a
jsonb field, it will raise an error. Sometimes a cast is automatic on
inserting in a table but not on passing a function parameter.

Numbers are messy, as they usually are: Python has int, float,
Decimal, Postgres has int2, int4, int8, float4, float8, numeric. The
mappings float -> float8 and Decimal -> numeric are more or less
straightforward. `int` is not, as in Python it's unbounded. If you say
`select 10` in psql, the server understands "unknown type, but a
number", and can try if either int* or numeric fit the context. But we
don't have the help from the syntax that psql has: because 10 doesn't
have quotes, Postgres is sure that it is a number, and not a string,
but executing query/params separately we lose that expressivity: we
cannot quote the strings and not the number. So choices are:

1. If we specify `numeric` or `int8` as oid, inserting in an int field
in a table will work ok, but some functions/operators won't (e.g. "1

What is not working here?

%s").
2. If we specify `int4` it would work for those few functions defined
as `integer`, but if we try to write a number that doesn't fit in 32
bits into a Postgres bigint field I assume something will overflow
along the way, even if both python and postgres can handle it.
3. If we specify `unknown` it might work more often, but
`cursor.execute("select %s", [10]) will return the string "10" instead
of a number.

So I wonder what's the best compromise to do here: the less bad seems
1. 3. might work in more contexts, but it's a very counterintuitive
behaviour, and roundtripping other objects (dates, uuid) works no
problem: they don't come back as strings.

There is a lot to digest here. I'm going to have to do some thinking on this.


-- Daniele



--
Adrian Klaver
adrian.kla...@aklaver.com


Reply via email to