A wild idea: support both client-side (like in psycopg2) and server-side binding. Keep old '%s' syntax and provide a separate method for client-side binding (not in 'cur.execute()'). This could alleviate cases like parameters roundtripping and other cases of safe query composition. At the same time use '$N' for true server-side binding. Is it an overcomplication or there are valid use-cases of that?
On Sun, 8 Nov 2020 at 18:19, Vladimir Ryabtsev <greatvo...@gmail.com> wrote: > Hello, > > From what I understood from your messages, I like passing 'unknown' for > both strings and numbers. > Roundtripping parameters seems to be a less common case (with a possible > fix if it's necessary). > Is there anything else that does not work or works counterintuitively with > Python 'int' -> 'unknown'? > > BTW, may I ask another question regarding parameters? > Don't you want to step away from '%s' syntax and use '$1, $2, ...' which > seems to be more traditional in the database world? > '%s' feels like old-school string formatting, new server-side parameter > binding may want to give some new impression. > Moreover, it appears more convenient when you have parameters numbered and > can reuse them a few times in a query. > > Vladimir > > On Sun, 8 Nov 2020 at 14:22, Daniele Varrazzo <daniele.varra...@gmail.com> > wrote: > >> On Sun, 8 Nov 2020 at 20:35, Adrian Klaver <adrian.kla...@aklaver.com> >> wrote: >> >> > Alright I understand now. >> > More below. >> > >> > > >> > > In psycopg3 the idea is to use a more advanced protocol, which >> > > separates query and parameters. It brings several benefits: can use >> > > prepared statements (send a query once, several parameters later), >> > > passing large data doesn't bloat the parser (the params don't hit the >> > > lexer/parser), can use binary format (useful to pass large binary >> > > blobs without escaping them in a textual form), the format of the data >> > > is more homogeneous (no need to quoting), so we can use Python objects >> > > in COPY instead of limiting the interface for the copy functions to >> > > file-like objects only. >> > > >> > > Both in psycopg2 and 3 there is an adaptation from Python types to >> > > Postgres string representation. In pg2 there is additional quoting, >> > > because apart from numbers and bools you need to quote a literal >> > > string to merge it to the query and make it syntactically valid. >> > >> > So the issue in the psycopg3 protocol is making the parameters that are >> > passed in separately match up correctly in type to what the server is >> > expecting(or can cast implicitly)? >> >> Yes, correct. What we have to choose is which Postgres oid to map to >> each Python type. >> >> Sometimes the mapping is trivial (e.g. `datetime.date` -> `date` in >> Postgres, `uuid.UUID` -> `uuid`...) >> >> Sometimes it might be ambiguous: is a `datetime.datetime` a >> `timestamp` or a `timestamptz`? In some cases we don't care (here we >> can say `timestamptz` no problem: if the Python datetime doesn't have >> tzinfo, Postgres will use the `TimeZone` setting). >> >> Sometimes it's messy: what Python type corresponds to a Postgres >> `jsonb`? It might be a dict, or a list, or types that have other >> representations too (numbers, strings, bools). In this case, as in >> psycopg2, there can be a wrapper, e.g. `Json`, to tell psycopg that >> this dict, or list, or whatever else, must be jsonified for the db. >> >> When there are mismatches, sometimes the database cast rules help >> (e.gi in the timestamp[tz] case). Sometimes not: if we say `text` to a >> jsonb field, it will raise an error. Sometimes a cast is automatic on >> inserting in a table but not on passing a function parameter. >> >> Numbers are messy, as they usually are: Python has int, float, >> Decimal, Postgres has int2, int4, int8, float4, float8, numeric. The >> mappings float -> float8 and Decimal -> numeric are more or less >> straightforward. `int` is not, as in Python it's unbounded. If you say >> `select 10` in psql, the server understands "unknown type, but a >> number", and can try if either int* or numeric fit the context. But we >> don't have the help from the syntax that psql has: because 10 doesn't >> have quotes, Postgres is sure that it is a number, and not a string, >> but executing query/params separately we lose that expressivity: we >> cannot quote the strings and not the number. So choices are: >> >> 1. If we specify `numeric` or `int8` as oid, inserting in an int field >> in a table will work ok, but some functions/operators won't (e.g. "1 >> >> %s"). >> 2. If we specify `int4` it would work for those few functions defined >> as `integer`, but if we try to write a number that doesn't fit in 32 >> bits into a Postgres bigint field I assume something will overflow >> along the way, even if both python and postgres can handle it. >> 3. If we specify `unknown` it might work more often, but >> `cursor.execute("select %s", [10]) will return the string "10" instead >> of a number. >> >> So I wonder what's the best compromise to do here: the less bad seems >> 1. 3. might work in more contexts, but it's a very counterintuitive >> behaviour, and roundtripping other objects (dates, uuid) works no >> problem: they don't come back as strings. >> >> -- Daniele >> >> >>