Re: [HACKERS] Linking libpq statically to libssl
On Fri, Oct 27, 2017 at 2:37 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Daniele Varrazzo <daniele.varra...@gmail.com> writes: >> I have a problem building binary packages for psycopg2. Binary >> packages ship with their own copies of libpq and libssl; however if >> another python package links to libssl the library will be imported >> twice with conflicting symbols, likely resulting in a segfault (see >> https://github.com/psycopg/psycopg2/issues/543). This happens e.g. if >> a python script both connects to postgres and opens an https resource. > > Basically, you're doing it wrong. Shipping your own copy of libssl, > rather than depending on whatever packaging the platform provides, > is just asking for pain --- and not only of this sort. You're also > now on the hook to update your package whenever libssl fixes a bug > or a security vulnerability, which happens depressingly often. > > The same applies to libpq, really. You don't want to be in the > business of shipping bits that you are not the originator of. > > When I worked at Red Hat, there was an ironclad policy against > building packages that incorporated other packages statically. > I would imagine that other distros have similar policies for > similar reasons. Just because you *can* ignore those policies > doesn't mean you *should*. Distros do compile the library from source and against the system package, and everyone using the package directly can still do so; the binary packages are only installed by the Python package manager. Psycopg is more complex to install than the average Python package (it needs python and postgres dev files, pg_config available somewhere etc) and a no-dependencies package provides a much smoother experience. It also happens that the libpq and libssl versions used tend to be more up-to-date than the system one (people can already use the new libpq 10 features without waiting for debian packages). I am confronted with the reality of Python developers as of 201x's, and shipping binary packages has proven generally a positive feature, even factoring in the need of shipping updated binary packages when the need arises. The benefit of a simple-to-use library is for the Postgres project at large, it is not for my personal gain. So while I know the shortcomings of binary packages and static libraries, I would still be interested in knowing the best way to fix the problem in the subject. Feel free to write in private if you want to avoid the public shaming. -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Linking libpq statically to libssl
Hello, I have a problem building binary packages for psycopg2. Binary packages ship with their own copies of libpq and libssl; however if another python package links to libssl the library will be imported twice with conflicting symbols, likely resulting in a segfault (see https://github.com/psycopg/psycopg2/issues/543). This happens e.g. if a python script both connects to postgres and opens an https resource. A solution to work around the problem could be to change libssl symbols names, in the binaries or in the source code (https://github.com/psycopg/psycopg2-wheels/issues/7). But maybe a simpler workaround could be just to build libpq linking to libssl statically. Is this possible? ...and other libs too I guess, because the problem was found on libssl but conflicts with other libs is possible too, ldap comes to mind... Any help to solve the problem would be appreciated. Thank you very much. -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Help required to debug pg_repack breaking logical replication
Hello, we have been reported, and I have experienced a couple of times, pg_repack breaking logical replication. - https://github.com/reorg/pg_repack/issues/135 - https://github.com/2ndQuadrant/pglogical/issues/113 In my experience, after the botched run, the replication slot was "stuck", and any attempt of reading (including pg_logical_slot_peek_changes()) blocked until ctrl-c. I've tried replicating the issue but first attempts have failed to fail. In the above issue #113, Petr Jelinek commented: > From quick look at pg_repack, the way it does table rewrite is almost > guaranteed > to break logical decoding unless there is zero unconsumed changes for a given > table > as it does not build the necessary mappings info for logical decoding that > standard > heap rewrite in postgres does. unfortunately he didn't follow up to further details requests. I've started drilling down the problem, observing that the swap function, swap_heap_or_index_files() [1] was cargoculted by the original author from the CLUSTER command code as of PG 8.2 [2] (with a custom addition to update the relfrozenxid which seems backwards to me as it sets the older frozen xid on the new table [3]). [1] https://github.com/reorg/pg_repack/blob/ver_1.4.1/lib/repack.c#L1082 [2] https://github.com/postgres/postgres/blob/REL8_2_STABLE/src/backend/commands/cluster.c#L783 [3] https://github.com/reorg/pg_repack/issues/152 so that code is effectively missing a good 10 years of development. Before jumping into fast-forwarding it, I would like to ask for some help, i.e. - Is Petr diagnosis right and freezing of logical replication is to be blamed to missing mapping? - Can you suggest a test to reproduce the issue reliably? - What are mapped relations anyway? Thank you in advance for any help (either info about how to fix the issue properly, or a patch by someone who happens to really understand what we are talking about). -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [pgsql-translators] [HACKERS] translatable string fixes
On Tue, May 23, 2017 at 1:15 AM, Alvaro Herrerawrote: > It took me a very long time to figure out how to translate these 9.6-new > strings in the AM validate routines: > > msgid "gin operator family \"%s\" contains support procedure %s with > cross-type registration" > > The problem I had was that the term "cross-type registration" is not > used anywhere else, it's not clear what it means, and (worst from my > POV) I couldn't think of any decent phrasing in Spanish for it. After > staring at the code for a while, I translated them roughly as: > > "gin operator family %s contains support procedure %s registered with > differing types" > > which I think is a tad clearer ... but as a user confronted with such a > message, I would be at a complete loss on what to do about it. I did something similar, translating the equivalent of "across different types". Had to look at the source code to understand the meaning of the sentence. Maybe cross-type registration is a bit too cryptic. > Maybe we can use this phrasing: > "gin operator family %s contains support procedure %s registered with > different left and right types" > > > The other complaint I have about this one and also other amvalidate > functions is the hardcoded AM name, so it's actually one string per AM, > which is annoying (a total of twenty-something messages which are > actually only four or five different ones). Ignoring the second part of > the phrase now, we could use this: > "operator family %s of access method %s contains support procedure %s with > cross-type registration" Yup, that was boring and error-prone :\ -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Detecting schema changes during logical replication
On Mon, May 8, 2017 at 3:48 AM, Craig Ringerwrote: > Sounds like you're reimplementing pglogical > (http://2ndquadrant.com/pglogical) on top of a json protocol. The fact the protocol is JSON is more a detail, but it's a good start as it's human-readable. > [...] > I have no reason to object to your doing it yourself, and you're welcome to > use pglogical as a reference for how to do things (see the license). It just > seems like a waste. Logical Replication, for the first time, offers a way to implement a replication solution that is not several layers away from the database. Or even: for the first time is something I understand. Using the logical replication we can perform some manipulation of the data I will want to use (tables not necessarily in the same places, schemas not necessarily matching). In particular one not-really-minor annoyance of the current system is that adding a column of the master regularly breaks the replica, and pglogical doesn't resolve this problem. We currently use certain features of Londiste (tables living in different schemas, slightly different data types with the same textual representation, extra columns on the slave...) that are against pglogical requirements, but which can be implemented no problem is a customised replication solution built on top of streaming replication. All in all I'm more thrilled by the idea of having a database throwing a stream of changes at me in a format I can reinterpret, allowing me to write in a target database with a high degree of configurability in the middle (i.e. I just have a Python script receiving the data, munging them and then performing queries), then a complete but schema-rigid replication solution. -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Detecting schema changes during logical replication
On Sun, May 7, 2017 at 8:04 PM, Andres Freund <and...@anarazel.de> wrote: > Hi, > > On 2017-05-07 19:27:08 +0100, Daniele Varrazzo wrote: >> I'm putting together a replication system based on logical >> replication. > > Interesting. If you very briefly could recap what it's about... ;) I need to replicate some tables from a central database into the database that should run a secondary system. For a similar use case we have used Londiste in the past, which has served us good, but its usage has not been problem-free. Logical decoding seems much less invasive on the source database than a trigger-based replication solution, and has less moving part to care about and maintain. For the moment I'm hacking into a fork of Euler project for wal decoding into json (https://github.com/dvarrazzo/wal2json), mostly adding configurability, so that we may be able to replicate only the tables we need, skip certain fields etc. I'm also taking a look at minimising the amount of information produced: sending over and over the column names and types for every record seems a waste, hence my question. >> I would like to send table information only the first >> time a table is seen by the 'change_cb' callback, but of course there >> could be some schema change after replication started. So I wonder: is >> there any information I can find in the 'Relation' structure of the >> change callback, which may suggest that there could have been a change >> in the table schema, hence a new schema should be sent to the client? > > The best way I can think of - which is also what is implemented in the > in-core replication framework - is to have a small cache on-top of the > relcache. That cache is kept coherent using > CacheRegisterRelcacheCallback(). Then whenever there's a change you > look up that change in that cache, and send the schema information if > it's been invalidated since you last sent something. That's also how > the new stuff in v10 essentially works: > src/backend/replication/pgoutput/pgoutput.c > > pgoutput_change(), does a lookup for its own metadata using > get_rel_sync_entry() > which then checks relentry->schema_sent. Invalidation unsets > schema_sent in rel_sync_cache_relation_cb. Thank you very much, it seems exactly what I need. I'll try hacking around this callback. -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Detecting schema changes during logical replication
Hello, I'm putting together a replication system based on logical replication. I would like to send table information only the first time a table is seen by the 'change_cb' callback, but of course there could be some schema change after replication started. So I wonder: is there any information I can find in the 'Relation' structure of the change callback, which may suggest that there could have been a change in the table schema, hence a new schema should be sent to the client? Thank you very much, -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 'text' instead of 'unknown' in Postgres 10
On Mon, Feb 13, 2017 at 3:09 AM, Jim Nasby <jim.na...@bluetreble.com> wrote: > On 2/7/17 9:16 AM, Daniele Varrazzo wrote: >> >> Thank you for the clarification: I will assume the behaviour cannot be >> maintained on PG 10 and think whether the treatment of '{}' is too >> magical and drop it instead. > > > BTW, I would hope that passing '{}' into a defined array field still works, > since an empty array isn't treated the same as NULL, which means you need > some way to create an empty array. Yes, that didn't change. The issue was only reading data from postgres to python. There is a beta of next psycopg version available on testpypi which implements the new behaviour, if you want to take a look at it. You can use pip install -i https://testpypi.python.org/pypi psycopg2==2.7b1 to install it. -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 'text' instead of 'unknown' in Postgres 10
On Tue, Feb 7, 2017 at 2:59 PM, Andreas Karlsson <andr...@proxel.se> wrote: > On 02/07/2017 03:14 PM, Daniele Varrazzo wrote: >> >> In psycopg '{}'::unknown is treated specially as an empty array and >> converted into an empty list, which allows empty lists to be passed to >> the server as arrays and returned back to python. Without the special >> case, empty lists behave differently from non-empty ones. It seems >> this behaviour cannot be maintained on PG 10 and instead users need to >> specify some form of cast for their placeholder. Previously this would >> have worked "as expected" and the 4th argument would have been an >> empty list: >> >> cur.execute("SELECT %s, %s, %s, %s", (['x'], [42], [date(2017,1,1)], >> [])); cur.fetchone() >> (['x'], [42], [datetime.date(2017, 1, 1)], '{}') > > > As Tom wrote this is the result of an intentional change, but no matter if > that change is a good thing or not the above behavior sounds rather fragile. > To me it does not seem safe to by default just assume that '{}' means the > empty array, it might also have been intended to be the Python string "{}", > the empty JSON object, or entirely something different. Yes, it could be actually the case to drop it. The case for it is quite thin anyway: if something comes from a query it will usually have a type attached. -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 'text' instead of 'unknown' in Postgres 10
On Tue, Feb 7, 2017 at 2:42 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Daniele Varrazzo <daniele.varra...@gmail.com> writes: >> testing with psycopg2 against Postgres 10 I've found a difference in >> behaviour regarding literals, which are returned as text instead of >> unknown. ... >> Is this behaviour here to stay? Is there documentation for this change? > > Yup, see > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=1e7c4bb0049732ece651d993d03bb6772e5d281a > > The expectation is that clients will never see "unknown" output columns > anymore. Ok thank you, I'll document the change in behaviour. > I don't have enough context to suggest a better definition for psycopg > ... but maybe you could pay some attention to the Python type of the value > you're handed? In python the only type is the list, there is no specific "list of integer" or such. >> It seems >> this behaviour cannot be maintained on PG 10 and instead users need to >> specify some form of cast for their placeholder. > > Well, no version of PG has ever allowed this without a cast: > > regression=# select array[]; > ERROR: cannot determine type of empty array > > so I'm not sure it's inconsistent for the same restriction to apply in the > case you're describing. I'm also unclear on why you are emphasizing the > point of the array being empty, because '{1,2,3}'::unknown would have the > same behavior. The inconsistency is on our side: on python list [1,2,3] we generate 'ARRAY[1,2,3]', and empty lists are instead converted to '{}' precisely because there is no such thing like unknown[] - nor we can generate array[]::int[] because the Python list is empty and we don't know if it would have contained integers or other stuff. Of course this only works because we merge arguments in the adapter: moving to use PQexecParams we couldn't allow that anymore and the user should be uniformly concerned with adding casts to their queries (this is a non-backward compatible change so only planned for a mythical psycopg3 version I've long desired to write but for which I have no resource). Thank you for the clarification: I will assume the behaviour cannot be maintained on PG 10 and think whether the treatment of '{}' is too magical and drop it instead. -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] 'text' instead of 'unknown' in Postgres 10
Hello, testing with psycopg2 against Postgres 10 I've found a difference in behaviour regarding literals, which are returned as text instead of unknown. In previous versions: In [2]: cnn = psycopg2.connect('') In [3]: cur = cnn.cursor() In [7]: cur.execute("select 'x'") In [9]: cur.description[0][1] Out[9]: 705 In pg10 master: In [10]: cnn = psycopg2.connect('dbname=postgres host=localhost port=54310') In [11]: cur = cnn.cursor() In [12]: cur.execute("select 'x'") In [13]: cur.description[0][1] Out[13]: 25 what is somewhat surprising is that unknown seems promoted to text "on the way out" from a query; in previous versions both columns of this query would have been "unknown". postgres=# select pg_typeof('x'), pg_typeof(foo) from (select 'x' as foo) x; pg_typeof | pg_typeof ---+--- unknown | text Is this behaviour here to stay? Is there documentation for this change? In psycopg '{}'::unknown is treated specially as an empty array and converted into an empty list, which allows empty lists to be passed to the server as arrays and returned back to python. Without the special case, empty lists behave differently from non-empty ones. It seems this behaviour cannot be maintained on PG 10 and instead users need to specify some form of cast for their placeholder. Previously this would have worked "as expected" and the 4th argument would have been an empty list: cur.execute("SELECT %s, %s, %s, %s", (['x'], [42], [date(2017,1,1)], [])); cur.fetchone() (['x'], [42], [datetime.date(2017, 1, 1)], '{}') Should I just take this test off from the test suite and document the adapter as behaving differently on PG 10? Thank you very much -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Redundant error messages in policy.c
There are 5 different strings (one has a whitespace error), they could be 2. Patch attached. postgresql/src/backend$ grep errmsg commands/policy.c | grep policy | sed 's/^ *//' errmsg(policy \%s\ for relation \%s\ already exists, errmsg(policy \%s\ on table \%s\ does not exist, errmsg(policy \%s\ for table \%s\ already exists, errmsg(policy \%s\ for table \%s\ does not exist, errmsg(policy \%s\ for table \%s\ does not exist, -- Daniele diff --git a/src/backend/commands/policy.c b/src/backend/commands/policy.c index 6e95ba2..11efc9f 100644 --- a/src/backend/commands/policy.c +++ b/src/backend/commands/policy.c @@ -563,7 +563,7 @@ CreatePolicy(CreatePolicyStmt *stmt) if (HeapTupleIsValid(policy_tuple)) ereport(ERROR, (errcode(ERRCODE_DUPLICATE_OBJECT), - errmsg(policy \%s\ for relation \%s\ already exists, + errmsg(policy \%s\ for table \%s\ already exists, stmt-policy_name, RelationGetRelationName(target_table; values[Anum_pg_policy_polrelid - 1] = ObjectIdGetDatum(table_id); @@ -735,7 +735,7 @@ AlterPolicy(AlterPolicyStmt *stmt) if (!HeapTupleIsValid(policy_tuple)) ereport(ERROR, (errcode(ERRCODE_UNDEFINED_OBJECT), - errmsg(policy \%s\ on table \%s\ does not exist, + errmsg(policy \%s\ for table \%s\ does not exist, stmt-policy_name, RelationGetRelationName(target_table; @@ -977,7 +977,7 @@ get_relation_policy_oid(Oid relid, const char *policy_name, bool missing_ok) if (!missing_ok) ereport(ERROR, (errcode(ERRCODE_UNDEFINED_OBJECT), - errmsg(policy \%s\ for table \%s\ does not exist, + errmsg(policy \%s\ for table \%s\ does not exist, policy_name, get_rel_name(relid; policy_oid = InvalidOid; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Spurious full-stop in message
postgresql/src/backend$ grep must be superuser to change bypassrls attribute commands/user.c | sed 's/ \+//' errmsg(must be superuser to change bypassrls attribute.))); errmsg(must be superuser to change bypassrls attribute))); Patch to fix attached. diff --git a/src/backend/commands/user.c b/src/backend/commands/user.c index 3b381c5..5b20994 100644 --- a/src/backend/commands/user.c +++ b/src/backend/commands/user.c @@ -301,7 +301,7 @@ CreateRole(CreateRoleStmt *stmt) if (!superuser()) ereport(ERROR, (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), -errmsg(must be superuser to change bypassrls attribute.))); +errmsg(must be superuser to change bypassrls attribute))); } else { -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] libpq connection status and closed fd
Hello, a psycopg user is reporting [1] that the library is not marking the connection as closed and/or bad after certain errors, such as a connection timeout. He is emulating the error by closing the connection fd (I don't know if the two conditions result in the same effect, but I'll stick to this hypothesis for now). [1] https://github.com/psycopg/psycopg2/issues/263 Testing with a short C program [2] it seems that indeed, after closing the fd and causing an error in `PQconsumeInput`, the connection is deemed in good state by `PQstatus`. A similar test gives the same result after `PQexec` is attempted on a connection whose fd is closed (tests performed with PostgreSQL 9.3.5). [2] https://gist.github.com/dvarrazzo/065f343c95f8ea67cf8f Is this intentional? Is there a better way to check for a broken connection? If we mark the connection as closed every time `PQconsumeInput` returns 0 (or `PQexec` returns null, which are the two code paths affecting psycopg) would it be too aggressive and cause false positives? Thank you very much. -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq connection status and closed fd
On Mon, Sep 22, 2014 at 8:17 AM, Dmitriy Igrishin dmit...@gmail.com wrote: 2014-09-22 10:42 GMT+04:00 Daniele Varrazzo daniele.varra...@gmail.com: [2] https://gist.github.com/dvarrazzo/065f343c95f8ea67cf8f Why are you using close() instead of PQfinish()? Because I'm testing for an error, please read my message and the original bug report. -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fixed redundant i18n strings in json
I'd definitely replace /arg/argument/. Furthermore I'd avoid the form argument 1: something is wrong: the string is likely to end up in sentences with other colons so I'd rather use something is wrong at argument 1. Is the patch attached better? Cheers, -- Daniele On Thu, Jul 31, 2014 at 1:57 PM, Robert Haas robertmh...@gmail.com wrote: On Wed, Jul 30, 2014 at 2:59 PM, Daniele Varrazzo daniele.varra...@gmail.com wrote: Please find attached a small tweak to a couple of strings found while updating translations. The %d version is already used elsewhere in the same file. Probably not a bad idea, but aren't these strings pretty awful anyway? At a minimum, arg as an abbreviation for argument doesn't seem good. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company diff --git a/src/backend/utils/adt/json.c b/src/backend/utils/adt/json.c index 2f99908..0c55e9a 100644 --- a/src/backend/utils/adt/json.c +++ b/src/backend/utils/adt/json.c @@ -1910,7 +1910,7 @@ json_object_agg_transfn(PG_FUNCTION_ARGS) if (val_type == InvalidOid || val_type == UNKNOWNOID) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), - errmsg(arg 1: could not determine data type))); + errmsg(could not determine data type for argument %d, 1))); add_json(arg, false, state, val_type, true); @@ -1934,7 +1934,7 @@ json_object_agg_transfn(PG_FUNCTION_ARGS) if (val_type == InvalidOid || val_type == UNKNOWNOID) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), - errmsg(arg 2: could not determine data type))); + errmsg(could not determine data type for argument %d, 2))); add_json(arg, PG_ARGISNULL(2), state, val_type, false); @@ -1980,7 +1980,8 @@ json_build_object(PG_FUNCTION_ARGS) if (nargs % 2 != 0) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), - errmsg(invalid number or arguments: object must be matched key value pairs))); + errmsg(invalid number or arguments), + errhint(Object must be matched key value pairs.))); result = makeStringInfo(); @@ -1994,7 +1995,8 @@ json_build_object(PG_FUNCTION_ARGS) if (PG_ARGISNULL(i)) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), - errmsg(arg %d: key cannot be null, i + 1))); + errmsg(argument %d cannot be null, i + 1), + errhint(Object keys should be text.))); val_type = get_fn_expr_argtype(fcinfo-flinfo, i); /* @@ -2016,7 +2018,8 @@ json_build_object(PG_FUNCTION_ARGS) if (val_type == InvalidOid || val_type == UNKNOWNOID) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), - errmsg(arg %d: could not determine data type, i + 1))); + errmsg(could not determine data type for argument %d, + i + 1))); appendStringInfoString(result, sep); sep = , ; add_json(arg, false, result, val_type, true); @@ -2042,7 +2045,8 @@ json_build_object(PG_FUNCTION_ARGS) if (val_type == InvalidOid || val_type == UNKNOWNOID) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), - errmsg(arg %d: could not determine data type, i + 2))); + errmsg(could not determine data type for argument %d, + i + 2))); add_json(arg, PG_ARGISNULL(i + 1), result, val_type, false); } @@ -2099,7 +2103,8 @@ json_build_array(PG_FUNCTION_ARGS) if (val_type == InvalidOid || val_type == UNKNOWNOID) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), - errmsg(arg %d: could not determine data type, i + 1))); + errmsg(could not determine data type for argument %d, + i + 1))); appendStringInfoString(result, sep); sep = , ; add_json(arg, PG_ARGISNULL(i), result, val_type, false); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Fixed redundant i18n strings in json
Please find attached a small tweak to a couple of strings found while updating translations. The %d version is already used elsewhere in the same file. -- Daniele diff --git a/src/backend/utils/adt/json.c b/src/backend/utils/adt/json.c index 2f99908..2aa27cc 100644 --- a/src/backend/utils/adt/json.c +++ b/src/backend/utils/adt/json.c @@ -1910,7 +1910,7 @@ json_object_agg_transfn(PG_FUNCTION_ARGS) if (val_type == InvalidOid || val_type == UNKNOWNOID) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), - errmsg(arg 1: could not determine data type))); + errmsg(arg %d: could not determine data type, 1))); add_json(arg, false, state, val_type, true); @@ -1934,7 +1934,7 @@ json_object_agg_transfn(PG_FUNCTION_ARGS) if (val_type == InvalidOid || val_type == UNKNOWNOID) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), - errmsg(arg 2: could not determine data type))); + errmsg(arg %d: could not determine data type, 2))); add_json(arg, PG_ARGISNULL(2), state, val_type, false); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 6, 2014 at 6:54 PM, Josh Berkus j...@agliodbs.com wrote: The actual storage upgrade of hstore--hstore2 is fairly painless from the user perspective; they don't have to do anything. The problem is that the input/output strings are different, something which I didn't think to check for (and Peter did), and which will break applications relying on Hstore, since the drivers which support Hstore (like psycopg2) rely on string-parsing to convert it. I haven't regression-tested hstore2 against psycopg2 since I don't have a good test, but that would be a useful thing to do. Hello, psycopg developer here. Not following the entire thread as it's quite articulated and not of my direct interest (nor comprehension). But if you throw at me a few test cases I can make sure psycopg can parse them much before hstore2 is released. FYI I have a trigger that highlights me the -hackers messages mentioning psycopg, so just mentioning it is enough for me to take a better look. But if you want a more active collaboration just ask. Thank you, -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb and nested hstore
On Thu, Mar 6, 2014 at 9:10 PM, Peter Geoghegan p...@heroku.com wrote: On Thu, Mar 6, 2014 at 12:58 PM, Daniele Varrazzo daniele.varra...@gmail.com wrote: Hello, psycopg developer here. Not following the entire thread as it's quite articulated and not of my direct interest (nor comprehension). But if you throw at me a few test cases I can make sure psycopg can parse them much before hstore2 is released. I don't think that'll be necessary. Any break in compatibility in the hstore format has been ruled a non-starter for having hstore support nested data structures. I believe on balance we're content to let hstore continue to be hstore. jsonb support would certainly be interesting, though. Cool, just let me know what you would expect a well-behaved client library to behave. -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Changing schema on the fly
Hello dear -hackers, I'm maintaining pg_reorg/pg_repack, which you may know effectively allow online VACUUM FULL or CLUSTER. It works by installing logging triggers to keep data up-to-date during the migration, creating a copy of the table, and eventually swapping the tables relfilenodes. The new table is forced to keep exactly the same physical structure, e.g. restoring dropped columns too. Failing to do so was apparently a big mistake, looking at this commit [1]. My knowledge of Postgres at that level is limited: what I imagine is that cached plans keep the offset of the field in the row, so data ends up read/written in the wrong position if such offset changes. The commit message mentions views and stored procedures being affected. Is there a way to force invalidation of all the cache that may hold a reference to the columns offset? Or is the problem an entirely different one and the above cache invalidation wouldn't be enough? If we managed to allow schema change in pg_repack we could allow many more online manipulations features: changing data types, reordering columns, really dropping columns freeing up space etc. Thank you very much, -- Daniele [1] https://github.com/reorg/pg_repack/commit/960930b645df8eeeda15f176c95d3e450786f78a -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] A few string fixed
Hello, while translating the new PostgreSQL 9.3 strings I've found a couple questionable. Patches attached. Cheers, -- Daniele 0001-Fixed-MultiXactIds-string-warning.patch Description: Binary data 0002-Fixed-pasto-in-hint-string-about-making-views-deleta.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A few string fixed
On Wed, Mar 20, 2013 at 2:01 PM, Tom Lane t...@sss.pgh.pa.us wrote: Daniele Varrazzo daniele.varra...@gmail.com writes: while translating the new PostgreSQL 9.3 strings I've found a couple questionable. Patches attached. Hmm ... I agree with the MultiXactId-MultiXactIds changes, but not with this one: -errhint(To make the view updatable, provide an unconditional ON DELETE DO INSTEAD rule or an INSTEAD OF DELETE trigger.))); +errhint(To make the view deletable, provide an unconditional ON DELETE DO INSTEAD rule or an INSTEAD OF DELETE trigger.))); We use the phrase updatable view, we don't say deletable view (and this usage is also found in the SQL standard). We could possibly make the message say To make the view updatable in this way, or ... for this purpose, but that seems a bit long-winded to me. Ok, I'd just thought it was a pasto. -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [RFC] ideas for a new Python DBAPI driver (was Re: [HACKERS] libpq test suite)
On Fri, Feb 15, 2013 at 9:28 PM, Peter Eisentraut pete...@gmx.net wrote: On 2/14/13 2:42 PM, Marko Tiikkaja wrote: I think the reason this doesn't work is that in order to prepare a query you need to know the parameter types, but you don't know that in Python, or at least with the way the DB-API works. For example, if you write cur.execute(SELECT * FROM tbl WHERE a = %s AND b = %s, (val1, val2)) what types will you pass to PQsendQueryParams? Pardon me if this is obvious, but why would you need to pass any types at all? Assuming we're still talking about PQsendQueryParams and not an explicit prepare/execute cycle.. Well, PQsendQueryParams() requires types to be passed, doesn't it? No, not necessarily: they are inferred by the context if they are not specified. I've had in mind for a long time to use the *Params() functions in psycopg (although it would be largely not backwards compatible, hence to be done on user request and not by default). Psycopg has all the degrees of freedom in keeping the two implementations alive (the non-*params for backward compatibility, the *params for future usage). I'd drafted a plan on the psycopg ML some times ago. But I don't have a timeline for that: it's a major work and without pressing motivations to do it. -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL/Python result object str handler
On Tue, Jan 8, 2013 at 9:32 AM, Magnus Hagander mag...@hagander.net wrote: On Tue, Jan 8, 2013 at 3:58 AM, Peter Eisentraut pete...@gmx.net wrote: For debugging PL/Python functions, I'm often tempted to write something like rv = plpy.execute(...) plpy.info(rv) which prints something unhelpful like PLyResult object at 0xb461d8d8 By implementing a str handler for the result object, it now prints something like PLyResult status=5 nrows=2 rows=[{'foo': 1, 'bar': '11'}, {'foo': 2, 'bar': '22'}] This looks more a repr-style format to me (if you implement repr but not str, the latter will default to the former). Patch attached for review. How does it work if there are many rows in there? Say the result contains 10,000 rows - will the string contain all of them? If so, might it be worthwhile to cap the number of rows shown and then follow with a ... or something? I think it would: old django versions were a pain in the neck because when a page broke an entire dump of gigantic queries was often dumped as debug info. -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] allowing multiple PQclear() calls
On Tue, Dec 11, 2012 at 12:43 PM, Josh Kupershmidt schmi...@gmail.com wrote: Ah, well. I guess using a macro like: #define SafeClear(res) do {PQclear(res); res = NULL;} while (0); will suffice for me. Psycopg uses: #define IFCLEARPGRES(pgres) if (pgres) {PQclear(pgres); pgres = NULL;} -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Resetting libpq connections after an app error
On Sat, Jul 21, 2012 at 5:36 PM, Tom Lane t...@sss.pgh.pa.us wrote: Martijn van Oosterhout klep...@svana.org writes: On Sat, Jul 21, 2012 at 01:08:58AM +0100, Daniele Varrazzo wrote: In a libpq application, if there is an application error between PQsendQuery and PQgetResult, is there a way to revert a connection back to an usable state (i.e. from transaction status ACTIVE to IDLE) without using the network in a blocking way? We are currently doing There is PQreset(), which also exists in a non-blocking variant. Note that PQreset essentially kills the connection and establishes a new one, which might not be what Daniele is looking for. The alternative approach is to issue PQcancel and then just let the query flush out as you normally would in an async application, ie PQisBusy/PQconsumeInput until ready, then PQgetResult (which you throw away), repeat until you get NULL. I'm back on this issue. I've tested the PQcancel approach, but blocks as well on send()/recv(), and given the constraint resources it is bound to use (to be called from a signal handler) I assume there is no workaround for it. The PQreset approach has the shortcoming of discarding the session configuration that somebody may have created in Pythonland: a state with a connection made but not configured would be something a client program is probably not prepared to handle. I think a plain disconnection would be much easier to handle for long-running python program: a long-running one should probably already cope with a broken connection, a short-lived one would benefit of working SIGINT. If you have other suggestions I'd be glad to know, thank you. -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_reorg in core?
On Fri, Sep 21, 2012 at 9:45 AM, M.Sakamoto sakamoto_masahiko...@lab.ntt.co.jp wrote: Hi, I'm sakamoto, maintainer of reorg. What could be also great is to move the project directly into github to facilitate its maintenance and development. No argument from me there, especially as I have my own fork in github, but that's up to the current maintainers. Yup, I am thinking development on CVS(onPgfoundry) is a bit awkward for me and github would be a suitable place. Hello Sakamoto-san I have created a reorg organization on github: https://github.com/reorg/ You are welcome to become one of the owners of the organization. I have already added Itagaki Takahiro as owner because he has a github account. If you open a github account or give me the email of one you own I will invite you as organization owner. Michael is also member of the organization. I have re-converted the original CVS repository as Michael's conversion was missing the commit email info, but I have rebased his commits on the new master. My intention is to track CVS commits into the cvs branch of the repos and merge them into the master, until official development is moved to git. The repository is at https://github.com/reorg/pg_reorg. Because I'm not sure yet about a few details (from the development model to the committers emails) it may be rebased in the near future, until everything has been decided. Thank you very much. -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_reorg in core?
On Fri, Sep 21, 2012 at 5:17 AM, Josh Kupershmidt schmi...@gmail.com wrote: If the argument for moving pg_reorg into core is faster and easier development, well I don't really buy that. I don't see any problem in having pg_reorg in PGXN instead. I've tried adding a META.json to the project and it seems working fine with the pgxn client. It is together with other patches in my own github fork. https://github.com/dvarrazzo/pg_reorg/ I haven't submitted it to PGXN as I prefer the original author to keep the ownership. -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Resetting libpq connections after an app error
Hello, apologize for bumping the question to -hackers but I got no answer from -general. If there is a better ML to post it let me know. In a libpq application, if there is an application error between PQsendQuery and PQgetResult, is there a way to revert a connection back to an usable state (i.e. from transaction status ACTIVE to IDLE) without using the network in a blocking way? We are currently doing while (NULL != (res = PQgetResult(conn-pgconn))) { PQclear(res); } but this is blocking, and if the error had been caused by the network down, we'll just get stuck in a poll() waiting for a timeout. Thank you very much. -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Potential reference miscounts and segfaults in plpython.c
On Thu, Feb 23, 2012 at 8:35 PM, Daniele Varrazzo daniele.varra...@gmail.com wrote: On Mon, Feb 20, 2012 at 12:09 AM, Jan Urbański wulc...@wulczer.org wrote: BTW, that tool is quite handy, I'll have to try running it over psycopg2. Indeed. I'm having a play with it. It is reporting several issues to clean up (mostly on failure at module import). It's also tracebacking here and there: I'll send the author some feedback/patches. I'm patching psycopg in the gcc-python-plugin branch in my dev repos (https://github.com/dvarrazzo/psycopg). The just released psycopg 2.4.5 has had its codebase cleaned up using the gcc static checker. I haven't managed to run it without false positives, however it's been very useful to find missing return value checks and other nuances. No live memory leak has been found: there were a few missing decrefs but only at module init, not in the live code. Full report about the changes in this message: https://fedorahosted.org/pipermail/gcc-python-plugin/2012-March/000229.html. Cheers, -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: improve selectivity estimation for IN/NOT IN
On Sun, Mar 4, 2012 at 1:44 PM, Euler Taveira de Oliveira eu...@timbira.com wrote: On 04-03-2012 00:20, Daniele Varrazzo wrote: It looks like you have grand plans for array estimation. My patch has a much more modest scope, and I'm hoping it could be applied to currently maintained PG versions, as I consider the currently produced estimations a bug. We don't normally add new features to stable branches unless it is a bug. In the optimizer case, planner regression is a bug (that this case is not). Please note that we are talking about planning errors leading to estimates of records in the millions instead of in the units, in queries where all the elements are known (most common elements, with the right stats, included in the query), even with a partial index whose size could never physically contain all the estimated rows screaming something broken here!. So, it's not a regression as the computation has always been broken, but I think it can be hardly considered not a bug. OTOH, I expect the decision of leaving things as they are could be taken on the basis of the possibility that some program may stop working in reaction of an altered query plan: I'm not going to argue about this, although I feel it unlikely. -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Patch: improve selectivity estimation for IN/NOT IN
Hello, the attached patch improves the array selectivity estimation for = ANY and ALL, hence for the IN/NOT IN operators, to avoid the shortcoming described in http://archives.postgresql.org/pgsql-performance/2012-03/msg6.php. Cheers, -- Daniele From d10e60dd3dec340b96b372792e4a0650ec10da92 Mon Sep 17 00:00:00 2001 From: Daniele Varrazzo daniele.varra...@gmail.com Date: Sat, 3 Mar 2012 19:27:16 + Subject: [PATCH] Improve array selectivity estimation for = ANY and ALL Assume distinct array elements, hence disjoint probabilities instead of independent. Using the wrong probability model can easily lead to serious selectivity overestimation for the IN operator and underestimation for NOT IN. --- src/backend/utils/adt/selfuncs.c | 37 + 1 files changed, 33 insertions(+), 4 deletions(-) diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c index 0a685aa..b0816ca 100644 --- a/src/backend/utils/adt/selfuncs.c +++ b/src/backend/utils/adt/selfuncs.c @@ -1797,10 +1797,29 @@ scalararraysel(PlannerInfo *root, ObjectIdGetDatum(operator), PointerGetDatum(args), Int32GetDatum(varRelid))); + /* + * For generic operators, assume the selection probabilities as + * independent for each array element. But for the equality the + * probabilities are disjoint, so the total probability is just + * the sum of the probabilities of the single elements. This is + * true only if the array doesn't contain dups, but the check is + * expensive: just assume it, to avoid penalizing well-written + * queries in favour of poorly-written ones. + */ if (useOr) -s1 = s1 + s2 - s1 * s2; + { +if (oprselproc.fn_addr == eqsel) + s1 = s1 + s2; +else + s1 = s1 + s2 - s1 * s2; + } else -s1 = s1 * s2; + { +if (oprselproc.fn_addr == neqsel) + s1 = s1 + s2 - 1.0; +else + s1 = s1 * s2; + } } } else if (rightop IsA(rightop, ArrayExpr) @@ -1840,9 +1859,19 @@ scalararraysel(PlannerInfo *root, PointerGetDatum(args), Int32GetDatum(varRelid))); if (useOr) -s1 = s1 + s2 - s1 * s2; + { +if (oprselproc.fn_addr == eqsel) + s1 = s1 + s2; +else + s1 = s1 + s2 - s1 * s2; + } else -s1 = s1 * s2; + { +if (oprselproc.fn_addr == neqsel) + s1 = s1 + s2 - 1.0; +else + s1 = s1 * s2; + } } } else -- 1.7.5.4 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: improve selectivity estimation for IN/NOT IN
On Sun, Mar 4, 2012 at 2:38 AM, Tom Lane t...@sss.pgh.pa.us wrote: Daniele Varrazzo daniele.varra...@gmail.com writes: the attached patch improves the array selectivity estimation for = ANY and ALL, hence for the IN/NOT IN operators, to avoid the shortcoming described in http://archives.postgresql.org/pgsql-performance/2012-03/msg6.php. In connection with Alexander Korotkov's array-estimation patch, I just committed some code into scalararraysel() that checks whether the operator is equality or inequality of the array element type. It looks like you have grand plans for array estimation. My patch has a much more modest scope, and I'm hoping it could be applied to currently maintained PG versions, as I consider the currently produced estimations a bug. It does that by consulting the default btree or hash opclass for the element type. I did that with the thought that it could be used to attack this issue too, but I see that you've done it another way, ie check to see whether the operator uses eqsel() or neqsel() as selectivity estimator. I'm not sure offhand which way is better. It could be argued that yours is more appropriate because if the operator isn't btree equality, but acts enough like it to use eqsel() as estimator, then it's still appropriate for scalararraysel() to treat it as equality. On the other side of the coin, an operator might be equality but have reason to use some operator-specific estimator rather than eqsel(). We have real examples of the former (such as the approximate-equality geometric operators) but I think the latter case is just hypothetical. Another thing that has to be thought about is that there are numerous cross-type operators that use eqsel, such as date-vs-timestamp, and it's far from clear whether it's appropriate for scalararraysel() to use the modified stats calculation when dealing with one of these. The btree-based logic is impervious to that since it won't match any cross-type operator. Thoughts? (BTW, in any case I don't trust function pointer comparison to be portable. It'd be a lot safer to look at the function OID.) My original idea was to compare the comparison function with the catalog: I just don't know how to inspect the catalog/perform the check. Studying how to do it I've noticed the fn_addr referring to the well known eqsel/neqsel functions and thought about using it as indicators of the right operator semantics. If you are still interested in the patch, for sake of bugfixing, and somebody provides an example in the source about how to compare the functions oid, I can try to improve it. -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Potential reference miscounts and segfaults in plpython.c
On Mon, Feb 20, 2012 at 12:09 AM, Jan Urbański wulc...@wulczer.org wrote: BTW, that tool is quite handy, I'll have to try running it over psycopg2. Indeed. I'm having a play with it. It is reporting several issues to clean up (mostly on failure at module import). It's also tracebacking here and there: I'll send the author some feedback/patches. I'm patching psycopg in the gcc-python-plugin branch in my dev repos (https://github.com/dvarrazzo/psycopg). -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extension Packaging
On Wed, Apr 27, 2011 at 1:48 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Tom Lane t...@sss.pgh.pa.us writes: If you didn't change the install script then it's not necessary to execute ALTER EXTENSION ... UPGRADE. You seem to be assuming that the pg_extensions catalog has to reflect the bug fix level of an extension, but that is *not* the intention. If it did reflect that, you'd need N times as many upgrade scripts, most of them identical, to deal with updating from different bug fix levels of the prior version. +1 — but this discussion shows we're not exactly finished here. Probably what is needed is only a clarification that the version number is only about schema object, not revision, patch level, release status or whatever else semantically meaningful. I've attached a patch for the docs about the point. IMO it'd be better if the bug fix level was tracked outside the database, for instance via an RPM package version/release number. I'm not sure whether PGXN has anything for that at the moment. -0.5 What I think would be useful here is to have both version and revision in the control file and pg_extension catalog. Then an extension can easily be at version 1.2 and revision 1.2.3. Now, that means that ALTER EXTENSION UPGRADE should accept to upgrade the revision in the control file when nothing else changes. A less invasive change would be to just update the extension comment on ALTER EXTENSION UPGRADE. This means that the revision would be just informative and not metadata available to eventual depending code but it's on purpose. I think that, if an extension requires its patchlevel to be known, e.g. because depending code has to take different actions based on the revision, it should really provide an inspection function, such as foo_revision(), so that pre-9.1 code can work with it as well. -- Daniele From 03fa593a46f1dae0a8e83b4bccd6dea51e2c102c Mon Sep 17 00:00:00 2001 From: Daniele Varrazzo daniele.varra...@gmail.com Date: Thu, 28 Apr 2011 14:02:08 +0100 Subject: [PATCH] Added paragraph about the distinction between extension version and patch level. --- doc/src/sgml/extend.sgml | 14 ++ 1 files changed, 14 insertions(+), 0 deletions(-) diff --git a/doc/src/sgml/extend.sgml b/doc/src/sgml/extend.sgml index 4ca17ef..ad26f5a 100644 --- a/doc/src/sgml/extend.sgml +++ b/doc/src/sgml/extend.sgml @@ -767,6 +767,20 @@ SELECT pg_catalog.pg_extension_config_dump('my_config', 'WHERE NOT standard_entr /para para + Note that version names are only meant to give an identity to the set of + objects in the database schema and should not be used to track outside + objects such as shared libraries or data files. Specifically, if the + extension has a concept of firsttermrevision/ or firsttermpatch + level/ (maybe loaded with semantic meaning such as revisions order or + release status), setting a version equal to the patch level is + discouraged as it would require a large number of mostly equal (or empty) + upgrade scripts. For example, if a bug is found in the C code of the + extension literalfoo/ version literal1.0/ you may want to release + a revision literal1.0.1/ but you should leave the version as + literal1.0/ if no object in the database schema is changed. +/para + +para Sometimes it is useful to provide quotedowngrade/ scripts, for example literalfoo--1.1--1.0.sql/ to allow reverting the changes associated with version literal1.1/. If you do that, be careful -- 1.7.1 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extension Packaging
On Thu, Apr 28, 2011 at 2:21 PM, Marko Kreen mark...@gmail.com wrote: On Thu, Apr 28, 2011 at 4:07 PM, Daniele Varrazzo daniele.varra...@gmail.com wrote: On Wed, Apr 27, 2011 at 1:48 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Tom Lane t...@sss.pgh.pa.us writes: If you didn't change the install script then it's not necessary to execute ALTER EXTENSION ... UPGRADE. You seem to be assuming that the pg_extensions catalog has to reflect the bug fix level of an extension, but that is *not* the intention. If it did reflect that, you'd need N times as many upgrade scripts, most of them identical, to deal with updating from different bug fix levels of the prior version. +1 — but this discussion shows we're not exactly finished here. Probably what is needed is only a clarification that the version number is only about schema object, not revision, patch level, release status or whatever else semantically meaningful. I've attached a patch for the docs about the point. How about each .so containing a version callback? Thus you can show what is the version of underlying implementation without needing to mess with catalogs just to keep track of patchlevel of C code. On this line, it would be easier to add a parameter revision to the control file and have a function pg_revision(ext) to return it, eventually showing in the \dx output. But this still assumes the revision as being just a string, and if it has a semantic meaning then it requires parsing to extract meaning for it (whereas foo_revision() may return everything the author of foo thinks is important for code depending on it to know, e.g. it may return an integer 90102 or a record (major, minor, patch, status, svn-rev, name-of-my-last-daughter). I don't think we want to force any convention, such as the revision being a semver number - even if PGXN restrict the extension to this strings subset. -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extension Packaging
On Thu, Apr 28, 2011 at 3:04 PM, Tom Lane t...@sss.pgh.pa.us wrote: Daniele Varrazzo daniele.varra...@gmail.com writes: On Thu, Apr 28, 2011 at 2:21 PM, Marko Kreen mark...@gmail.com wrote: How about each .so containing a version callback? Thus you can show what is the version of underlying implementation without needing to mess with catalogs just to keep track of patchlevel of C code. On this line, it would be easier to add a parameter revision to the control file and have a function pg_revision(ext) to return it, eventually showing in the \dx output. I think what we're discussing here is bug-fix revisions that don't affect the SQL declarations for the extension. Presumably, that means a change in the C code, so the shared library is the right place to keep the revision number. A version number in the control file seems to carry a nontrivial risk of being out of sync with the actual code in the shared library. There is also the case of extensions whose data file matter: for instance I've packaged the Italian text search dictionary as an extension (http://pgxn.org/dist/italian_fts/): it contains no .so but it may happen for the dictionary files to be changed. Its version is 1.2 and will stay so as long as the sql doesn't change, but its revision is currently 1.2.1 and may bump to 1.2.2 should the dict content change. For this extension, just spotting the 1.2.1 in the \dx output would be more than enough, I don't see any use for the revision number returned in an api call. As long as the extension is installed via make install the .control shouldn't drift away from the extension files it represents. -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] maximum digits for NUMERIC
On Wed, Apr 27, 2011 at 4:47 AM, Bruce Momjian br...@momjian.us wrote: Alvaro Herrera wrote: Excerpts from Bruce Momjian's message of mar abr 26 12:58:19 -0300 2011: Wow, I am so glad someone documented this. I often do factorial(4000) which generates 12673 digits when teaching classes, and it bugged me that we documented the limit as 1000 digits. I keep wondering why you want to know factorial(4000) so often. It is just to impress folks, and it is impressive. An instant screenful of digits is pretty cool. If you are into impressing people with big numbers (or maybe doing something useful with them too) you may take a look at http://pgmp.projects.postgresql.org/ -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extension Packaging
On Sun, Apr 24, 2011 at 10:46 PM, Tom Lane t...@sss.pgh.pa.us wrote: Daniele Varrazzo daniele.varra...@gmail.com writes: On Thu, Apr 21, 2011 at 4:16 AM, Tom Lane t...@sss.pgh.pa.us wrote: If you did not actually change the contents of the install script, you should not change its version number either. Sorry, I'm not entirely convinced. If I release an extension 1.0, then find a bug in the C code and fix it in 1.0.1, arguably make install will put the .so in the right place and the 1.0.1 code will be picked up by new sessions. But pg_extension still knows 1.0 as extension version, and ALTER EXTENSION ... UPGRADE fails because no update path is knows. If you didn't change the install script then it's not necessary to execute ALTER EXTENSION ... UPGRADE. You seem to be assuming that the pg_extensions catalog has to reflect the bug fix level of an extension, but that is *not* the intention. If it did reflect that, you'd need N times as many upgrade scripts, most of them identical, to deal with updating from different bug fix levels of the prior version. Yes, I was assuming that the pg_extension catalog should have included the bug fix level, and I noticed the explosion of upgrade paths required. IMO it'd be better if the bug fix level was tracked outside the database, for instance via an RPM package version/release number. I'm not sure whether PGXN has anything for that at the moment. PGXN requires a version for the extension, possibly including the patchlevel (well, actually forcing a patchlevel, as per semver spec), and I/David/probably everybody else were thinking that such version ought to be the same specified in the .control file. I see that a better guideline would be to have '1.0' specified in the control and '1.0.X' in the metadata submitted on PGXN, which I think is not currently the case - see for example http://api.pgxn.org/src/pair/pair-0.1.2/pair.control -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extension Packaging
On Thu, Apr 21, 2011 at 4:16 AM, Tom Lane t...@sss.pgh.pa.us wrote: David E. Wheeler da...@kineticode.com writes: * Another, minor point: If I release a new version with no changes to the code (as I've done today, just changing the make stuff and documentation), it's kind of annoying that I'd need to have a migration script from the old version to the new version that's…empty. But I dunno, maybe not such a big deal. It's useful to have it there with a comment in it: “No changes.” If you did not actually change the contents of the install script, you should not change its version number either. Sorry, I'm not entirely convinced. If I release an extension 1.0, then find a bug in the C code and fix it in 1.0.1, arguably make install will put the .so in the right place and the 1.0.1 code will be picked up by new sessions. But pg_extension still knows 1.0 as extension version, and ALTER EXTENSION ... UPGRADE fails because no update path is knows. There is also a dangerous asymmetry: If I'm not mistaken the library .so has no version number, so there can be only one version in the system: an update changing code and sql requires ALTER EXTENSION to be run as soon as possible, or some sql function from the old extension may try to call non-existing functions in the library - or worse the wrong ones or with wrong parameters. OTOH library-only changes don't strictly require ALTER EXTENSION - and trying to issue the command would fail if no path to the default version is available (leaving the admin puzzled about whether he installed the upgrade properly). Is an empty upgrade file the only way to get the extension metadata right? I wouldn't find it particularly bad, but better have it that have library-metadata mismatches or inconsistencies in the upgrade procedures I think. -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extension Packaging
On Thu, Apr 21, 2011 at 1:14 AM, David E. Wheeler da...@kineticode.com wrote: I finally got round to updating a couple of my extensions to support 9.1 extensions. Unlike the contrib extensions, these needed to target older versions of PostgreSQL, as well. Case in point, the semver data type; you might find the Makefile of particular interest: https://github.com/theory/pg-semver/blob/master/Makefile Hi David, thanks for sharing. I've recently packaged an extension for PG 8.4-9.1 and had to wrestle the Makefile too. You may take a look at it and check if there is any solution useful for your extension too: https://github.com/dvarrazzo/pgmp/blob/master/Makefile. Specifically, I parse the version from the control file using: PGMP_VERSION=$(shell grep default_version pgmp.control | sed -e s/default_version = '\(.*\)'/\1/) so the Makefile doesn't have to be maintained for it. To tell apart 9.1 and = 9.1 I've used instead: PG91 = $(shell $(PG_CONFIG) --version | grep -qE 8\.| 9\.0 echo pre91 || echo 91) ifeq ($(PG91),91) ... else ... For my extension I'm less concerned by having the install sql named in different ways or by the upgrade sql as all these files are generated by scripts. You may find useful this one https://github.com/dvarrazzo/pgmp/blob/master/tools/sql2extension.py to generate the upgrade sql from the install sql. For my extension I require Python and have all the sql files generated by the Makefile at install time; if you don't want this dependency you may generate the sql before packaging and ship the result instead. Cheers, -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] getting composite types info from libpq
Hello, when a query returns a composite type, the libpq PQftype() function reports the oid of the record type. In psycopg: cur.execute(select (1,2)) cur.description (('row', 2249, None, -1, None, None, None),) test=# select typname from pg_type where oid = 2249; typname - record Is there a way to recursively retrieve the types for the record components? Thanks, -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] getting composite types info from libpq
On Wed, Dec 15, 2010 at 6:56 PM, Merlin Moncure mmonc...@gmail.com wrote: On Wed, Dec 15, 2010 at 1:25 PM, Daniele Varrazzo daniele.varra...@gmail.com wrote: Hello, when a query returns a composite type, the libpq PQftype() function reports the oid of the record type. In psycopg: cur.execute(select (1,2)) cur.description (('row', 2249, None, -1, None, None, None),) test=# select typname from pg_type where oid = 2249; typname - record Is there a way to recursively retrieve the types for the record components? not without talking to the server, unless you had previously pulled pg_attribute data. select * from pg_attribute where attrelid = 2249; No, there is no such info in pg_attribute: 2249 is the oid for the type of a generic record, not for a specific type. This question is more appropriate for -general, but what are you trying to do? Added -general in copy: please remove -hackers in your reply if you think this thread is out of place. I'm hacking on psycopg. Currently it uses PQftype, PQfname and related functions to inspect the PQresult received after a query in order to build the python representation of the record. But the inspection is flat: if the record contains a composite structure it is currently returned as an unparsed string: cur.execute(select ('1'::int, current_date), current_date) # the date outside the record is easily parsed, for the one inside the record cur.fetchone() ('(1,2010-12-16)', datetime.date(2010, 12, 16)) cur.description # name and oid are the first two fields (('row', 2249, None, -1, None, None, None), ('date', 1082, None, 4, None, None, None)) As the record is created on the fly, I assume there is no structure left in the catalog for it. If I instead explicitly create the type I see how to inspect it: test= create type intdate as (an_int integer, a_date date); CREATE TYPE cur.execute(select (1, current_date)::intdate, current_date) cur.fetchone() ('(1,2010-12-16)', datetime.date(2010, 12, 16)) cur.description (('row', 650308, None, -1, None, None, None), ('date', 1082, None, 4, None, None, None)) test= select attname, atttypid from pg_attribute where attrelid = 650306; attname | atttypid -+-- an_int | 23 a_date | 1082 but even in this case it seems it would take a second query to inspect the type and even here It doesn't seem I could use PQgetvalue/PQgetlength to read the internal components of the composite values. The goal would be to have the query above translated into e.g. a nested tuple in python: ((1, datetime.date(2010, 12, 16), datetime.date(2010, 12, 16)) and I'd like to know: 1. do I get enough info in the PGresult to inspect anonymous composite types? 2. do I get such info for composite types for which I have schema info in the catalog, without issuing a second query? (which I don't feel it is a driver's job) 3. is there any libpq facility to split the string returned after a composite types into its single components, without having to write a parser to deal with commas and quotes? cur.execute(select ('a'::text, 'b,c'::text, 'd''e'::text, 'f\g'::text)) print cur.fetchone()[0] (a,b,c,d'e,fg) 4. are by any chance those info passed on the network, maybe available in an internal libpq structure, but then not accessible from the libpq interface? Thank you very much. -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psycopg and two phase commit
On Fri, Nov 5, 2010 at 5:16 AM, Pavel Stehule pavel.steh...@gmail.com wrote: 2010/11/4 Daniele Varrazzo daniele.varra...@gmail.com: Just wanted to warn you that I have implemented the 2pc protocol in psycopg. I read a notice, but I didn't find a link for download, where is it, please? We have just released a beta package including this and other features. All the details at http://initd.org/psycopg/articles/2010/11/06/psycopg-230-beta1-released/. Cheers -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psycopg and two phase commit
On Sat, Sep 18, 2010 at 5:01 PM, Pavel Stehule pavel.steh...@gmail.com wrote: Hello who is psycopg maintainer, please? Can somebody explains to me, why psycopg doesn't support twophase commit still, although some implementation was done in summer 2008? Hello Pavel, Just wanted to warn you that I have implemented the 2pc protocol in psycopg. I have this and other patches ready to be merged in the trunk: more details are available in http://initd.org/psycopg/articles/2010/11/02/new-features-upcoming-psycopg-release/ I will try to have the patches released soon: definitely before the PyCon. If you fancy some testing you are welcome. Best regards, -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psycopg and two phase commit
On Sat, Sep 18, 2010 at 5:01 PM, Pavel Stehule pavel.steh...@gmail.com wrote: Hello who is psycopg maintainer, please? Here is one. The others can be usually mailed on the psycopg mailing list, which is currently down and being recovered. Can somebody explains to me, why psycopg doesn't support twophase commit still, although some implementation was done in summer 2008? Probably because nobody has asked before and it has been nobody's itch to scratch. The work you probably refer to seems Jason Henstridge's. I didn't know anything about it, but I've bcc'd him so he can provide details. https://code.launchpad.net/~jamesh/psycopg/two-phase-commit Now two phase commit is part of DB-API, so can be implemented. There are some bariers? I see none at a first glance. I just don't get the intricacies of the .xid() method suggested in the dbapi (http://www.python.org/dev/peps/pep-0249/) while a regular string would do - and the xid has to be converted in a string anyway to be passed to the Postgres TPC statements. So I'm tempted to allow the tpc_*() methods to accept a simple string too as parameter; also because otherwise psycopg wouldn't be able to manipulate a transaction prepared by other tools (e.g. retrieving a xid using tpc_recover() and passing it to tpc_commit()/tpc_rollback(), a limitation that can be avoided. I can work on the feature, first I'd like to review James's code and possibly hear from him his impressions. -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psycopg and two phase commit
On Sun, Sep 19, 2010 at 6:38 PM, Daniele Varrazzo daniele.varra...@gmail.com wrote: On Sat, Sep 18, 2010 at 5:01 PM, Pavel Stehule pavel.steh...@gmail.com wrote: There are some bariers? I see none at a first glance. I just don't get the intricacies of the .xid() method suggested in the dbapi (http://www.python.org/dev/peps/pep-0249/) while a regular string would do - and the xid has to be converted in a string anyway to be passed to the Postgres TPC statements. So I'm tempted to allow the tpc_*() methods to accept a simple string too as parameter; also because otherwise psycopg wouldn't be able to manipulate a transaction prepared by other tools (e.g. retrieving a xid using tpc_recover() and passing it to tpc_commit()/tpc_rollback(), a limitation that can be avoided. I've read the XA specification, which have inspired the DBAPI extension for TPC, and the discussion in the Python DB-SIG leading to such extensions (http://mail.python.org/pipermail/db-sig/2008-January/thread.html). The methods proposed in the DBAPI don't map 1-1 with the PG TPC statements so it will be necessary some work in the driver to implement the proposed interface. But being TPC all about interoperability I think it is a necessary complication. -- Daniele -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Adding regexp_match() function
Hello, I'd like to contribute a regexp_match() function as discussed in bug #5469 [1] The aim is to overcome the limitation outlined in the thread above http://archives.postgresql.org/pgsql-bugs/2010-05/msg00227.php. PostgreSQL currently offers the function regexp_matches(), a SRF (which, unless invoked with the 'g' flag, returns at most one result). An user interested in the nonglob behaviour, i.e. expecting at most 1 match group, has to adopt special care to avoid records to be dropped from the target list in case no match is found. Being this a rather common use case, I'd like to provide a function with a less astonishing behaviour, i.e. returning a text[] instead of a set of text[] and returning NULL in case no match is found (the 'g' flag wouldn't be supported). The proposed name is regexp_match(), to underline the semantics very similar to regexp_matches() but returning a single value as result. While the symmetry between the names is a pro, an excessive similarity may result confusing, so I wouldn't be surprised if a better name is found. The implementation of the function is very simple, given the infrastructure already available for the other regexp-related functions. I've actually already implemented it (mostly to check how easy or hard it would have been: I had never written a C procedure for PG before): a preliminary patch is attached. If the idea is accepted I will submit a complete patch including documentation and tests. Best regards, -- Daniele diff --git a/src/backend/utils/adt/regexp.c b/src/backend/utils/adt/regexp.c index c3fd23e..422a94d 100644 --- a/src/backend/utils/adt/regexp.c +++ b/src/backend/utils/adt/regexp.c @@ -817,6 +817,42 @@ regexp_matches_no_flags(PG_FUNCTION_ARGS) } /* + * regexp_match() + * Return the first match of a pattern within a string. + */ +Datum +regexp_match(PG_FUNCTION_ARGS) +{ + regexp_matches_ctx *matchctx; + + matchctx = setup_regexp_matches(PG_GETARG_TEXT_PP(0), + PG_GETARG_TEXT_PP(1), + PG_GETARG_TEXT_PP_IF_EXISTS(2), + true, false, true); + + if (matchctx-nmatches) + { + ArrayType *result_ary; + + matchctx-elems = (Datum *) palloc(sizeof(Datum) * matchctx-npatterns); + matchctx-nulls = (bool *) palloc(sizeof(bool) * matchctx-npatterns); + result_ary = build_regexp_matches_result(matchctx); + PG_RETURN_ARRAYTYPE_P(result_ary); + } + else + { + PG_RETURN_NULL(); + } +} + +/* This is separate to keep the opr_sanity regression test from complaining */ +Datum +regexp_match_no_flags(PG_FUNCTION_ARGS) +{ + return regexp_match(fcinfo); +} + +/* * setup_regexp_matches --- do the initial matching for regexp_matches() * or regexp_split() * diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index f2751a4..970036a 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -2241,6 +2241,10 @@ DATA(insert OID = 2763 ( regexp_matches PGNSP PGUID 12 1 1 0 f f f t t i 2 0 DESCR(return all match groups for regexp); DATA(insert OID = 2764 ( regexp_matches PGNSP PGUID 12 1 10 0 f f f t t i 3 0 1009 25 25 25 _null_ _null_ _null_ _null_ regexp_matches _null_ _null_ _null_ )); DESCR(return all match groups for regexp); +DATA(insert OID = 3778 ( regexp_match PGNSP PGUID 12 1 1 0 f f f t f i 2 0 1009 25 25 _null_ _null_ _null_ _null_ regexp_match_no_flags _null_ _null_ _null_ )); +DESCR(return first match group for regexp); +DATA(insert OID = 3779 ( regexp_match PGNSP PGUID 12 1 1 0 f f f t f i 3 0 1009 25 25 25 _null_ _null_ _null_ _null_ regexp_match _null_ _null_ _null_ )); +DESCR(return first match group for regexp); DATA(insert OID = 2088 ( split_part PGNSP PGUID 12 1 0 0 f f f t f i 3 0 25 25 25 23 _null_ _null_ _null_ _null_ split_text _null_ _null_ _null_ )); DESCR(split string by field_sep and return field_num); DATA(insert OID = 2765 ( regexp_split_to_table PGNSP PGUID 12 1 1000 0 f f f t t i 2 0 25 25 25 _null_ _null_ _null_ _null_ regexp_split_to_table_no_flags _null_ _null_ _null_ )); diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h index e8f38a6..f6320f1 100644 --- a/src/include/utils/builtins.h +++ b/src/include/utils/builtins.h @@ -525,6 +525,8 @@ extern Datum textregexreplace(PG_FUNCTION_ARGS); extern Datum similar_escape(PG_FUNCTION_ARGS); extern Datum regexp_matches(PG_FUNCTION_ARGS); extern Datum regexp_matches_no_flags(PG_FUNCTION_ARGS); +extern Datum regexp_match(PG_FUNCTION_ARGS); +extern Datum regexp_match_no_flags(PG_FUNCTION_ARGS); extern Datum regexp_split_to_table(PG_FUNCTION_ARGS); extern Datum regexp_split_to_table_no_flags(PG_FUNCTION_ARGS); extern Datum regexp_split_to_array(PG_FUNCTION_ARGS); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers