Re: [HACKERS] Linking libpq statically to libssl

2017-10-27 Thread Daniele Varrazzo
On Fri, Oct 27, 2017 at 2:37 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Daniele Varrazzo <daniele.varra...@gmail.com> writes:
>> I have a problem building binary packages for psycopg2. Binary
>> packages ship with their own copies of libpq and libssl; however if
>> another python package links to libssl the library will be imported
>> twice with conflicting symbols, likely resulting in a segfault (see
>> https://github.com/psycopg/psycopg2/issues/543). This happens e.g. if
>> a python script both connects to postgres and opens an https resource.
>
> Basically, you're doing it wrong.  Shipping your own copy of libssl,
> rather than depending on whatever packaging the platform provides,
> is just asking for pain --- and not only of this sort.  You're also
> now on the hook to update your package whenever libssl fixes a bug
> or a security vulnerability, which happens depressingly often.
>
> The same applies to libpq, really.  You don't want to be in the
> business of shipping bits that you are not the originator of.
>
> When I worked at Red Hat, there was an ironclad policy against
> building packages that incorporated other packages statically.
> I would imagine that other distros have similar policies for
> similar reasons.  Just because you *can* ignore those policies
> doesn't mean you *should*.

Distros do compile the library from source and against the system
package, and everyone using the package directly can still do so; the
binary packages are only installed by the Python package manager.
Psycopg is more complex to install than the average Python package (it
needs python and postgres dev files, pg_config available somewhere
etc) and a no-dependencies package provides a much smoother
experience. It also happens that the libpq and libssl versions used
tend to be more up-to-date than the system one (people can already use
the new libpq 10 features without waiting for debian packages).

I am confronted with the reality of Python developers as of 201x's,
and shipping binary packages has proven generally a positive feature,
even factoring in the need of shipping updated binary packages when
the need arises. The benefit of a simple-to-use library is for the
Postgres project at large, it is not for my personal gain. So while I
know the shortcomings of binary packages and static libraries, I would
still be interested in knowing the best way to fix the problem in the
subject. Feel free to write in private if you want to avoid the public
shaming.

-- Daniele


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Linking libpq statically to libssl

2017-10-27 Thread Daniele Varrazzo
Hello,

I have a problem building binary packages for psycopg2. Binary
packages ship with their own copies of libpq and libssl; however if
another python package links to libssl the library will be imported
twice with conflicting symbols, likely resulting in a segfault (see
https://github.com/psycopg/psycopg2/issues/543). This happens e.g. if
a python script both connects to postgres and opens an https resource.

A solution to work around the problem could be to change libssl
symbols names, in the binaries or in the source code
(https://github.com/psycopg/psycopg2-wheels/issues/7). But maybe a
simpler workaround could be just to build libpq linking to libssl
statically. Is this possible?

...and other libs too I guess, because the problem was found on libssl
but conflicts with other libs is possible too, ldap comes to mind...

Any help to solve the problem would be appreciated. Thank you very much.

-- Daniele


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Help required to debug pg_repack breaking logical replication

2017-10-07 Thread Daniele Varrazzo
Hello,

we have been reported, and I have experienced a couple of times,
pg_repack breaking logical replication.

- https://github.com/reorg/pg_repack/issues/135
- https://github.com/2ndQuadrant/pglogical/issues/113

In my experience, after the botched run, the replication slot was
"stuck", and any attempt of reading (including
pg_logical_slot_peek_changes()) blocked until ctrl-c. I've tried
replicating the issue but first attempts have failed to fail.

In the above issue #113, Petr Jelinek commented:

> From quick look at pg_repack, the way it does table rewrite is almost 
> guaranteed
> to break logical decoding unless there is zero unconsumed changes for a given 
> table
> as it does not build the necessary mappings info for logical decoding that 
> standard
> heap rewrite in postgres does.

unfortunately he didn't follow up to further details requests.

I've started drilling down the problem, observing that the swap
function, swap_heap_or_index_files() [1] was cargoculted by the
original author from the CLUSTER command code as of PG 8.2 [2] (with a
custom addition to update the relfrozenxid which seems backwards to me
as it sets the older frozen xid on the new table [3]).

[1] https://github.com/reorg/pg_repack/blob/ver_1.4.1/lib/repack.c#L1082
[2] 
https://github.com/postgres/postgres/blob/REL8_2_STABLE/src/backend/commands/cluster.c#L783
[3] https://github.com/reorg/pg_repack/issues/152

so that code is effectively missing a good 10 years of development.
Before jumping into fast-forwarding it, I would like to ask for some
help, i.e.

- Is Petr diagnosis right and freezing of logical replication is to be
blamed to missing mapping?
- Can you suggest a test to reproduce the issue reliably?
- What are mapped relations anyway?

Thank you in advance for any help (either info about how to fix the
issue properly, or a patch by someone who happens to really understand
what we are talking about).

-- Daniele


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [pgsql-translators] [HACKERS] translatable string fixes

2017-05-24 Thread Daniele Varrazzo
On Tue, May 23, 2017 at 1:15 AM, Alvaro Herrera
 wrote:
> It took me a very long time to figure out how to translate these 9.6-new
> strings in the AM validate routines:
>
> msgid "gin operator family \"%s\" contains support procedure %s with 
> cross-type registration"
>
> The problem I had was that the term "cross-type registration" is not
> used anywhere else, it's not clear what it means, and (worst from my
> POV) I couldn't think of any decent phrasing in Spanish for it.  After
> staring at the code for a while, I translated them roughly as:
>
> "gin operator family %s contains support procedure %s registered with 
> differing types"
>
> which I think is a tad clearer ... but as a user confronted with such a
> message, I would be at a complete loss on what to do about it.

I did something similar, translating the equivalent of "across
different types". Had to look at the source code to understand the
meaning of the sentence. Maybe cross-type registration is a bit too
cryptic.


> Maybe we can use this phrasing:
> "gin operator family %s contains support procedure %s registered with 
> different left and right types"
>
>
> The other complaint I have about this one and also other amvalidate
> functions is the hardcoded AM name, so it's actually one string per AM,
> which is annoying (a total of twenty-something messages which are
> actually only four or five different ones).  Ignoring the second part of
> the phrase now, we could use this:
>   "operator family %s of access method %s contains support procedure %s with 
> cross-type registration"

Yup, that was boring and error-prone :\


-- Daniele


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Detecting schema changes during logical replication

2017-05-08 Thread Daniele Varrazzo
On Mon, May 8, 2017 at 3:48 AM, Craig Ringer
 wrote:

> Sounds like you're reimplementing pglogical
> (http://2ndquadrant.com/pglogical) on top of a json protocol.

The fact the protocol is JSON is more a detail, but it's a good start
as it's human-readable.

> [...]
> I have no reason to object to your doing it yourself, and you're welcome to
> use pglogical as a reference for how to do things (see the license). It just
> seems like a waste.

Logical Replication, for the first time, offers a way to implement a
replication solution that is not several layers away from the
database. Or even: for the first time is something I understand.

Using the logical replication we can perform some manipulation of the
data I will want to use (tables not necessarily in the same places,
schemas not necessarily matching). In particular one not-really-minor
annoyance of the current system is that adding a column of the master
regularly breaks the replica, and pglogical doesn't resolve this
problem. We currently use certain features of Londiste (tables living
in different schemas, slightly different data types with the same
textual representation, extra columns on the slave...) that are
against pglogical requirements, but which can be implemented no
problem is a customised replication solution built on top of streaming
replication.

All in all I'm more thrilled by the idea of having a database throwing
a stream of changes at me in a format I can reinterpret, allowing me
to write in a target database with a high degree of configurability in
the middle (i.e. I just have a Python script receiving the data,
munging them and then performing queries), then a complete but
schema-rigid replication solution.


-- Daniele


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Detecting schema changes during logical replication

2017-05-07 Thread Daniele Varrazzo
On Sun, May 7, 2017 at 8:04 PM, Andres Freund <and...@anarazel.de> wrote:
> Hi,
>
> On 2017-05-07 19:27:08 +0100, Daniele Varrazzo wrote:
>> I'm putting together a replication system based on logical
>> replication.
>
> Interesting.  If you very briefly could recap what it's about... ;)

I need to replicate some tables from a central database into the
database that should run a secondary system. For a similar use case we
have used Londiste in the past, which has served us good, but its
usage has not been problem-free. Logical decoding seems much less
invasive on the source database than a trigger-based replication
solution, and has less moving part to care about and maintain.

For the moment I'm hacking into a fork of Euler project for wal
decoding into json (https://github.com/dvarrazzo/wal2json), mostly
adding configurability, so that we may be able to replicate only the
tables we need, skip certain fields etc. I'm also taking a look at
minimising the amount of information produced: sending over and over
the column names and types for every record seems a waste, hence my
question.

>> I would like to send table information only the first
>> time a table is seen by the 'change_cb' callback, but of course there
>> could be some schema change after replication started. So I wonder: is
>> there any information I can find in the 'Relation' structure of the
>> change callback, which may suggest that there could have been a change
>> in the table schema, hence a new schema should be sent to the client?
>
> The best way I can think of - which is also what is implemented in the
> in-core replication framework - is to have a small cache on-top of the
> relcache.  That cache is kept coherent using
> CacheRegisterRelcacheCallback().  Then whenever there's a change you
> look up that change in that cache, and send the schema information if
> it's been invalidated since you last sent something.  That's also how
> the new stuff in v10 essentially works:
> src/backend/replication/pgoutput/pgoutput.c
>
> pgoutput_change(), does a lookup for its own metadata using 
> get_rel_sync_entry()
> which then checks relentry->schema_sent.  Invalidation unsets
> schema_sent in rel_sync_cache_relation_cb.

Thank you very much, it seems exactly what I need. I'll try hacking
around this callback.

-- Daniele


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Detecting schema changes during logical replication

2017-05-07 Thread Daniele Varrazzo
Hello,

I'm putting together a replication system based on logical
replication. I would like to send table information only the first
time a table is seen by the 'change_cb' callback, but of course there
could be some schema change after replication started. So I wonder: is
there any information I can find in the 'Relation' structure of the
change callback, which may suggest that there could have been a change
in the table schema, hence a new schema should be sent to the client?

Thank you very much,

-- Daniele


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 'text' instead of 'unknown' in Postgres 10

2017-02-13 Thread Daniele Varrazzo
On Mon, Feb 13, 2017 at 3:09 AM, Jim Nasby <jim.na...@bluetreble.com> wrote:
> On 2/7/17 9:16 AM, Daniele Varrazzo wrote:
>>
>> Thank you for the clarification: I will assume the behaviour cannot be
>> maintained on PG 10 and think whether the treatment of '{}' is too
>> magical and drop it instead.
>
>
> BTW, I would hope that passing '{}' into a defined array field still works,
> since an empty array isn't treated the same as NULL, which means you need
> some way to create an empty array.

Yes, that didn't change. The issue was only reading data from postgres
to python.

There is a beta of next psycopg version available on testpypi which
implements the new behaviour, if you want to take a look at it. You
can use

pip install -i https://testpypi.python.org/pypi psycopg2==2.7b1

to install it.


-- Daniele


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 'text' instead of 'unknown' in Postgres 10

2017-02-07 Thread Daniele Varrazzo
On Tue, Feb 7, 2017 at 2:59 PM, Andreas Karlsson <andr...@proxel.se> wrote:
> On 02/07/2017 03:14 PM, Daniele Varrazzo wrote:
>>
>> In psycopg '{}'::unknown is treated specially as an empty array and
>> converted into an empty list, which allows empty lists to be passed to
>> the server as arrays and returned back to python. Without the special
>> case, empty lists behave differently from non-empty ones. It seems
>> this behaviour cannot be maintained on PG 10 and instead users need to
>> specify some form of cast for their placeholder. Previously this would
>> have worked "as expected" and the 4th argument would have been an
>> empty list:
>>
>> cur.execute("SELECT %s, %s, %s, %s", (['x'], [42], [date(2017,1,1)],
>> [])); cur.fetchone()
>> (['x'], [42], [datetime.date(2017, 1, 1)], '{}')
>
>
> As Tom wrote this is the result of an intentional change, but no matter if
> that change is a good thing or not the above behavior sounds rather fragile.
> To me it does not seem safe to by default just assume that '{}' means the
> empty array, it might also have been intended to be the Python string "{}",
> the empty JSON object, or entirely something different.

Yes, it could be actually the case to drop it. The case for it is
quite thin anyway: if something comes from a query it will usually
have a type attached.

-- Daniele


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 'text' instead of 'unknown' in Postgres 10

2017-02-07 Thread Daniele Varrazzo
On Tue, Feb 7, 2017 at 2:42 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Daniele Varrazzo <daniele.varra...@gmail.com> writes:
>> testing with psycopg2 against Postgres 10 I've found a difference in
>> behaviour regarding literals, which are returned as text instead of
>> unknown. ...
>> Is this behaviour here to stay? Is there documentation for this change?
>
> Yup, see
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=1e7c4bb0049732ece651d993d03bb6772e5d281a
>
> The expectation is that clients will never see "unknown" output columns
> anymore.

Ok thank you, I'll document the change in behaviour.


> I don't have enough context to suggest a better definition for psycopg
> ... but maybe you could pay some attention to the Python type of the value
> you're handed?

In python the only type is the list, there is no specific "list of
integer" or such.


>> It seems
>> this behaviour cannot be maintained on PG 10 and instead users need to
>> specify some form of cast for their placeholder.
>
> Well, no version of PG has ever allowed this without a cast:
>
> regression=# select array[];
> ERROR:  cannot determine type of empty array
>
> so I'm not sure it's inconsistent for the same restriction to apply in the
> case you're describing.  I'm also unclear on why you are emphasizing the
> point of the array being empty, because '{1,2,3}'::unknown would have the
> same behavior.

The inconsistency is on our side: on python list [1,2,3] we generate
'ARRAY[1,2,3]', and empty lists are instead converted to '{}'
precisely because there is no such thing like unknown[] - nor we can
generate array[]::int[] because the Python list is empty and we don't
know if it would have contained integers or other stuff. Of course
this only works because we merge arguments in the adapter: moving to
use PQexecParams we couldn't allow that anymore and the user should be
uniformly concerned with adding casts to their queries (this is a
non-backward compatible change so only planned for a mythical psycopg3
version I've long desired to write but for which I have no resource).

Thank you for the clarification: I will assume the behaviour cannot be
maintained on PG 10 and think whether the treatment of '{}' is too
magical and drop it instead.


-- Daniele


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 'text' instead of 'unknown' in Postgres 10

2017-02-07 Thread Daniele Varrazzo
Hello,

testing with psycopg2 against Postgres 10 I've found a difference in
behaviour regarding literals, which are returned as text instead of
unknown. In previous versions:

In [2]: cnn = psycopg2.connect('')
In [3]: cur = cnn.cursor()
In [7]: cur.execute("select 'x'")
In [9]: cur.description[0][1]
Out[9]: 705

In pg10 master:

In [10]: cnn = psycopg2.connect('dbname=postgres host=localhost port=54310')
In [11]: cur = cnn.cursor()
In [12]: cur.execute("select 'x'")
In [13]: cur.description[0][1]
Out[13]: 25

what is somewhat surprising is that unknown seems promoted to text "on
the way out" from a query; in previous versions both columns of this
query would have been "unknown".

postgres=# select pg_typeof('x'), pg_typeof(foo) from (select 'x' as foo) x;
 pg_typeof | pg_typeof
---+---
 unknown   | text

Is this behaviour here to stay? Is there documentation for this change?

In psycopg '{}'::unknown is treated specially as an empty array and
converted into an empty list, which allows empty lists to be passed to
the server as arrays and returned back to python. Without the special
case, empty lists behave differently from non-empty ones. It seems
this behaviour cannot be maintained on PG 10 and instead users need to
specify some form of cast for their placeholder. Previously this would
have worked "as expected" and the 4th argument would have been an
empty list:

cur.execute("SELECT %s, %s, %s, %s", (['x'], [42], [date(2017,1,1)],
[])); cur.fetchone()
(['x'], [42], [datetime.date(2017, 1, 1)], '{}')

Should I just take this test off from the test suite and document the
adapter as behaving differently on PG 10?

Thank you very much

-- Daniele


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Redundant error messages in policy.c

2015-07-06 Thread Daniele Varrazzo
There are 5 different strings (one has a whitespace error), they could
be 2. Patch attached.

postgresql/src/backend$ grep errmsg commands/policy.c | grep policy |
sed 's/^ *//'
 errmsg(policy \%s\ for relation \%s\ already exists,
 errmsg(policy \%s\ on table \%s\ does not exist,
 errmsg(policy \%s\ for table \%s\ already exists,
 errmsg(policy \%s\ for table \%s\ does not exist,
 errmsg(policy \%s\ for table  \%s\ does not exist,

-- Daniele
diff --git a/src/backend/commands/policy.c b/src/backend/commands/policy.c
index 6e95ba2..11efc9f 100644
--- a/src/backend/commands/policy.c
+++ b/src/backend/commands/policy.c
@@ -563,7 +563,7 @@ CreatePolicy(CreatePolicyStmt *stmt)
 	if (HeapTupleIsValid(policy_tuple))
 		ereport(ERROR,
 (errcode(ERRCODE_DUPLICATE_OBJECT),
- errmsg(policy \%s\ for relation \%s\ already exists,
+ errmsg(policy \%s\ for table \%s\ already exists,
  stmt-policy_name, RelationGetRelationName(target_table;
 
 	values[Anum_pg_policy_polrelid - 1] = ObjectIdGetDatum(table_id);
@@ -735,7 +735,7 @@ AlterPolicy(AlterPolicyStmt *stmt)
 	if (!HeapTupleIsValid(policy_tuple))
 		ereport(ERROR,
 (errcode(ERRCODE_UNDEFINED_OBJECT),
- errmsg(policy \%s\ on table \%s\ does not exist,
+ errmsg(policy \%s\ for table \%s\ does not exist,
 		stmt-policy_name,
 		RelationGetRelationName(target_table;
 
@@ -977,7 +977,7 @@ get_relation_policy_oid(Oid relid, const char *policy_name, bool missing_ok)
 		if (!missing_ok)
 			ereport(ERROR,
 	(errcode(ERRCODE_UNDEFINED_OBJECT),
-	 errmsg(policy \%s\ for table  \%s\ does not exist,
+	 errmsg(policy \%s\ for table \%s\ does not exist,
 			policy_name, get_rel_name(relid;
 
 		policy_oid = InvalidOid;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Spurious full-stop in message

2015-07-06 Thread Daniele Varrazzo
postgresql/src/backend$ grep must be superuser to change bypassrls
attribute commands/user.c | sed 's/   \+//'
errmsg(must be superuser to change bypassrls attribute.)));
 errmsg(must be superuser to change bypassrls attribute)));

Patch to fix attached.
diff --git a/src/backend/commands/user.c b/src/backend/commands/user.c
index 3b381c5..5b20994 100644
--- a/src/backend/commands/user.c
+++ b/src/backend/commands/user.c
@@ -301,7 +301,7 @@ CreateRole(CreateRoleStmt *stmt)
 		if (!superuser())
 			ereport(ERROR,
 	(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
-errmsg(must be superuser to change bypassrls attribute.)));
+errmsg(must be superuser to change bypassrls attribute)));
 	}
 	else
 	{

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] libpq connection status and closed fd

2014-09-22 Thread Daniele Varrazzo
Hello,

a psycopg user is reporting [1] that the library is not marking the
connection as closed and/or bad after certain errors, such as a
connection timeout. He is emulating the error by closing the
connection fd (I don't know if the two conditions result in the same
effect, but I'll stick to this hypothesis for now).

[1] https://github.com/psycopg/psycopg2/issues/263

Testing with a short C program [2] it seems that indeed, after closing
the fd and causing an error in `PQconsumeInput`, the connection is
deemed in good state by `PQstatus`. A similar test gives the same
result after `PQexec` is attempted on a connection whose fd is closed
(tests performed with PostgreSQL 9.3.5).

[2] https://gist.github.com/dvarrazzo/065f343c95f8ea67cf8f

Is this intentional? Is there a better way to check for a broken connection?

If we mark the connection as closed every time `PQconsumeInput`
returns 0 (or `PQexec` returns null, which are the two code paths
affecting psycopg) would it be too aggressive and cause false
positives?

Thank you very much.

-- Daniele


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq connection status and closed fd

2014-09-22 Thread Daniele Varrazzo
On Mon, Sep 22, 2014 at 8:17 AM, Dmitriy Igrishin dmit...@gmail.com wrote:

 2014-09-22 10:42 GMT+04:00 Daniele Varrazzo daniele.varra...@gmail.com:

 [2] https://gist.github.com/dvarrazzo/065f343c95f8ea67cf8f

 Why are you using close() instead of PQfinish()?

Because I'm testing for an error, please read my message and the
original bug report.

-- Daniele


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fixed redundant i18n strings in json

2014-08-02 Thread Daniele Varrazzo
I'd definitely replace /arg/argument/. Furthermore I'd avoid the form
argument 1: something is wrong: the string is likely to end up in
sentences with other colons so I'd rather use something is wrong at
argument 1.

Is the patch attached better?

Cheers,

-- Daniele

On Thu, Jul 31, 2014 at 1:57 PM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Jul 30, 2014 at 2:59 PM, Daniele Varrazzo
 daniele.varra...@gmail.com wrote:
 Please find attached a small tweak to a couple of strings found while
 updating translations. The %d version is already used elsewhere in the
 same file.

 Probably not a bad idea, but aren't these strings pretty awful anyway?
  At a minimum, arg as an abbreviation for argument doesn't seem
 good.

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company
diff --git a/src/backend/utils/adt/json.c b/src/backend/utils/adt/json.c
index 2f99908..0c55e9a 100644
--- a/src/backend/utils/adt/json.c
+++ b/src/backend/utils/adt/json.c
@@ -1910,7 +1910,7 @@ json_object_agg_transfn(PG_FUNCTION_ARGS)
 	if (val_type == InvalidOid || val_type == UNKNOWNOID)
 		ereport(ERROR,
 (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg(arg 1: could not determine data type)));
+ errmsg(could not determine data type for argument %d, 1)));
 
 	add_json(arg, false, state, val_type, true);
 
@@ -1934,7 +1934,7 @@ json_object_agg_transfn(PG_FUNCTION_ARGS)
 	if (val_type == InvalidOid || val_type == UNKNOWNOID)
 		ereport(ERROR,
 (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg(arg 2: could not determine data type)));
+ errmsg(could not determine data type for argument %d, 2)));
 
 	add_json(arg, PG_ARGISNULL(2), state, val_type, false);
 
@@ -1980,7 +1980,8 @@ json_build_object(PG_FUNCTION_ARGS)
 	if (nargs % 2 != 0)
 		ereport(ERROR,
 (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg(invalid number or arguments: object must be matched key value pairs)));
+ errmsg(invalid number or arguments),
+ errhint(Object must be matched key value pairs.)));
 
 	result = makeStringInfo();
 
@@ -1994,7 +1995,8 @@ json_build_object(PG_FUNCTION_ARGS)
 		if (PG_ARGISNULL(i))
 			ereport(ERROR,
 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-	 errmsg(arg %d: key cannot be null, i + 1)));
+	 errmsg(argument %d cannot be null, i + 1),
+	 errhint(Object keys should be text.)));
 		val_type = get_fn_expr_argtype(fcinfo-flinfo, i);
 
 		/*
@@ -2016,7 +2018,8 @@ json_build_object(PG_FUNCTION_ARGS)
 		if (val_type == InvalidOid || val_type == UNKNOWNOID)
 			ereport(ERROR,
 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-	 errmsg(arg %d: could not determine data type, i + 1)));
+	 errmsg(could not determine data type for argument %d,
+			i + 1)));
 		appendStringInfoString(result, sep);
 		sep = , ;
 		add_json(arg, false, result, val_type, true);
@@ -2042,7 +2045,8 @@ json_build_object(PG_FUNCTION_ARGS)
 		if (val_type == InvalidOid || val_type == UNKNOWNOID)
 			ereport(ERROR,
 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-	 errmsg(arg %d: could not determine data type, i + 2)));
+	 errmsg(could not determine data type for argument %d,
+			i + 2)));
 		add_json(arg, PG_ARGISNULL(i + 1), result, val_type, false);
 
 	}
@@ -2099,7 +2103,8 @@ json_build_array(PG_FUNCTION_ARGS)
 		if (val_type == InvalidOid || val_type == UNKNOWNOID)
 			ereport(ERROR,
 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-	 errmsg(arg %d: could not determine data type, i + 1)));
+	 errmsg(could not determine data type for argument %d,
+			i + 1)));
 		appendStringInfoString(result, sep);
 		sep = , ;
 		add_json(arg, PG_ARGISNULL(i), result, val_type, false);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Fixed redundant i18n strings in json

2014-07-30 Thread Daniele Varrazzo
Please find attached a small tweak to a couple of strings found while
updating translations. The %d version is already used elsewhere in the
same file.

-- Daniele
diff --git a/src/backend/utils/adt/json.c b/src/backend/utils/adt/json.c
index 2f99908..2aa27cc 100644
--- a/src/backend/utils/adt/json.c
+++ b/src/backend/utils/adt/json.c
@@ -1910,7 +1910,7 @@ json_object_agg_transfn(PG_FUNCTION_ARGS)
 	if (val_type == InvalidOid || val_type == UNKNOWNOID)
 		ereport(ERROR,
 (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg(arg 1: could not determine data type)));
+ errmsg(arg %d: could not determine data type, 1)));
 
 	add_json(arg, false, state, val_type, true);
 
@@ -1934,7 +1934,7 @@ json_object_agg_transfn(PG_FUNCTION_ARGS)
 	if (val_type == InvalidOid || val_type == UNKNOWNOID)
 		ereport(ERROR,
 (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg(arg 2: could not determine data type)));
+ errmsg(arg %d: could not determine data type, 2)));
 
 	add_json(arg, PG_ARGISNULL(2), state, val_type, false);
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-06 Thread Daniele Varrazzo
On Thu, Mar 6, 2014 at 6:54 PM, Josh Berkus j...@agliodbs.com wrote:

 The actual storage upgrade of hstore--hstore2 is fairly painless from
 the user perspective; they don't have to do anything.  The problem is
 that the input/output strings are different, something which I didn't
 think to check for (and Peter did), and which will break applications
 relying on Hstore, since the drivers which support Hstore (like
 psycopg2) rely on string-parsing to convert it.  I haven't
 regression-tested hstore2 against psycopg2 since I don't have a good
 test, but that would be a useful thing to do.

Hello, psycopg developer here. Not following the entire thread as it's
quite articulated and not of my direct interest (nor comprehension).
But if you throw at me a few test cases I can make sure psycopg can
parse them much before hstore2 is released.

FYI I have a trigger that highlights me the -hackers messages
mentioning psycopg, so just mentioning it is enough for me to take a
better look. But if you want a more active collaboration just ask.

Thank you,

-- Daniele


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-06 Thread Daniele Varrazzo
On Thu, Mar 6, 2014 at 9:10 PM, Peter Geoghegan p...@heroku.com wrote:
 On Thu, Mar 6, 2014 at 12:58 PM, Daniele Varrazzo
 daniele.varra...@gmail.com wrote:
 Hello, psycopg developer here. Not following the entire thread as it's
 quite articulated and not of my direct interest (nor comprehension).
 But if you throw at me a few test cases I can make sure psycopg can
 parse them much before hstore2 is released.

 I don't think that'll be necessary. Any break in compatibility in the
 hstore format has been ruled a non-starter for having hstore support
 nested data structures. I believe on balance we're content to let
 hstore continue to be hstore. jsonb support would certainly be
 interesting, though.

Cool, just let me know what you would expect a well-behaved client
library to behave.

-- Daniele


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Changing schema on the fly

2013-04-17 Thread Daniele Varrazzo
Hello dear -hackers,

I'm maintaining pg_reorg/pg_repack, which you may know effectively
allow online VACUUM FULL or CLUSTER. It works by installing logging
triggers to keep data up-to-date during the migration, creating a copy
of the table, and eventually swapping the tables relfilenodes.

The new table is forced to keep exactly the same physical structure,
e.g. restoring dropped columns too. Failing to do so was apparently a
big mistake, looking at this commit [1]. My knowledge of Postgres at
that level is limited: what I imagine is that cached plans keep the
offset of the field in the row, so data ends up read/written in the
wrong position if such offset changes. The commit message mentions
views and stored procedures being affected.

Is there a way to force invalidation of all the cache that may hold a
reference to the columns offset? Or is the problem an entirely
different one and the above cache invalidation wouldn't be enough?

If we managed to allow schema change in pg_repack we could allow many
more online manipulations features: changing data types, reordering
columns, really dropping columns freeing up space etc.

Thank you very much,

-- Daniele

[1] 
https://github.com/reorg/pg_repack/commit/960930b645df8eeeda15f176c95d3e450786f78a


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] A few string fixed

2013-03-20 Thread Daniele Varrazzo
Hello,

while translating the new PostgreSQL 9.3 strings I've found a couple
questionable. Patches attached.

Cheers,

-- Daniele


0001-Fixed-MultiXactIds-string-warning.patch
Description: Binary data


0002-Fixed-pasto-in-hint-string-about-making-views-deleta.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A few string fixed

2013-03-20 Thread Daniele Varrazzo
On Wed, Mar 20, 2013 at 2:01 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Daniele Varrazzo daniele.varra...@gmail.com writes:
 while translating the new PostgreSQL 9.3 strings I've found a couple
 questionable. Patches attached.

 Hmm ... I agree with the MultiXactId-MultiXactIds changes, but not with
 this one:

 -errhint(To make the view 
 updatable, provide an unconditional ON DELETE DO INSTEAD rule or an INSTEAD 
 OF DELETE trigger.)));
 +errhint(To make the view 
 deletable, provide an unconditional ON DELETE DO INSTEAD rule or an INSTEAD 
 OF DELETE trigger.)));

 We use the phrase updatable view, we don't say deletable view
 (and this usage is also found in the SQL standard).  We could possibly
 make the message say To make the view updatable in this way, or
 ... for this purpose, but that seems a bit long-winded to me.

Ok, I'd just thought it was a pasto.

-- Daniele


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [RFC] ideas for a new Python DBAPI driver (was Re: [HACKERS] libpq test suite)

2013-02-15 Thread Daniele Varrazzo
On Fri, Feb 15, 2013 at 9:28 PM, Peter Eisentraut pete...@gmx.net wrote:
 On 2/14/13 2:42 PM, Marko Tiikkaja wrote:
 I think the reason this doesn't work is that in order to prepare a query
 you need to know the parameter types, but you don't know that in Python,
 or at least with the way the DB-API works.  For example, if you write

 cur.execute(SELECT * FROM tbl WHERE a = %s AND b = %s, (val1, val2))

 what types will you pass to PQsendQueryParams?

 Pardon me if this is obvious, but why would you need to pass any types
 at all?  Assuming we're still talking about PQsendQueryParams and not an
 explicit prepare/execute cycle..

 Well, PQsendQueryParams() requires types to be passed, doesn't it?

No, not necessarily: they are inferred by the context if they are not specified.

I've had in mind for a long time to use the *Params() functions in
psycopg (although it would be largely not backwards compatible, hence
to be done on user request and not by default). Psycopg has all the
degrees of freedom in keeping the two implementations alive (the
non-*params for backward compatibility, the *params for future usage).
I'd drafted a plan on the psycopg ML some times ago. But I don't have
a timeline for that: it's a major work and without pressing
motivations to do it.

-- Daniele


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/Python result object str handler

2013-01-08 Thread Daniele Varrazzo
On Tue, Jan 8, 2013 at 9:32 AM, Magnus Hagander mag...@hagander.net wrote:
 On Tue, Jan 8, 2013 at 3:58 AM, Peter Eisentraut pete...@gmx.net wrote:
 For debugging PL/Python functions, I'm often tempted to write something
 like

 rv = plpy.execute(...)
 plpy.info(rv)

 which prints something unhelpful like

 PLyResult object at 0xb461d8d8

 By implementing a str handler for the result object, it now prints
 something like

 PLyResult status=5 nrows=2 rows=[{'foo': 1, 'bar': '11'}, {'foo': 2, 'bar': 
 '22'}]

This looks more a repr-style format to me (if you implement repr but
not str, the latter will default to the former).


 Patch attached for review.

 How does it work if there are many rows in there? Say the result
 contains 10,000 rows - will the string contain all of them? If so,
 might it be worthwhile to cap the number of rows shown and then follow
 with a ... or something?

I think it would: old django versions were a pain in the neck because
when a page broke an entire dump of gigantic queries was often dumped
as debug info.

-- Daniele


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] allowing multiple PQclear() calls

2012-12-11 Thread Daniele Varrazzo
On Tue, Dec 11, 2012 at 12:43 PM, Josh Kupershmidt schmi...@gmail.com wrote:

 Ah, well. I guess using a macro like:

 #define SafeClear(res) do {PQclear(res); res = NULL;} while (0);

 will suffice for me.

Psycopg uses:

#define IFCLEARPGRES(pgres)  if (pgres) {PQclear(pgres); pgres = NULL;}

-- Daniele


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Resetting libpq connections after an app error

2012-10-06 Thread Daniele Varrazzo
On Sat, Jul 21, 2012 at 5:36 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Martijn van Oosterhout klep...@svana.org writes:
 On Sat, Jul 21, 2012 at 01:08:58AM +0100, Daniele Varrazzo wrote:
 In a libpq application, if there is an application error between
 PQsendQuery and PQgetResult, is there a way to revert a connection
 back to an usable state (i.e. from transaction status ACTIVE to IDLE)
 without using the network in a blocking way? We are currently doing

 There is PQreset(), which also exists in a non-blocking variant.

 Note that PQreset essentially kills the connection and establishes a new
 one, which might not be what Daniele is looking for.  The alternative
 approach is to issue PQcancel and then just let the query flush out as
 you normally would in an async application, ie PQisBusy/PQconsumeInput
 until ready, then PQgetResult (which you throw away), repeat until you
 get NULL.

I'm back on this issue. I've tested the PQcancel approach, but blocks
as well on send()/recv(), and given the constraint resources it is
bound to use (to be called from a signal handler) I assume there is no
workaround for it.

The PQreset approach has the shortcoming of discarding the session
configuration that somebody may have created in Pythonland: a state
with a connection made but not configured would be something a client
program is probably not prepared to handle.

I think a plain disconnection would be much easier to handle for
long-running python program: a long-running one should probably
already cope with a broken connection, a short-lived one would benefit
of working SIGINT.

If you have other suggestions I'd be glad to know, thank you.


-- Daniele


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_reorg in core?

2012-09-22 Thread Daniele Varrazzo
On Fri, Sep 21, 2012 at 9:45 AM, M.Sakamoto
sakamoto_masahiko...@lab.ntt.co.jp wrote:
 Hi,
 I'm sakamoto, maintainer of reorg.

 What could be also great is to move the project directly into github to
 facilitate its maintenance and development.
No argument from me there, especially as I have my own fork in github,
but that's up to the current maintainers.
 Yup, I am thinking development on CVS(onPgfoundry) is a bit awkward for
 me and github would be a suitable place.

Hello Sakamoto-san

I have created a reorg organization on github: https://github.com/reorg/
You are welcome to become one of the owners of the organization. I
have already added Itagaki Takahiro as owner because he has a github
account. If you open a github account or give me the email of one you
own I will invite you as organization owner. Michael is also member of
the organization.

I have re-converted the original CVS repository as Michael's
conversion was missing the commit email info, but I have rebased his
commits on the new master. My intention is to track CVS commits into
the cvs branch of the repos and merge them into the master, until
official development is moved to git.

The repository is at https://github.com/reorg/pg_reorg. Because I'm
not sure yet about a few details (from the development model to the
committers emails) it may be rebased in the near future, until
everything has been decided.

Thank you very much.

-- Daniele


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_reorg in core?

2012-09-21 Thread Daniele Varrazzo
On Fri, Sep 21, 2012 at 5:17 AM, Josh Kupershmidt schmi...@gmail.com wrote:

 If the argument for moving pg_reorg into core is faster and easier
 development, well I don't really buy that.

I don't see any problem in having pg_reorg in PGXN instead.

I've tried adding a META.json to the project and it seems working fine
with the pgxn client. It is together with other patches in my own
github fork.

https://github.com/dvarrazzo/pg_reorg/

I haven't submitted it to PGXN as I prefer the original author to keep
the ownership.

-- Daniele


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Resetting libpq connections after an app error

2012-07-20 Thread Daniele Varrazzo
Hello,

apologize for bumping the question to -hackers but I got no answer
from -general. If there is a better ML to post it let me know.

In a libpq application, if there is an application error between
PQsendQuery and PQgetResult, is there a way to revert a connection
back to an usable state (i.e. from transaction status ACTIVE to IDLE)
without using the network in a blocking way? We are currently doing

while (NULL != (res = PQgetResult(conn-pgconn))) {
PQclear(res);
}

but this is blocking, and if the error had been caused by the network
down, we'll just get stuck in a poll() waiting for a timeout.

Thank you very much.

-- Daniele

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Potential reference miscounts and segfaults in plpython.c

2012-03-29 Thread Daniele Varrazzo
On Thu, Feb 23, 2012 at 8:35 PM, Daniele Varrazzo
daniele.varra...@gmail.com wrote:
 On Mon, Feb 20, 2012 at 12:09 AM, Jan Urbański wulc...@wulczer.org wrote:

 BTW, that tool is quite handy, I'll have to try running it over psycopg2.

 Indeed. I'm having a play with it. It is reporting several issues to
 clean up (mostly on failure at module import). It's also tracebacking
 here and there: I'll send the author some feedback/patches.

 I'm patching psycopg in the gcc-python-plugin branch in my dev repos
 (https://github.com/dvarrazzo/psycopg).

The just released psycopg 2.4.5 has had its codebase cleaned up using
the gcc static checker.

I haven't managed to run it without false positives, however it's been
very useful to find missing return value checks and other nuances. No
live memory leak has been found: there were a few missing decrefs but
only at module init, not in the live code.

Full report about the changes in this message:
https://fedorahosted.org/pipermail/gcc-python-plugin/2012-March/000229.html.

Cheers,

-- Daniele

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: improve selectivity estimation for IN/NOT IN

2012-03-04 Thread Daniele Varrazzo
On Sun, Mar 4, 2012 at 1:44 PM, Euler Taveira de Oliveira
eu...@timbira.com wrote:
 On 04-03-2012 00:20, Daniele Varrazzo wrote:
 It looks like you have grand plans for array estimation. My patch has
 a much more modest scope, and I'm hoping it could be applied to
 currently maintained PG versions, as I consider the currently produced
 estimations a bug.

 We don't normally add new features to stable branches unless it is a bug. In
 the optimizer case, planner regression is a bug (that this case is not).

Please note that we are talking about planning errors leading to
estimates of records in the millions instead of in the units, in
queries where all the elements are known (most common elements, with
the right stats, included in the query), even with a partial index
whose size could never physically contain all the estimated rows
screaming something broken here!. So, it's not a regression as the
computation has always been broken, but I think it can be hardly
considered not a bug.

OTOH, I expect the decision of leaving things as they are could be
taken on the basis of the possibility that some program may stop
working in reaction of an altered query plan: I'm not going to argue
about this, although I feel it unlikely.

-- Daniele

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Patch: improve selectivity estimation for IN/NOT IN

2012-03-03 Thread Daniele Varrazzo
Hello,

the attached patch improves the array selectivity estimation for = ANY
and  ALL, hence for the IN/NOT IN operators, to avoid the
shortcoming described in
http://archives.postgresql.org/pgsql-performance/2012-03/msg6.php.

Cheers,

-- Daniele
From d10e60dd3dec340b96b372792e4a0650ec10da92 Mon Sep 17 00:00:00 2001
From: Daniele Varrazzo daniele.varra...@gmail.com
Date: Sat, 3 Mar 2012 19:27:16 +
Subject: [PATCH] Improve array selectivity estimation for = ANY and  ALL

Assume distinct array elements, hence disjoint probabilities instead of
independent. Using the wrong probability model can easily lead to serious
selectivity overestimation for the IN operator and underestimation for NOT IN.
---
 src/backend/utils/adt/selfuncs.c |   37 +
 1 files changed, 33 insertions(+), 4 deletions(-)

diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 0a685aa..b0816ca 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -1797,10 +1797,29 @@ scalararraysel(PlannerInfo *root,
   ObjectIdGetDatum(operator),
   PointerGetDatum(args),
   Int32GetDatum(varRelid)));
+			/*
+			 * For generic operators, assume the selection probabilities as
+			 * independent for each array element. But for the equality the
+			 * probabilities are disjoint, so the total probability is just
+			 * the sum of the probabilities of the single elements. This is
+			 * true only if the array doesn't contain dups, but the check is
+			 * expensive: just assume it, to avoid penalizing well-written
+			 * queries in favour of poorly-written ones.
+			 */
 			if (useOr)
-s1 = s1 + s2 - s1 * s2;
+			{
+if (oprselproc.fn_addr == eqsel)
+	s1 = s1 + s2;
+else
+	s1 = s1 + s2 - s1 * s2;
+			}
 			else
-s1 = s1 * s2;
+			{
+if (oprselproc.fn_addr == neqsel)
+	s1 = s1 + s2 - 1.0;
+else
+	s1 = s1 * s2;
+			}
 		}
 	}
 	else if (rightop  IsA(rightop, ArrayExpr) 
@@ -1840,9 +1859,19 @@ scalararraysel(PlannerInfo *root,
   PointerGetDatum(args),
   Int32GetDatum(varRelid)));
 			if (useOr)
-s1 = s1 + s2 - s1 * s2;
+			{
+if (oprselproc.fn_addr == eqsel)
+	s1 = s1 + s2;
+else
+	s1 = s1 + s2 - s1 * s2;
+			}
 			else
-s1 = s1 * s2;
+			{
+if (oprselproc.fn_addr == neqsel)
+	s1 = s1 + s2 - 1.0;
+else
+	s1 = s1 * s2;
+			}
 		}
 	}
 	else
-- 
1.7.5.4


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: improve selectivity estimation for IN/NOT IN

2012-03-03 Thread Daniele Varrazzo
On Sun, Mar 4, 2012 at 2:38 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Daniele Varrazzo daniele.varra...@gmail.com writes:
 the attached patch improves the array selectivity estimation for = ANY
 and  ALL, hence for the IN/NOT IN operators, to avoid the
 shortcoming described in
 http://archives.postgresql.org/pgsql-performance/2012-03/msg6.php.

 In connection with Alexander Korotkov's array-estimation patch,
 I just committed some code into scalararraysel() that checks whether the
 operator is equality or inequality of the array element type.

It looks like you have grand plans for array estimation. My patch has
a much more modest scope, and I'm hoping it could be applied to
currently maintained PG versions, as I consider the currently produced
estimations a bug.

 It does
 that by consulting the default btree or hash opclass for the element
 type.  I did that with the thought that it could be used to attack this
 issue too, but I see that you've done it another way, ie check to see
 whether the operator uses eqsel() or neqsel() as selectivity estimator.

 I'm not sure offhand which way is better.  It could be argued that yours
 is more appropriate because if the operator isn't btree equality, but acts
 enough like it to use eqsel() as estimator, then it's still appropriate
 for scalararraysel() to treat it as equality.  On the other side of the
 coin, an operator might be equality but have reason to use some
 operator-specific estimator rather than eqsel().  We have real examples
 of the former (such as the approximate-equality geometric operators)
 but I think the latter case is just hypothetical.  Another thing that
 has to be thought about is that there are numerous cross-type operators
 that use eqsel, such as date-vs-timestamp, and it's far from clear
 whether it's appropriate for scalararraysel() to use the modified stats
 calculation when dealing with one of these.  The btree-based logic is
 impervious to that since it won't match any cross-type operator.

 Thoughts?

 (BTW, in any case I don't trust function pointer comparison to be
 portable.  It'd be a lot safer to look at the function OID.)

My original idea was to compare the comparison function with the
catalog: I just don't know how to inspect the catalog/perform the
check. Studying how to do it I've noticed the fn_addr referring to the
well known eqsel/neqsel functions and thought about using it as
indicators of the right operator semantics.

If you are still interested in the patch, for sake of bugfixing, and
somebody provides an example in the source about how to compare the
functions oid, I can try to improve it.

-- Daniele

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Potential reference miscounts and segfaults in plpython.c

2012-02-23 Thread Daniele Varrazzo
On Mon, Feb 20, 2012 at 12:09 AM, Jan Urbański wulc...@wulczer.org wrote:

 BTW, that tool is quite handy, I'll have to try running it over psycopg2.

Indeed. I'm having a play with it. It is reporting several issues to
clean up (mostly on failure at module import). It's also tracebacking
here and there: I'll send the author some feedback/patches.

I'm patching psycopg in the gcc-python-plugin branch in my dev repos
(https://github.com/dvarrazzo/psycopg).

-- Daniele

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extension Packaging

2011-04-28 Thread Daniele Varrazzo
On Wed, Apr 27, 2011 at 1:48 PM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Tom Lane t...@sss.pgh.pa.us writes:
 If you didn't change the install script then it's not necessary to
 execute ALTER EXTENSION ... UPGRADE.  You seem to be assuming that the
 pg_extensions catalog has to reflect the bug fix level of an extension,
 but that is *not* the intention.  If it did reflect that, you'd need
 N times as many upgrade scripts, most of them identical, to deal with
 updating from different bug fix levels of the prior version.

 +1 — but this discussion shows we're not exactly finished here.

Probably what is needed is only a clarification that the version
number is only about schema object, not revision, patch level, release
status or whatever else semantically meaningful. I've attached a patch
for the docs about the point.


 IMO it'd be better if the bug fix level was tracked outside the
 database, for instance via an RPM package version/release number.
 I'm not sure whether PGXN has anything for that at the moment.

 -0.5

 What I think would be useful here is to have both version and revision
 in the control file and pg_extension catalog.  Then an extension can
 easily be at version 1.2 and revision 1.2.3.

 Now, that means that ALTER EXTENSION UPGRADE should accept to upgrade
 the revision in the control file when nothing else changes.

A less invasive change would be to just update the extension comment
on ALTER EXTENSION UPGRADE. This means that the revision would be just
informative and not metadata available to eventual depending code but
it's on purpose. I think that, if an extension requires its patchlevel
to be known, e.g. because depending code has to take different actions
based on the revision, it should really provide an inspection
function, such as foo_revision(), so that pre-9.1 code can work with
it as well.


-- Daniele
From 03fa593a46f1dae0a8e83b4bccd6dea51e2c102c Mon Sep 17 00:00:00 2001
From: Daniele Varrazzo daniele.varra...@gmail.com
Date: Thu, 28 Apr 2011 14:02:08 +0100
Subject: [PATCH] Added paragraph about the distinction between extension version and patch level.

---
 doc/src/sgml/extend.sgml |   14 ++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/doc/src/sgml/extend.sgml b/doc/src/sgml/extend.sgml
index 4ca17ef..ad26f5a 100644
--- a/doc/src/sgml/extend.sgml
+++ b/doc/src/sgml/extend.sgml
@@ -767,6 +767,20 @@ SELECT pg_catalog.pg_extension_config_dump('my_config', 'WHERE NOT standard_entr
 /para
 
 para
+ Note that version names are only meant to give an identity to the set of
+ objects in the database schema and should not be used to track outside
+ objects such as shared libraries or data files.  Specifically, if the
+ extension has a concept of firsttermrevision/ or firsttermpatch
+ level/ (maybe loaded with semantic meaning such as revisions order or
+ release status), setting a version equal to the patch level is
+ discouraged as it would require a large number of mostly equal (or empty)
+ upgrade scripts. For example, if a bug is found in the C code of the
+ extension literalfoo/ version literal1.0/ you may want to release
+ a revision literal1.0.1/ but you should leave the version as
+ literal1.0/ if no object in the database schema is changed.
+/para
+
+para
  Sometimes it is useful to provide quotedowngrade/ scripts, for
  example literalfoo--1.1--1.0.sql/ to allow reverting the changes
  associated with version literal1.1/.  If you do that, be careful
-- 
1.7.1


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extension Packaging

2011-04-28 Thread Daniele Varrazzo
On Thu, Apr 28, 2011 at 2:21 PM, Marko Kreen mark...@gmail.com wrote:
 On Thu, Apr 28, 2011 at 4:07 PM, Daniele Varrazzo
 daniele.varra...@gmail.com wrote:
 On Wed, Apr 27, 2011 at 1:48 PM, Dimitri Fontaine
 dimi...@2ndquadrant.fr wrote:
 Tom Lane t...@sss.pgh.pa.us writes:
 If you didn't change the install script then it's not necessary to
 execute ALTER EXTENSION ... UPGRADE.  You seem to be assuming that the
 pg_extensions catalog has to reflect the bug fix level of an extension,
 but that is *not* the intention.  If it did reflect that, you'd need
 N times as many upgrade scripts, most of them identical, to deal with
 updating from different bug fix levels of the prior version.

 +1 — but this discussion shows we're not exactly finished here.

 Probably what is needed is only a clarification that the version
 number is only about schema object, not revision, patch level, release
 status or whatever else semantically meaningful. I've attached a patch
 for the docs about the point.

 How about each .so containing a version callback?

 Thus you can show what is the version of underlying implementation
 without needing to mess with catalogs just to keep track of patchlevel
 of C code.

On this line, it would be easier to add a parameter revision to the
control file and have a function pg_revision(ext) to return it,
eventually showing in the \dx output. But this still assumes the
revision as being just a string, and if it has a semantic meaning then
it requires parsing to extract meaning for it (whereas foo_revision()
may return everything the author of foo thinks is important for code
depending on it to know, e.g. it may return an integer 90102 or a
record (major, minor, patch, status, svn-rev,
name-of-my-last-daughter). I don't think we want to force any
convention, such as the revision being a semver number - even if PGXN
restrict the extension to this strings subset.

-- Daniele

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extension Packaging

2011-04-28 Thread Daniele Varrazzo
On Thu, Apr 28, 2011 at 3:04 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Daniele Varrazzo daniele.varra...@gmail.com writes:
 On Thu, Apr 28, 2011 at 2:21 PM, Marko Kreen mark...@gmail.com wrote:
 How about each .so containing a version callback?

 Thus you can show what is the version of underlying implementation
 without needing to mess with catalogs just to keep track of patchlevel
 of C code.

 On this line, it would be easier to add a parameter revision to the
 control file and have a function pg_revision(ext) to return it,
 eventually showing in the \dx output.

 I think what we're discussing here is bug-fix revisions that don't
 affect the SQL declarations for the extension.  Presumably, that means a
 change in the C code, so the shared library is the right place to keep
 the revision number.  A version number in the control file seems to
 carry a nontrivial risk of being out of sync with the actual code in the
 shared library.

There is also the case of extensions whose data file matter: for
instance I've packaged the Italian text search dictionary as an
extension (http://pgxn.org/dist/italian_fts/): it contains no .so but
it may happen for the dictionary files to be changed. Its version is
1.2 and will stay so as long as the sql doesn't change, but its
revision is currently 1.2.1 and may bump to 1.2.2 should the dict
content change. For this extension, just spotting the 1.2.1 in the \dx
output would be more than enough, I don't see any use for the revision
number returned in an api call.

As long as the extension is installed via make install the .control
shouldn't drift away from the extension files it represents.

-- Daniele

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] maximum digits for NUMERIC

2011-04-27 Thread Daniele Varrazzo
On Wed, Apr 27, 2011 at 4:47 AM, Bruce Momjian br...@momjian.us wrote:
 Alvaro Herrera wrote:
 Excerpts from Bruce Momjian's message of mar abr 26 12:58:19 -0300 2011:

  Wow, I am so glad someone documented this.  I often do factorial(4000)
  which generates 12673 digits when teaching classes, and it bugged me
  that we documented the limit as 1000 digits.

 I keep wondering why you want to know factorial(4000) so often.

 It is just to impress folks, and it is impressive.  An instant
 screenful of digits is pretty cool.

If you are into impressing people with big numbers (or maybe doing
something useful with them too) you may take a look at
http://pgmp.projects.postgresql.org/

-- Daniele

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extension Packaging

2011-04-24 Thread Daniele Varrazzo
On Sun, Apr 24, 2011 at 10:46 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Daniele Varrazzo daniele.varra...@gmail.com writes:
 On Thu, Apr 21, 2011 at 4:16 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 If you did not actually change the contents of the install script, you
 should not change its version number either.

 Sorry, I'm not entirely convinced. If I release an extension 1.0, then
 find a bug in the C code and fix it in 1.0.1, arguably make install
 will put the .so in the right place and the 1.0.1 code will be picked
 up by new sessions. But pg_extension still knows 1.0 as extension
 version, and ALTER EXTENSION ... UPGRADE fails because no update path
 is knows.

 If you didn't change the install script then it's not necessary to
 execute ALTER EXTENSION ... UPGRADE.  You seem to be assuming that the
 pg_extensions catalog has to reflect the bug fix level of an extension,
 but that is *not* the intention.  If it did reflect that, you'd need
 N times as many upgrade scripts, most of them identical, to deal with
 updating from different bug fix levels of the prior version.

Yes, I was assuming that the pg_extension catalog should have included
the bug fix level, and I noticed the explosion of upgrade paths
required.


 IMO it'd be better if the bug fix level was tracked outside the
 database, for instance via an RPM package version/release number.
 I'm not sure whether PGXN has anything for that at the moment.

PGXN requires a version for the extension, possibly including the
patchlevel (well, actually forcing a patchlevel, as per semver spec),
and I/David/probably everybody else were thinking that such version
ought to be the same specified in the .control file. I see that a
better guideline would be to have '1.0' specified in the control and
'1.0.X' in the metadata submitted on PGXN, which I think is not
currently the case - see for example
http://api.pgxn.org/src/pair/pair-0.1.2/pair.control


-- Daniele

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extension Packaging

2011-04-23 Thread Daniele Varrazzo
On Thu, Apr 21, 2011 at 4:16 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 David E. Wheeler da...@kineticode.com writes:

 * Another, minor point: If I release a new version with no changes to the 
 code (as I've done today, just changing the make stuff and documentation), 
 it's kind of annoying that I'd need to have a migration script from the old 
 version to the new version that's…empty. But I dunno, maybe not such a big 
 deal. It's useful to have it there with a comment in it: “No changes.”

 If you did not actually change the contents of the install script, you
 should not change its version number either.

Sorry, I'm not entirely convinced. If I release an extension 1.0, then
find a bug in the C code and fix it in 1.0.1, arguably make install
will put the .so in the right place and the 1.0.1 code will be picked
up by new sessions. But pg_extension still knows 1.0 as extension
version, and ALTER EXTENSION ... UPGRADE fails because no update path
is knows.

There is also a dangerous asymmetry: If I'm not mistaken the library
.so has no version number, so there can be only one version in the
system: an update changing code and sql requires ALTER EXTENSION to be
run as soon as possible, or some sql function from the old extension
may try to call non-existing functions in the library - or worse the
wrong ones or with wrong parameters. OTOH library-only changes don't
strictly require ALTER EXTENSION - and trying to issue the command
would fail if no path to the default version is available (leaving the
admin puzzled about whether he installed the upgrade properly).

Is an empty upgrade file the only way to get the extension metadata
right? I wouldn't find it particularly bad, but better have it that
have library-metadata mismatches or inconsistencies in the upgrade
procedures I think.

-- Daniele

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extension Packaging

2011-04-20 Thread Daniele Varrazzo
On Thu, Apr 21, 2011 at 1:14 AM, David E. Wheeler da...@kineticode.com wrote:

 I finally got round to updating a couple of my extensions to support 9.1 
 extensions. Unlike the contrib extensions, these needed to target older 
 versions of PostgreSQL, as well. Case in point, the semver data type; you 
 might find the Makefile of particular interest:

  https://github.com/theory/pg-semver/blob/master/Makefile

Hi David, thanks for sharing.

I've recently packaged an extension for PG 8.4-9.1 and had to wrestle
the Makefile too. You may take a look at it and check if there is any
solution useful for your extension too:
https://github.com/dvarrazzo/pgmp/blob/master/Makefile.

Specifically, I parse the version from the control file using:

PGMP_VERSION=$(shell grep default_version pgmp.control | sed -e
s/default_version = '\(.*\)'/\1/)

so the Makefile doesn't have to be maintained for it. To tell apart 
9.1 and = 9.1 I've used instead:

PG91 = $(shell $(PG_CONFIG) --version | grep -qE  8\.| 9\.0 
echo pre91 || echo 91)
ifeq ($(PG91),91)
...
else
...

For my extension I'm less concerned by having the install sql named in
different ways or by the upgrade sql as all these files are generated
by scripts. You may find useful this one
https://github.com/dvarrazzo/pgmp/blob/master/tools/sql2extension.py
to generate the upgrade sql from the install sql. For my extension I
require Python and have all the sql files generated by the Makefile at
install time; if you don't want this dependency you may generate the
sql before packaging and ship the result instead.

Cheers,

-- Daniele

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] getting composite types info from libpq

2010-12-15 Thread Daniele Varrazzo
Hello,

when a query returns a composite type, the libpq PQftype() function
reports the oid of the record type. In psycopg:

 cur.execute(select (1,2))
 cur.description
(('row', 2249, None, -1, None, None, None),)

test=# select typname from pg_type where oid = 2249;
 typname
-
 record

Is there a way to recursively retrieve the types for the record components?

Thanks,

-- Daniele

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] getting composite types info from libpq

2010-12-15 Thread Daniele Varrazzo
On Wed, Dec 15, 2010 at 6:56 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Wed, Dec 15, 2010 at 1:25 PM, Daniele Varrazzo
 daniele.varra...@gmail.com wrote:
 Hello,

 when a query returns a composite type, the libpq PQftype() function
 reports the oid of the record type. In psycopg:

     cur.execute(select (1,2))
     cur.description
    (('row', 2249, None, -1, None, None, None),)

    test=# select typname from pg_type where oid = 2249;
     typname
    -
     record

 Is there a way to recursively retrieve the types for the record components?

 not without talking to the server, unless you had previously pulled
 pg_attribute data.

 select * from pg_attribute where attrelid = 2249;

No, there is no such info in pg_attribute: 2249 is the oid for the
type of a generic record, not for a specific type.

 This question is more appropriate for -general, but what are you trying to do?

Added -general in copy: please remove -hackers in your reply if you
think this thread is out of place.

I'm hacking on psycopg. Currently it uses PQftype, PQfname and related
functions to inspect the PQresult received after a query in order to
build the python representation of the record. But the inspection is
flat: if the record contains a composite structure it is currently
returned as an unparsed string:

 cur.execute(select ('1'::int, current_date), current_date)
# the date outside the record is easily parsed, for the one inside
the record
 cur.fetchone()
('(1,2010-12-16)', datetime.date(2010, 12, 16))
 cur.description  # name and oid are the first two fields
(('row', 2249, None, -1, None, None, None),
 ('date', 1082, None, 4, None, None, None))

As the record is created on the fly, I assume there is no structure
left in the catalog for it. If I instead explicitly create the type I
see how to inspect it:

test= create type intdate as (an_int integer, a_date date);
CREATE TYPE

 cur.execute(select (1, current_date)::intdate, current_date)
 cur.fetchone()
('(1,2010-12-16)', datetime.date(2010, 12, 16))
 cur.description
(('row', 650308, None, -1, None, None, None),
 ('date', 1082, None, 4, None, None, None))

test= select attname, atttypid from pg_attribute where attrelid = 650306;
 attname | atttypid
-+--
 an_int  |   23
 a_date  | 1082

but even in this case it seems it would take a second query to inspect
the type and even here It doesn't seem I could use
PQgetvalue/PQgetlength to read the internal components of the
composite values.

The goal would be to have the query above translated into e.g. a
nested tuple in python:

((1, datetime.date(2010, 12, 16), datetime.date(2010, 12, 16))

and I'd like to know:

1. do I get enough info in the PGresult to inspect anonymous composite types?
2. do I get such info for composite types for which I have schema info
in the catalog, without issuing a second query? (which I don't feel it
is a driver's job)
3. is there any libpq facility to split the string returned after a
composite types into its single components, without having to write a
parser to deal with commas and quotes?

 cur.execute(select ('a'::text, 'b,c'::text, 'd''e'::text,
'f\g'::text))
 print cur.fetchone()[0]
(a,b,c,d'e,fg)

4. are by any chance those info passed on the network, maybe available
in an internal libpq structure, but then not accessible from the libpq
interface?

Thank you very much.

-- Daniele

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psycopg and two phase commit

2010-11-06 Thread Daniele Varrazzo
On Fri, Nov 5, 2010 at 5:16 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 2010/11/4 Daniele Varrazzo daniele.varra...@gmail.com:

 Just wanted to warn you that I have implemented the 2pc protocol in psycopg.

 I read a notice, but I didn't find a link for download, where is it, please?

We have just released a beta package including this and other
features. All the details at
http://initd.org/psycopg/articles/2010/11/06/psycopg-230-beta1-released/.

Cheers

-- Daniele

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psycopg and two phase commit

2010-11-04 Thread Daniele Varrazzo
On Sat, Sep 18, 2010 at 5:01 PM, Pavel Stehule pavel.steh...@gmail.com wrote:
 Hello

 who is psycopg maintainer, please? Can somebody explains to me, why
 psycopg doesn't support twophase commit still, although some
 implementation was done in summer 2008?

Hello Pavel,

Just wanted to warn you that I have implemented the 2pc protocol in psycopg.

I have this and other patches ready to be merged in the trunk: more
details are available in
http://initd.org/psycopg/articles/2010/11/02/new-features-upcoming-psycopg-release/

I will try to have the patches released soon: definitely before the
PyCon. If you fancy some testing you are welcome.

Best regards,

-- Daniele

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psycopg and two phase commit

2010-09-19 Thread Daniele Varrazzo
On Sat, Sep 18, 2010 at 5:01 PM, Pavel Stehule pavel.steh...@gmail.com wrote:
 Hello

 who is psycopg maintainer, please?

Here is one. The others can be usually mailed on the psycopg mailing
list, which is currently down and being recovered.

 Can somebody explains to me, why
 psycopg doesn't support twophase commit still, although some
 implementation was done in summer 2008?

Probably because nobody has asked before and it has been nobody's itch
to scratch.

The work you probably refer to seems Jason Henstridge's. I didn't know
anything about it, but I've bcc'd him so he can provide details.

  https://code.launchpad.net/~jamesh/psycopg/two-phase-commit

 Now two phase commit is part of DB-API, so can be implemented.

 There are some bariers?

I see none at a first glance. I just don't get the intricacies of the
.xid() method suggested in the dbapi
(http://www.python.org/dev/peps/pep-0249/) while a regular string
would do - and the xid has to be converted in a string anyway to be
passed to the Postgres TPC statements. So I'm tempted to allow the
tpc_*() methods to accept a simple string too as parameter; also
because otherwise psycopg wouldn't be able to manipulate a transaction
prepared by other tools (e.g. retrieving a xid using tpc_recover() and
passing it to tpc_commit()/tpc_rollback(), a limitation that can be
avoided.

I can work on the feature, first I'd like to review James's code and
possibly hear from him his impressions.

-- Daniele

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psycopg and two phase commit

2010-09-19 Thread Daniele Varrazzo
On Sun, Sep 19, 2010 at 6:38 PM, Daniele Varrazzo
daniele.varra...@gmail.com wrote:
 On Sat, Sep 18, 2010 at 5:01 PM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:

 There are some bariers?

 I see none at a first glance. I just don't get the intricacies of the
 .xid() method suggested in the dbapi
 (http://www.python.org/dev/peps/pep-0249/) while a regular string
 would do - and the xid has to be converted in a string anyway to be
 passed to the Postgres TPC statements. So I'm tempted to allow the
 tpc_*() methods to accept a simple string too as parameter; also
 because otherwise psycopg wouldn't be able to manipulate a transaction
 prepared by other tools (e.g. retrieving a xid using tpc_recover() and
 passing it to tpc_commit()/tpc_rollback(), a limitation that can be
 avoided.

I've read the XA specification, which have inspired the DBAPI
extension for TPC, and the discussion in the Python DB-SIG leading to
such extensions
(http://mail.python.org/pipermail/db-sig/2008-January/thread.html).
The methods proposed in the DBAPI don't map 1-1 with the PG TPC
statements so it will be necessary some work in the driver to
implement the proposed interface. But being TPC all about
interoperability I think it is a necessary complication.

-- Daniele

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Adding regexp_match() function

2010-05-30 Thread Daniele Varrazzo
Hello,

I'd like to contribute a regexp_match() function as discussed in bug
#5469 [1] The aim is to overcome the limitation outlined in the thread
above http://archives.postgresql.org/pgsql-bugs/2010-05/msg00227.php.

PostgreSQL currently offers the function regexp_matches(), a SRF
(which, unless invoked with the 'g' flag, returns at most one result).
An user interested in the nonglob behaviour, i.e. expecting at most
1 match group, has to adopt special care to avoid records to be
dropped from the target list in case no match is found. Being this a
rather common use case, I'd like to provide a function with a less
astonishing behaviour, i.e. returning a text[] instead of a set of
text[] and returning NULL in case no match is found (the 'g' flag
wouldn't be supported).

The proposed name is regexp_match(), to underline the semantics very
similar to regexp_matches() but returning a single value as result.
While the symmetry between the names is a pro, an excessive similarity
may result confusing, so I wouldn't be surprised if a better name is
found. The implementation of the function is very simple, given the
infrastructure already available for the other regexp-related
functions. I've actually already implemented it (mostly to check how
easy or hard it would have been: I had never written a C procedure for
PG before): a preliminary patch is attached.

If the idea is accepted I will submit a complete patch including
documentation and tests.

Best regards,


-- Daniele
diff --git a/src/backend/utils/adt/regexp.c b/src/backend/utils/adt/regexp.c
index c3fd23e..422a94d 100644
--- a/src/backend/utils/adt/regexp.c
+++ b/src/backend/utils/adt/regexp.c
@@ -817,6 +817,42 @@ regexp_matches_no_flags(PG_FUNCTION_ARGS)
 }
 
 /*
+ * regexp_match()
+ *		Return the first match of a pattern within a string.
+ */
+Datum
+regexp_match(PG_FUNCTION_ARGS)
+{
+	regexp_matches_ctx *matchctx;
+
+	matchctx = setup_regexp_matches(PG_GETARG_TEXT_PP(0),
+	PG_GETARG_TEXT_PP(1),
+	PG_GETARG_TEXT_PP_IF_EXISTS(2),
+	true, false, true);
+
+	if (matchctx-nmatches)
+	{
+		ArrayType  *result_ary;
+
+		matchctx-elems = (Datum *) palloc(sizeof(Datum) * matchctx-npatterns);
+		matchctx-nulls = (bool *) palloc(sizeof(bool) * matchctx-npatterns);
+		result_ary = build_regexp_matches_result(matchctx);
+		PG_RETURN_ARRAYTYPE_P(result_ary);
+	}
+	else
+	{
+		PG_RETURN_NULL();
+	}
+}
+
+/* This is separate to keep the opr_sanity regression test from complaining */
+Datum
+regexp_match_no_flags(PG_FUNCTION_ARGS)
+{
+	return regexp_match(fcinfo);
+}
+
+/*
  * setup_regexp_matches --- do the initial matching for regexp_matches()
  *		or regexp_split()
  *
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index f2751a4..970036a 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2241,6 +2241,10 @@ DATA(insert OID =  2763 ( regexp_matches   PGNSP PGUID 12 1 1 0 f f f t t i 2 0
 DESCR(return all match groups for regexp);
 DATA(insert OID =  2764 ( regexp_matches   PGNSP PGUID 12 1 10 0 f f f t t i 3 0 1009 25 25 25 _null_ _null_ _null_ _null_ regexp_matches _null_ _null_ _null_ ));
 DESCR(return all match groups for regexp);
+DATA(insert OID =  3778 ( regexp_match PGNSP PGUID 12 1 1 0 f f f t f i 2 0 1009 25 25 _null_ _null_ _null_ _null_ regexp_match_no_flags _null_ _null_ _null_ ));
+DESCR(return first match group for regexp);
+DATA(insert OID =  3779 ( regexp_match PGNSP PGUID 12 1 1 0 f f f t f i 3 0 1009 25 25 25 _null_ _null_ _null_ _null_ regexp_match _null_ _null_ _null_ ));
+DESCR(return first match group for regexp);
 DATA(insert OID =  2088 ( split_part   PGNSP PGUID 12 1 0 0 f f f t f i 3 0 25 25 25 23 _null_ _null_ _null_ _null_	split_text _null_ _null_ _null_ ));
 DESCR(split string by field_sep and return field_num);
 DATA(insert OID =  2765 ( regexp_split_to_table PGNSP PGUID 12 1 1000 0 f f f t t i 2 0 25 25 25 _null_ _null_ _null_ _null_	regexp_split_to_table_no_flags _null_ _null_ _null_ ));
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index e8f38a6..f6320f1 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -525,6 +525,8 @@ extern Datum textregexreplace(PG_FUNCTION_ARGS);
 extern Datum similar_escape(PG_FUNCTION_ARGS);
 extern Datum regexp_matches(PG_FUNCTION_ARGS);
 extern Datum regexp_matches_no_flags(PG_FUNCTION_ARGS);
+extern Datum regexp_match(PG_FUNCTION_ARGS);
+extern Datum regexp_match_no_flags(PG_FUNCTION_ARGS);
 extern Datum regexp_split_to_table(PG_FUNCTION_ARGS);
 extern Datum regexp_split_to_table_no_flags(PG_FUNCTION_ARGS);
 extern Datum regexp_split_to_array(PG_FUNCTION_ARGS);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers