date:20100813

Re: [HACKERS] WIP partial replication patch

2010-08-13 Thread Boszormenyi Zoltan

Andres Freund írta:
> On Fri, Aug 13, 2010 at 09:36:00PM +0200, Boszormenyi Zoltan wrote:
>   
>> Tom Lane írta:
>> 
>>> Boszormenyi Zoltan  writes:
>>>
>>>   
 attached is a WIP patch that will eventually implement
 partial replication, with the following syntax:
 
>>> This fundamentally cannot work, as it relies on system catalogs to be
>>> valid during recovery.
>>>   
>> Just like Hot Standby, no? What is the difference here?
>> Sorry for being ignorant.
>> 
> In HS you can only connect after youve found a restartpoint - only
> after that you know that you have reached a consistent point for the
> system.
>   

And in this patch, the startup process only tries to connect
after signalling the postmaster that a consistent state is reached.
And the connection has a reasonable timeout built in.

> I think this is fixable by keeping more wal on the standby's but I
> need to think more about it.
>
> Andres
>
>   

Best regards,
Zoltán Böszörményi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] BUG #5608: array_agg() consumes too much memory

2010-08-13 Thread Hitoshi Harada

2010/8/14 Itagaki Takahiro :
> 2010/8/10 Tom Lane :
>> Eventually it might be nice to have some sort of way to specify the
>> estimate to use for any aggregate function --- but for a near-term
>> fix maybe we should just hard-wire a special case for array_agg in
>> count_agg_clauses_walker().
>
> The attached patch is the near-term fix; it adds ALLOCSET_DEFAULT_INITSIZE
> bytes to memory assumption.
>
> We might need the same adjustment for string_agg(), that consumes
> 1024 bytes for the transit condition. array_agg() and string_agg()
> are only aggregates that have "internal" for aggtranstype.

So, is it better to generalize as it adds ALLOCSET_DEFAULT_INITSIZE if
the transtype is internal, rather than specifying individual function
OID as the patch stands?

Regards,

-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Recognize functional dependency on primary keys.

2010-08-13 Thread Tom Lane

Bruce Momjian  writes:
> Tom Lane wrote:
>> I'm not sure whether there is any clear rule for what rows you get when
>> grouping by a non-PK column in mysql, but it'll let you do it.

> I understand this.  The issue is how many people who complained about
> our GROUP BY behavior were grouping by something that was a primary key,
> and how many were not using a primary key?  The former will no longer
> complain.

No doubt, but the TODO entry you removed is still 100% accurately
worded, and what's more the archive entry it links to clearly describes
exactly the point at issue, namely that grouping by a PK *isn't*
indeterminate.  You were wrong to remove it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Recognize functional dependency on primary keys.

2010-08-13 Thread Bruce Momjian

Tom Lane wrote:
> Stephen Frost  writes:
> > * Bruce Momjian (br...@momjian.us) wrote:
> >> My guess is our new 9.1 functionality will reduce requests for this
> >> features, so we can just not list it anymore.  If they still ask, we can
> >> re-added this not-wanted item.
> 
> > I'm not so sure...  I expect we're going to get people complaining that
> > it doesn't work the way MySQL's does now instead of complaints we don't
> > have it.
> 
> Yes.  Please compare PG HEAD with mysql 5.1.48 (ok, it's last month's
> version):
> 
> regression=# create table t1 (f1 int primary key, f2 int, f3 int);
> NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "t1_pkey" for 
> table "t1"
> CREATE TABLE
> regression=# select * from t1 group by f1;
>  f1 | f2 | f3 
> ++
> (0 rows)
> 
> regression=# select * from t1 group by f2;
> ERROR:  column "t1.f1" must appear in the GROUP BY clause or be used in an 
> aggregate function
> LINE 1: select * from t1 group by f2;
>^
> 
> 
> 
> mysql> create table t1 (f1 int primary key, f2 int, f3 int);
> Query OK, 0 rows affected (0.07 sec)
> 
> mysql> select * from t1 group by f1;
> Empty set (0.00 sec)
> 
> mysql> select * from t1 group by f2;
> Empty set (0.00 sec)
> 
> 
> I'm not sure whether there is any clear rule for what rows you get when
> grouping by a non-PK column in mysql, but it'll let you do it.

I understand this.  The issue is how many people who complained about
our GROUP BY behavior were grouping by something that was a primary key,
and how many were not using a primary key?  The former will no longer
complain.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Recognize functional dependency on primary keys.

2010-08-13 Thread Tom Lane

Stephen Frost  writes:
> * Bruce Momjian (br...@momjian.us) wrote:
>> My guess is our new 9.1 functionality will reduce requests for this
>> features, so we can just not list it anymore.  If they still ask, we can
>> re-added this not-wanted item.

> I'm not so sure...  I expect we're going to get people complaining that
> it doesn't work the way MySQL's does now instead of complaints we don't
> have it.

Yes.  Please compare PG HEAD with mysql 5.1.48 (ok, it's last month's
version):

regression=# create table t1 (f1 int primary key, f2 int, f3 int);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "t1_pkey" for 
table "t1"
CREATE TABLE
regression=# select * from t1 group by f1;
 f1 | f2 | f3 
++
(0 rows)

regression=# select * from t1 group by f2;
ERROR:  column "t1.f1" must appear in the GROUP BY clause or be used in an 
aggregate function
LINE 1: select * from t1 group by f2;
   ^



mysql> create table t1 (f1 int primary key, f2 int, f3 int);
Query OK, 0 rows affected (0.07 sec)

mysql> select * from t1 group by f1;
Empty set (0.00 sec)

mysql> select * from t1 group by f2;
Empty set (0.00 sec)


I'm not sure whether there is any clear rule for what rows you get when
grouping by a non-PK column in mysql, but it'll let you do it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Recognize functional dependency on primary keys.

2010-08-13 Thread Tom Lane

Bruce Momjian  writes:
> Well, as worded, it says we have to group by everything, which is not
> true in 9.1, so I figured let's see what feedback we get and we can add
> a new one if we want, but our old argument is invalid, since we did
> implement part of what we said we wouldn't.  ;-)

Uh, no.  What we said we wouldn't implement is "Indeterminate behavior
for the GROUP BY clause".  We haven't implemented any part of that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Recognize functional dependency on primary keys.

2010-08-13 Thread Bruce Momjian

Stephen Frost wrote:
-- Start of PGP signed section.
> Bruce,
> 
> * Bruce Momjian (br...@momjian.us) wrote:
> > My guess is our new 9.1 functionality will reduce requests for this
> > features, so we can just not list it anymore.  If they still ask, we can
> > re-added this not-wanted item.
> 
> I'm not so sure...  I expect we're going to get people complaining that
> it doesn't work the way MySQL's does now instead of complaints we don't
> have it.  Not sure what value there is in removing it as a "feature we're
> not going to implement but realize others have"?

Well, as worded, it says we have to group by everything, which is not
true in 9.1, so I figured let's see what feedback we get and we can add
a new one if we want, but our old argument is invalid, since we did
implement part of what we said we wouldn't.  ;-)

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] BUG #5608: array_agg() consumes too much memory

2010-08-13 Thread Itagaki Takahiro

2010/8/10 Tom Lane :
> Eventually it might be nice to have some sort of way to specify the
> estimate to use for any aggregate function --- but for a near-term
> fix maybe we should just hard-wire a special case for array_agg in
> count_agg_clauses_walker().

The attached patch is the near-term fix; it adds ALLOCSET_DEFAULT_INITSIZE
bytes to memory assumption.

We might need the same adjustment for string_agg(), that consumes
1024 bytes for the transit condition. array_agg() and string_agg()
are only aggregates that have "internal" for aggtranstype.

-- 
Itagaki Takahiro


array_agg_memcalc.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Python 2.7 deprecated the PyCObject API?

2010-08-13 Thread James William Pye

On Aug 13, 2010, at 5:20 PM, Tom Lane wrote:
> According to a discussion over in Fedora-land, $subject is true:
> http://lists.fedoraproject.org/pipermail/devel/2010-August/140995.html
> 
> I see several calls in plpython.c that seem to refer to PyCObject stuff.
> Anybody have any idea if we need to do something about this?

Well, we should at least be checking for an exception here anyways:

proc->me = PyCObject_FromVoidPtr(proc, NULL);
PyDict_SetItemString(PLy_procedure_cache, key, proc->me);

if (proc->me == NULL) complain();

That is, with those warnings adjustments, proc->me will be NULL and then 
explode in PyDict_SetItemString:

[David Malcolm]
However, if someone overrides the process-wide warnings settings, then
the API can fail altogether, raising a PendingDeprecationWarning
exception (which in CPython terms means setting a thread-specific error
state and returning NULL).
./

AFA a better fix is concerned, the shortest route would seem to be to use the 
new capsule stuff iff Python >= 2.7.
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Recognize functional dependency on primary keys.

2010-08-13 Thread Stephen Frost

Bruce,

* Bruce Momjian (br...@momjian.us) wrote:
> My guess is our new 9.1 functionality will reduce requests for this
> features, so we can just not list it anymore.  If they still ask, we can
> re-added this not-wanted item.

I'm not so sure...  I expect we're going to get people complaining that
it doesn't work the way MySQL's does now instead of complaints we don't
have it.  Not sure what value there is in removing it as a "feature we're
not going to implement but realize others have"?

Thanks,

Stephen

signature.asc
Description: Digital signature

[HACKERS] Re: [COMMITTERS] pgsql: Recognize functional dependency on primary keys.

2010-08-13 Thread Bruce Momjian

Tom Lane wrote:
> Log Message:
> ---
> Recognize functional dependency on primary keys.  This allows a table's
> other columns to be referenced without listing them in GROUP BY, so long as
> the primary key column(s) are listed in GROUP BY.
> 
> Eventually we should also allow functional dependency on a UNIQUE constraint
> when the columns are marked NOT NULL, but that has to wait until NOT NULL
> constraints are represented in pg_constraint, because we need to have
> pg_constraint OIDs for all the conditions needed to ensure functional
> dependency.
> 
> Peter Eisentraut, reviewed by Alex Hunsaker and Tom Lane

Because of this commit, I am removing this "we do not want" TODO item:

{{TodoItem
|Indeterminate behavior for the GROUP BY clause (not wanted)
|At least one other database product allows specification of a subset of
the result columns which GROUP BY would need to be able to provide
predictable results; the server is free to return any value from the
group.  This is not viewed as a desirable feature.
* [http://archives.postgresql.org/pgsql-hackers/2010-03/msg00297.php
Re: SQL compatibility reminder: MySQL vs PostgreSQL]
}}

My guess is our new 9.1 functionality will reduce requests for this
features, so we can just not list it anymore.  If they still ask, we can
re-added this not-wanted item.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Include the backend ID in the relpath of temporary relations.

2010-08-13 Thread Robert Haas

On Fri, Aug 13, 2010 at 7:23 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> I have taken a crack at fixing this but someone who understands DTrace
>> better than I do may want to check and see if the changes look sane.
>> It appears to me that we have no documentation - not even so much as a
>> source code comment - explaining how these probes are supposed to work
>> or what the arguments to each one are intended mean.
>
> http://developer.postgresql.org/pgdocs/postgres/dynamic-trace.html#DTRACE-PROBE-POINT-TABLE
>
> (... which you now need to update ...)
>
> I think your confusion may stem from the fact that the definition of the
> buffer-read-done probe was actually wrong, AFAICS.  The docs say its
> last 3 args were bools, which was reasonable, but the definition said
> int for the first of those.  Which is what you want now ...

No, it was the original patch that mangled that.  I think the real
problem is that (1) I didn't test with dtrace enabled when writing the
patch, or maybe I did somewhere in the middle but not at the end and
(2) I didn't realize there were docs.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "micro bucket sort" ...

2010-08-13 Thread Greg Stark

2010/8/11 Hans-Jürgen Schönig :
> now, the problem is: i cannot easily create additional indexes as i have too 
> many possible "second" conditions here.
>

Is it just me or is this description of the problem not very specific?
Can you give more examples of your queries and explain what kind of
plan you're looking for for them?

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Python 2.7 deprecated the PyCObject API?

2010-08-13 Thread Tom Lane

According to a discussion over in Fedora-land, $subject is true:
http://lists.fedoraproject.org/pipermail/devel/2010-August/140995.html

I see several calls in plpython.c that seem to refer to PyCObject stuff.
Anybody have any idea if we need to do something about this?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Include the backend ID in the relpath of temporary relations.

2010-08-13 Thread Tom Lane

Robert Haas  writes:
> I have taken a crack at fixing this but someone who understands DTrace
> better than I do may want to check and see if the changes look sane.
> It appears to me that we have no documentation - not even so much as a
> source code comment - explaining how these probes are supposed to work
> or what the arguments to each one are intended mean.

http://developer.postgresql.org/pgdocs/postgres/dynamic-trace.html#DTRACE-PROBE-POINT-TABLE

(... which you now need to update ...)

I think your confusion may stem from the fact that the definition of the
buffer-read-done probe was actually wrong, AFAICS.  The docs say its
last 3 args were bools, which was reasonable, but the definition said
int for the first of those.  Which is what you want now ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [COMMITTERS] pgsql: Include the backend ID in the relpath of temporary relations.

2010-08-13 Thread Robert Haas

On Fri, Aug 13, 2010 at 6:38 PM, Robert Haas  wrote:
> On Fri, Aug 13, 2010 at 5:51 PM, Tom Lane  wrote:
>> rh...@postgresql.org (Robert Haas) writes:
>>> Include the backend ID in the relpath of temporary relations.
>>
>> A couple of the buildfarm members don't like this patch.  I think you
>> missed making some edits in some dtrace calls.
>
> Well, I guess that's why we have a buildfarm.  Working on it now...

I have taken a crack at fixing this but someone who understands DTrace
better than I do may want to check and see if the changes look sane.
It appears to me that we have no documentation - not even so much as a
source code comment - explaining how these probes are supposed to work
or what the arguments to each one are intended mean.  That may not be
ideal.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Window functions seem to inhibit push-down of quals into views

2010-08-13 Thread Tom Lane

Alvaro Herrera  writes:
> CREATE TABLE foo AS SELECT a, a % 10 AS b FROM generate_series(1, 10) a;
> CREATE INDEX a_b ON foo (b);
> CREATE VIEW bar AS SELECT a, b, lead(a, 1) OVER () FROM foo;

> explain select a, b, lead(a, 1) over () from foo where b = 2;
> explain select * from bar where b = 2;

Those are not equivalent queries.  In the first case b=2 is supposed to be
applied before window function evaluation, in the second case not.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP partial replication patch

2010-08-13 Thread Tom Lane

Josh Berkus  writes:
>> Another rather basic problem is that you've
>> got to pass system catalog updates downstream (in case they affect
>> the tables being replicated) but if you want partial replication then
>> many of those updates will be incorrect for the slave machine.

> Couldn't this be taken care of by replicating the objects but not any
> data for them?  That is, the tables and indexes would exist, but be empty?

Seems a bit pointless.  What exactly is the use-case for a slave whose
system catalogs match the master exactly (as they must) but whose data
does not?

Notice also that you have to shove the entire WAL downstream anyway ---
the proposed patch filters at the point of application, and would have a
hard time doing better because LSNs have to remain consistent.

It would also be rather tricky to identify which objects have to have
updates applied, eg, if you replicate a table you'd damn well better
replicate the data for each and every one of its indexes (which is a
non-constant set in general), because queries on the slave will expect
them all to be valid.  Maybe it's possible to keep track of that, though
I bet things will be tricky when there are uncommitted DDL changes
(consider data changes associated with a CREATE INDEX on a replicated
table).  In any case xlog replay functions are not the place to have
that kind of logic.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Window functions seem to inhibit push-down of quals into views

2010-08-13 Thread Alvaro Herrera

Hi,

I've got a table and view defined like this:

CREATE TABLE foo AS SELECT a, a % 10 AS b FROM generate_series(1, 10) a;
CREATE INDEX a_b ON foo (b);
CREATE VIEW bar AS SELECT a, b, lead(a, 1) OVER () FROM foo;

Now, if I query the table directly instead of going through the view, a
WHERE condition can be pushed down to the table scan:

explain select a, b, lead(a, 1) over () from foo where b = 2;
QUERY PLAN 
---
 WindowAgg  (cost=12.14..488.72 rows=500 width=8)
   ->  Bitmap Heap Scan on foo  (cost=12.14..482.47 rows=500 width=8)
 Recheck Cond: (b = 2)
 ->  Bitmap Index Scan on a_b  (cost=0.00..12.01 rows=500 width=0)
   Index Cond: (b = 2)
(5 filas)

However, if I instead query the view, the qual is applied to a SubqueryScan
instead, and the table is scanned with no qual at all:

alvherre=# explain select * from bar where b = 2;
  QUERY PLAN   
---
 Subquery Scan bar  (cost=0.00..3943.00 rows=500 width=12)
   Filter: (bar.b = 2)
   ->  WindowAgg  (cost=0.00..2693.00 rows=10 width=8)
 ->  Seq Scan on foo  (cost=0.00..1443.00 rows=10 width=8)
(4 filas)

The view is behaving like this:

alvherre=# explain select * from (select a, b, lead(a, 1) over () from foo) b 
where b = 2;
  QUERY PLAN   
---
 Subquery Scan b  (cost=0.00..3943.00 rows=500 width=12)
   Filter: (b.b = 2)
   ->  WindowAgg  (cost=0.00..2693.00 rows=10 width=8)
 ->  Seq Scan on foo  (cost=0.00..1443.00 rows=10 width=8)
(4 filas)



This is a killer for useful views on top of queries with window
functions :-(

Is this a optimizer shortcoming?

-- 
Álvaro Herrera 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: General purpose utility functions used by the JSON data type

2010-08-13 Thread Andrew Dunstan




On 08/13/2010 03:46 PM, Joseph Adams wrote:

On Fri, Aug 13, 2010 at 2:02 PM, David Fetter  wrote:

On Fri, Aug 13, 2010 at 01:33:06PM -0400, Robert Haas wrote:

Maybe so, but it's not clear the interface that Joseph implemented is
the one everyone wants...

Fair enough.  What's the interface now in a nutshell?  Lack of
nutshells might well mean the interface needs more work...

Suppose we have:

-- SQL --
CREATE TYPE colors AS ENUM ('red', 'green', 'blue');

-- C --
enum Colors {RED, GREEN, BLUE, COLOR_COUNT};

In a nutshell:

* Define an enum mapping like so:

static EnumLabel enum_labels[COLOR_COUNT] =
{
{COLOR_RED, "red"},
{COLOR_GREEN, "green"},
{COLOR_BLUE, "blue"}
};

* Get the OIDs of the enum labels:

Oid oids[COLOR_COUNT];
getEnumLabelOids("colors", enum_labels, oids, COLOR_COUNT);

* Convert C enum values to OIDs using the table:

PG_RETURN_OID(oids[COLOR_BLUE]);

A caller would typically cache the Oid table with fn_extra.

Currently, getEnumLabelOids puts InvalidOid for labels that could not
successfully be looked up.  This will probably be changed to
ereport(ERROR) instead.

Also, I'm thinking that getEnumLabelOids should be renamed to
something that's easier to remember.  Maybe enum_oids or
get_enum_oids.






I should point out that I'm hoping to present a patch in a few weeks for 
extensible enums, along the lines previously discussed on this list. I 
have only just noticed this thread or I would have jumped in earlier.


Maybe what I'm doing won't impact much on this - it does cache enum oids 
and their associated sort orders in the function context, but lazily - 
the theory being that a retail comparison should not have to look up the 
whole of a large enum set just to get two sort order values.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP partial replication patch

2010-08-13 Thread Josh Berkus


>  Another rather basic problem is that you've
> got to pass system catalog updates downstream (in case they affect
> the tables being replicated) but if you want partial replication then
> many of those updates will be incorrect for the slave machine.

Couldn't this be taken care of by replicating the objects but not any
data for them?  That is, the tables and indexes would exist, but be empty?

> More generally, though, we are going to have our hands full for the
> foreseeable future trying to get the existing style of replication
> bug-free and performant.  I don't think we want to undertake any large
> expansion of the replication feature set, at least not for some time
> to come.  So you can count on me to vote against committing anything
> like this into core.

I imagine it'll take more than a year to get this to work, if we ever
do.  Probably good to put it on a git branch and that way those who want
to can continue long-term work on it.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] review: psql: edit function, show function commands patch

2010-08-13 Thread Tom Lane

Pavel Stehule  writes:
> attached updated \sf implementation. It is little bit simplyfied with
> support a pager and output forwarding.

The line number argument to this greatly complicates the code but
doesn't appear to me to have much practical use.  Why would you bother
with that?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: utf8_to_unicode (trivial)

2010-08-13 Thread Robert Haas

On Fri, Aug 13, 2010 at 12:11 PM, Alvaro Herrera
 wrote:
> src/include/port.h?

Oh, hey, look at that.  Any thought on what to about the fact that our
two existing copies of utf2ucs() don't match?  (one tests against 0xf8
where the other against 0xf0)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: General purpose utility functions used by the JSON data type

2010-08-13 Thread Joseph Adams

On Fri, Aug 13, 2010 at 2:02 PM, David Fetter  wrote:
> On Fri, Aug 13, 2010 at 01:33:06PM -0400, Robert Haas wrote:
>> Maybe so, but it's not clear the interface that Joseph implemented is
>> the one everyone wants...
>
> Fair enough.  What's the interface now in a nutshell?  Lack of
> nutshells might well mean the interface needs more work...

Suppose we have:

-- SQL --
CREATE TYPE colors AS ENUM ('red', 'green', 'blue');

-- C --
enum Colors {RED, GREEN, BLUE, COLOR_COUNT};

In a nutshell:

* Define an enum mapping like so:

static EnumLabel enum_labels[COLOR_COUNT] =
{
{COLOR_RED, "red"},
{COLOR_GREEN, "green"},
{COLOR_BLUE, "blue"}
};

* Get the OIDs of the enum labels:

Oid oids[COLOR_COUNT];
getEnumLabelOids("colors", enum_labels, oids, COLOR_COUNT);

* Convert C enum values to OIDs using the table:

PG_RETURN_OID(oids[COLOR_BLUE]);

A caller would typically cache the Oid table with fn_extra.

Currently, getEnumLabelOids puts InvalidOid for labels that could not
successfully be looked up.  This will probably be changed to
ereport(ERROR) instead.

Also, I'm thinking that getEnumLabelOids should be renamed to
something that's easier to remember.  Maybe enum_oids or
get_enum_oids.


Joey Adams

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP partial replication patch

2010-08-13 Thread Andres Freund

On Fri, Aug 13, 2010 at 09:36:00PM +0200, Boszormenyi Zoltan wrote:
> Tom Lane írta:
> > Boszormenyi Zoltan  writes:
> >
> >> attached is a WIP patch that will eventually implement
> >> partial replication, with the following syntax:
> > This fundamentally cannot work, as it relies on system catalogs to be
> > valid during recovery.
> Just like Hot Standby, no? What is the difference here?
> Sorry for being ignorant.
In HS you can only connect after youve found a restartpoint - only
after that you know that you have reached a consistent point for the
system.

I think this is fixable by keeping more wal on the standby's but I
need to think more about it.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP partial replication patch

2010-08-13 Thread Boszormenyi Zoltan

Tom Lane írta:
> Boszormenyi Zoltan  writes:
>   
>> attached is a WIP patch that will eventually implement
>> partial replication, with the following syntax:
>> 
>
> This fundamentally cannot work, as it relies on system catalogs to be
> valid during recovery.

Just like Hot Standby, no? What is the difference here?
Sorry for being ignorant.

>   Another rather basic problem is that you've
> got to pass system catalog updates downstream (in case they affect
> the tables being replicated) but if you want partial replication then
> many of those updates will be incorrect for the slave machine.
>   

Yes, it's true. But there's an easy solution to that, querying
such tables can be forbidden, we were talking about truncating
such excluded relations internally. Currently querying exluded
tables are allowed just to be able to see that DML indeed doesn't
modify them. As I said, ATM it's only a proof of concept patch.

> More generally, though, we are going to have our hands full for the
> foreseeable future trying to get the existing style of replication
> bug-free and performant.  I don't think we want to undertake any large
> expansion of the replication feature set, at least not for some time
> to come.  So you can count on me to vote against committing anything
> like this into core.
>   

Understood.

Best regards,
Zoltán Böszörményi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP partial replication patch

2010-08-13 Thread Tom Lane

Boszormenyi Zoltan  writes:
> attached is a WIP patch that will eventually implement
> partial replication, with the following syntax:

This fundamentally cannot work, as it relies on system catalogs to be
valid during recovery.  Another rather basic problem is that you've
got to pass system catalog updates downstream (in case they affect
the tables being replicated) but if you want partial replication then
many of those updates will be incorrect for the slave machine.

More generally, though, we are going to have our hands full for the
foreseeable future trying to get the existing style of replication
bug-free and performant.  I don't think we want to undertake any large
expansion of the replication feature set, at least not for some time
to come.  So you can count on me to vote against committing anything
like this into core.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] review: xml_is_well_formed

2010-08-13 Thread Tom Lane

Mike Fowler  writes:
> On 11/08/10 21:27, Tom Lane wrote:
>> Yes.  Mike, are you expecting to submit a new version before the end of
>> the week?

> Yes and here it is, apologies for the delay. I have re-implemented 
> xml_is_well_formed such that it is sensitive to the XMLOPTION. The 
> additional _document and _content methods are now present. Tests and 
> documentation adjusted to suit.

Applied with minor cleanups, mostly in the documentation.  The only
thing that seems worth remarking on is that since xml_is_well_formed
now depends on a GUC variable, it *cannot* be marked IMMUTABLE.  The
right marking in such cases is STABLE.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: utf8_to_unicode (trivial)

2010-08-13 Thread Tom Lane

Robert Haas  writes:
> On Fri, Aug 13, 2010 at 1:50 PM, Tom Lane  wrote:
>> AFAICS the version in mbprint.c is flat out wrong, and the only reason
>> nobody's noticed is that it should never get passed a more-than-4-byte
>> sequence anyway.

> Should we fix it, then, and if so how far should we go back?

Yes, I'd vote for back-patching, at least to all versions in which the
wchar.c version looks like that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: utf8_to_unicode (trivial)

2010-08-13 Thread Robert Haas

On Fri, Aug 13, 2010 at 1:50 PM, Tom Lane  wrote:
> Alvaro Herrera  writes:
>> Excerpts from Robert Haas's message of vie ago 13 12:50:13 -0400 2010:
>>> Oh, hey, look at that.  Any thought on what to about the fact that our
>>> two existing copies of utf2ucs() don't match?  (one tests against 0xf8
>>> where the other against 0xf0)
>
>> I'm not sure why it's masking 0xf8 instead of 0xf0.
>
> Because it wants to verify that this is in fact a 4-byte UTF8 code.
> Compare the code (and comments) for pg_utf_mblen.
>
> AFAICS the version in mbprint.c is flat out wrong, and the only reason
> nobody's noticed is that it should never get passed a more-than-4-byte
> sequence anyway.

Should we fix it, then, and if so how far should we go back?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: General purpose utility functions used by the JSON data type

2010-08-13 Thread David Fetter

On Fri, Aug 13, 2010 at 01:33:06PM -0400, Robert Haas wrote:
> On Fri, Aug 13, 2010 at 1:05 PM, David Fetter  wrote:
> > On Fri, Aug 13, 2010 at 12:59:48PM -0400, Robert Haas wrote:
> >> On Fri, Aug 13, 2010 at 12:55 PM, Joseph Adams
> >>  wrote:
> >> > Indeed, a built-in JSON data type will certainly not need it.
> >> > However, users may want to return enums from procedures written in
> >> > C, and this function provides a way to do it.
> >>
> >> Yeah, but I can't see accepting it on speculation.  We'll add it if
> >> and when it's clear that it will be generally useful.
> >
> > Please pardon me for jumping in here, but one of the things people
> > really love about PostgreSQL is that when you reach for something,
> > it's frequently "just there."
> >
> > As we're improving enums (allowing them to be altered easily after
> > creation, etc.), it seems reasonable to provide ways to return them
> > from all kinds of PLs, including making this easier in C.
> 
> Maybe so, but it's not clear the interface that Joseph implemented is
> the one everyone wants...

Fair enough.  What's the interface now in a nutshell?  Lack of
nutshells might well mean the interface needs more work...

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: utf8_to_unicode (trivial)

2010-08-13 Thread Tom Lane

Alvaro Herrera  writes:
> Excerpts from Robert Haas's message of vie ago 13 12:50:13 -0400 2010:
>> Oh, hey, look at that.  Any thought on what to about the fact that our
>> two existing copies of utf2ucs() don't match?  (one tests against 0xf8
>> where the other against 0xf0)

> I'm not sure why it's masking 0xf8 instead of 0xf0.

Because it wants to verify that this is in fact a 4-byte UTF8 code.
Compare the code (and comments) for pg_utf_mblen.

AFAICS the version in mbprint.c is flat out wrong, and the only reason
nobody's noticed is that it should never get passed a more-than-4-byte
sequence anyway.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: utf8_to_unicode (trivial)

2010-08-13 Thread Alvaro Herrera

Excerpts from Robert Haas's message of vie ago 13 12:50:13 -0400 2010:
> On Fri, Aug 13, 2010 at 12:11 PM, Alvaro Herrera
>  wrote:
> > src/include/port.h?
> 
> Oh, hey, look at that.  Any thought on what to about the fact that our
> two existing copies of utf2ucs() don't match?  (one tests against 0xf8
> where the other against 0xf0)

I'm not sure why it's masking 0xf8 instead of 0xf0.  It seems like c &
0xf8 == 0xf8 signals start of a 5-byte sequence which is not valid per
RFC 3629, according to wikipedia:
http://en.wikipedia.org/wiki/UTF-8#Description

(Moreover, 0xf5 to 0xf7 signal start of a 4-byte sequence for codepoints
that apparently are not supposed to be valid).

So apparently it's good that the code returns an invalid code in those
cases, i.e. wchar.c is right and mbprint is wrong.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] more numeric stuff

2010-08-13 Thread Robert Haas

On Fri, Aug 13, 2010 at 1:10 PM, Bruce Momjian  wrote:
> Tom Lane wrote:
>> > 3. 64-bit arithmetic.  Right now, mul_var() and div_var() use int for
>> > arithmetic, but haven't we given up on supporting platforms without
>> > long long?  I'm not sure I'm motivated enough to write the patch
>> > myself, but it seems like 64-bit arithmetic would give us a lot more
>> > room to postpone carries.
>>
>> I don't think this would win unless we went to 32-bit NumericDigit,
>> which is a problem from the on-disk-compatibility standpoint, not to
>> mention making the alignment issues even worse.  Postponing carries is
>> good, but we have enough headroom for that already --- I really doubt
>> that making the array elements wider would save anything noticeable
>> unless you increase NBASE.
>
> Should we be collecting pg_upgrade-breaking changes on the TODO list so
> we can implement them in one future release?

Possibly, but I don't think we want to do this one even if we WERE
willing to break pg_upgrade.  Increasing NBASE would be a complete
disaster in terms of Numeric on-disk footprint, which - even with the
changes I just implemented - is already uncomfortably high.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: General purpose utility functions used by the JSON data type

2010-08-13 Thread Robert Haas

On Fri, Aug 13, 2010 at 1:05 PM, David Fetter  wrote:
> On Fri, Aug 13, 2010 at 12:59:48PM -0400, Robert Haas wrote:
>> On Fri, Aug 13, 2010 at 12:55 PM, Joseph Adams
>>  wrote:
>> > Indeed, a built-in JSON data type will certainly not need it.
>> > However, users may want to return enums from procedures written in
>> > C, and this function provides a way to do it.
>>
>> Yeah, but I can't see accepting it on speculation.  We'll add it if
>> and when it's clear that it will be generally useful.
>
> Please pardon me for jumping in here, but one of the things people
> really love about PostgreSQL is that when you reach for something,
> it's frequently "just there."
>
> As we're improving enums (allowing them to be altered easily after
> creation, etc.), it seems reasonable to provide ways to return them
> from all kinds of PLs, including making this easier in C.

Maybe so, but it's not clear the interface that Joseph implemented is
the one everyone wants...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] more numeric stuff

2010-08-13 Thread Bruce Momjian

Tom Lane wrote:
> > 3. 64-bit arithmetic.  Right now, mul_var() and div_var() use int for
> > arithmetic, but haven't we given up on supporting platforms without
> > long long?  I'm not sure I'm motivated enough to write the patch
> > myself, but it seems like 64-bit arithmetic would give us a lot more
> > room to postpone carries.
> 
> I don't think this would win unless we went to 32-bit NumericDigit,
> which is a problem from the on-disk-compatibility standpoint, not to
> mention making the alignment issues even worse.  Postponing carries is
> good, but we have enough headroom for that already --- I really doubt
> that making the array elements wider would save anything noticeable
> unless you increase NBASE.

Should we be collecting pg_upgrade-breaking changes on the TODO list so
we can implement them in one future release?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: General purpose utility functions used by the JSON data type

2010-08-13 Thread David Fetter

On Fri, Aug 13, 2010 at 12:59:48PM -0400, Robert Haas wrote:
> On Fri, Aug 13, 2010 at 12:55 PM, Joseph Adams
>  wrote:
> > Indeed, a built-in JSON data type will certainly not need it.
> > However, users may want to return enums from procedures written in
> > C, and this function provides a way to do it.
> 
> Yeah, but I can't see accepting it on speculation.  We'll add it if
> and when it's clear that it will be generally useful.

Please pardon me for jumping in here, but one of the things people
really love about PostgreSQL is that when you reach for something,
it's frequently "just there."

As we're improving enums (allowing them to be altered easily after
creation, etc.), it seems reasonable to provide ways to return them
from all kinds of PLs, including making this easier in C.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: General purpose utility functions used by the JSON data type

2010-08-13 Thread Tom Lane

Joseph Adams  writes:
> On Fri, Aug 13, 2010 at 10:46 AM, Robert Haas  wrote:
>> +   if (sub_end + len > e)
>> +   {
>> +   Assert(false);  /* Clipped multibyte 
>> character */
>> +   break;
>> +   }

> If I simply say Assert(sub_end + len <= e), the function will yield a
> range hanging off the edge of the input string (out of bounds).  The
> five lines include a safeguard against that when assertion checking is
> off.

If you think it is actually likely to happen in practice, then an Assert
is 100% inappropriate.  Throw an actual error instead.  Code that has
provisions for continuing after an Assert failure is wrong by definition.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: General purpose utility functions used by the JSON data type

2010-08-13 Thread Robert Haas

On Fri, Aug 13, 2010 at 12:55 PM, Joseph Adams
 wrote:
> Indeed, a built-in JSON data type will certainly not need it.
> However, users may want to return enums from procedures written in C,
> and this function provides a way to do it.

Yeah, but I can't see accepting it on speculation.  We'll add it if
and when it's clear that it will be generally useful.

>> Incidentally, if we were going to accept this function, I think we'd
>> need to add some kind of a check to throw an error if any of the
>> labels can't be mapped onto an Oid.  Otherwise, an error in this area
>> might lead to difficult-to-find misbehavior.
>
> I agree.  Perhaps an ereport(ERROR, ...) would be better, since it
> would not be hard for a user to cause an enum mapping error (even in a
> production database) by updating a stored procedure in C but not
> updating the corresponding enum (or vice versa, if enum labels are
> renamed).

Right...

> Fair enough.  Perhaps the comment about FN_EXTRA (which explains
> fn_extra in more detail) could be reworded (to talk about using
> fcinfo->flinfo->fn_extra manually) and included in the documentation
> at xfunc-c.html .  fn_extra currently only gets a passing mention in
> the discussion about set-returning functions.

Feel to propose a patch to that comment.

>>> pg_substring, pg_encoding_substring
>>>  * Useful-ometer: ()---o
>>>  * Rationale: The JSONPath code uses it / will use it for extracting
>>> substrings, which is probably not a very useful feature (but who am I
>>> to say that).  This function could probably benefit the
>>> text_substring() function in varlena.c , but it would take a bit of
>>> work to ensure it continues to comply with standards.
>>
>> I'm more positive about this idea.  If you can make this generally
>> useful, I'd encourage you to do that.  On a random coding style note,
>> I see that you have two copies of the following code, which can fairly
>> obviously be written in a single line rather than five, assuming it's
>> actually safe.
>>
>> +               if (sub_end + len > e)
>> +               {
>> +                       Assert(false);          /* Clipped multibyte
>> character */
>> +                       break;
>> +               }
>
> If I simply say Assert(sub_end + len <= e), the function will yield a
> range hanging off the edge of the input string (out of bounds).  The
> five lines include a safeguard against that when assertion checking is
> off.  This could happen if the input string has a clipped multibyte
> character at the end.  Perhaps I should just drop the assertions and
> make it so if there's a clipped character at the end, it'll be
> ignored, no big deal.

I think maybe what you want is ereport(ERROR).  It should never be
possible for user action to trip an Assert(); Assert() is only to
catch coding mistakes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-13 Thread Tom Lane

Robert Haas  writes:
> On Fri, Aug 13, 2010 at 12:49 PM, Tom Lane  wrote:
>> Also, it'd be better if BBB and FFF were marked up as 
>> rather than , see examples elsewhere in that file.

> I see.  How should I mark tBBB_FFF?

I'd do
tBBB_FFF

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: General purpose utility functions used by the JSON data type

2010-08-13 Thread Joseph Adams

On Fri, Aug 13, 2010 at 10:46 AM, Robert Haas  wrote:
> On Fri, Aug 13, 2010 at 5:45 AM, Joseph Adams
>  wrote:
>> getEnumLabelOids
>>  * Useful-ometer: ()---o
>>  * Rationale: There is currently no streamlined way to return a custom
>> enum value from a PostgreSQL function written in C.  This function
>> performs a batch lookup of enum OIDs, which can then be cached with
>> fn_extra.  This should be reasonably efficient, and it's quite elegant
>> to use.
>
> It's possible that there might be a contrib module out there someplace
> that wants to do this, but it's clearly a waste of time for a core
> datatype.  The OIDs are fixed.  Just get rid of the enum altogether
> and use the OIDs directly wherever you would have used the enum.  Then
> you don't need this.

Indeed, a built-in JSON data type will certainly not need it.
However, users may want to return enums from procedures written in C,
and this function provides a way to do it.

> Incidentally, if we were going to accept this function, I think we'd
> need to add some kind of a check to throw an error if any of the
> labels can't be mapped onto an Oid.  Otherwise, an error in this area
> might lead to difficult-to-find misbehavior.

I agree.  Perhaps an ereport(ERROR, ...) would be better, since it
would not be hard for a user to cause an enum mapping error (even in a
production database) by updating a stored procedure in C but not
updating the corresponding enum (or vice versa, if enum labels are
renamed).

>> FN_EXTRA, FN_EXTRA_ALLOC, FN_MCXT
>>  * Useful-ometer: ()o
>>  * Rationale: Using fcinfo->flinfo->fn_extra takes a lot of
>> boilerplate.  These macros help cut down the boilerplate, and the
>> comment explains what fn_extra is all about.
>
> I think this is not a good idea.  Macros that use local variables
> under the covers make it hard for new contributors to read the code
> and understand what it does.  It's only worth doing if it saves a lot
> of typing, and this doesn't.  If we were going to do this, the right
> thing for your patch to do would be to update every instance of this
> coding pattern in our source base, so that people who copy those
> examples in the future do it the new way.  But I think there's no real
> advantage in that: it would complicate back-patching future bug fixes
> for no particularly compelling reason.

Fair enough.  Perhaps the comment about FN_EXTRA (which explains
fn_extra in more detail) could be reworded (to talk about using
fcinfo->flinfo->fn_extra manually) and included in the documentation
at xfunc-c.html .  fn_extra currently only gets a passing mention in
the discussion about set-returning functions.

>> pg_substring, pg_encoding_substring
>>  * Useful-ometer: ()---o
>>  * Rationale: The JSONPath code uses it / will use it for extracting
>> substrings, which is probably not a very useful feature (but who am I
>> to say that).  This function could probably benefit the
>> text_substring() function in varlena.c , but it would take a bit of
>> work to ensure it continues to comply with standards.
>
> I'm more positive about this idea.  If you can make this generally
> useful, I'd encourage you to do that.  On a random coding style note,
> I see that you have two copies of the following code, which can fairly
> obviously be written in a single line rather than five, assuming it's
> actually safe.
>
> +               if (sub_end + len > e)
> +               {
> +                       Assert(false);          /* Clipped multibyte
> character */
> +                       break;
> +               }

If I simply say Assert(sub_end + len <= e), the function will yield a
range hanging off the edge of the input string (out of bounds).  The
five lines include a safeguard against that when assertion checking is
off.  This could happen if the input string has a clipped multibyte
character at the end.  Perhaps I should just drop the assertions and
make it so if there's a clipped character at the end, it'll be
ignored, no big deal.


Joey Adams

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-13 Thread Robert Haas

On Fri, Aug 13, 2010 at 12:49 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> Here is an updated patch.  It's in context-diff format this time,
>
> Thanks, I appreciate that ;-)
>
> This looks committable to me, with a couple of tiny suggestions:

Woo hoo!

> In the text added to storage.sgml, s/temporary relation/temporary relations/.
> Also, it'd be better if BBB and FFF were marked up as 
> rather than , see examples elsewhere in that file.

I see.  How should I mark tBBB_FFF?

I actually didn't like that way of explaining it very much, but I
couldn't think of anything clearer.  Saying "the name will consist of
a lowercase t, followed by the backend ID, followed by an underscore,
followed by the filenode" did not seem better.

> The comment for local_buffer_write_error_callback() probably meant to
> say "during local buffer writes".

No doubt.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-13 Thread Tom Lane

Robert Haas  writes:
> Here is an updated patch.  It's in context-diff format this time,

Thanks, I appreciate that ;-)

This looks committable to me, with a couple of tiny suggestions:

In the text added to storage.sgml, s/temporary relation/temporary relations/.
Also, it'd be better if BBB and FFF were marked up as 
rather than , see examples elsewhere in that file.

The comment for local_buffer_write_error_callback() probably meant to
say "during local buffer writes".

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: utf8_to_unicode (trivial)

2010-08-13 Thread Alvaro Herrera

Excerpts from Robert Haas's message of vie ago 13 12:00:32 -0400 2010:
> On Fri, Aug 13, 2010 at 3:12 AM, Joseph Adams
>  wrote:
> > I've attached another patch that moves utf8_to_unicode to src/port per
> > Robert Haas's suggestion.
> >
> > This patch itself is not quite as elegant as the first one because it
> > puts platform-independent code that "belongs" in wchar.c into src/port
> > .  It also uses unsigned int instead of pg_wchar because the typedef
> > of pg_wchar isn't available to the frontend, if I'm not mistaken.
> 
> Well, right now, in addition to having two copies of utf2ucs(), we
> have two declarations of pg_wchar, one in src/bin/psql/mbprint.c and
> the other in src/include/mb/pg_wchar.h; so both existing copies of the
> function are able to use that typedef.  It seems like we might want to
> move the typedef to the same place as the declaration of the renamed
> utf2ucs(), but I'm not quite sure where that should be.  The only
> header in src/port is pthread-win32.h, and we're sure not going to put
> it there.

src/include/port.h?

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RecordTransactionCommit() and SharedInvalidationMessages

2010-08-13 Thread Robert Haas

On Thu, Aug 12, 2010 at 9:43 PM, Fujii Masao  wrote:
> On Fri, Aug 13, 2010 at 10:24 AM, Robert Haas  wrote:
>> On Thu, Aug 12, 2010 at 12:11 AM, Fujii Masao  wrote:
>>> It appears to me that RecordTransactionCommit() only needs to WAL-log
>>> shared invalidation messages when wal_level is hot_standby, but I
>>> don't see a guard to prevent it from doing it in all cases.
 Should use XLogStandbyInfoActive() macro, for the sake of consistency.
>>> And, RelcacheInitFileInval should be initialized with false just in case.
>>
>> How about this?
>
> Looks good to me.

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: utf8_to_unicode (trivial)

2010-08-13 Thread Robert Haas

On Fri, Aug 13, 2010 at 3:12 AM, Joseph Adams
 wrote:
> I've attached another patch that moves utf8_to_unicode to src/port per
> Robert Haas's suggestion.
>
> This patch itself is not quite as elegant as the first one because it
> puts platform-independent code that "belongs" in wchar.c into src/port
> .  It also uses unsigned int instead of pg_wchar because the typedef
> of pg_wchar isn't available to the frontend, if I'm not mistaken.

Well, right now, in addition to having two copies of utf2ucs(), we
have two declarations of pg_wchar, one in src/bin/psql/mbprint.c and
the other in src/include/mb/pg_wchar.h; so both existing copies of the
function are able to use that typedef.  It seems like we might want to
move the typedef to the same place as the declaration of the renamed
utf2ucs(), but I'm not quite sure where that should be.  The only
header in src/port is pthread-win32.h, and we're sure not going to put
it there.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [ADMIN] postgres 9.0 crash when bringing up hot standby

2010-08-13 Thread Robert Haas

On Thu, Aug 12, 2010 at 5:54 PM, Tom Lane  wrote:
> I wrote:
>> I wonder if the problem is not so much libpqwalreceiver as the
>> walreceiver process.  Maybe an ordinary backend process does some
>> prerequisite initialization that walreceiver is missing.  Hard to
>> guess what, though ... I can't think of anything dlopen() depends on
>> that should be under our control.
>
> Actually, that idea is easily tested: try doing
>        LOAD 'libpqwalreceiver';
> in a regular backend process.

Alanoly, is this something you can try?

> If that still crashes, it might be useful to truss or strace the backend
> while it runs the command, and compare that to the trace of
>        LOAD 'dblink';

And if necessary, this too?

Thanks for your help debugging this

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] typos in HS source code comment

2010-08-13 Thread Robert Haas

On Thu, Aug 12, 2010 at 9:37 PM, Fujii Masao  wrote:
> Before 9.0, since xact_redo_commit always calls TransactionIdCommitTree,
> TransactionIdSetStatusBit always receives InvalidXLogRecPtr. In 9.0,
> xact_redo_commit calls TransactionIdCommitTree only when hot standby is
> disabled. OTOH, when hot standby is enabled, xact_redo_commit calls
> TransactionIdAsyncCommitTree, so TransactionIdSetStatusBit receives the
> valid lsn.

Hmm.  Maybe that should be spelled out a little more explicitly in the comment.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: General purpose utility functions used by the JSON data type

2010-08-13 Thread Robert Haas

On Fri, Aug 13, 2010 at 5:45 AM, Joseph Adams
 wrote:
> getEnumLabelOids
>  * Useful-ometer: ()---o
>  * Rationale: There is currently no streamlined way to return a custom
> enum value from a PostgreSQL function written in C.  This function
> performs a batch lookup of enum OIDs, which can then be cached with
> fn_extra.  This should be reasonably efficient, and it's quite elegant
> to use.

It's possible that there might be a contrib module out there someplace
that wants to do this, but it's clearly a waste of time for a core
datatype.  The OIDs are fixed.  Just get rid of the enum altogether
and use the OIDs directly wherever you would have used the enum.  Then
you don't need this.

Incidentally, if we were going to accept this function, I think we'd
need to add some kind of a check to throw an error if any of the
labels can't be mapped onto an Oid.  Otherwise, an error in this area
might lead to difficult-to-find misbehavior.

> FN_EXTRA, FN_EXTRA_ALLOC, FN_MCXT
>  * Useful-ometer: ()o
>  * Rationale: Using fcinfo->flinfo->fn_extra takes a lot of
> boilerplate.  These macros help cut down the boilerplate, and the
> comment explains what fn_extra is all about.

I think this is not a good idea.  Macros that use local variables
under the covers make it hard for new contributors to read the code
and understand what it does.  It's only worth doing if it saves a lot
of typing, and this doesn't.  If we were going to do this, the right
thing for your patch to do would be to update every instance of this
coding pattern in our source base, so that people who copy those
examples in the future do it the new way.  But I think there's no real
advantage in that: it would complicate back-patching future bug fixes
for no particularly compelling reason.

> getTypeInfo
>  * Useful-ometer: ()---o
>  * Rationale: The get_type_io_data "six-fer" function is very
> cumbersome to use, since one has to declare all the output variables.
> The getTypeInfo puts the results in a structure.  It also performs the
> fmgr_info_cxt step, which is a step done after every usage of
> get_type_io_data in the PostgreSQL code.
>  * Other thoughts: getTypeInfo also retrieves typcategory (and
> typispreferred), which is rather ad-hoc.  This benefits the JSON code
> because to_json() uses the typcategory to figure out what type of JSON
> value to convert something to (for instance, things in category 'A'
> become JSON arrays).  Other data types could care less about the
> typcategory.  Should getTypeInfo leave that step out?

Well, again, you have to decide whether this is a function that you're
adding just for the JSON code or whether it's really a general purpose
utility function.  If you want it to be a general purpose utility
function, you need to change all the callers that could potentially
leverage it.  If it's JSON specific, put it in the JSON code.  It
might be simpler to just declare the variables.

> pg_substring, pg_encoding_substring
>  * Useful-ometer: ()---o
>  * Rationale: The JSONPath code uses it / will use it for extracting
> substrings, which is probably not a very useful feature (but who am I
> to say that).  This function could probably benefit the
> text_substring() function in varlena.c , but it would take a bit of
> work to ensure it continues to comply with standards.

I'm more positive about this idea.  If you can make this generally
useful, I'd encourage you to do that.  On a random coding style note,
I see that you have two copies of the following code, which can fairly
obviously be written in a single line rather than five, assuming it's
actually safe.

+   if (sub_end + len > e)
+   {
+   Assert(false);  /* Clipped multibyte
character */
+   break;
+   }

> server_to_utf8, utf8_to_server, text_to_utf8_cstring,
> utf8_cstring_to_text, utf8_cstring_to_text_with_len
>  * Useful-ometer: ()--o
>  * Rationale:  The JSON data type operates in UTF-8 rather than the
> server encoding because it needs to deal with Unicode escapes, but
> individual Unicode characters can't be converted to/from the server
> encoding simply and efficiently (as far as I know).  These routines
> made the conversions done by the JSON data type vastly simpler, and
> they could simplify other data types in the future (XML does a lot of
> server<->UTF-8 conversions too).

Sounds interesting.  But again, you would need to modify the XML code
to use these also, so that we can clearly see that this is reusable
code, and actually reuse it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "micro bucket sort" ...

2010-08-13 Thread Hitoshi Harada

2010/8/12 PostgreSQL - Hans-Jürgen Schönig :
> as tom pointed out - this is not possible.
> there is no limit 20 in my case - i just used it to indicate that limiting 
> does not make the index scan possible which it does in some other cases.

I came up with this:

explain analyze select * from (select * from t_test order by x limit
all)s order by x, y limit 20;

which uses index scan for column x and top-N heapsort for outer ORDER
BY, though it's slower than "ORDER BY x LIMIT 20" case. If it chooses
external sort for "ORDER BY x, y" LIMIT ALL  likely wins while looses
if quicksort is chosen.


Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Develop item from TODO list

2010-08-13 Thread Robert Haas

On Wed, Aug 4, 2010 at 5:34 PM, Dimitri Fontaine  wrote:
>> On Tue, 2010-08-03 at 11:10 -0400, Robert Haas wrote:
>>> Or vi.
>>
>> cough.
>
> Well, I guess letting newcomers know about tools of choice amongst
> regular contributors is a good idea, but the best editor you can find
> around is the one you master.

I should probably mention that my comment about vi was mostly
tongue-in-cheek, although it is my preferred editor just because I've
been using it so long (i.e. it's the one I've mastered).  I learned
emacs at one point, but it was just too slow on the Sun3 I was using
at the time, and the need to hit the control key constantly was hard
on my hands.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_restore should accept multiple -t switches?

2010-08-13 Thread Dimitri Fontaine

Fujii Masao  writes:
> pg_dump allows us to select multiple target tables by using
> multiple -t switches, but pg_restore does not. So, when
> restoring multiple tables, we have to run pg_restore more
> than once as follows. This is a pain to me.

Use the list facilities, options -l and -L.

> Is it worth allowing pg_restore to accept multiple -t
> switches as well as pg_dump?

I don't think so.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Develop item from TODO list

2010-08-13 Thread Dimitri Fontaine

Viktor Valy  writes:
> Thanks for the advice!Yes, we are new to linux too :)We have chosen
> Eclipse, because we have already experience with it.However, after
> downloading the code from CVS, we can't build it, because of some
> include commands in tutorial / complex.c says "No such file or
> directory".  Does anybody know what the clue is?

Did you try this wiki page yet?

  http://wiki.postgresql.org/wiki/Working_with_Eclipse

> On Tue, 2010-08-03 at 11:10 -0400, Robert Haas wrote:
>> Or vi.
>
> cough.

Well, I guess letting newcomers know about tools of choice amongst
regular contributors is a good idea, but the best editor you can find
around is the one you master.

In all fairness until now I counted a lot of Emacs users, some (g)vim
ones, and I didn't keep track of users of Visual Studio, Eclipse, etc
but you can't pretend they're not there. My bet is that the winner in
term of user count would be Emacs.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: General purpose utility functions used by the JSON data type

2010-08-13 Thread Mike Fowler


On 13/08/10 10:45, Joseph Adams wrote:

This patch doesn't include tests .  How would I go about writing them?

I have made the JSON data type built-in, and I will post that patch
shortly (it depends on this one).  The built-in JSON data type uses
all of these utility functions, and the tests for the JSON data type
pass.
   


Joseph,

Most existing data types have a regression SQL script in 
src/test/regress/sql. Using one of them as an example to draw some 
inspiration from, you should be able to write a 'json.sql' script that 
exercises the data type and it's supporting functions. Full instructions 
can be found at http://wiki.postgresql.org/wiki/Regression_test_authoring


Regards,

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] patch: General purpose utility functions used by the JSON data type

2010-08-13 Thread Joseph Adams

I factored out the general-purpose utility functions in the JSON data
type code into a patch against HEAD.  I have made a few changes to
them since I posted about them earlier (
http://archives.postgresql.org/pgsql-hackers/2010-08/msg00692.php ).

A summary of the utility functions along with some of my own thoughts
about them:

getEnumLabelOids
 * Useful-ometer: ()---o
 * Rationale: There is currently no streamlined way to return a custom
enum value from a PostgreSQL function written in C.  This function
performs a batch lookup of enum OIDs, which can then be cached with
fn_extra.  This should be reasonably efficient, and it's quite elegant
to use.

FN_EXTRA, FN_EXTRA_ALLOC, FN_MCXT
 * Useful-ometer: ()o
 * Rationale: Using fcinfo->flinfo->fn_extra takes a lot of
boilerplate.  These macros help cut down the boilerplate, and the
comment explains what fn_extra is all about.

getTypeInfo
 * Useful-ometer: ()---o
 * Rationale: The get_type_io_data "six-fer" function is very
cumbersome to use, since one has to declare all the output variables.
The getTypeInfo puts the results in a structure.  It also performs the
fmgr_info_cxt step, which is a step done after every usage of
get_type_io_data in the PostgreSQL code.
 * Other thoughts: getTypeInfo also retrieves typcategory (and
typispreferred), which is rather ad-hoc.  This benefits the JSON code
because to_json() uses the typcategory to figure out what type of JSON
value to convert something to (for instance, things in category 'A'
become JSON arrays).  Other data types could care less about the
typcategory.  Should getTypeInfo leave that step out?

pg_substring, pg_encoding_substring
 * Useful-ometer: ()---o
 * Rationale: The JSONPath code uses it / will use it for extracting
substrings, which is probably not a very useful feature (but who am I
to say that).  This function could probably benefit the
text_substring() function in varlena.c , but it would take a bit of
work to ensure it continues to comply with standards.

server_to_utf8, utf8_to_server, text_to_utf8_cstring,
utf8_cstring_to_text, utf8_cstring_to_text_with_len
 * Useful-ometer: ()--o
 * Rationale:  The JSON data type operates in UTF-8 rather than the
server encoding because it needs to deal with Unicode escapes, but
individual Unicode characters can't be converted to/from the server
encoding simply and efficiently (as far as I know).  These routines
made the conversions done by the JSON data type vastly simpler, and
they could simplify other data types in the future (XML does a lot of
server<->UTF-8 conversions too).

This patch doesn't include tests .  How would I go about writing them?

I have made the JSON data type built-in, and I will post that patch
shortly (it depends on this one).  The built-in JSON data type uses
all of these utility functions, and the tests for the JSON data type
pass.


json-util-01.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] MERGE command for inheritance

2010-08-13 Thread Boxuan Zhai

On Fri, Aug 13, 2010 at 2:33 PM, Heikki Linnakangas <
heikki.linnakan...@enterprisedb.com> wrote:

> On 13/08/10 09:27, Boxuan Zhai wrote:
>
>> I have renewed the merge.sql and merge.out in regress. Please have a look.
>>
>
> Thanks.
>
> Did you change the way DO INSTEAD rules are handled already?
> http://archives.postgresql.org/pgsql-hackers/2010-08/msg00151.php
>
>
Yes. This mistake has been corrected a few editions ago. Now, the actions
replaced by INSTEAD rules will still catch tuples but do nothing for them.

> --
>  Heikki Linnakangas
>
>  EnterpriseDB   http://www.enterprisedb.com
>

Re: [HACKERS] patch: utf8_to_unicode (trivial)

2010-08-13 Thread Joseph Adams

On Tue, Jul 27, 2010 at 1:31 PM, Robert Haas  wrote:
> On Sat, Jul 24, 2010 at 10:34 PM, Joseph Adams
>  wrote:
>> In src/include/mb/pg_wchar.h , there is a function unicode_to_utf8 ,
>> but no corresponding utf8_to_unicode .  However, there is a static
>> function called utf2ucs that does what utf8_to_unicode would do.
>>
>> I'd like this function to be available because the JSON code needs to
>> convert UTF-8 to and from Unicode codepoints, and I'm currently using
>> a separate UTF-8 to codepoint function for that.
>>
>> This patch renames utf2ucs to utf8_to_unicode and makes it public.  It
>> also fixes the version of utf2ucs in  src/bin/psql/mbprint.c so that
>> it's equivalent to the one in wchar.c .
>>
>> This is a patch against CVS HEAD for application.  It compiles and
>> tests successfully.
>>
>> Comments?  Thanks,
>
> I feel obliged to respond this since I'm supposed to be covering your
> GSoC project while Magnus is on vacation, but I actually know very
> little about this topic.  What's undeniable, however, is that the
> coding in the two versions of utf8ucs() in the tree right now don't
> match.  src/backend/utils/mb/wchar.c has:
>
>        else if ((*c & 0xf8) == 0xf0)
>
> while src/bin/psql/mbprint.c, which is otherwise identical, has:
>
>        else if ((*c & 0xf0) == 0xf0)
>
> I'm inclined to believe that your patch is right to think that the
> former version is correct, because it used to match the latter version
> until Tom Lane changed it in 2007, and I suspect he simply failed to
> update both copies.  But I'd like someone who actually understands
> what this code is doing to confirm that.
>
> http://archives.postgresql.org/pgsql-committers/2007-01/msg00293.php
>
> I suspect we need to not only fix this, but back-patch it at least to
> 8.2, which is the first release where there are two copies of this
> function.  I am not sure whether earlier releases need to be changed,
> or not.  But again, someone who understands the issues better than I
> do needs to weigh in here.
>
> In terms of making this function non-static, I'm inclined to think
> that a better approach would be to move it to src/port.  That gets rid
> of the need to have two copies in the first place.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise Postgres Company
>

I've attached another patch that moves utf8_to_unicode to src/port per
Robert Haas's suggestion.

This patch itself is not quite as elegant as the first one because it
puts platform-independent code that "belongs" in wchar.c into src/port
.  It also uses unsigned int instead of pg_wchar because the typedef
of pg_wchar isn't available to the frontend, if I'm not mistaken.

I'm not sure whether I like the old patch better or the new one.


Joey Adams


utf8-to-unicode-port.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

57 matches

Mail list logo