date:20100514

Re: [HACKERS] List traffic

2010-05-14 Thread Rob Wultsch

> Linux has *as many if not more* ... MySQL, if memory servers, has a half
> dozen or more ... etc ...

MySQL has a bunch of lists, none of which get much traffic. Honestly,
they should probably be combined.

-- 
Rob Wultsch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] List traffic

2010-05-14 Thread Marc G. Fournier


On Sat, 15 May 2010, Jaime Casanova wrote:


On Fri, May 14, 2010 at 8:39 AM, Marc G. Fournier  wrote:


And IMHO, that is as much a fault of the 'old timers' on the lists as the
newbies ... if nobody redirects / loosely enforces the mandates of the
various lists, newbies aren't going to learn to post to more appropriate
ones ...



oh! yeah! that's easy... you say: hey maybe that list is better for
your question... and suddenly you're a piece of crap that should never
answer a mail

most people are not prepared to understand the concept of more than
one list for project...


Apparently you don't use very many large projects ... FreeBSD has 20+ 
lists, dedicated to various aspects of both end user and developer ... I 
imagine Linux has *as many if not more* ... MySQL, if memory servers, has 
a half dozen or more ... etc ...



Marc G. FournierHub.Org Hosting Solutions S.A.
scra...@hub.org http://www.hub.org

Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Japanies translation breaks solaris build

2010-05-14 Thread Tom Lane

Peter Eisentraut  writes:
> On fre, 2010-05-14 at 15:04 +0200, Zdenek Kotala wrote:
>>> The problem is that it contains mix of DOS/Unix end of lines.

> I have added a check to the admin scripts to prevent this in the future.

I wonder if we shouldn't be trying to prevent this at the CVS-checkin
level.  We've had problems with people committing DOS-ish newlines
many times before.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] predefined macros for various BSD-based systems?

2010-05-14 Thread Peter Eisentraut

On lör, 2010-05-15 at 00:23 -0400, Robert Haas wrote:
> It's a commercial distribution of BSD.  I remember it being pretty
> nice when I used it 10+ years ago, but it sounds like it's dead now.

BSDI is the company that produced BSD/OS, which was Bruce's main
development environment at some point, which is why it has left
excruciating traces all over the PostgreSQL source.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] predefined macros for various BSD-based systems?

2010-05-14 Thread Peter Eisentraut

On lör, 2010-05-15 at 00:15 -0400, Tom Lane wrote:
> I suppose that at least some of the *BSD herd really do predefine some
> of the symbols being attributed to them here, but I would like to see
> something authoritative about which and what.

http://www.freebsd.org/doc/en_US.ISO8859-1/books/porters-handbook/porting-versions.html


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Japanies translation breaks solaris build

2010-05-14 Thread Peter Eisentraut

On fre, 2010-05-14 at 15:04 +0200, Zdenek Kotala wrote:
> Takahiro Itagaki píše v pá 14. 05. 2010 v 19:38 +0900:
> > Zdenek Kotala  wrote:
> > 
> > > http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=comet_moth&dt=2010-05-13%2021:06:01
> > > The problem is that it contains mix of DOS/Unix end of lines.
> > 
> > I removed two CRs in ja.po.
> 
> Thanks. Gothic moth is green again
> 
> http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=gothic_moth&br=HEAD
> 
> The rest solaris machine will recover during a night.

I have added a check to the admin scripts to prevent this in the future.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronous replication patch built on SR

2010-05-14 Thread Boszormenyi Zoltan

Robert Haas írta:
> On Fri, May 14, 2010 at 9:33 AM, Boszormenyi Zoltan  wrote:
>   
>> If  min_sync_replication_clients == 0, then the replication is async.
>> If  min_sync_replication_clients == max_wal_senders then the
>> replication is fully synchronous.
>> If 0 < min_sync_replication_clients < max_wal_senders then
>> the replication is partially synchronous, i.e. the master can wait
>> only for say, 50% of the clients to report back before it's considered
>> synchronous and the relevant transactions get released from the wait.
>> 
>
> That's an interesting design and in some ways pretty elegant, but it
> rules out some things that people might easily want to do - for
> example, synchronous replication to the other server in the same data
> center that acts as a backup for the master; and asynchronous
> replication to a reporting server located off-site.
>   

No, it doesn't. :-) You didn't take into account the third knob
usable in recovery.conf:
synchronous_slave = on/off
The off-site reporting server can be an asynchronous standby,
while the on-site backup server can be synchronous. The only thing
you need to take into account is that min_sync_replication_clients
shouldn't ever exceed your actual number of synchronous standbys.
The setup these three knobs provide is pretty flexible I think.

> One of the things that I think we will probably need/want to change
> eventually is the fact that the master has no real knowledge of who
> the replication slaves are.

The changes I made in my patch partly changes that,
the server still doesn't know "who" the standbys are
but there's a call that returns the number of connected
_synchronous_ standbys.

>   That might be something we want to change
> in order to be able to support more configurability.  Inventing syntax
> out of whole cloth and leaving semantics to the imagination of the
> reader:
>
> CREATE REPLICATION SLAVE reporting_server (mode asynchronous, xid_feedback 
> on);
> CREATE REPLICATION SLAVE failover_server (mode synchronous,
> xid_feedback off, break_synchrep_timeout 30);
>
>   


-- 
Bible has answers for everything. Proof:
"But let your communication be, Yea, yea; Nay, nay: for whatsoever is more
than these cometh of evil." (Matthew 5:37) - basics of digital technology.
"May your kingdom come" - superficial description of plate tectonics

--
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
http://www.postgresql.at/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] List traffic

2010-05-14 Thread Jaime Casanova

On Fri, May 14, 2010 at 8:39 AM, Marc G. Fournier  wrote:
>
> And IMHO, that is as much a fault of the 'old timers' on the lists as the
> newbies ... if nobody redirects / loosely enforces the mandates of the
> various lists, newbies aren't going to learn to post to more appropriate
> ones ...
>

oh! yeah! that's easy... you say: hey maybe that list is better for
your question... and suddenly you're a piece of crap that should never
answer a mail

most people are not prepared to understand the concept of more than
one list for project... what i personally do in the spanish list is to
read (and when i can) answer questions that have the less or none
answers first, then those that Alvaro has not commented yet and last
if i have time the other ones and then i read the subjects of the
threads in the other lists if something pop up read the thread and
"mark as read" everything else

-- 
Jaime Casanova www.2ndQuadrant.com
Soporte y capacitación de PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] predefined macros for various BSD-based systems?

2010-05-14 Thread Robert Haas

On Sat, May 15, 2010 at 12:37 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Sat, May 15, 2010 at 12:15 AM, Tom Lane  wrote:
>>> I'm not even too sure what "bsdi" is, but I'm suspicious of that branch
>>> too.  A search of our code finds
>
>> It's a commercial distribution of BSD.  I remember it being pretty
>> nice when I used it 10+ years ago, but it sounds like it's dead now.
>
> Um ... so do you remember which symbol they predefined?  It's pretty
> lame that we can't even spell it consistently in one source file.
> (Or, if BSDI is so dead that no one remembers, shouldn't we rip out
> those #if branches?)

No clue.  I agree.  (Yes.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] predefined macros for various BSD-based systems?

2010-05-14 Thread Tom Lane

Robert Haas  writes:
> On Sat, May 15, 2010 at 12:15 AM, Tom Lane  wrote:
>> I'm not even too sure what "bsdi" is, but I'm suspicious of that branch
>> too.  A search of our code finds

> It's a commercial distribution of BSD.  I remember it being pretty
> nice when I used it 10+ years ago, but it sounds like it's dead now.

Um ... so do you remember which symbol they predefined?  It's pretty
lame that we can't even spell it consistently in one source file.
(Or, if BSDI is so dead that no one remembers, shouldn't we rip out
those #if branches?)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] predefined macros for various BSD-based systems?

2010-05-14 Thread Robert Haas

On Sat, May 15, 2010 at 12:15 AM, Tom Lane  wrote:
> I'm not even too sure what "bsdi" is, but I'm suspicious of that branch
> too.  A search of our code finds

It's a commercial distribution of BSD.  I remember it being pretty
nice when I used it 10+ years ago, but it sounds like it's dead now.
Too bad.

http://en.wikipedia.org/wiki/Berkeley_Software_Design

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] predefined macros for various BSD-based systems?

2010-05-14 Thread Tom Lane

The recently added contrib/pg_upgrade code contains this bit:

/*
 * scandir() is originally from BSD 4.3, which had the third argument as
 * non-const. Linux and other C libraries have updated it to use a const.
 * 
http://unix.derkeiler.com/Mailing-Lists/FreeBSD/questions/2005-12/msg00214.html
 *
 * Here we try to guess which libc's need const, and which don't. The net
 * goal here is to try to suppress a compiler warning due to a prototype
 * mismatch of const usage. Ideally we would do this via autoconf, but
 * autoconf doesn't have a suitable builtin test and it seems overkill
 * to add one just to avoid a warning.
 */
#elif defined(freebsd) || defined(bsdi) || defined(__darwin__) || 
defined(openbsd)
/* no const */
return scandir(dirname, namelist, (int (*) (struct dirent *)) selector, 
NULL);
#else
/* use const */
return scandir(dirname, namelist, selector, NULL);

This drew my attention a couple days ago because it was picking the
wrong alternative on OS X, which was because the as-committed coding
was "defined(darwin)", which is not how that symbol is spelled.  I fixed
that, but I am now thinking that the other three checks are equally
tin-eared.  In particular, I see that buildfarm members ermine (FreeBSD)
and spoonbill (OpenBSD) are reporting warnings here, which proves that
those two platforms don't predefine "freebsd" or "openbsd" respectively.
Does anyone know if they define "__freebsd__" or "__freebsd" etc?

I'm not even too sure what "bsdi" is, but I'm suspicious of that branch
too.  A search of our code finds

contrib/pg_upgrade/file.c: 248: #elif defined(freebsd) || defined(bsdi) || 
defined(__darwin__) || defined(openbsd)
src/backend/utils/misc/ps_status.c: 67: #elif (defined(BSD) || 
defined(__bsdi__) || defined(__hurd__)) && !defined(__darwin__)
src/include/port.h: 355: #if defined(bsdi) || defined(netbsd)
src/port/fseeko.c: 20: #if defined(__bsdi__) || defined(__NetBSD__)
src/port/fseeko.c: 24: #ifdef bsdi
src/port/fseeko.c: 47: #ifdef bsdi
src/port/fseeko.c: 55: #ifdef bsdi
src/port/fseeko.c: 66: #ifdef bsdi
src/port/fseeko.c: 76: #ifdef bsdi
src/port/fseeko.c: 87: #ifdef bsdi

which leaves one with not a lot of warm fuzzies that we know how to
spell the symbol for either bsdi or netbsd.  (Oh, and shouldn't
this pg_upgrade check be looking for netbsd too?)

In the "darwin" case we aren't really making any assumptions,
because we actually define __darwin__ in port/darwin.h.

I suppose that at least some of the *BSD herd really do predefine some
of the symbols being attributed to them here, but I would like to see
something authoritative about which and what.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON manipulation functions

2010-05-14 Thread Pavel Stehule

2010/5/14 Tom Lane :
> Robert Haas  writes:
>> On Thu, May 13, 2010 at 9:47 PM, Joseph Adams
>>  wrote:
>>> [] retrieves a value of a JSON array/object by (one-based) index.  In
>>> other words, value[n] is equivalent to selecting the nth row of
>>> json_values(value) (provided value is of type JSON).  Examples:
>>>
>>> SELECT ('[1,2,3,4]'::JSON)[3]
>>> -- Result is '3'::JSON
>>>
>>> SELECT ('{"a": 1, "b": 2, "c": 3, "d": 4}'::JSON)[3]
>>> -- Result is '3'::JSON
>
>> I think some kind of array deference and object deference mechanism is
>> absolutely, positively 100% required.  I don't know whether the
>> particular syntax you've proposed here is best or whether we should
>> pick another syntax or just use function notation, but I think we
>> definitely need *something*.
>
> Trying to use array notation on something that isn't a SQL array type
> is guaranteed to be a mess.  I strongly recommend that you not attempt
> that.  Just define a function for it.

I agree. Maybe you can implement cast to hstore datatype.

Regards

Pavel Stehule

>
>                        regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] List traffic

2010-05-14 Thread Marc G. Fournier



[moved to -chat]

On Fri, 14 May 2010, Kevin Grittner wrote:

I think that's exactly backwards -- we shouldn't have any traffic on 
-general for issues which could reasonably happen in another list. You 
can always configure your email to combine lists into a common folder 
upon receipt.


*Exactly* ... the thought that we should increase the volume on any one of 
the lists seems counter-productive, but, I guess is the @postgresql.org 
mailing lists are the only ones that someone participats into, maybe they 
hae enough time to keep up on *all* of the email ... ?





Marc G. FournierHub.Org Hosting Solutions S.A.
scra...@hub.org http://www.hub.org

Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON manipulation functions

2010-05-14 Thread Robert Haas

On Fri, May 14, 2010 at 10:35 PM, Joseph Adams
 wrote:
> By the way, I'm considering making it so JSON arrays will be treated
> like objects when it comes to -> and the json_keys function.  Thus,
> json_keys('[1,4,9,16,25]') would yield '{1,2,3,4,5}', and
> ('[1,4,9,16,25]'::JSON) -> 3 would yield the third item.  This would
> obviate the need for an array-only subscript function/operator.
>
> In general, I prefer zero-based counting, but because PostgreSQL's
> array indexing is one-based, one-based array keys would be better for
> the sake of consistency.  Note that if there was a function like this
> in the future:
>
> -- Access a JSON object like you would in JavaScript
> json_path('{"squares": [1,4,9,16,25]}', '.squares[2]')
>
> There could be confusion, as JavaScript uses zero-based indexing.

I think you should take Tom's suggestion and use functional notation
rather than operator notation.  And then I think you should use
0-based counting to match JS.  But I'm game to be outvoted if others
disagree.  Basically, I think you're right: it will be confusing to
have two different notations, and we're certainly going to want a JS
equivalent of XPath at some point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON manipulation functions

2010-05-14 Thread Joseph Adams

On Fri, May 14, 2010 at 11:33 AM, Bruce Momjian  wrote:
> Joseph Adams wrote:
>> == array/object conversion ==
>>
>> The json_object function converts a tuple to a JSON object.  If there
>> are duplicate column names, there will be duplicate keys in the
>> resulting JSON object.
>>
>> json_object([content [AS name] [, ...]]) returns json
>>
>> Likewise, the json_array function converts a tuple to a JSON array.
>> Column names are ignored.
>>
>> json_array([content [AS name] [, ...]]) returns json
>
> Do you see any problems with the fact that JSON arrays can use mixed
> data types, e.g.:
>
>        [ 1, 2, 'hi', false]

I suppose the json_object and json_array functions would determine
which JSON types to employ by looking at the types of arguments given
(TEXT values would become strings, INT/FLOAT/NUMERIC/etc. values would
become numbers, TRUE/FALSE would become true/false, NULLS would just
be null, and JSON values would just be inserted as themselves).  Note
that json_array('"Hello"'::TEXT) would yield '["\"Hello\""]'::JSON,
while json_array('"Hello"'::JSON) would yield '["Hello"]' .

Going the other way around, values pulled out of JSON objects and
arrays would just be of type JSON.  This (revised) function signature
says it all:

json_values(JSON) returns JSON[]

In short, I don't believe mixed data types in arrays will be a
problem.  json_to_* and *_to_json functions would be used for
individual conversions.

On Fri, May 14, 2010 at 1:15 PM, Robert Haas  wrote:
>> json_keys gets the keys of a JSON object as a set.
>>
>> json_keys(json) returns setof text
>
> I would tend to make this return text[] rather than SETOF text.
>
>> json_values gets the values of a JSON object or the iems of a JSON
>> array as a set.
>>
>> json_values(json) returns setof json
>
> Similarly I would make this return json[].

Agreed.  For those who want sets, the unnest() function can be used.

>> -> retrieves an item of a JSON object by key.
>>[snip]
>> [] retrieves a value of a JSON array/object by (one-based) index.
>>[snip]
>
> I think some kind of array deference and object deference mechanism is
> absolutely, positively 100% required.  I don't know whether the
> particular syntax you've proposed here is best or whether we should
> pick another syntax or just use function notation, but I think we
> definitely need *something*.

If the dereferencing operations aren't available, one could work
around it by using json_keys/json_values.  Of course, it would be a
really clunky solution, and implementing -> will probably be easy
compared to implementing those functions.

> I also think we need a function called something like json_length()
> which returns the length of a list or the number of keys in an object.

Definitely.

By the way, I'm considering making it so JSON arrays will be treated
like objects when it comes to -> and the json_keys function.  Thus,
json_keys('[1,4,9,16,25]') would yield '{1,2,3,4,5}', and
('[1,4,9,16,25]'::JSON) -> 3 would yield the third item.  This would
obviate the need for an array-only subscript function/operator.

In general, I prefer zero-based counting, but because PostgreSQL's
array indexing is one-based, one-based array keys would be better for
the sake of consistency.  Note that if there was a function like this
in the future:

-- Access a JSON object like you would in JavaScript
json_path('{"squares": [1,4,9,16,25]}', '.squares[2]')

There could be confusion, as JavaScript uses zero-based indexing.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] underscore split to alias

2010-05-14 Thread Tom Lane

"Erik Rijkers"  writes:
> I am not sure this is a bug, but I was surprised by the following behaviour
> in HEAD and 8.4.4 (instances built today, 2010.05.14):

> Invalid (?) values like 123_456 are split before the underscore and 
> interpreted as
> 123 as "456":

All versions of postgres will parse 123_456 as an integer (123)
immediately followed by an identifier (_456).  In the particular context
that this is all of a top-level SELECT item, the SQL spec requires that
we parse this as an integer and a column alias (with an implied AS).
We failed to do that before 8.4, but now honor the spec requirement
that AS can be omitted.  Personally I think that's one of the stupider,
more error-prone aspects of the spec's syntax, but nonetheless it's
required by spec 

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] underscore split to alias

2010-05-14 Thread Erik Rijkers

I am not sure this is a bug, but I was surprised by the following behaviour
in HEAD and 8.4.4 (instances built today, 2010.05.14):

Invalid (?) values like 123_456 are split before the underscore and interpreted 
as
123 as "456":


$ psql -p 6591 -d testdb -c "select 123_456, current_setting('server_version')"
 _456 | current_setting
--+-
  123 | 9.0beta1
(1 row)


$ psql -p 6584 -d testdb -c "select 123_456, current_setting('server_version')"
 _456 | current_setting
--+-
  123 | 8.4.4
(1 row)


Older versions (8.3 thru 7.4) all give an error:


$ psql -p 6583 -d testdb -c "select 123_456, current_setting('server_version')"
ERROR:  syntax error at or near "_456"
LINE 1: select 123_456, current_setting('server_version')
  ^

The error given by the older servers seems justified.  I assume
this is a bug and not a deliberate change?


thanks,


Erik Rijkers





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parameter oddness; was HS/SR Assert server crash

2010-05-14 Thread Bruce Momjian

Kevin Grittner wrote:
> Bruce Momjian  wrote:
>  
> > One odd thing is we have two paramters that mention hot_standby
> > --- on the master we have to do in postgresql.conf:
> > 
> > wal_level = hot_standby
> > 
> > and on the slave we do in postgresql.conf:
> > 
> > hot_standby = on
> > 
> > That is a little confusing.
>  
> Why?  I read that as saying that the master is writing sufficient
> data into the WAL for it to be usable for hot standby purposes, and
> that that the slave machine is going to be used as a hot standby. 
> You'd better do the former if you're going to do the latter, but
> there were good reasons not to try to infer one setting from the
> other.

My point was that the string 'hot_standby' is both a variable and a
setting.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Generating Lots of PKs with nextval(): A Feature Proposal

2010-05-14 Thread Robert Haas

On Fri, May 14, 2010 at 6:11 PM, Tom Lane  wrote:
> Peter Crabtree  writes:
>> On Fri, May 14, 2010 at 5:29 PM, Robert Haas  wrote:
>>> If we do this, I'm inclined to think that the extra argument to
>>> nextval() should be treated as overriding the base increment rather
>>> than specifying a multiplier for it.  Other than that nitpick, it
>>> sounds like a reasonable thing to allow.
>
>> After giving it some thought, that sounds better. You gain some
>> functionality that way (temporarily overriding the interval) and lose
>> none.
>
> Well, what you lose is the previous assurance that values of nextval()
> were always multiples of the increment.  I could see that breaking
> applications that are using non-unity increments.

Err, right.  But those applications presumably will also not be using
this new behavior.  There are no versions of PG that have an extra
argument to nextval but still guarantee that the values of nextval()
are multiples of the increment.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] max_standby_delay considered harmful

2010-05-14 Thread Robert Haas

On Fri, May 14, 2010 at 5:51 PM, Josh Berkus  wrote:
>
>>> This would be OK as long as we document it well.  We patched the
>>> shutdown the way we did specifically because Fujii thought it would be
>>> an easy fix; if it's complicated, we should revert it and document the
>>> issue for DBAs.
>>
>> I don't understand this comment.
>
> In other words, I'm saying that it's not critical that we troubleshoot
> this for 9.0.  Revering Fujii's patch, if it's not working, is an option.

There is no patch which we could revert to fix this, by Fujii Masao or
anyone else.  The patch he proposed has not been committed.  I am
still studying the problem to try to figure out where to go with it.
We could decide to punt the whole thing for 9.1, but I'd like to
understand what the options are before we make that decision.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Generating Lots of PKs with nextval(): A Feature Proposal

2010-05-14 Thread Tom Lane

Peter Crabtree  writes:
> On Fri, May 14, 2010 at 5:29 PM, Robert Haas  wrote:
>> If we do this, I'm inclined to think that the extra argument to
>> nextval() should be treated as overriding the base increment rather
>> than specifying a multiplier for it. Â Other than that nitpick, it
>> sounds like a reasonable thing to allow.

> After giving it some thought, that sounds better. You gain some
> functionality that way (temporarily overriding the interval) and lose
> none.

Well, what you lose is the previous assurance that values of nextval()
were always multiples of the increment.  I could see that breaking
applications that are using non-unity increments.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Generating Lots of PKs with nextval(): A Feature Proposal

2010-05-14 Thread Peter Crabtree

On Fri, May 14, 2010 at 5:29 PM, Robert Haas  wrote:
> On Fri, May 14, 2010 at 5:04 PM, hubert depesz lubaczewski
>  wrote:
>> On Fri, May 14, 2010 at 02:07:27PM -0500, Kenneth Marshall wrote:
>>> Hi Peter,
>>>
>>> All you need to do is define your own sequence with an
>>> increment of 500. Look at:
>>>
>>> http://www.postgresql.org/docs/8.4/static/sql-createsequence.html
>>
>> This is often not enough. For example - I want standard increment of 1,
>> but right now I'm importing 1 objects, and it would be simpler for
>> me to get 1 ids. Preferably in one block.
>>
>> This is not achievable now. I know I can 'alter sequence set increment
>> by' - but this will also affect concurrent sessions. which might not be
>> a problem, but it's a side effect that I don't want.
>>
>> +1 for original proposition, would love to get it.
>
> If we do this, I'm inclined to think that the extra argument to
> nextval() should be treated as overriding the base increment rather
> than specifying a multiplier for it.  Other than that nitpick, it
> sounds like a reasonable thing to allow.
>

After giving it some thought, that sounds better. You gain some
functionality that way (temporarily overriding the interval) and lose
none.

Peter

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Generating Lots of PKs with nextval(): A Feature Proposal

2010-05-14 Thread Peter Crabtree

On Fri, May 14, 2010 at 5:27 PM, Tom Lane  wrote:
> Peter Crabtree  writes:
>> Now, I was reminded that I could simply do this:
>
>> SELECT nextval('my_seq') FROM generate_series(1, 500);
>
>> But of course then I would have no guarantee that I would get a
>> contiguous block of ids,
>
> The existing "cache" behavior will already handle that for you,
> I believe.  I don't really see a need for new features here.

I don't see how that works for this case, because the "cache" setting
is "static", and also shared between sessions. So if I have 10 records
one time, and 100 records the next, and 587 the third time, what
should my CACHE be set to for that sequence?

And if I do ALTER SEQUENCE SET CACHE each time, I have either killed
concurrency (because I'm locking other sessions out of using that
sequence until I'm finished with it), or I have a race condition (if
someone else issues an ALTER SEQUENCE before I call nextval()). The
same problem exists with using ALTER SEQUENCE SET INCREMENT BY.

Peter

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] max_standby_delay considered harmful

2010-05-14 Thread Josh Berkus


>> This would be OK as long as we document it well.  We patched the
>> shutdown the way we did specifically because Fujii thought it would be
>> an easy fix; if it's complicated, we should revert it and document the
>> issue for DBAs.
> 
> I don't understand this comment.

In other words, I'm saying that it's not critical that we troubleshoot
this for 9.0.  Revering Fujii's patch, if it's not working, is an option.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] max_standby_delay considered harmful

2010-05-14 Thread Robert Haas

On Thu, May 13, 2010 at 1:12 PM, Josh Berkus  wrote:
> On 5/12/10 8:07 PM, Robert Haas wrote:
>> I think that would be a good thing to check (it'll confirm whether
>> this is the same bug), but I'm not convinced we should actually fix it
>> that way.  Prior to 8.4, we handled a smart shutdown during recovery
>> at the conclusion of recovery, just prior to entering normal running.
>> I'm wondering if we shouldn't revert to that behavior in both 8.4 and
>> HEAD.
>
> This would be OK as long as we document it well.  We patched the
> shutdown the way we did specifically because Fujii thought it would be
> an easy fix; if it's complicated, we should revert it and document the
> issue for DBAs.

I don't understand this comment.

> Oh, and to confirm: the same issue exists, and has always existed, with
> Warm Standby.

That's what I was thinking, but I hadn't gotten around to testing it.
Thanks for the confirmation.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Generating Lots of PKs with nextval(): A Feature Proposal

2010-05-14 Thread Robert Haas

On Fri, May 14, 2010 at 5:04 PM, hubert depesz lubaczewski
 wrote:
> On Fri, May 14, 2010 at 02:07:27PM -0500, Kenneth Marshall wrote:
>> Hi Peter,
>>
>> All you need to do is define your own sequence with an
>> increment of 500. Look at:
>>
>> http://www.postgresql.org/docs/8.4/static/sql-createsequence.html
>
> This is often not enough. For example - I want standard increment of 1,
> but right now I'm importing 1 objects, and it would be simpler for
> me to get 1 ids. Preferably in one block.
>
> This is not achievable now. I know I can 'alter sequence set increment
> by' - but this will also affect concurrent sessions. which might not be
> a problem, but it's a side effect that I don't want.
>
> +1 for original proposition, would love to get it.

If we do this, I'm inclined to think that the extra argument to
nextval() should be treated as overriding the base increment rather
than specifying a multiplier for it.  Other than that nitpick, it
sounds like a reasonable thing to allow.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Generating Lots of PKs with nextval(): A Feature Proposal

2010-05-14 Thread Tom Lane

Peter Crabtree  writes:
> Now, I was reminded that I could simply do this:

> SELECT nextval('my_seq') FROM generate_series(1, 500);

> But of course then I would have no guarantee that I would get a
> contiguous block of ids,

The existing "cache" behavior will already handle that for you,
I believe.  I don't really see a need for new features here.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] recovery consistent != hot standby

2010-05-14 Thread Tom Lane

Robert Haas  writes:
> PM_RECOVERY_CONSISTENT -> PM_HOT_STANDBY
> PMSIGNAL_RECOVERY_CONSISTENT -> PMSIGNAL_BEGIN_HOT_STANDBY

+1.  From the point of view of the postmaster, whether the state
transition happens immediately upon reaching consistency, or at a
later time, or perhaps even earlier (if we could make that work)
is not relevant.  What's relevant is that it's allowed to let in
hot-standby backends.  So the current naming overspecifies the
meaning of the state and the transition event.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parameter oddness; was HS/SR Assert server crash

2010-05-14 Thread Kevin Grittner

Bruce Momjian  wrote:

> One odd thing is we have two paramters that mention hot_standby
> --- on the master we have to do in postgresql.conf:
> 
>   wal_level = hot_standby
> 
> and on the slave we do in postgresql.conf:
> 
>   hot_standby = on
> 
> That is a little confusing.

Why?  I read that as saying that the master is writing sufficient
data into the WAL for it to be usable for hot standby purposes, and
that that the slave machine is going to be used as a hot standby. 
You'd better do the former if you're going to do the latter, but
there were good reasons not to try to infer one setting from the
other.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parameter oddness; was HS/SR Assert server crash

2010-05-14 Thread Bruce Momjian

bruce wrote:
> > and my slave recovery.conf was:
> > 
> > restore_command = 'cp /u/pg/archive/%f %p'  # e.g. 'cp 
> > /mnt/server/archivedir/%f %p'
> > standby_mode = 'on'
> > primary_conninfo = 'host=localhost port=5432'   # e.g. 
> > 'host=localhost port=5432'
> > 
> > Let me know what additional information I can supply.
> 
> I saw Simon's commit fixing this bug.  Another good reason we didn't
> bundle 9.0 beta2 yesterday.

I can now confirm that Simon's patch fixes the bug.  (Unfortunately I
had a web cast yesterday about this feature and the bug prevented me
from completing the demonstration.)

FYI, my presentation shows all the steps necessary to setup HS/SR:

http://momjian.us/main/presentations/technical.html#hot_streaming

One odd thing is we have two paramters that mention hot_standby --- on
the master we have to do in postgresql.conf:

wal_level = hot_standby

and on the slave we do in postgresql.conf:

hot_standby = on

That is a little confusing.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] List traffic

2010-05-14 Thread Kevin Grittner

Greg Stark  wrote: 

> If they're interested in performance topics and they're not
> subscribed to -general then they're missing *most* of what they're
> interested in which doesn't take place on -performance.

Well, I for one can't currently suck the end of the fire hose which
is -general, and would be less able to do so should other lists be
folded into it.  So I lurk on -bugs, -performance, -admin, and
others -- not to glean information so much as to attempt to respond
in areas where I feel I might be able to be helpful and, with a bit
of luck, take some of the burden off of those who do the most to
help people on these lists.  Combining lists will only make it
harder for me to attempt to assist in this way.

> And most of what's on -performance ends up being non-performance
> related questions anyways.

I don't believe you.  Scanning this:

http://archives.postgresql.org/pgsql-performance/2010-05/index.php

I see a few non-performance questions, but they're clearly a small
fraction of the traffic.

> I think what I'm getting at is that we shouldn't have any lists
> for traffic which could reasonably happen on -general.

I think that's exactly backwards -- we shouldn't have any traffic on
-general for issues which could reasonably happen in another list. 
You can always configure your email to combine lists into a common
folder upon receipt.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Generating Lots of PKs with nextval(): A Feature Proposal

2010-05-14 Thread hubert depesz lubaczewski

On Fri, May 14, 2010 at 02:07:27PM -0500, Kenneth Marshall wrote:
> Hi Peter,
> 
> All you need to do is define your own sequence with an
> increment of 500. Look at:
> 
> http://www.postgresql.org/docs/8.4/static/sql-createsequence.html

This is often not enough. For example - I want standard increment of 1,
but right now I'm importing 1 objects, and it would be simpler for
me to get 1 ids. Preferably in one block.

This is not achievable now. I know I can 'alter sequence set increment
by' - but this will also affect concurrent sessions. which might not be
a problem, but it's a side effect that I don't want.

+1 for original proposition, would love to get it.

depesz

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

2010-05-14 Thread Robert Haas

On Thu, May 13, 2010 at 5:39 PM, Tom Lane  wrote:
> Florian Pflug  writes:
>> All in all, I believe that SHARE and UPDATE row-level locks should be
>> changed to cause concurrent UPDATEs to fail with a serialization
>> error.
>
> I don't see an argument for doing that for FOR SHARE locks, and it
> already happens for FOR UPDATE (at least if the row actually gets
> updated).  AFAICS this proposal mainly breaks things, in pursuit of
> an unnecessary and probably-impossible-anyway goal of making FK locking
> work with only user-level snapshots.

After giving this considerable thought and testing the behavior at
some length, I think the OP has it right.  One thing I sometimes need
to do is denormalize a copy of a field, e.g.

CREATE TABLE parent (id serial, mode integer not null, primary key (id));
CREATE TABLE child (id serial, parent_id integer not null references
parent (id), parent_mode integer not null);

The way I have typically implemented this in the past is:

1. Add a trigger to the parent table so that, whenever the mode column
gets updated, we do an update on the parent_mode of all children.
2. Add a trigger to the child table so that, when a new child is
inserted, it initializes parent_mode from its parent.  I do SELECT
with FOR UPDATE on the parent parent can't change under me; though FOR
SHARE ought to be enough also since we're just trying to lock out
concurrent updates.

Suppose T1 updates the parent's mode while T2 adds a new child; then
both commit.  In read committed mode, this seems to work OK regardless
of the order of T1 and T2.  If T1 grabs the lock first, then T2 sees
the updated version of the row after T1 commits.  If T2 grabs the lock
first, then the update on the parent blocks until the child commits.
Subsequently, when the trigger fires, it apparently uses an up-to-date
snapshot, so the new child is updated also.  In serializable mode,
things are not so good.  If T1 grabs the lock first, the child waits
to see whether it commits or aborts.  On commit, it complains that it
can't serialize and aborts, which is reasonable - transaction aborts
are the price you pay for serializability.  If T2 grabs the lock
first, the update on the parent blocks as before, but now the update
is done with the old snapshot and ignores the new child, so the new
child now has a value for parent_mode that doesn't match the parent's
actual mode.  That is, you get the wrong answer due to a serialization
anomaly that didn't existed at the read committed level.

Increasing the transaction isolation level is supposed to *eliminate*
serialization anomalies, not create them.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] List traffic

2010-05-14 Thread Robert Haas

On Thu, May 13, 2010 at 11:39 PM, Greg Smith  wrote:
> The only real argument to keep some more targeted lists is for the benefit
> of the people who subscribe to them, not we the faithful, so that they can
> have something that isn't a firehose of messages to sort through.  Is it
> helpful to novices that they can subscribe to a list when they won't be
> overwhelmed by traffic, and can ask questions without being too concerned
> about being harassed for being newbies?  Probably.  Are there enough people
> interesting in performance topics alone to justify a list targeted just to
> them?  Certainly; I was only on that list for a long time before joining any
> of the others.  Are the marketing oriented people subscribed only to
> advocacy and maybe announce happy to avoid the rest of the lists?  You bet.
>
> Folding, say, performance or admin into general, one idea that pops up
> sometimes, doesn't help those people--now they can only get the
> firehose--and it doesn't help me, either.  If you can keep up with general,
> whether or not the other lists are also included in that or not doesn't
> really matter.  Ditto for hackers and the things you might try and split out
> of it.  It's just going to end up with more cross posting, and the only
> thing I hate more than a mailbox full of messages is discovering a chunk of
> them are dupes because of that.

+1.

> I might like to see, for example, a user mailing list devoted strictly to
> replication/clustering work with PostgreSQL.  That's another topic I think
> that people are going to want to ask about more in the near future without
> getting overwhelmed.  But, again, that's for their benefit.  I'll have to
> subscribe to that, too, and in reality it will probably increase the amount
> of messages I read, because people will ask stuff there that's already been
> covered on other lists, and vice-versa.

Yep.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] List traffic

2010-05-14 Thread Marc G. Fournier


On Fri, 14 May 2010, Greg Stark wrote:


On Fri, May 14, 2010 at 4:39 AM, Greg Smith  wrote:

Is it
helpful to novices that they can subscribe to a list when they won't be
overwhelmed by traffic, and can ask questions without being too concerned
about being harassed for being newbies?  Probably.


Only if they aren't hoping to get answers... What percentage of the
hackers and experts who trawl -general for questions to answer are
subscribed to -novices?

-general isn't subscriber-only posts is it?


All our lists are, yes ... *but* ... the 'subscriber list' is cross list, 
in that if you are subscribed to one, you can post to all ...



If they're interested in performance topics and they're not subscribed
to -general then they're missing *most* of what they're interested in
which doesn't take place on -performance. And most of what's on
-performance ends up being non-performance related questions anyways.


And IMHO, that is as much a fault of the 'old timers' on the lists as the 
newbies ... if nobody redirects / loosely enforces the mandates of the 
various lists, newbies aren't going to learn to post to more 
appropriate ones ...


Personally, my experience with large lists is that if there is a smaller, 
more focused list, I'll post there first, to avoid being lost in the noise 
... and, I will re-post to a more general list *if* and only if I'm unable 
to get an answer from where I posted my original ...



Marc G. FournierHub.Org Hosting Solutions S.A.
scra...@hub.org http://www.hub.org

Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] List traffic

2010-05-14 Thread Greg Stark

On Fri, May 14, 2010 at 4:39 AM, Greg Smith  wrote:
> Is it
> helpful to novices that they can subscribe to a list when they won't be
> overwhelmed by traffic, and can ask questions without being too concerned
> about being harassed for being newbies?  Probably.

Only if they aren't hoping to get answers... What percentage of the
hackers and experts who trawl -general for questions to answer are
subscribed to -novices?

-general isn't subscriber-only posts is it?

> Are there enough people
> interesting in performance topics alone to justify a list targeted just to
> them?  Certainly; I was only on that list for a long time before joining any
> of the others.

If they're interested in performance topics and they're not subscribed
to -general then they're missing *most* of what they're interested in
which doesn't take place on -performance. And most of what's on
-performance ends up being non-performance related questions anyways.

I think what I'm getting at is that we shouldn't have any lists for
traffic which could reasonably happen on -general. If it's a usage
question about postgres then it belongs in the same place regardless
of what feature or aspect of the usage it is -- otherwise it'll always
be some random subset of the relevant messages.

This won't actually cut down on list traffic for me and Simon but it
would help get people answers since they'll be posting to the same
place as everyone else.

> Are the marketing oriented people subscribed only to
> advocacy and maybe announce happy to avoid the rest of the lists?  You bet.

Well yeah. This is an actual discernible distinction. As evidence
I'llnote that there is no advocacy traffic on -general or other
mailing lists.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] List traffic

2010-05-14 Thread Greg Smith


Tom Lane wrote:

I can see the need for small tightly-focused special lists.


How about a list devoted to discussions about reorganizing the lists?  
It would get plenty of traffic, and then I could not subscribe to that 
and have that many less messages to read.


There is only one viable solution to reduce list traffic:  ban forever 
everyone who top-posts or doesn't trim what they quote.  Maybe some 
other old-school Usenet rules too--can we ban those with incorrectly 
formatted signatures and finally add proper bozo tagging?  Praise Kibo.


Seriously though, I file admin/general/performance into one user 
oriented folder, hackers/committers into a second, and all the non-code 
ones (advocacy, www, docs) into a third.  I don't think there's any way 
to restructure those lists that will improve life for people who try to 
keep up with most of them.  I was traveling yesterday and busy today, 
and now I'm 350 messages behind.  No amount of rijiggering the lists 
will change the fact that there's just that much activity happening 
around PostgreSQL.  You can move the messages around, but the same 
number are going to show up, and people who want to keep up with 
everything will have to cope with that.  The best you can do is get 
better support in your mail program for wiping out whole threads at 
once, once you've realized you're not interested in them.


The only real argument to keep some more targeted lists is for the 
benefit of the people who subscribe to them, not we the faithful, so 
that they can have something that isn't a firehose of messages to sort 
through.  Is it helpful to novices that they can subscribe to a list 
when they won't be overwhelmed by traffic, and can ask questions without 
being too concerned about being harassed for being newbies?  Probably.  
Are there enough people interesting in performance topics alone to 
justify a list targeted just to them?  Certainly; I was only on that 
list for a long time before joining any of the others.  Are the 
marketing oriented people subscribed only to advocacy and maybe announce 
happy to avoid the rest of the lists?  You bet.


Folding, say, performance or admin into general, one idea that pops up 
sometimes, doesn't help those people--now they can only get the 
firehose--and it doesn't help me, either.  If you can keep up with 
general, whether or not the other lists are also included in that or not 
doesn't really matter.  Ditto for hackers and the things you might try 
and split out of it.  It's just going to end up with more cross posting, 
and the only thing I hate more than a mailbox full of messages is 
discovering a chunk of them are dupes because of that.


I might like to see, for example, a user mailing list devoted strictly 
to replication/clustering work with PostgreSQL.  That's another topic I 
think that people are going to want to ask about more in the near future 
without getting overwhelmed.  But, again, that's for their benefit.  
I'll have to subscribe to that, too, and in reality it will probably 
increase the amount of messages I read, because people will ask stuff 
there that's already been covered on other lists, and vice-versa.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Add SIGCHLD catch to psql

2010-05-14 Thread Bruce Momjian

Stephen Frost wrote:
-- Start of PGP signed section.
> Greetings,
> 
>   Toying around with FETCH_COUNT today, I discovered that it didn't do
>   the #1 thing I really wanted to use it for- query large tables without
>   having to worry about LIMIT to see the first couple hundred records.
>   The reason is simple- psql ignores $PAGER exiting, which means that it
>   will happily continue pulling down the entire large table long after
>   you've stopped caring, which means you still have to wait forever.
> 
>   The attached, admittedly quick hack, fixes this by having psql catch
>   SIGCHLD's using handle_sigint.  I've tested this and it doesn't
>   appear to obviously break other cases where we have children (\!, for
>   example), since we're not going to be running a database query when
>   we're doing those, and if we are, and the child dies, we probably want
>   to *stop* anyway, similar to the $PAGER issue.
> 
>   Another approach that I considered was fixing various things to deal
>   cleanly with write's failing to $PAGER (I presume the writes *were*
>   failing, since less was in a defunct state, but I didn't actually
>   test).  This solution was simpler, faster to code and check, and alot
>   less invasive (or so it seemed to me at the time).
> 
>   Anyway, this makes FETCH_COUNT alot more useful, and, in my view, the
>   current behaviour of completely ignoring $PAGER exiting is a bug.

Plesae add this to the next commit-fest:

https://commitfest.postgresql.org/action/commitfest_view/inprogress

Thanks.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] List traffic

2010-05-14 Thread Marc G. Fournier


On Fri, 14 May 2010, Bruce Momjian wrote:

FYI, I usually email new people privately that cross-posting a question 
can cause the question to be ignored.  They usually respond positively 
and avoid it in the future.


We all have our own methods ... for instance, I just CC'd this to -chat 
with a -BCC to -hackers so that follow ups will go over there (since Josh 
is right, this thread doesn't belong on -hackers) ...



Marc G. FournierHub.Org Hosting Solutions S.A.
scra...@hub.org http://www.hub.org

Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] List traffic

2010-05-14 Thread Bruce Momjian

Kevin Grittner wrote:
> Tom Lane  wrote:
>  
> > I can't imagine that there's not going to need to be a "catchall"
> > list for problems that don't fit into any of the subcategories.
> > 
> > More generally, we already have most of the lists that you
> > suggest, and we already know that people frequently don't find the
> > most appropriate list for postings.  I don't think getting rid of
> > -general would help that in the least.  The way to cut down on
> > misposted traffic is to make the set of categories smaller and
> > simpler, not to redouble our efforts to persuade people to use the
> > same or even more categories.
>  
> Well, redoubling our current efforts to direct people to more
> specific lists would accomplish nothing, since doubling zero leaves
> you with zero.  The description of -general includes:
>  
> | General discussion area for users. Apart from compile, acceptance
> | test, and bug problems, most new users will probably only be
> | interested in this mailing list
>  
> Given that, the fact that -admin, -novice, -sql, and -performance
> collectively get as many posts as -general suggests that people are,
> in fact, making some effort to find a list which seems a good fit. 
> Perhaps if the description of -general was changed to suggest it
> *was* a catch-all for posts which don't fit the other lists, things
> would improve.

FYI, I usually email new people privately that cross-posting a question
can cause the question to be ignored.  They usually respond positively
and avoid it in the future.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronous replication patch built on SR

2010-05-14 Thread Robert Haas

On Fri, May 14, 2010 at 9:33 AM, Boszormenyi Zoltan  wrote:
> If  min_sync_replication_clients == 0, then the replication is async.
> If  min_sync_replication_clients == max_wal_senders then the
> replication is fully synchronous.
> If 0 < min_sync_replication_clients < max_wal_senders then
> the replication is partially synchronous, i.e. the master can wait
> only for say, 50% of the clients to report back before it's considered
> synchronous and the relevant transactions get released from the wait.

That's an interesting design and in some ways pretty elegant, but it
rules out some things that people might easily want to do - for
example, synchronous replication to the other server in the same data
center that acts as a backup for the master; and asynchronous
replication to a reporting server located off-site.

One of the things that I think we will probably need/want to change
eventually is the fact that the master has no real knowledge of who
the replication slaves are.  That might be something we want to change
in order to be able to support more configurability.  Inventing syntax
out of whole cloth and leaving semantics to the imagination of the
reader:

CREATE REPLICATION SLAVE reporting_server (mode asynchronous, xid_feedback on);
CREATE REPLICATION SLAVE failover_server (mode synchronous,
xid_feedback off, break_synchrep_timeout 30);

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] List traffic

2010-05-14 Thread Selena Deckelmann

On Fri, May 14, 2010 at 9:51 AM, Josh Berkus  wrote:

> Second, regarding advocacy: no, absolutely not.  -advocacy is a working list
> and not a virtual water cooler.

+1. I would find it very difficult to manage having -advocacy thrown
into -general.

If folks think that information isn't getting wide enough
distribution, that's one thing. But it is an important working group,
even if there's not a ton of traffic all the time on it.

-selena

-- 
http://chesnok.com/daily - me

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] List traffic

2010-05-14 Thread Marc G. Fournier


On Fri, 14 May 2010, Josh Berkus wrote:

First off, this is absolutely the wrong list to be discussing management of 
PostgreSQL lists.  That belongs on pgsql-www.


Actually, this is as good a list as any ... -www is for WWW related 
issues, not mailing list ... be as inappropriate there as it would be on 
sysadmins, which also doesn't cover mailing lists ...


Second, regarding advocacy: no, absolutely not.  -advocacy is a working list 
and not a virtual water cooler.


BTW, and even I totally forgot about it, but we do have a virtual water 
cooler already: pgsql-chat ... 224 subscribers currently, just nobody uses 
it ...


In fact, I just removed / changed to BCC -hackers so that all further 
discussions on this part of the thread will be on -chat ...



Marc G. FournierHub.Org Hosting Solutions S.A.
scra...@hub.org http://www.hub.org

Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON manipulation functions

2010-05-14 Thread Tom Lane

Robert Haas  writes:
> On Thu, May 13, 2010 at 9:47 PM, Joseph Adams
>  wrote:
>> [] retrieves a value of a JSON array/object by (one-based) index.  In
>> other words, value[n] is equivalent to selecting the nth row of
>> json_values(value) (provided value is of type JSON).  Examples:
>> 
>> SELECT ('[1,2,3,4]'::JSON)[3]
>> -- Result is '3'::JSON
>> 
>> SELECT ('{"a": 1, "b": 2, "c": 3, "d": 4}'::JSON)[3]
>> -- Result is '3'::JSON

> I think some kind of array deference and object deference mechanism is
> absolutely, positively 100% required.  I don't know whether the
> particular syntax you've proposed here is best or whether we should
> pick another syntax or just use function notation, but I think we
> definitely need *something*.

Trying to use array notation on something that isn't a SQL array type
is guaranteed to be a mess.  I strongly recommend that you not attempt
that.  Just define a function for it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Generating Lots of PKs with nextval(): A Feature Proposal

2010-05-14 Thread Kenneth Marshall

Hi Peter,

All you need to do is define your own sequence with an
increment of 500. Look at:

http://www.postgresql.org/docs/8.4/static/sql-createsequence.html

Regards,
Ken

On Fri, May 14, 2010 at 02:56:18PM -0400, Peter Crabtree wrote:
> Recently, in preparation for migrating an application to postgres, I
> got to this part of the manual (which is *excellent* so far, by the
> way):
> 
> http://www.postgresql.org/docs/8.4/interactive/functions-sequence.html
> 
> A quick check with the folks on #postgresql confirmed my
> understanding, which was that the locking semantics of setval() and
> nextval() make this unsafe:
> 
> SELECT setval('my_seq', nextval('my_seq') + 500);
> 
> Now, I was reminded that I could simply do this:
> 
> SELECT nextval('my_seq') FROM generate_series(1, 500);
> 
> But of course then I would have no guarantee that I would get a
> contiguous block of ids, which means if I'm using this to do a mass
> insert of records which refer to each others' ids (example: storing a
> directed, linear graph), I either have to do a correlated update on
> the client side, after transferring the keys (consider the cost of
> doing this for a few million records - 4 MB in keys per million
> records, for, in extreme cases, 12 MB of data to be inserted -- 33%
> overhead in the worst case, presuming symmetric bandwidth), or I have
> to insert into a temporary table, then have the db backend do the
> update, then insert from there to the real table. Both are imperfect
> options in terms of performance and complexity.
> 
> Thus, before I start work on it, I propose an extension to the current
> nextval():
> 
> SELECT nextval('my_seq', 500);
> 
> This would increment the my_seq sequence by its interval * 500, and
> return the first valid key. This both makes client code that needs a
> bunch of PKs simpler to implement, and saves in performance, since the
> client can just replace all its PKs (presuming they're currently a
> contiguous block from 1 to n) with my_starting_pk + current_pk, so
> this:
> 
>  pk | next_node
> +---
>   0 | 1
>   1 | 2
>   2 | 0
> 
> can be easily updated like this:
> 
> SELECT nextval('my_seq', (SELECT count(*) FROM my_table));
> UPDATE my_table SET pk = currval('my_seq') + pk, next_node =
> currval('my_seq') + next_node;
> 
> to something like this:
> 
>pk   | next_node
> +--
>  521650 |521651
>  521651 |521652
>  521652 |521650
> 
> This is a net gain of performance and ease of implementation in many
> cases where a large number of ids from a sequence are needed -- with a
> small added benefit of the keys being guaranteed to be contiguous.
> 
> I don't see any technical problems with this; postgres already can
> "pre-allocate" more than one key, but the number is semi-static (the
> CACHE parameter to CREATE SEQUENCE). This might break existing user
> code if they've defined a nextval(regclass, integer), but I don't see
> any way to
> 
> Finally, I've checked sequence.c -- this looks pretty straightforward
> to implement, but I figured checking with this list was wise before
> starting work. Apologies if I've been overly wordy.
> 
> Peter
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
> 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fillfactor gets set to zero for toast tables

2010-05-14 Thread Tom Lane

Alvaro Herrera  writes:
> Excerpts from Tom Lane's message of vie may 14 14:19:30 -0400 2010:
>> What seems more rational is to provide toast.fillfactor but give it
>> min/max/default = 100.

> Adding an entry to intRelOpts should be enough to fix this bug, correct?

I think so, but haven't tested.  The docs would need some correction
perhaps.

BTW, I notice that the code allows people to fool with
autovacuum_analyze_threshold and related parameters for toast tables,
which seems rather pointless since we never analyze toast tables.
So the fillfactor isn't really the only instance of the issue.

Maybe a better solution is to have some kind of notion of a default-only
entry, which is sufficient to insert the default into the struct but
isn't accepted as a user-settable item.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Generating Lots of PKs with nextval(): A Feature Proposal

2010-05-14 Thread Peter Crabtree

Recently, in preparation for migrating an application to postgres, I
got to this part of the manual (which is *excellent* so far, by the
way):

http://www.postgresql.org/docs/8.4/interactive/functions-sequence.html

A quick check with the folks on #postgresql confirmed my
understanding, which was that the locking semantics of setval() and
nextval() make this unsafe:

SELECT setval('my_seq', nextval('my_seq') + 500);

Now, I was reminded that I could simply do this:

SELECT nextval('my_seq') FROM generate_series(1, 500);

But of course then I would have no guarantee that I would get a
contiguous block of ids, which means if I'm using this to do a mass
insert of records which refer to each others' ids (example: storing a
directed, linear graph), I either have to do a correlated update on
the client side, after transferring the keys (consider the cost of
doing this for a few million records - 4 MB in keys per million
records, for, in extreme cases, 12 MB of data to be inserted -- 33%
overhead in the worst case, presuming symmetric bandwidth), or I have
to insert into a temporary table, then have the db backend do the
update, then insert from there to the real table. Both are imperfect
options in terms of performance and complexity.

Thus, before I start work on it, I propose an extension to the current
nextval():

SELECT nextval('my_seq', 500);

This would increment the my_seq sequence by its interval * 500, and
return the first valid key. This both makes client code that needs a
bunch of PKs simpler to implement, and saves in performance, since the
client can just replace all its PKs (presuming they're currently a
contiguous block from 1 to n) with my_starting_pk + current_pk, so
this:

 pk | next_node
+---
  0 | 1
  1 | 2
  2 | 0

can be easily updated like this:

SELECT nextval('my_seq', (SELECT count(*) FROM my_table));
UPDATE my_table SET pk = currval('my_seq') + pk, next_node =
currval('my_seq') + next_node;

to something like this:

   pk   | next_node
+--
 521650 |521651
 521651 |521652
 521652 |521650

This is a net gain of performance and ease of implementation in many
cases where a large number of ids from a sequence are needed -- with a
small added benefit of the keys being guaranteed to be contiguous.

I don't see any technical problems with this; postgres already can
"pre-allocate" more than one key, but the number is semi-static (the
CACHE parameter to CREATE SEQUENCE). This might break existing user
code if they've defined a nextval(regclass, integer), but I don't see
any way to

Finally, I've checked sequence.c -- this looks pretty straightforward
to implement, but I figured checking with this list was wise before
starting work. Apologies if I've been overly wordy.

Peter

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fillfactor gets set to zero for toast tables

2010-05-14 Thread Alvaro Herrera

Excerpts from Tom Lane's message of vie may 14 14:19:30 -0400 2010:

> The problem is that if any reloption is set for the toast table,
> we build a StdRdOptions struct in which fillfactor is zero, and
> then all the code that actually uses fillfactor honors that.
> And the reason fillfactor gets left as zero is that there is no
> entry for it in the tables for toast-table reloptions.  Clearly,
> this wasn't thought through carefully enough.

Ouch :-( We messed with this stuff after the initial commit because I
didn't get it right the first time either.  I'm surprised that we didn't
find this problem right away.

> I'm inclined to think that we should remove the notion of there
> being separate sets of rdoptions for heaps and toast tables ---
> any case in which someone tries to omit a StdRdOption for toast
> tables is going to fail the same way, unless we are fortunate
> enough that zero is a desirable default.

It doesn't seem like having the distinction has bought us anything.
However, if we do that, we can't add a separate entry to intRelOpts to
fix min/max/default to 100, so I think we should document that options
for toast must match those for heaps.

> What seems more rational is to provide toast.fillfactor but give it
> min/max/default = 100.

Adding an entry to intRelOpts should be enough to fix this bug, correct?
-- 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Generalized Inverted Generalized Search Tree

2010-05-14 Thread Robert Haas

On Fri, May 14, 2010 at 12:03 AM, Takahiro Itagaki
 wrote:
> We can index multiple scalar values per row with GIN access method,
> and also can index single vector value per row with GiST AM.
>
> Is it worth having a new AM to index multiple vector values per row?
> It will be an AM for the missing feature in below:
>
>                | scalar | vector |
> +++
>  single per row | btree  | gist   |
>  multi per row  | gin    | *HERE* |
>
> We can call the new AM "gigist". Or, there might be another idea
> to support expression indexes for SRF functions, like
>  =# CREATE TABLE tbl (c circle[]);
>  =# CREATE INDEX ON tbl USING gist (unnest(c));
>
> Comments and ideas welcome.

I'm not sure I agree with your characterization of the purpose of
GIST.  I think of GIST as a mechanism to support index lookups on
types that are not totally ordered.  Still, I guess I understand the
point - I think what you are trying to do is optimize queries of the
form, e.g.:

SELECT * FROM tbl WHERE [some circle in c overlaps a given box]

I don't personally want to do that :-) but I can easily imagine that
someone else might.  As for adding a new AM, we could certainly do it
that way, but we should at least consider the possibility of trying to
modify the existing GIST code to handle this case in addition to the
things it already does.  If that's not possible, then we should think
about how to minimize code duplication.  I fear that if we're not
careful we might end up having to fix a lot of bugs in two places;
also, any code that is specific to gigist will certainly get less
real-world use than gist itself, so sharing code will help keep our
bug counts down.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] recovery getting interrupted is not so unusual as it used to be

2010-05-14 Thread Robert Haas

On Thu, May 13, 2010 at 1:28 AM, Fujii Masao  wrote:
> On Thu, May 13, 2010 at 12:10 PM, Robert Haas  wrote:
>> Hmm, it seems this is my night to rediscover the wisdom of your
>> previous proposals.  I think that state would only be appropriate when
>> we shutdown after reaching consistency, not any shutdown during
>> recovery.  Do you agree?
>
> No. When shutdown happens before reaching consistency, the database might
> be inconsistent, but which doesn't mean that some data might be corrupted.
> We can get consistent (not corrupted) database by applying the WAL records
> to inconsistent one.
>
>> HINT:  If this has occurred more than once some data might be
>> corrupted and you might need to choose an earlier recovery target.
>
> I think that the hint message indicates the data corruption which prevents
> recovery from completing, rather than the inconsistency of the database.

Hmm, OK, I think that makes sense.  Would you care to propose a patch?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] fillfactor gets set to zero for toast tables

2010-05-14 Thread Tom Lane

I've been able to reproduce the problem described here:
http://archives.postgresql.org/pgsql-bugs/2010-05/msg00100.php
Do this:

create table foo(f1 text);
alter table foo set (toast.autovacuum_enabled = false);
insert into foo values(repeat('xyzzy',10));
vacuum verbose foo;

Notice that the vacuum output shows there's only one toast tuple per
toast table page.

The problem is that if any reloption is set for the toast table,
we build a StdRdOptions struct in which fillfactor is zero, and
then all the code that actually uses fillfactor honors that.
And the reason fillfactor gets left as zero is that there is no
entry for it in the tables for toast-table reloptions.  Clearly,
this wasn't thought through carefully enough.

I'm inclined to think that we should remove the notion of there
being separate sets of rdoptions for heaps and toast tables ---
any case in which someone tries to omit a StdRdOption for toast
tables is going to fail the same way, unless we are fortunate
enough that zero is a desirable default.  What seems more rational
is to provide toast.fillfactor but give it min/max/default = 100.

Thoughts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] recovery consistent != hot standby

2010-05-14 Thread Robert Haas

While looking through postmaster.c and xlog.c I discovered that we're
being a little bit loose about our use of terminology.  Maybe this was
right when committed (I think, at that point, Hot Standby was always
on) but it's not right any more.  It appears that we only enter the
PM_RECOVERY_CONSISTENT when Hot Standby is enabled; otherwise, we
remain in PM_RECOVERY even after reaching consistency.  I think, then,
that the state, and the signal which triggers it are misnamed.  For
the avoidance of confusion, I'd like to propose that we rename as
follows:

PM_RECOVERY_CONSISTENT -> PM_HOT_STANDBY
PMSIGNAL_RECOVERY_CONSISTENT -> PMSIGNAL_BEGIN_HOT_STANDBY

Objections?  Better ideas?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON manipulation functions

2010-05-14 Thread Mike Rylander

On Fri, May 14, 2010 at 1:15 PM, Robert Haas  wrote:
> On Thu, May 13, 2010 at 9:47 PM, Joseph Adams
>  wrote:

[snip]

>> == array/object conversion ==
>>
>> The json_object function converts a tuple to a JSON object.  If there
>> are duplicate column names, there will be duplicate keys in the
>> resulting JSON object.
>>
>> json_object([content [AS name] [, ...]]) returns json
>
> Seems good.
>
>> Likewise, the json_array function converts a tuple to a JSON array.
>> Column names are ignored.
>>
>> json_array([content [AS name] [, ...]]) returns json
>
> I think this is pointless and should be omitted.
>

(I'm going to avoid the use of the term "object" here to reduce confusion.)

I disagree with the assertion that it's pointless, and I have a
specific use-case in mind for this function.  I have a system that
uses JSON arrays on the wire to encapsulate data, and both ends
understand the positional semantics of the elements.  Using JSON
arrays instead of JSON objects reduces the transfer size by 40-80%,
depending on how "full" the rows (or class instances) are and the data
types of the elements, simply by removing the redundant object keys.
This function would be extremely useful to me when creating or
persisting raw class instances of these sorts.

-- 
Mike Rylander
 | VP, Research and Design
 | Equinox Software, Inc. / The Evergreen Experts
 | phone:  1-877-OPEN-ILS (673-6457)
 | email:  mi...@esilibrary.com
 | web:  http://www.esilibrary.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] List traffic

2010-05-14 Thread Marc G. Fournier


On Fri, 14 May 2010, Kevin Grittner wrote:


Well, redoubling our current efforts to direct people to more
specific lists would accomplish nothing, since doubling zero leaves
you with zero.  The description of -general includes:


Agreed ...

Given that, the fact that -admin, -novice, -sql, and -performance 
collectively get as many posts as -general suggests that people are, in 
fact, making some effort to find a list which seems a good fit. Perhaps 
if the description of -general was changed to suggest it *was* a 
catch-all for posts which don't fit the other lists, things would 
improve.


Can you offer improvd / stronger wording on this ... ?

Marc G. FournierHub.Org Hosting Solutions S.A.
scra...@hub.org http://www.hub.org

Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON manipulation functions

2010-05-14 Thread Robert Haas

On Thu, May 13, 2010 at 9:47 PM, Joseph Adams
 wrote:
> The following function returns the type of any JSON value.
>
> json_type as enum ('null', 'string', 'number', 'boolean', 'object', 'array')
> json_type(json) returns json_type

Seems reasonable.

> Would it be a bad idea to give an enum and a function the same name
> (which appears to be allowed by PostgreSQL) ?  If so, json_type(json)
> could be json_typeof(json) or something instead.

No, I think that's a fine idea.

> I thought about having predicates like IS STRING and IS NUMBER,
> similar to the IS DOCUMENT predicate used for XML.  However, a major
> problem with that approach is that it could lead to confusion
> involving IS NULL.  By my understanding, the JSON datatype will just
> be a specialization of TEXT (it just validates the input).  Like TEXT,
> a JSON string can be 'null'.  'null'::JSON is not NULL.  Bear in mind
> that json_to_*('null') is NULL, though.

Even aside from the possible semantic confusion, I don't think that we
should make any changes to our core grammar (gram.y) to support JSON.
It's not really necessary and it's better not to add extra stuff to
the grammar unless we really need it.

> I also thought about having a series of json_is_* functions.  I don't
> think it's a bad idea, but I think json_type is a better solution.

I agree.

> == text/number/boolean conversion ==
>
> These functions each convert a non-compound JSON value to its
> respective return type.  Run-time type checking is performed; a
> conversion will throw an error if the input JSON is not the correct
> type.  If the JSON value is 'null', then the return value will be
> NULL.
>
> json_to_text(json) returns text
> json_to_number(json) returns numeric
> json_to_bool(json) returns boolean

Can '3' be converted to a number, or only if it's written without the quotes?

> These functions convert values to JSON.  Passing NULL to any of the
> functions below will return 'null':
>
> text_to_json(text) returns json
> number_to_json(numeric) returns json
> bool_to_json(boolean) returns json
>
> There could be generic value_to_json(any), but not a
> json_to_value(json) function.  See
> http://archives.postgresql.org/pgsql-hackers/2010-04/msg00321.php for
> more details.

Seems OK.

> Conversion to/from number or boolean can also be achieved with
> casting.  Note well that '"string"'::JSON::TEXT is '"string"', not the
> string's actual value.  json_to_text is needed for this conversion.
> For this reason, casting JSON might seem like something to recommend
> against.  However, IMHO, casting numbers and booleans to/from JSON is
> fine and dandy; the paragraphs below give some weight to this.
>
> I originally considered making json_to_number and number_to_json work
> with TEXT instead of NUMERIC.  However, as Tom Lane pointed out in the
> above link, "Forcing people to insert explicit coercions from text
> isn't going to be particularly convenient to use.".  Nevertheless,
> NUMERIC introduces a problem.  For instance, if you say:
>
> SELECT '-1e-38'::NUMERIC;
>
> This conversion knocks out the scientific notation and produces a
> 41-character string.  I seriously doubt that all outside applications
> will handle 41-character numbers correctly.

Maybe not, but I don't think it's your problem to fix it if they
don't.  If people want to have fine-grained control over the JSON that
gets generated, they can always generate the value as text and cast it
to JSON.

> Perhaps there should be individual functions for specific data types,
> or maybe just a handful for particular cases.  There might be
> json_to_int, json_to_float, and json_to_numeric.  In any case,
> converting to/from number types can be achieved quite easily with
> casting.

Personally I'd go with just json_to_numeric for starters.  We can
always add the others if and when it's clear that they are useful.

> == array/object conversion ==
>
> The json_object function converts a tuple to a JSON object.  If there
> are duplicate column names, there will be duplicate keys in the
> resulting JSON object.
>
> json_object([content [AS name] [, ...]]) returns json

Seems good.

> Likewise, the json_array function converts a tuple to a JSON array.
> Column names are ignored.
>
> json_array([content [AS name] [, ...]]) returns json

I think this is pointless and should be omitted.

> The json_agg function reduces a set of JSON values to a single array
> containing those values.
>
> aggregate json_agg(json) returns json

Very useful, I like that.

> json_object and json_agg can be used together to convert an entire
> result set to one JSON array:
>
> SELECT json_agg(json_object(*)) FROM tablename;

Spiffy.

> json_keys gets the keys of a JSON object as a set.
>
> json_keys(json) returns setof text

I would tend to make this return text[] rather than SETOF text.

> json_values gets the values of a JSON object or the iems of a JSON
> array as a set.
>
> json_values(json) returns setof json

Similarly I would make this retur

Re: [HACKERS] List traffic

2010-05-14 Thread Yeb Havinga


Marc G. Fournier wrote:

On Fri, 14 May 2010, Yeb Havinga wrote:


Marc G. Fournier wrote:

On Thu, 13 May 2010, Alvaro Herrera wrote:


Excerpts from Yeb Havinga's message of jue may 13 15:06:53 -0400 2010:

My $0.02 - I like the whole 'don't sort, search' (or how did they 
call

it?) just let the inbox fill up, google is fast enough. What would be
really interesting is to have some extra 'tags/headers' added to the
emails (document classification with e.g. self organizing 
map/kohonen),
so my local filters could make labels based on that, instead of 
perhaps

badly spelled keywords in subjects or message body.


I missed this when I read it the first time .. all list email does 
have an X-Mailing-List header added so that you can label based on 
list itself ... is that what you mean, or are you thinking of 
something else entirely?
Something else: if automatic classification of articles was in place, 
there would be need of fewer mailing lists, depending on the quality 
of the classification.


You've mentinoed this serveral time, but what *is* "autoclassication 
of articles"?  or is this something you do on the gmail side of things?
I ment classification in the sense of automated as apposed to manual 
classification by author or subscriber, in the general sense, not linked 
to any mail client or server. Example: junk mail detection by mail client.


After sending my previous mail this morning I looked a bit more into 
(the faq of) carrot2, which links to LingPipe as a solution for people 
that like pre-defined classes. LingPipe in fact has a tutorial where 
they classify a dataset of newsgroups articles, see e.g. 
http://alias-i.com/lingpipe/demos/tutorial/classify/read-me.html. I 
suppose it would be interesting to see what could be done with the pg 
archives. If the archive database itself is publically available, or 
could be made available I'd be willing to put some time into this 
(solely on the bases that I'm interested in the outcome, not that I 
pursue that it'd be used by the pg project, though that'd ofcourse be 
cool if it turned out that way in the end)


regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] List traffic

2010-05-14 Thread Josh Berkus




There is no reason why advocacy can't happen on general. Theoretically
www could be on hackers (although I do see the point of a separate
list).


First off, this is absolutely the wrong list to be discussing management 
of PostgreSQL lists.  That belongs on pgsql-www.  And, I'll point out, 
that this completely pointless discussion on the wrong list has been 1/6 
of the traffic on -hackers for the last two days.  Also, I really don't 
see what problem people think they're addressing with these bimonthly 
calls for list consolidation.  It seems like a solution in search of a 
problem.


So it's an exercise in ironic wankitude.   Can we please stop it now?

Second, regarding advocacy: no, absolutely not.  -advocacy is a working 
list and not a virtual water cooler.


--
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] List traffic

2010-05-14 Thread Kevin Grittner

Tom Lane  wrote:

> I can't imagine that there's not going to need to be a "catchall"
> list for problems that don't fit into any of the subcategories.
> 
> More generally, we already have most of the lists that you
> suggest, and we already know that people frequently don't find the
> most appropriate list for postings.  I don't think getting rid of
> -general would help that in the least.  The way to cut down on
> misposted traffic is to make the set of categories smaller and
> simpler, not to redouble our efforts to persuade people to use the
> same or even more categories.

Well, redoubling our current efforts to direct people to more
specific lists would accomplish nothing, since doubling zero leaves
you with zero.  The description of -general includes:

| General discussion area for users. Apart from compile, acceptance
| test, and bug problems, most new users will probably only be
| interested in this mailing list

Given that, the fact that -admin, -novice, -sql, and -performance
collectively get as many posts as -general suggests that people are,
in fact, making some effort to find a list which seems a good fit. 
Perhaps if the description of -general was changed to suggest it
*was* a catch-all for posts which don't fit the other lists, things
would improve.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON manipulation functions

2010-05-14 Thread Pavel Stehule

2010/5/14 Bruce Momjian :
> Joseph Adams wrote:
>> == array/object conversion ==
>>
>> The json_object function converts a tuple to a JSON object.  If there
>> are duplicate column names, there will be duplicate keys in the
>> resulting JSON object.
>>
>> json_object([content [AS name] [, ...]]) returns json
>>
>> Likewise, the json_array function converts a tuple to a JSON array.
>> Column names are ignored.
>>
>> json_array([content [AS name] [, ...]]) returns json
>
> Do you see any problems with the fact that JSON arrays can use mixed
> data types, e.g.:
>
>        [ 1, 2, 'hi', false]

it could not be a problem

regards
Pavel
>
> --
>  Bruce Momjian          http://momjian.us
>  EnterpriseDB                             http://enterprisedb.com
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] [PATCH] Add SIGCHLD catch to psql

2010-05-14 Thread Stephen Frost

Greetings,

  Toying around with FETCH_COUNT today, I discovered that it didn't do
  the #1 thing I really wanted to use it for- query large tables without
  having to worry about LIMIT to see the first couple hundred records.
  The reason is simple- psql ignores $PAGER exiting, which means that it
  will happily continue pulling down the entire large table long after
  you've stopped caring, which means you still have to wait forever.

  The attached, admittedly quick hack, fixes this by having psql catch
  SIGCHLD's using handle_sigint.  I've tested this and it doesn't
  appear to obviously break other cases where we have children (\!, for
  example), since we're not going to be running a database query when
  we're doing those, and if we are, and the child dies, we probably want
  to *stop* anyway, similar to the $PAGER issue.

  Another approach that I considered was fixing various things to deal
  cleanly with write's failing to $PAGER (I presume the writes *were*
  failing, since less was in a defunct state, but I didn't actually
  test).  This solution was simpler, faster to code and check, and alot
  less invasive (or so it seemed to me at the time).

  Anyway, this makes FETCH_COUNT alot more useful, and, in my view, the
  current behaviour of completely ignoring $PAGER exiting is a bug.

Thanks,

Stephen
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index f605c97..dcab436 100644
*** a/src/bin/psql/common.c
--- b/src/bin/psql/common.c
*** NoticeProcessor(void *arg, const char *m
*** 188,194 
  /*
   * Code to support query cancellation
   *
!  * Before we start a query, we enable the SIGINT signal catcher to send a
   * cancel request to the backend. Note that sending the cancel directly from
   * the signal handler is safe because PQcancel() is written to make it
   * so. We use write() to report to stderr because it's better to use simple
--- 188,194 
  /*
   * Code to support query cancellation
   *
!  * Before we start a query, we enable SIGINT and SIGCHLD signals to send a
   * cancel request to the backend. Note that sending the cancel directly from
   * the signal handler is safe because PQcancel() is written to make it
   * so. We use write() to report to stderr because it's better to use simple
*** NoticeProcessor(void *arg, const char *m
*** 208,213 
--- 208,218 
   * catcher to longjmp through sigint_interrupt_jmp.  We assume readline and
   * fgets are coded to handle possible interruption.  (XXX currently this does
   * not work on win32, so control-C is less useful there)
+  *
+  * SIGCHLD is also caught and handled the same to deal with cases where a user's
+  * PAGER or other child process exits.  Otherwise, we would just keep sending
+  * data to a dead/zombied process.  This won't typically matter except when
+  * FETCH_COUNT is used.
   */
  volatile bool sigint_interrupt_enabled = false;
  
*** void
*** 259,264 
--- 264,272 
  setup_cancel_handler(void)
  {
  	pqsignal(SIGINT, handle_sigint);
+ 
+ 	/* Also send SIGCHLD signals, to catch cases where the user exits PAGER */
+ 	pqsignal(SIGCHLD, handle_sigint);
  }
  #else			/* WIN32 */
  


signature.asc
Description: Digital signature

Re: [HACKERS] List traffic

2010-05-14 Thread Marc G. Fournier


On Fri, 14 May 2010, Yeb Havinga wrote:


Marc G. Fournier wrote:

On Thu, 13 May 2010, Alvaro Herrera wrote:


Excerpts from Yeb Havinga's message of jue may 13 15:06:53 -0400 2010:


My $0.02 - I like the whole 'don't sort, search' (or how did they call
it?) just let the inbox fill up, google is fast enough. What would be
really interesting is to have some extra 'tags/headers' added to the
emails (document classification with e.g. self organizing map/kohonen),
so my local filters could make labels based on that, instead of perhaps
badly spelled keywords in subjects or message body.


I missed this when I read it the first time .. all list email does have an 
X-Mailing-List header added so that you can label based on list itself ... 
is that what you mean, or are you thinking of something else entirely?
Something else: if automatic classification of articles was in place, there 
would be need of fewer mailing lists, depending on the quality of the 
classification.


You've mentinoed this serveral time, but what *is* "autoclassication 
of articles"?  or is this something you do on the gmail side of things?




Marc G. FournierHub.Org Hosting Solutions S.A.
scra...@hub.org http://www.hub.org

Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON manipulation functions

2010-05-14 Thread Bruce Momjian

Joseph Adams wrote:
> == array/object conversion ==
> 
> The json_object function converts a tuple to a JSON object.  If there
> are duplicate column names, there will be duplicate keys in the
> resulting JSON object.
> 
> json_object([content [AS name] [, ...]]) returns json
> 
> Likewise, the json_array function converts a tuple to a JSON array.
> Column names are ignored.
> 
> json_array([content [AS name] [, ...]]) returns json

Do you see any problems with the fact that JSON arrays can use mixed
data types, e.g.:

[ 1, 2, 'hi', false]

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] HS/SR Assert server crash

2010-05-14 Thread Bruce Momjian

Bruce Momjian wrote:
> I was able to easily crash the standby server today just by starting it
> and connecting to it via psql.  The master was idle.  The failure was:
> 
>   LOG:  streaming replication successfully connected to primary
>   TRAP: FailedAssertion("!(((xmax) >= ((TransactionId) 3)))", File: 
> "procarray.c", Line: 1211)
>   LOG:  server process (PID 12761) was terminated by signal 6: Abort trap
>   LOG:  terminating any other active server processes
> 
> My master postgresql.conf was:
> 
>   wal_level = hot_standby # minimal, archive, or 
> hot_standby
>   archive_mode = on   # allows archiving to be done
>   archive_command = 'cp -i %p /u/pg/archive/%f < /dev/null '  # command 
> to use to archive a logfile segment
>   max_wal_senders = 1 # max number of walsender processes
> 
> My slave postgresql.conf was:
> 
>   port = 5433 # (change requires restart)
>   wal_level = hot_standby # minimal, archive, or 
> hot_standby
>   archive_mode = off  # allows archiving to be done
>   archive_command = 'cp -i %p /u/pg/archive/%f < /dev/null '  # 
> command to use to archive a logfile segment
>   hot_standby = on# allows queries during recovery
>   max_wal_senders = 1 # max number of walsender processes
> 
> and my slave recovery.conf was:
> 
>   restore_command = 'cp /u/pg/archive/%f %p'  # e.g. 'cp 
> /mnt/server/archivedir/%f %p'
>   standby_mode = 'on'
>   primary_conninfo = 'host=localhost port=5432'   # e.g. 
> 'host=localhost port=5432'
> 
> Let me know what additional information I can supply.

I saw Simon's commit fixing this bug.  Another good reason we didn't
bundle 9.0 beta2 yesterday.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] List traffic

2010-05-14 Thread Tom Lane

"Marc G. Fournier"  writes:
> why not close down -general so that ppl *have* to use better pick where to 
> post their question ...

I can't imagine that there's not going to need to be a "catchall" list
for problems that don't fit into any of the subcategories.

More generally, we already have most of the lists that you suggest, and
we already know that people frequently don't find the most appropriate
list for postings.  I don't think getting rid of -general would help
that in the least.  The way to cut down on misposted traffic is to make
the set of categories smaller and simpler, not to redouble our efforts
to persuade people to use the same or even more categories.

BTW, as far as "no crossposting" goes: usually when I find myself doing
that, it's to redirect a thread that started on -bugs or -general into
-hackers.  I don't see the need for that going away.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] List traffic

2010-05-14 Thread Kevin Grittner

"Marc G. Fournier"  wrote:

> -sql : how to write a query
> -performance : how to improve performance of my queries
> -admin : how to admin the server
> -novice : I'm a new user
> -odbc : how to use ...
> -php : php related interface questions
> -interfaces : more general then -odbc
> 
> why not close down -general so that ppl *have* to use better pick
> where to post their question ...

That's a change I could support.  I even think the descriptions are
pretty close to what should show.

In trying to think what might be missing, I wonder whether we could
decrease inappropriate traffic on the -bugs list if we had a
"feature request" list, for end users not prepared to discuss things
at the level appropriate for -hackers, but who think that PostgreSQL
should support some feature they don't see.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] List traffic

2010-05-14 Thread Marc G. Fournier


On Fri, 14 May 2010, Kevin Grittner wrote:


"Greg Sabino Mullane"  wrote:


Would anyone argue against rolling those two (sql and admin) into
-general as a first step?


At the risk of repeating myself, I won't be able to keep up with the
traffic of the combined list; so rather than read 100% of the
messages from a smaller set, I'll need to pick and choose based on
subject line or some such.  I get the impression that other people,
who read different subsets of the lists, will be forced to a similar
change.  That may result in either some posts "slipping through the
cracks" or in increasing the burden of responding to the posts for
those brave few who wade through them all.


That's what I find with the freebsd-questions list ... there is so much 
noise in there that I tend to avoid posting to it for fear that my email 
will just get skip'd over ...


I am definitely against *merging* lists ... getting rid of the 'meta list' 
makes more sense so as to force ppl to *use* the smaller lists then to 
merge smaller lists and *increase* the noise on one of them ...



Marc G. FournierHub.Org Hosting Solutions S.A.
scra...@hub.org http://www.hub.org

Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] List traffic

2010-05-14 Thread Marc G. Fournier


On Fri, 14 May 2010, Greg Sabino Mullane wrote:



-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160



There is no reason why advocacy can't happen on general. Theoretically
www could be on hackers (although I do see the point of a separate
list).


I don't feel as strong about -advocacy being removed, but we certainly
can fold in -sql and -admin. Would anyone argue against rolling those
two (sql and admin) into -general as a first step?


Question ... we have, right now:

-sql : how to write a query
-performance : how to improve performance of my queries
-admin : how to admin the server
-novice : I'm a new user
-odbc : how to use ...
-php : php related interface questions
-interfaces : more general then -odbc

why not close down -general so that ppl *have* to use better pick where to 
post their question ...




Marc G. FournierHub.Org Hosting Solutions S.A.
scra...@hub.org http://www.hub.org

Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] List traffic

2010-05-14 Thread Marc G. Fournier


On Fri, 14 May 2010, Greg Sabino Mullane wrote:



-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160



... is there a reason why, other the fact that we don't do now, that we
can't just put in a restriction against cross posting altogether?


Because that would be shooting ourselves in the foot. Cross-posting
is often desirable. If we had a clearer distinction of list topics, I
might support such a move, but we don't, so I can't.


But, its the cross-posting, IMHO, that reduces the distinction ...


... and, for those that have been here awhile, who "should know better",
why isn't there any self-management of this sort of stuff in the first
place?


What would you have us do?


Redirect users ... if user sends a query performance related question to 
-general, respond back with -general as the CC, To as -performance and a 
Reply-To header of -performance ... that way those on -general know that 
its been redirected, but *hopefully* users replying will honor the 
-performance redirect ...



Marc G. FournierHub.Org Hosting Solutions S.A.
scra...@hub.org http://www.hub.org

Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

2010-05-14 Thread Kevin Grittner

Florian Pflug  wrote:

> I must admit that I wasn't able to find an explicit reference to
> Oracle's behavior in their docs, so I had to resort to
> experiments. They do have examples showing how to do FK-like
> constraints with triggers, and those don't contain any warning
> whatsoever about problems in SERIALIZABLE mode, though.  But
> still, if there is word on this from Oracle somewhere, I'd love to
> hear about it.

I suspect that in trying to emulate Oracle on this, you may run into
an issue which posed challenges for the SSI implementation which
didn't come up in the Cahill prototype implementations: Oracle, and
all other MVCC databases I've read about outside of PostgreSQL, use
an "update in place with a rollback log" technique.  Access to any
version of a given row or index entry goes through a single
location, with possible backtracking through the log after that,
which simplifies management of certain concurrency issues.  Do they
perhaps use an in-RAM lock table, pointing to the "base" location of
the row for these SELECT FOR UPDATE locks?  (Just guessing; I've
never used Oracle, myself.)

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] List traffic

2010-05-14 Thread Kevin Grittner

"Greg Sabino Mullane"  wrote:

> Would anyone argue against rolling those two (sql and admin) into
> -general as a first step?

At the risk of repeating myself, I won't be able to keep up with the
traffic of the combined list; so rather than read 100% of the
messages from a smaller set, I'll need to pick and choose based on
subject line or some such.  I get the impression that other people,
who read different subsets of the lists, will be forced to a similar
change.  That may result in either some posts "slipping through the
cracks" or in increasing the burden of responding to the posts for
those brave few who wade through them all.

Personally, I'm not convince that merging current lists will solve
more problems than it will create.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

2010-05-14 Thread Florian Pflug

On May 14, 2010, at 15:54 , Kevin Grittner wrote:
> Florian Pflug  wrote:
>> On May 14, 2010, at 12:56 , Kevin Grittner wrote:
>> unless your patch completely removes support for snapshot
>> isolation (what is current called SERIALIZABLE)
> 
> Both SERIALIZABLE and REPEATABLE READ currently map to snapshot
> isolation.  We're leaving REPEATABLE READ alone.

Ah, yeah, that makes a lot of sense. I kinda had forgotten about REPEATABLE 
READ...

>> my proposal still eliminates the situation that user-level
>> constraints are correct in READ COMMITTED and (true) SERIALIZABLE
>> isolation but not in snapshot isolation.
> 
> Agreed.  If someone wants to enforce user-level constraints using
> SSI, they will somehow need to ensure that less strict isolation
> levels are never used to modify data.  Your approach lifts that
> burden.
> 
> By the way, if you can make this behave in a similar way to Oracle,
> especially if the syntax is compatible, I'm sure it will help
> promote PostgreSQL adoption.  At PostgreSQL Conference U.S. East
> 2010, I talked briefly with a couple guys from an Oracle shop who
> were looking at converting to PostgreSQL, and were very concerned
> about not having what you describe.  The techniques required to
> ensure integrity in PostgreSQL were not, to put it mildly, appealing
> to them.  I suspect that they would be satisfied with *either* SSI
> or the change you describe.

My proposal would make SELECT ... FOR UPDATE behave like Oracle does with 
regard to serialization conflicts. SELECT ... FOR SHARE doesn't seem to exist 
on Oracle at all - at least I couldn't find a reference to it in the docs.

The syntax isn't 100% compatible because Oracle seems to expect a list of 
columns after the FOR UPDATE clause, while postgres expects a list of tables.

I must admit that I wasn't able to find an explicit reference to Oracle's 
behavior in their docs, so I had to resort to experiments. They do have 
examples showing how to do FK-like constraints with triggers, and those don't 
contain any warning whatsoever about problems in SERIALIZABLE mode, though. But 
still, if there is word on this from Oracle somewhere, I'd love to hear about 
it.

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] List traffic

2010-05-14 Thread Greg Sabino Mullane


-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


> There is no reason why advocacy can't happen on general. Theoretically
> www could be on hackers (although I do see the point of a separate
> list).

I don't feel as strong about -advocacy being removed, but we certainly 
can fold in -sql and -admin. Would anyone argue against rolling those 
two (sql and admin) into -general as a first step?

- -- 
Greg Sabino Mullane g...@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201005141009
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAkvtWbgACgkQvJuQZxSWSsjfAQCg0s9GxUIKnxHjbAWd2XOWxYpk
OZMAni62Fpj/PPTE9/qFUNw08une4YgT
=OyI0
-END PGP SIGNATURE-



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] List traffic

2010-05-14 Thread Greg Sabino Mullane


-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


> ... is there a reason why, other the fact that we don't do now, that we 
> can't just put in a restriction against cross posting altogether?

Because that would be shooting ourselves in the foot. Cross-posting 
is often desirable. If we had a clearer distinction of list topics, I 
might support such a move, but we don't, so I can't.

> ... and, for those that have been here awhile, who "should know better", 
> why isn't there any self-management of this sort of stuff in the first 
> place?

What would you have us do?

- -- 
Greg Sabino Mullane g...@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201005141005
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAkvtWKwACgkQvJuQZxSWSsimYACgrPesGj6yxfo49c6T1PPLrKir
oPoAn0b81VrrrqAozXnPXV/5vzlAuxr1
=11EB
-END PGP SIGNATURE-



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

2010-05-14 Thread Kevin Grittner

Florian Pflug  wrote:
> On May 14, 2010, at 12:56 , Kevin Grittner wrote:

>> I think that SIREAD locks will generally be cheaper than SELECT
>> FOR UPDATE, since the former don't require any disk I/O and the
>> latter do.

> I can see how a single SIREAD lock can potentially be cheaper than
> a FOR SHARE or FOR UPDATE lock. But the number of SIREAD locks
> would exceed the number of FOR SHARE / FOR UPDATE locks by a few
> order of magnitude I'd think - at least of you ran even
> transaction under true serializable isolation.
> 
> I don't quite understand how SIREAD locks work if they don't
> involve any disk IO, since shared memory isn't resizable.

We use a well-worn technique used by many (most?) database products
-- granularity promotion.  This is one of the things which could
cause enough false positives under some loads to cause your
technique to perform better than SSI for those loads.

> unless your patch completely removes support for snapshot
> isolation (what is current called SERIALIZABLE)

Both SERIALIZABLE and REPEATABLE READ currently map to snapshot
isolation.  We're leaving REPEATABLE READ alone.

> my proposal still eliminates the situation that user-level
> constraints are correct in READ COMMITTED and (true) SERIALIZABLE
> isolation but not in snapshot isolation.

Agreed.  If someone wants to enforce user-level constraints using
SSI, they will somehow need to ensure that less strict isolation
levels are never used to modify data.  Your approach lifts that
burden.

By the way, if you can make this behave in a similar way to Oracle,
especially if the syntax is compatible, I'm sure it will help
promote PostgreSQL adoption.  At PostgreSQL Conference U.S. East
2010, I talked briefly with a couple guys from an Oracle shop who
were looking at converting to PostgreSQL, and were very concerned
about not having what you describe.  The techniques required to
ensure integrity in PostgreSQL were not, to put it mildly, appealing
to them.  I suspect that they would be satisfied with *either* SSI
or the change you describe.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronous replication patch built on SR

2010-05-14 Thread Boszormenyi Zoltan

Fujii Masao írta:
> 2010/4/29 Boszormenyi Zoltan :
>   
>> attached is a patch that does $SUBJECT, we are submitting it for 9.1.
>> I have updated it to today's CVS after the "wal_level" GUC went in.
>> 
>
> I'm planning to create the synchronous replication patch for 9.0, too.
> My design is outlined in the wiki. Let's work together to do the design
> of it.
> http://wiki.postgresql.org/wiki/Streaming_Replication#Synchronization_capability
>
> The log-shipping replication has some synchronization levels as follows.
> Which are you going to work on?
>
> The transaction commit on the master
> #1 doesn't wait for replication (already suppored in 9.0)
> #2 waits for WAL to be received by the standby
> #3 waits for WAL to be received and flushed by the standby
> #4 waits for WAL to be received, flushed and replayed by the standby
> ..etc?
>
> I'm planning to add #2 and #3 into 9.1. #4 is useful but is outside
> the scope of my development for at least 9.1. In #4, read-only query
> can easily block recovery by the lock conflict and make the
> transaction commit on the master get stuck. This problem is difficult
> to be addressed within 9.1, I think. But the design and implementation
> of #2 and #3 need to be easily extensible to #4.
>
>   
>> How does it work?
>>
>> First, the walreceiver and the walsender are now able to communicate
>> in a duplex way on the same connection, so while COPY OUT is
>> in progress from the primary server, the standby server is able to
>> issue PQputCopyData() to pass the transaction IDs that were seen
>> with XLOG_XACT_COMMIT or XLOG_XACT_PREPARE
>> signatures. I did by adding a new protocol message type, with letter
>> 'x' that's only acknowledged by the walsender process. The regular
>> backend was intentionally unchanged so an SQL client gets a protocol
>> error. A new libpq call called PQsetDuplexCopy() which sends this
>> new message before sending START_REPLICATION. The primary
>> makes a note of it in the walsender process' entry.
>>
>> I had to move the TransactionIdLatest(xid, nchildren, children) call
>> that computes latestXid earlier in RecordTransactionCommit(), so
>> it's in the critical section now, just before the
>> XLogInsert(RM_XACT_ID, XLOG_XACT_COMMIT, rdata)
>> call. Otherwise, there was a race condition between the primary
>> and the standby server, where the standby server might have seen
>> the XLOG_XACT_COMMIT record for some XIDs before the
>> transaction in the primary server marked itself waiting for this XID,
>> resulting in stuck transactions.
>> 
>
> You seem to have chosen #4 as synchronization level. Right?
>   

Yes.

> In your design, the transaction commit on the master waits for its XID
> to be read from the XLOG_XACT_COMMIT record and replied by the standby.
> Right? This design seems not to be extensible to #2 and #3 since
> walreceiver cannot read XID from the XLOG_XACT_COMMIT record.

Yes, this was my problem, too. I would have had to
implement a custom interpreter into walreceiver to
process the WAL records and extract the XIDs.

But at least the supporting details, i.e. not opening another
connection, instead being able to do duplex COPY operations in
a server-acknowledged way is acceptable, no? :-)

>  How about
> using LSN instead of XID? That is, the transaction commit waits until
> the standby has reached its LSN. LSN is more easy-used for walreceiver
> and startup process, I think.
>   

Indeed, using the LSN seems to be more appropriate for
the walreceiver, but how would you extract the information
that a certain LSN means a COMMITted transaction? Or
we could release a locked transaction in case the master receives
an LSN greater than or equal to the transaction's own LSN?

Sending back all the LSNs in case of long transactions would
increase the network traffic compared to sending back only the
XIDs, but the amount is not clear for me. What I am more
worried about is the contention on the ProcArrayLock.
XIDs are rarer then LSNs, no?

> What if the "synchronous" standby starts up from the very old backup?
> The transaction on the master needs to wait until a large amount of
> outstanding WAL has been applied? I think that synchronous replication
> should start with *asynchronous* replication, and should switch to the
> sync level after the gap between servers has become enough small.
> What's your opinion?
>   

It's certainly one option, which I think partly addressed
with the "strict_sync_replication" knob below.
If strict_sync_replication = off, then the master doesn't make
its transactions wait for the synchronous reports, and the client(s)
can work through their WALs. IIRC, the walreceiver connects
to the master only very late in the recovery process, no?

It would be nicer if it could be made automatic. I simply thought
that there may be situations where the "strict" behaviour may be
desired. I was thinking about the transactions executed on the
master between the standby startup and walreceive

Re: [HACKERS] Japanies translation breaks solaris build

2010-05-14 Thread Zdenek Kotala

Takahiro Itagaki píše v pá 14. 05. 2010 v 19:38 +0900:
> Zdenek Kotala  wrote:
> 
> > http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=comet_moth&dt=2010-05-13%2021:06:01
> > The problem is that it contains mix of DOS/Unix end of lines.
> 
> I removed two CRs in ja.po.

Thanks. Gothic moth is green again

http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=gothic_moth&br=HEAD

The rest solaris machine will recover during a night.

Zdenek


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

2010-05-14 Thread Florian Pflug

On May 14, 2010, at 12:56 , Kevin Grittner wrote:
>> True serializable transaction are much more powerful than what I
>> proposed, but at a much higher price too, due to the necessity of
>> SIREAD locks.
> 
> I think that SIREAD locks will generally be cheaper than SELECT FOR
> UPDATE, since the former don't require any disk I/O and the latter
> do.  I only have one benchmark so far (more on the way), but it
> attempts to isolate the cost of acquiring the SIREAD locks by using
> a read-only load against a fully cached database.  Benchmarks so far
> show the new version of the SERIALIZABLE level as supporting 1.8%
> fewer TPS than REPEATABLE READ (the existing snapshot isolation
> level) in that environment.  That will probably disappear into the
> noise for any load involving disk I/O.

I can see how a single SIREAD lock can potentially be cheaper than a FOR SHARE 
or FOR UPDATE lock. But the number of SIREAD locks would exceed the number of 
FOR SHARE / FOR UPDATE locks by a few order of magnitude I'd think - at least 
of you ran even transaction under true serializable isolation.

I don't quite understand how SIREAD locks work if they don't involve any disk 
IO, since shared memory isn't resizable. But I guess I'll find out once you 
post the patch ;-)

> I guess what I'm suggesting is that unless you have a very small
> database with a very large number of connections in a high
> contention workload, or you can't require SERIALIZABLE transaction
> isolation level, SSI might actually perform better than what you're
> proposing.

That is entirely possible. However, unless your patch completely removes 
support for snapshot isolation (what is current called SERIALIZABLE), my 
proposal still eliminates the situation that user-level constraints are correct 
in READ COMMITTED and (true) SERIALIZABLE isolation but not in snapshot 
isolation.

Btw, the only user of FOR SHARE locks inside postgres proper are the RI 
triggers, and those do that special crosscheck when called within a 
SERIALIZABLE transactions. I do take this as evidence that the current behavior 
might not be all that useful with serializable transactions...

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How to know killed by pg_terminate_backend

2010-05-14 Thread Robert Haas

On Thu, May 13, 2010 at 8:20 PM, Tatsuo Ishii  wrote:
>> > Maybe we could make PostgreSQL a little bit smarter so that it returns
>> > a different code than 57P01 when killed by pg_terminate_backend().
>>
>> Seems reasonable. Does the victim backend currently know why it has been
>> killed?
>
> I don't think so.
>
> One idea is postmaster sets a flag in the shared memory area
> indicating it rceived SIGTERM before forwarding the signal to
> backends.
>
> Backend check the flag and if it's not set, it knows that the signal
> has been sent by pg_terminate_backend(), not postmaster.

Or it could also be sent by some other user process, like the user
running "kill" from the shell.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronous replication patch built on SR

2010-05-14 Thread Fujii Masao

2010/4/29 Boszormenyi Zoltan :
> attached is a patch that does $SUBJECT, we are submitting it for 9.1.
> I have updated it to today's CVS after the "wal_level" GUC went in.

I'm planning to create the synchronous replication patch for 9.0, too.
My design is outlined in the wiki. Let's work together to do the design
of it.
http://wiki.postgresql.org/wiki/Streaming_Replication#Synchronization_capability

The log-shipping replication has some synchronization levels as follows.
Which are you going to work on?

The transaction commit on the master
#1 doesn't wait for replication (already suppored in 9.0)
#2 waits for WAL to be received by the standby
#3 waits for WAL to be received and flushed by the standby
#4 waits for WAL to be received, flushed and replayed by the standby
..etc?

I'm planning to add #2 and #3 into 9.1. #4 is useful but is outside
the scope of my development for at least 9.1. In #4, read-only query
can easily block recovery by the lock conflict and make the
transaction commit on the master get stuck. This problem is difficult
to be addressed within 9.1, I think. But the design and implementation
of #2 and #3 need to be easily extensible to #4.

> How does it work?
>
> First, the walreceiver and the walsender are now able to communicate
> in a duplex way on the same connection, so while COPY OUT is
> in progress from the primary server, the standby server is able to
> issue PQputCopyData() to pass the transaction IDs that were seen
> with XLOG_XACT_COMMIT or XLOG_XACT_PREPARE
> signatures. I did by adding a new protocol message type, with letter
> 'x' that's only acknowledged by the walsender process. The regular
> backend was intentionally unchanged so an SQL client gets a protocol
> error. A new libpq call called PQsetDuplexCopy() which sends this
> new message before sending START_REPLICATION. The primary
> makes a note of it in the walsender process' entry.
>
> I had to move the TransactionIdLatest(xid, nchildren, children) call
> that computes latestXid earlier in RecordTransactionCommit(), so
> it's in the critical section now, just before the
> XLogInsert(RM_XACT_ID, XLOG_XACT_COMMIT, rdata)
> call. Otherwise, there was a race condition between the primary
> and the standby server, where the standby server might have seen
> the XLOG_XACT_COMMIT record for some XIDs before the
> transaction in the primary server marked itself waiting for this XID,
> resulting in stuck transactions.

You seem to have chosen #4 as synchronization level. Right?

In your design, the transaction commit on the master waits for its XID
to be read from the XLOG_XACT_COMMIT record and replied by the standby.
Right? This design seems not to be extensible to #2 and #3 since
walreceiver cannot read XID from the XLOG_XACT_COMMIT record. How about
using LSN instead of XID? That is, the transaction commit waits until
the standby has reached its LSN. LSN is more easy-used for walreceiver
and startup process, I think.

What if the "synchronous" standby starts up from the very old backup?
The transaction on the master needs to wait until a large amount of
outstanding WAL has been applied? I think that synchronous replication
should start with *asynchronous* replication, and should switch to the
sync level after the gap between servers has become enough small.
What's your opinion?

> I have added 3 new options, two GUCs in postgresql.conf and one
> setting in recovery.conf. These options are:
>
> 1. min_sync_replication_clients = N
>
> where N is the number of reports for a given transaction before it's
> released as committed synchronously. 0 means completely asynchronous,
> the value is maximized by the value of max_wal_senders. Anything
> in between 0 and max_wal_senders means different levels of partially
> synchronous replication.
>
> 2. strict_sync_replication = boolean
>
> where the expected number of synchronous reports from standby
> servers is further limited to the actual number of connected synchronous
> standby servers if the value of this GUC is false. This means that if
> no standby servers are connected yet then the replication is asynchronous
> and transactions are allowed to finish without waiting for synchronous
> reports. If the value of this GUC is true, then transactions wait until
> enough synchronous standbys connect and report back.

Why are these options necessary?

Can these options cover more than three synchronization levels?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

2010-05-14 Thread Kevin Grittner

[slight rearrangement]

Florian Pflug  wrote:

> I'm very exited about the work you're doing

Always nice to hear.  :-)

> I view my proposal as pretty orthogonal to that work.

> My proposal allows for simple FK-like constraints to be
> implemented at user-level that are correct for all isolation
> levels.

OK, I can see the attraction in that.

> True serializable transaction are much more powerful than what I
> proposed, but at a much higher price too, due to the necessity of
> SIREAD locks.

I think that SIREAD locks will generally be cheaper than SELECT FOR
UPDATE, since the former don't require any disk I/O and the latter
do.  I only have one benchmark so far (more on the way), but it
attempts to isolate the cost of acquiring the SIREAD locks by using
a read-only load against a fully cached database.  Benchmarks so far
show the new version of the SERIALIZABLE level as supporting 1.8%
fewer TPS than REPEATABLE READ (the existing snapshot isolation
level) in that environment.  That will probably disappear into the
noise for any load involving disk I/O.

Now *rollbacks*, particularly those due to false positives, might
become a more serious issue in some pessimal loads, but I'm still
working on developing meaningful benchmarks for that.

I guess what I'm suggesting is that unless you have a very small
database with a very large number of connections in a high
contention workload, or you can't require SERIALIZABLE transaction
isolation level, SSI might actually perform better than what you're
proposing.  Of course, that's all conjecture until there are
benchmarks; but I'd be very interested in getting any and all
alternative solutions like this worked into a benchmark -- where I
can pull out the FOR UPDATE and FOR SHARE clauses, any redundant
updates or denormalizations added just for concurrency issues, and
all explicit locking -- and compare that under SERIALIZABLE to the
original performance.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Japanies translation breaks solaris build

2010-05-14 Thread Takahiro Itagaki


Zdenek Kotala  wrote:

> http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=comet_moth&dt=2010-05-13%2021:06:01
> The problem is that it contains mix of DOS/Unix end of lines.

I removed two CRs in ja.po.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] nvarchar notation accepted?

2010-05-14 Thread Florian Pflug

On May 14, 2010, at 5:56 , Jaime Casanova wrote:
> On Thu, May 13, 2010 at 10:13 PM, Takahiro Itagaki
>  wrote:
>> 
>> Jaime Casanova  wrote:
>> 
>>> i migrate a ms sql server database to postgres and was trying some
>>> queries from the application to find if everything works right...
>>> when i was looking to those queries i found some that has a notation
>>> for nvarchar (ej: campo = N'sometext')
>> 
>> Do you have documentation for N'...' literal in SQLServer?
>> Does it mean unicode literal? What is the difference from U& literal?
>> http://developer.postgresql.org/pgdocs/postgres/sql-syntax-lexical.html
>> 
> 
> nop, only thing i found is about NVARCHAR:
> http://msdn.microsoft.com/en-us/library/ms186939.aspx but it has no
> examples about the N'' notation although you can find examples of it
> use here: http://msdn.microsoft.com/en-us/library/dd776381.aspx#BasicSyntax

Without using the N prefixed versions of CHAR, VARCHAR and string literals, MS 
SQL Server refuses to process characters other than those in the database's 
character set. It will replace all those characters with '?'.

Note that this is not an encoding issue - it will even do so with protocol 
versions (everything >= 7.0 I think) that use UTF16 on-wire, where those 
characters can be transmitted just fine.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

2010-05-14 Thread Florian Pflug


On May 14, 2010, at 2:37 , Greg Stark wrote:

> On Thu, May 13, 2010 at 10:25 PM, Florian Pflug  wrote:
>> C1: BEGIN
>> C1: SELECT * FROM t WHERE id = 1 FOR UPDATE
>> C2: BEGIN
>> C2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
>> C2: SELECT * FROM t -- Take snapshot before C1 commits
>> C1: COMMIT
>> C2: DELETE FROM t WHERE id = 1
>> C2: COMMIT
>> 
> 
> Can you give an actual realistic example -- ie, not doing a select for
> update and then never updating the row or with an explanation of what
> the programmer is attempting to accomplish with such an unusual
> sequence? The rest of the post talks about FKs but I don't see any
> here...

The table "t" is supposed to represent the parent table of a FK constraint. The 
SELECT FOR UPDATE is done upon an INSERT to the child table to protect the 
parent row against concurrent deletion. I've used FOR UPDATE instead of FOR 
SHARE because I did test this against oracle also, and oracle does not support 
FOR SHARE. 

Here's a full example of a pair of FK triggers in PL/PGSQL that work correctly 
in READ COMMITTED mode but fail to enforce the constraint in SERIALIZABLE mode 
as the following sequence of commands show. With my proposal, the DELETE would 
again raise a serialization error and hence keep the constraint satisfied.

C1: BEGIN
C1: INSERT INTO child (parent_id) VALUES (1) -- Locks the parent row FOR UPDATE
C2: BEGIN
C2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
C2: SELECT TRUE -- Take snapshot *before* C1 commits
C1: COMMIT
C2: DELETE FROM parent WHERE parent_id = 1 -- Succeeds
C2: COMMIT

--
CREATE TABLE parent (parent_id SERIAL NOT NULL PRIMARY KEY);
CREATE TABLE child (child_id SERIAL NOT NULL PRIMARY KEY, parent_id INTEGER NOT 
NULL);

CREATE FUNCTION ri_parent() RETURNS TRIGGER AS $body$
BEGIN
 PERFORM TRUE FROM child WHERE parent_id = OLD.parent_id;
 IF FOUND THEN
   RAISE SQLSTATE '23503' USING MESSAGE = 'Parent ' || OLD.parent_id || ' still 
referenced during ' || TG_OP;
 END IF;
 RETURN NULL;
END;
$body$ LANGUAGE PLPGSQL VOLATILE;
CREATE TRIGGER ri_parent AFTER UPDATE OR DELETE ON parent FOR EACH ROW EXECUTE 
PROCEDURE ri_parent();

CREATE FUNCTION ri_child() RETURNS TRIGGER AS $body$
BEGIN
 PERFORM TRUE FROM parent WHERE parent_id = NEW.parent_id FOR UPDATE OF parent;
 IF NOT FOUND THEN
   RAISE SQLSTATE '23503' USING MESSAGE = 'Parent ' || NEW.parent_id || ' does 
not exist during ' || TG_OP;
 END IF;
 RETURN NULL;
END;
$body$ LANGUAGE PLPGSQL VOLATILE;
CREATE TRIGGER ri_child AFTER INSERT OR UPDATE ON child FOR EACH ROW EXECUTE 
PROCEDURE ri_child();
--

best regards,

Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Row-level Locks & SERIALIZABLE transactions, postgres vs. Oracle

2010-05-14 Thread Nicolas Barbier

2010/5/14 Greg Stark :

> On Thu, May 13, 2010 at 10:25 PM, Florian Pflug  wrote:
>
>> C1: BEGIN
>> C1: SELECT * FROM t WHERE id = 1 FOR UPDATE
>> C2: BEGIN
>> C2: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
>> C2: SELECT * FROM t -- Take snapshot before C1 commits
>> C1: COMMIT
>> C2: DELETE FROM t WHERE id = 1
>> C2: COMMIT
>
> Can you give an actual realistic example -- ie, not doing a select for
> update and then never updating the row or with an explanation of what
> the programmer is attempting to accomplish with such an unusual
> sequence? The rest of the post talks about FKs but I don't see any
> here...

The link with FKs is as follows:

* The example does not use a real FK, because the whole purpose is to
do the same as FKs while not using the FK machinery.
* The example uses only one table, because that is enough to
illustrate the problem (see next items).
* C1 locks a row, supposedly because it wants to create a reference to
it in a non-mentioned table, and wants to prevent the row from being
deleted under it.
* C2 deletes that row (supposedly after it verified that there are no
references to it; it would indeed not be able to see the reference
that C1 created/would create), and C1 fails to detect that.
* C2 also fails to detect the problem, because the lock that C1 held
is being released after C1 commits, and C2 can happily go on deleting
the row.
* The end result is that the hypothetical reference is created,
although the referent is gone.

Nicolas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] List traffic

2010-05-14 Thread Yeb Havinga

Marc G. Fournier wrote:

On Thu, 13 May 2010, Alvaro Herrera wrote:

Excerpts from Yeb Havinga's message of jue may 13 15:06:53 -0400 2010:

My $0.02 - I like the whole 'don't sort, search' (or how did they call
it?) just let the inbox fill up, google is fast enough. What would be
really interesting is to have some extra 'tags/headers' added to the
emails (document classification with e.g. self organizing map/kohonen),
so my local filters could make labels based on that, instead of perhaps
badly spelled keywords in subjects or message body.

I missed this when I read it the first time .. all list email does
have an X-Mailing-List header added so that you can label based on
list itself ... is that what you mean, or are you thinking of
something else entirely?
Something else: if automatic classification of articles was in place,
there would be need of fewer mailing lists, depending on the quality of
the classification.

IMHO the problem of handling the big volume of the lists is not solved
by splitting into more, since it does not decrease the amount of posts
that are interesting from the subscribers perspective. It would only
mean that posters are more likely to make mistakes, a possible increase
in crossposts or 'my question was not answered there so now I try here'
on the sender part, and at the subscriber side bigger chance to miss
interesting articles. That my current mailing list setup works for me
supports this claim; I did not subscribe to less lists, but managed to
decrease the ms spent at 'handling' to an amount that became workable.

Though I do not believe algorithmic article classification/ranking to
provide a 100% fool proof filter, it might help decreasing the "ms spent
per article" more. Take a look at how "carrot2" clusters results from
the query "postgresql prepared transactions site:postgresql.org" -
http://search.carrot2.org/stable/search?source=web&view=tree&skin=fancy-compact&query=postgresql+prepared+transactions+site%3Apostgresql.org&results=100&algorithm=lingo&EToolsDocumentSource.country=ALL&EToolsDocumentSource.language=ENGLISH&EToolsDocumentSource.safeSearch=false

I wonder if a cluster algorithm could tag articles with (multiple)
keywords, e.g. 'hackers','prepared transaction','dba' etc etc. I could
then make filters or ranking on: hackers AND optimizer -> +10.

regards,
Yeb Havinga

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

86 matches

Mail list logo