date:20110421

Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-04-21 Thread tomas

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Wed, Apr 20, 2011 at 11:39:47AM -0700, Josh Berkus wrote:

[...]

 Review of design concepts and WIP patches has *always* been a problem
 for this project [...]

 We tell people to submit a design concept, but then such submissions are
 often ignored.  When they're not ignored, they often are subject to
 either extreme bikeshedding or a lot of negativity around things the
 author hasn't implemented yet ... even if the author warns that they're
 not implemented.

I'm not a committer. So take this data point for what it's worth. But I
have been following this list for quite a while, and I must say: I (very
respectfully!) disagree. Having  watched mailing lists for other
projects, the quality of the answers one gets here is outstanding. The
tone might be sometimes a bit tight (but never disrespectful or
flaming), but seriously: what do I get off a friendly answer if there is
no content?

The same goes to -GENERAL. I've always got answers to my (sometimes, in
hindsight quite stupid) questions which actually *helped* to solve my
problem.

It's OK to strive to improve the process, but I think you all are quite
good.

[...]

 So in the spirit of NOT reinventing the wheel: ReviewBoard.  Yes,
 really [...]
 [...]  But I think it's time to try something else, maybe
 several other things.

Maybe. But I *do* understand the unwillingness to change that. I've
contributed (tiny) patches to more that one project, and it's
frustrating to fight the bug-tracker-du-jour system. This one won't talk
to me unless my browser talks Javascript. That one... (you get the
idea). I strongly appreciate the free-flowing mailing list style here
(maybe it's just an age problem ;-)

Regards
- -- tomás
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFNr9IhBcgs9XrR2kYRAkw+AJoDFJcnpR06VpGNVAzsbx/eZpQcxACfUv//
vFsZsPiYlM78fxsjCLQvbHw=
=A+7H
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-04-21 Thread Peter Eisentraut

On Wed, 2011-04-20 at 21:09 -0400, Robert Haas wrote:
 But
 even then I think we'd have this problem of people being unwilling to
 give up on jamming stuff into a release, regardless of the scheduling
 impact of doing so.  I actually think the problem of getting releases
 out on time is a *much* bigger problem for us than how long or short
 CommitFests are.

I think to really address that problem, you need to think about shorter
release cycles overall, like every 6 months.  Otherwise, the current 12
to 14 month horizon is just too long psychologically.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] fsync reliability

2011-04-21 Thread Simon Riggs

Daniel Farina points out to me that the Linux man page for fsync() says
Calling fsync() does not necessarily ensure that the entry in the directory
containing the file has also reached disk. For that an
explicit fsync() on a
file descriptor for the directory is also needed.
http://www.kernel.org/doc/man-pages/online/pages/man2/fsync.2.html

That phrase does not exist here
http://pubs.opengroup.org/onlinepubs/007908799/xsh/fsync.html

This point appears to have been discussed before
http://postgresql.1045698.n5.nabble.com/ALTER-DATABASE-SET-TABLESPACE-vs-crash-safety-td1995703.html

Tom said
We don't try to fsync the
directory after a normal table create for instance

which is fine because we don't need to. In the event of a crash a
missing table would be recreated during crash recovery.

However, that begs the question of what happens with WAL. At present,
we do nothing to ensure that the entry in the directory containing
the file has also reached disk.

ISTM that we can easily do this, since we preallocate WAL files during
RemoveOldXlogFiles() and rarely extend the number of files.
So it seems easily possible to fsync the pg_xlog directory at the end
of RemoveOldXlogFiles(), which is mostly performed by the bgwriter
anyway.

It was also noted that we've always expected the filesystem to take
care of its own metadata
which isn't actually stated anywhere in the docs, AFAIK.

Perhaps this is an irrelevant problem these days, but would it hurt to fix?

Happy to do the patch if we agree.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: database system identifier differs between the primary and standby

2011-04-21 Thread rajibdk

What does that database system identifier means? Is it related to DB 
transactions' or unique to a version?

Rajib Deka
SIEMENS Ltd.
Robert V Chandran Tower, First Floor, West Wing,
#149, Velechery Tambaram Main Road, Pallikaranai, Chennai-100, INDIA.
www.siemens.comhttp://www.siemens.com

Mob: +91-9176780669 | E-Mail: rajib.d...@siemens.com

From: Robert Haas [via PostgreSQL] 
[mailto:ml-node+4326869-1711138747-200...@n5.nabble.com]
Sent: Wednesday, April 20, 2011 7:20 PM
To: Deka, Rajib IN MAA SL
Subject: Re: database system identifier differs between the primary and standby

On Wed, Apr 20, 2011 at 9:28 AM, rajibdk [hidden 
email]/user/SendEmail.jtp?type=nodenode=4326869i=0by-user=t wrote:
 We are getting the following log while configuring hot standby,

 2011-04-20 17:34:40 ETC/GMT FATAL:  the database system is starting up
 2011-04-20 17:34:41 ETC/GMT FATAL:  database system identifier differs
 between the primary and standby
 2011-04-20 17:34:41 ETC/GMT DETAIL:  The primary's identifier is
 5592072752411433371, the standby's identifier is 5597615802844953578.

 PostgreSQL Version: 9.0.2

You need to initialize the slave using a hot backup taken on the master.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list ([hidden 
email]/user/SendEmail.jtp?type=nodenode=4326869i=1by-user=t)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


If you reply to this email, your message will be added to the discussion below:
http://postgresql.1045698.n5.nabble.com/database-system-identifier-differs-between-the-primary-and-standby-tp4317178p4326869.html
To unsubscribe from database system identifier differs between the primary and 
standby, click 
herehttp://postgresql.1045698.n5.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4317178code=cmFqaWIuZGVrYUBzaWVtZW5zLmNvbXw0MzE3MTc4fC0zNTQ2NTE2Njg=.


Important notice: This e-mail and any attachment there to contains corporate 
proprietary information. If you have received it by mistake, please notify us 
immediately by reply e-mail and delete this e-mail and its attachments from 
your system.
Thank You.


--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/database-system-identifier-differs-between-the-primary-and-standby-tp4317178p4330373.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

Re: [HACKERS] Re: database system identifier differs between the primary and standby

2011-04-21 Thread Heikki Linnakangas


On 21.04.2011 12:31, rajibdk wrote:

What does that database system identifier means? Is it related to DB 
transactions' or unique to a version?


It's an identifier unique to each PostgreSQL database cluster. It's 
generated when you run initdb.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: database system identifier differs between the primary and standby

2011-04-21 Thread Simon Riggs

On Thu, Apr 21, 2011 at 10:31 AM, rajibdk rajib.d...@siemens.com wrote:

 What does that database system identifier means? Is it related to DB
 transactions’ or unique to a version?

Regrettably, it means you didn't follow the documented procedure.

It isn't possible to do it any other way, so those questions are a distraction.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] hot backups: am I doing it wrong, or do we have a problem with pg_clog?

2011-04-21 Thread Daniel Farina

To start at the end of this story: DETAIL:  Could not read from file
pg_clog/007D at offset 65536: Success.

This is a message we received on a a standby that we were bringing
online as part of a test.  The clog file was present, but apparently
too small for Postgres (or at least I tihnk this is what the message
meant), so one could stub in another clog file and then continue
recovery successfully (modulus the voodoo of stubbing in clog files in
general).  I am unsure if this is due to an interesting race condition
in Postgres or a result of my somewhat-interesting hot-backup
protocol, which is slightly more involved than the norm.  I will
describe what it does here:

1) Call pg start backup
2) crawl the entire postgres cluster directory structure, except
pg_xlog, taking notes of the size of every file present
3) begin writing TAR files, but *only up to the size noted during the
original crawling of the cluster directory,* so if the file grows
between the original snapshot and subsequently actually calling read()
on the file those extra bytes will not be added to the TAR.
  3a) If a file is truncated partially, I add \0 bytes to pad the
tarfile member up to the size sampled in step 2, as I am streaming the
tar file and cannot go back in the stream and adjust the tarfile
member size
4) call pg stop backup

The reason I go to this trouble is because I use many completely
disjoint tar files to do parallel compression, decompression,
uploading, and downloading of the base backup of the database, and I
want to be able to control the size of these files up-front.  The
requirement of stubbing in \0 is because of a limitation of the tar
format when dealing with streaming archives and the requirement to
truncate the files to the size snapshotted in the step 2 is to enable
splitting up the files between volumes even in the presence of
possible concurrent growth while I'm performing the hot backup. (ex: a
handful of nearly-empty heap files can rapidly grow due to a
concurrent bulk load if I get unlucky, which I do not intend to allow
myself to be).

Any ideas?  Or does it sound like I'm making some bookkeeping errors
and should review my code again?  It does work most of the time.  I
have not gotten a sense how often this reproduces just yet.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Defining input function for new datatype

2011-04-21 Thread Nick Raj

Hi,
I am defining a new data type called mpoint
i.e.
typedef struct mpoint
{
Point p;
Timestamp t;
} mpoint;

For defining input/output function

1 Datum mpoint_in(PG_FUNCTION_ARGS)
2 {
3
4mpoint *result;
5char *pnt=(char *)malloc (sizeof (20));
6char *ts=(char *)malloc (sizeof (20));
7result= (mpoint *) palloc(sizeof(mpoint));
8char *st = PG_GETARG_CSTRING(0);
9mpoint_decode(st,pnt,ts);
// st breaks down into pnt that corresponds to Point and ts corresponds to
Timestamp
10
11  result-p = point_in(PointerGetDatum(pnt));//
point_in (input function for point that assigns x, y into point)
12  result- t = timestamp_in(PointerGetDatum(ts)); // similar
for timestamp
13
14  PG_RETURN_MPOINT_P(result);
15   }

line no 11 warning: passing argument 1 of ‘point_in’ makes pointer from
integer without a cast
 ../../../include/utils/geo_decls.h:191: note: expected
‘FunctionCallInfo’ but argument is of type ‘unsigned int’
line no 11 error: incompatible types when assigning to type ‘Point’ from
type ‘Datum’
line no 12 warning: passing argument 1 of ‘timestamp_in’ makes pointer from
integer without a cast
 ../../../include/utils/timestamp.h:205: note: expected
‘FunctionCallInfo’ but argument is of type ‘unsigned int’

Can anybody figure out what kind of mistake i am doing?
Also, why it got related to 'FunctionCallInfo' ?

Thanks
Nick

Re: [HACKERS] Defining input function for new datatype

2011-04-21 Thread Pavel Stehule

Hello

2011/4/21 Nick Raj nickrajj...@gmail.com:
 Hi,
 I am defining a new data type called mpoint
 i.e.
 typedef struct mpoint
 {
     Point p;
     Timestamp t;
 } mpoint;

 For defining input/output function

 1 Datum mpoint_in(PG_FUNCTION_ARGS)
 2 {
 3
 4    mpoint *result;
 5    char *pnt=(char *)malloc (sizeof (20));
 6    char *ts=(char *)malloc (sizeof (20));
 7    result= (mpoint *) palloc(sizeof(mpoint));
 8    char *st = PG_GETARG_CSTRING(0);
 9    mpoint_decode(st,pnt,ts);
 // st breaks down into pnt that corresponds to Point and ts corresponds to
 Timestamp
 10
 11  result-p = point_in(PointerGetDatum(pnt));    //
 point_in (input function for point that assigns x, y into point)
 12  result- t = timestamp_in(PointerGetDatum(ts)); // similar
 for timestamp
 13
 14  PG_RETURN_MPOINT_P(result);
 15   }

 line no 11 warning: passing argument 1 of ‘point_in’ makes pointer from
 integer without a cast
  ../../../include/utils/geo_decls.h:191: note: expected
 ‘FunctionCallInfo’ but argument is of type ‘unsigned int’
 line no 11 error: incompatible types when assigning to type ‘Point’ from
 type ‘Datum’
 line no 12 warning: passing argument 1 of ‘timestamp_in’ makes pointer from
 integer without a cast
  ../../../include/utils/timestamp.h:205: note: expected
 ‘FunctionCallInfo’ but argument is of type ‘unsigned int’


you are missing a important header files.

 Can anybody figure out what kind of mistake i am doing?
 Also, why it got related to 'FunctionCallInfo' ?

see on definition of PG_FUNCTION_ARGS macro

Regards

Pavel Stehule


 Thanks
 Nick


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-04-21 Thread Simon Riggs

On Wed, Apr 20, 2011 at 8:54 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Peter Eisentraut pete...@gmx.net writes:
 I would imagine one commit fest per month, but
 it's only a week long.

 BTW, just as a thought experiment: what about a one-day CF once a week?
 Patch Tuesdays, if you will.  Spend all day reviewing/committing,
 bounce back whatever is not ready, patch authors try again next week.

 Really large patches are not going to fit into that paradigm, probably,
 but an awful lot of stuff would --- and it might help encourage more
 incremental development of the big ones, too.

I'm responding to this post with mostly general comments, not directed
specifically at Tom.

Speeding up the process means that people with more time get a bigger
say and people with less time get a smaller input than before. I'm
already concerned that the gap between patch submission and patch
commit is so short it effectively means feedback is impossible.

The more frequently we do integration, the greater proportion of our
time is spent doing that.

My concern is there are a relatively low number of people working on
features that lots of people care about. Senior time should not be
wasted on endless integration.

We should be encouraging people to spend more time on more useful
features, not an endless stream of trivial patches, integration and
release processes. None of our users give a flying, err, squirrel,
about our small patch review process. Especially when its absolutely
brilliant already.

My model of contributing to this project has always been to spend time
with customers, understanding solutions and problems, then bringing
that back to the community. That has brought both the funding to allow
me to contribute and a stream of ideas with a clear focus. I encourage
others to do the same. I don't think we should be working on an
interrupt driven model, we should be planning our contributions and
making sure we make the biggest impact possible with real code, not
just twittering about it constantly. If we spend too much time with
each other we will be exactly like the larger commercial development
groups who never meet users only each other. Even the General list
isn't fully representative of the actual/potential user base.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fsync reliability

2011-04-21 Thread Alvaro Herrera

Excerpts from Simon Riggs's message of jue abr 21 05:26:06 -0300 2011:

 ISTM that we can easily do this, since we preallocate WAL files during
 RemoveOldXlogFiles() and rarely extend the number of files.
 So it seems easily possible to fsync the pg_xlog directory at the end
 of RemoveOldXlogFiles(), which is mostly performed by the bgwriter
 anyway.
 
 It was also noted that we've always expected the filesystem to take
 care of its own metadata
 which isn't actually stated anywhere in the docs, AFAIK.
 
 Perhaps this is an irrelevant problem these days, but would it hurt to fix?

I don't think it's irrelevant (yet).  Even Greg Smith's book suggests to
use ext2 for the WAL partition in extreme cases.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-04-21 Thread Kevin Grittner

Peter Eisentraut pete...@gmx.net wrote:
 
 you need to think about shorter release cycles overall, like every
 6 months.
 
With the current time between feature freeze and release, that
wouldn't leave a lot of time for development.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hot backups: am I doing it wrong, or do we have a problem with pg_clog?

2011-04-21 Thread Merlin Moncure

On Thu, Apr 21, 2011 at 6:15 AM, Daniel Farina dan...@heroku.com wrote:
 To start at the end of this story: DETAIL:  Could not read from file
 pg_clog/007D at offset 65536: Success.

 This is a message we received on a a standby that we were bringing
 online as part of a test.  The clog file was present, but apparently
 too small for Postgres (or at least I tihnk this is what the message
 meant), so one could stub in another clog file and then continue
 recovery successfully (modulus the voodoo of stubbing in clog files in
 general).  I am unsure if this is due to an interesting race condition
 in Postgres or a result of my somewhat-interesting hot-backup
 protocol, which is slightly more involved than the norm.  I will
 describe what it does here:

 1) Call pg start backup
 2) crawl the entire postgres cluster directory structure, except
 pg_xlog, taking notes of the size of every file present
 3) begin writing TAR files, but *only up to the size noted during the
 original crawling of the cluster directory,* so if the file grows
 between the original snapshot and subsequently actually calling read()
 on the file those extra bytes will not be added to the TAR.
  3a) If a file is truncated partially, I add \0 bytes to pad the
 tarfile member up to the size sampled in step 2, as I am streaming the
 tar file and cannot go back in the stream and adjust the tarfile
 member size
 4) call pg stop backup

 The reason I go to this trouble is because I use many completely
 disjoint tar files to do parallel compression, decompression,
 uploading, and downloading of the base backup of the database, and I
 want to be able to control the size of these files up-front.  The
 requirement of stubbing in \0 is because of a limitation of the tar
 format when dealing with streaming archives and the requirement to
 truncate the files to the size snapshotted in the step 2 is to enable
 splitting up the files between volumes even in the presence of
 possible concurrent growth while I'm performing the hot backup. (ex: a
 handful of nearly-empty heap files can rapidly grow due to a
 concurrent bulk load if I get unlucky, which I do not intend to allow
 myself to be).

 Any ideas?  Or does it sound like I'm making some bookkeeping errors
 and should review my code again?  It does work most of the time.  I
 have not gotten a sense how often this reproduces just yet.

Everyone here is going to assume the problem is in your (too?) fancy
tar/diff delta archiving approach because we can't see that code and
it just sounds suspicious.  A busted clog file is of course very
noteworthy but to eliminate your stuff you should try reproducing
using a more standard method of grabbing the base backup.

Have you considered using rsync instead?

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] smallserial / serial2

2011-04-21 Thread Tom Lane

Mike Pultz m...@mikepultz.com writes:
 I use tables all the time that have sequences on smallint's; 
 I'd like to simplify my create files by not having to create the sequence
 first, but I also don't want to give up those 2 bytes per column!

A sequence that can only go to 32K doesn't seem all that generally
useful ...

Are you certain that you're really saving anything?  More likely than
not, the saved 2 bytes are going to disappear into alignment padding
of a later column or of the whole tuple.  Even if it really does help
for your case, that's another reason to doubt that it's generally
useful.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hot backups: am I doing it wrong, or do we have a problem with pg_clog?

2011-04-21 Thread Andres Freund

Hi,

On Thursday, April 21, 2011 01:15:48 PM Daniel Farina wrote:
 Any ideas?  Or does it sound like I'm making some bookkeeping errors
 and should review my code again?  It does work most of the time.  I
 have not gotten a sense how often this reproduces just yet.
I would suggest taking both, your backup, and a simpler version for now. When 
the error occurs again you can compare...

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Defining input function for new datatype

2011-04-21 Thread Tom Lane

Nick Raj nickrajj...@gmail.com writes:
 1 Datum mpoint_in(PG_FUNCTION_ARGS)
 2 {
 3
 4mpoint *result;
 5char *pnt=(char *)malloc (sizeof (20));
 6char *ts=(char *)malloc (sizeof (20));

(1) You should *not* use malloc here.  There is seldom any reason to use
malloc directly at all in functions coded for Postgres.  Use palloc,
or expect memory leaks.

(2) sizeof(20) almost certainly doesn't mean what you want.  It's most
likely 4 ...

 11  result-p = point_in(PointerGetDatum(pnt));//
 point_in (input function for point that assigns x, y into point)

You need to use DirectFunctionCallN when trying to call a function that
obeys the PG_FUNCTION_ARGS convention, as point_in does.  And the result
is a Datum, which means you're going to need to apply a DatumGetWhatever
macro to get a bare Point or Timestamp from these functions.

Look around in the PG sources for examples.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-04-21 Thread Peter Eisentraut

On Thu, 2011-04-21 at 14:01 +0100, Simon Riggs wrote:
 We should be encouraging people to spend more time on more useful
 features, not an endless stream of trivial patches, integration and
 release processes.

Hence the proposal to cut that time down and make it count better.

Which direction were you thinking?



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-04-21 Thread Peter Eisentraut

On Thu, 2011-04-21 at 08:42 -0500, Kevin Grittner wrote:
  you need to think about shorter release cycles overall, like every
  6 months.
  
 With the current time between feature freeze and release, that
 wouldn't leave a lot of time for development.

Presumably, one would aim to cut all the other things in half as well.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-04-21 Thread Robert Haas

On Thu, Apr 21, 2011 at 2:43 AM, Peter Eisentraut pete...@gmx.net wrote:
 On Wed, 2011-04-20 at 21:09 -0400, Robert Haas wrote:
 But
 even then I think we'd have this problem of people being unwilling to
 give up on jamming stuff into a release, regardless of the scheduling
 impact of doing so.  I actually think the problem of getting releases
 out on time is a *much* bigger problem for us than how long or short
 CommitFests are.

 I think to really address that problem, you need to think about shorter
 release cycles overall, like every 6 months.  Otherwise, the current 12
 to 14 month horizon is just too long psychologically.

I agree.  I am in favor of a shorter release cycle.  But I think that
a shorter release cycle won't work well if there is still four month
long integration period at the end of each series of CommitFests.  The
problem is a bit circular here: because release cycles are long,
people really, really want to slip as much as possible in at the end.
But being under time pressure to get things committed results in a
higher bug count, which means more things that have to be fixed after
feature freeze, which translates into a long release cycle.

I think that it's not too bad if the process of a release getting out
the door results in effectively missing one CommitFest.  For example,
if we imagine one-month CommitFests starting every two months, and we
had a CommitFest starting on January 15th, it wouldn't be too painful
if we skipped a hypothetical March 15th CommitFest to get the release
done, and then started up the process again on May 15th.  However, in
practice, what happens is we miss *two* CommitFests: the expectation
is that the next CommitFest will be on the order of July 15th, which
is just too long.  Similarly, if we did shorter CommitFests and
shorter releases - say, five one-week-a-month CommitFests in July,
August, September, October, and November, I'd want to kick a release
out in December and reopen for development in January, not get stuck
with the same six-month feature freeze we have now, or even a
four-month feature freeze.  But that isn't going to work if people do
the same sort of throwing everything into the kitchen sink at the last
minute that we have been doing for at least the last couple of
releases.

In fact, I don't believe that the current CF cycle really forces a
huge amount of waiting-for-feedback.  It's true that if you submit a
patch at a randomly chosen time, you will have to wait up to two
months for a CommitFest to start, and then you might not get a review
until late in the CommitFest, so it could take you up to three months
to get a review.  In practice, patches are not submitted at random
times - in fact, probably 50% of the patches come in during the last
week before the CF starts, and typically perhaps 50% of the patches
get a review in the first week, and maybe 80% within the first two
weeks.   Some patches also get an initial review between CommitFests,
which further improves the average.  Overall, I bet the average time
between patch submission and first review is 3 weeks.  You can
typically get 2 or 3 followup reviews during the same cycle with only
a few days latency for each.  Even though it would be nice to do
better, for an all-volunteer project, I think it's respectable.   I
can't say the same thing about our process from getting from feature
freeze to release.  It's really long, and it's nearly all fixing bugs
in code that was committed in the last CF, and the last CF produces
exponentially more bugs than the earlier ones, and it's often the case
that people don't fix their own bugs and someone else has to jump in
to pick up the slack.  Meanwhile, the regular flow of reviewing and
committing patches is completely disrupted; and once in a while
someone gets flamed for so much as bringing up a new feature that
they're interested in working on for the next release (which I think
is totally unwarranted; now is the PERFECT time to begin roughing out
plans for 9.2 work... but I digress).

So while I'm mildly interested in the idea of shifting the CF cycle
around to provide more timely review, I can't really get that excited
about it, especially if there's any risk that we are just shifting
more of the work from the CommitFest cycle to the
end-of-release-interminable-integration-period.  However, if there's
some way of avoiding the phenomenon where all hell breaks loose
because people jam four major new features into the tree in as many
weeks, sign me up.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Typed table DDL loose ends

2011-04-21 Thread Peter Eisentraut

On Wed, 2011-04-20 at 10:44 -0400, Noah Misch wrote:
 If we add that ownership check, we'll protect some operations on the
 type.  The
 cost is localized divergence from our principle that types have no
 usage
 restrictions.  I'm of the opinion that it's not worth introducing that
 policy
 exception to block just some of these avenues of attack.  I would not
 object to
 it, though. 

So that means we should leave it as is for now?  Fine with me.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: database system identifier differs between the primary and standby

2011-04-21 Thread Robert Haas

On Thu, Apr 21, 2011 at 6:38 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On Thu, Apr 21, 2011 at 10:31 AM, rajibdk rajib.d...@siemens.com wrote:
 What does that database system identifier means? Is it related to DB
 transactions’ or unique to a version?

 Regrettably, it means you didn't follow the documented procedure.

 It isn't possible to do it any other way, so those questions are a 
 distraction.

I think they are perfectly good questions.  If someone is trying to
understand how our product works, we should encourage that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-04-21 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Thu, Apr 21, 2011 at 2:43 AM, Peter Eisentraut pete...@gmx.net wrote:
 I think to really address that problem, you need to think about shorter
 release cycles overall, like every 6 months.  Otherwise, the current 12
 to 14 month horizon is just too long psychologically.

 I agree.  I am in favor of a shorter release cycle.

I'm not.  I don't think there is any demand among *users* (as opposed to
developers) for more than one major PG release a year.  It's hard enough
to get people to migrate that often.

Another problem is that if you halve the release interval, you either
double the amount of work spent on maintaining back branches, or halve
the support lifetime of a branch.  Neither of those is attractive.

Now, it certainly would be nice to spend less time in beta mode as
opposed to development, and I think most of the points being made here
are really about how to cut that.  But reducing the release interval is
not going to reduce the total amount of time we spend in beta mode;
in fact I'd expect it to increase.  Halving the amount of development
time per release doesn't mean that you can cut beta time proportionally.
It just takes time to cut a release, and time for testers to try it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hot backups: am I doing it wrong, or do we have a problem with pg_clog?

2011-04-21 Thread Robert Haas

On Thu, Apr 21, 2011 at 7:15 AM, Daniel Farina dan...@heroku.com wrote:
 To start at the end of this story: DETAIL:  Could not read from file
 pg_clog/007D at offset 65536: Success.

 This is a message we received on a a standby that we were bringing
 online as part of a test.  The clog file was present, but apparently
 too small for Postgres (or at least I tihnk this is what the message
 meant), so one could stub in another clog file and then continue
 recovery successfully (modulus the voodoo of stubbing in clog files in
 general).  I am unsure if this is due to an interesting race condition
 in Postgres or a result of my somewhat-interesting hot-backup
 protocol, which is slightly more involved than the norm.  I will
 describe what it does here:

 1) Call pg start backup
 2) crawl the entire postgres cluster directory structure, except
 pg_xlog, taking notes of the size of every file present
 3) begin writing TAR files, but *only up to the size noted during the
 original crawling of the cluster directory,* so if the file grows
 between the original snapshot and subsequently actually calling read()
 on the file those extra bytes will not be added to the TAR.
  3a) If a file is truncated partially, I add \0 bytes to pad the
 tarfile member up to the size sampled in step 2, as I am streaming the
 tar file and cannot go back in the stream and adjust the tarfile
 member size
 4) call pg stop backup

In theory I would expect any defects introduced by the, ahem,
exciting, procedure described in steps 3 and 3a to be corrected by
recovery automatically when you start the new cluster.  It shouldn't
matter exactly when you read the file, and recovery for unrelated
blocks ought to proceed totally independently, and an all-zeros block
should be treated the same way as one that isn't allocated yet, so it
seems like it ought to work.  But you may be stressing some paths in
the recovery code that don't get regular exercise, since the manner in
which you are taking the backup can produce backups that are different
from any backup that could be taken by the normal method, and those
paths might have bugs.  It's also possible, as others have said, that
you've botched it.  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] stored procedures

2011-04-21 Thread Peter Eisentraut

So the topic of real stored procedures came up again.  Meaning a
function-like object that executes outside of a regular transaction,
with the ability to start and stop SQL transactions itself.

I would like to collect some specs on this feature.  So does anyone have
links to documentation of existing implementations, or their own spec
writeup?  A lot of people appear to have a very clear idea of this
concept in their own head, so let's start collecting those.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-04-21 Thread Andrew Dunstan




On 04/21/2011 11:16 AM, Tom Lane wrote:

Robert Haasrobertmh...@gmail.com  writes:

On Thu, Apr 21, 2011 at 2:43 AM, Peter Eisentrautpete...@gmx.net  wrote:

I think to really address that problem, you need to think about shorter
release cycles overall, like every 6 months.  Otherwise, the current 12
to 14 month horizon is just too long psychologically.

I agree.  I am in favor of a shorter release cycle.

I'm not.  I don't think there is any demand among *users* (as opposed to
developers) for more than one major PG release a year.  It's hard enough
to get people to migrate that often.


I agree.


Another problem is that if you halve the release interval, you either
double the amount of work spent on maintaining back branches, or halve
the support lifetime of a branch.  Neither of those is attractive.


I *really* *really* agree.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-04-21 Thread Ross J. Reedstrom

On Thu, Apr 21, 2011 at 11:16:45AM -0400, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  On Thu, Apr 21, 2011 at 2:43 AM, Peter Eisentraut pete...@gmx.net wrote:
  I think to really address that problem, you need to think about shorter
  release cycles overall, like every 6 months. �Otherwise, the current 12
  to 14 month horizon is just too long psychologically.
 
  I agree.  I am in favor of a shorter release cycle.
 
 I'm not.  I don't think there is any demand among *users* (as opposed to
 developers) for more than one major PG release a year.  It's hard enough
 to get people to migrate that often.

In fact, I predict that the observed behavior would be for even more end
users to start skipping releases. Some already do - it's common not to
upgrade unless there's a feature you really need, but for those who do
stay on the 'current' upgrade path, you'll lose some who can't afford to
spend more than one integration-testing round a year.

Ross
-- 
Ross Reedstrom, Ph.D. reeds...@rice.edu
Systems Engineer  Admin, Research Scientistphone: 713-348-6166
Connexions  http://cnx.orgfax: 713-348-3665
Rice University MS-375, Houston, TX 77005
GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E  F888 D3AE 810E 88F0 BEDE

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] getting to beta

2011-04-21 Thread Peter Eisentraut

On Tue, 2011-04-19 at 18:18 -0400, Robert Haas wrote:
 2. The typed tables stuff vs. pg_upgrade still needs work.  I would be
 just as happy if Tom or Peter wanted to fix this, mostly for fear of
 getting flak over the details of the fixes, but if not I will do it.

Noah Misch is hot on the trail of that one.

 - There is an outstanding bug-fix patch for PL/python tracebacks,

That has been addressed.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-04-21 Thread Robert Haas

On Thu, Apr 21, 2011 at 11:16 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Thu, Apr 21, 2011 at 2:43 AM, Peter Eisentraut pete...@gmx.net wrote:
 I think to really address that problem, you need to think about shorter
 release cycles overall, like every 6 months.  Otherwise, the current 12
 to 14 month horizon is just too long psychologically.

 I agree.  I am in favor of a shorter release cycle.

 I'm not.  I don't think there is any demand among *users* (as opposed to
 developers) for more than one major PG release a year.  It's hard enough
 to get people to migrate that often.

I agree there's probably little user demand, and back-branch
maintenance is an issue, but I think if it removed the temptation to
cram major new features into the tree at the last minute, it might be
worth it.  However, a possibly more likely outcome is that we'd still
have that temptation, just more frequently; and end up with even less
of the year open to new patches than is currently the case.

 Another problem is that if you halve the release interval, you either
 double the amount of work spent on maintaining back branches, or halve
 the support lifetime of a branch.  Neither of those is attractive.

 Now, it certainly would be nice to spend less time in beta mode as
 opposed to development, and I think most of the points being made here
 are really about how to cut that.  But reducing the release interval is
 not going to reduce the total amount of time we spend in beta mode;
 in fact I'd expect it to increase.  Halving the amount of development
 time per release doesn't mean that you can cut beta time proportionally.
 It just takes time to cut a release, and time for testers to try it.

I believe that the problem is much more related to the fact that we
commit things at the end of the cycle that aren't really done than it
is to the amount of time beta testers need to try things.  If we were
only waiting on testing, we could branch the tree and call the release
du jour beta for another N months, then release, meanwhile continuing
development.  In fact, you and I and three or four other people have
spent most of our visible PG time over the last 2 months fixing MANY
bugs, mostly in the six or so major features committed between
February 7th and March 6th.  (By way of comparison, notice how few
bugs that have been in the major patches from CF3 - because those
things were actually pretty much working *when they were committed*.)

Now, we're getting to the point where that might actually be a
reasonable way to go.  It wouldn't bother me a bit to branch the tree
just after beta1 and start a new cycle of CommitFests on May 15th, and
we could begin integrating some of the big stuff that didn't make it
into 9.1: key locks, range types, additional sync rep modes, snapshot
cloning, parallel pg_dump, etc.  It would be great to start working on
that stuff while it's still mildly fresh in people's minds, and at the
*beginning* of the release cycle.  We're probably doomed to another
fall release at this point anyway, so it's not clear to me that the
inevitable loss of focus that will ensue is really costing anything.
Had we gotten to beta1 on March 1st, I'd probably be in favor of going
all in to get the release out in June or maybe on July 1, but at this
point that seems unlikely to be realistic.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 105 matches

Mail list logo