Re: [HACKERS] Support for N synchronous standby servers - take 2

2016-02-05 Thread Joshua Berkus

> We may have a good idea of how to define a custom language, still we
> are going to need to design a clean interface at catalog level more or
> less close to what is written here. If we can get a clean interface,
> the custom language implemented, and TAP tests that take advantage of
> this user interface to check the node/group statuses, I guess that we
> would be in good shape for this patch.
> 
> Anyway that's not a small project, and perhaps I am over-complicating
> the whole thing.

Yes.  The more I look at this, the worse the idea of custom syntax looks.  Yes, 
I realize there are drawbacks to using JSON, but this is worse.

Further, there's a lot of horse-cart inversion here.  This proposal involves 
letting the syntax for sync_list configuration determine the feature set for 
N-sync.  That's backwards; we should decide the total list of features we want 
to support, and then adopt a syntax which will make it possible to have them.

-- 
Josh Berkus
Red Hat OSAS
(opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Releasing in September

2016-01-25 Thread Joshua Berkus
All,

So the proximate cause of late releases are the following:

1. patches are getting kicked down the road from one CF to another, creating a 
big pileup in the final CF.  This is exactly like the huge pile of unreviewed 
patches we used to have before the CF system.

2. Once the last CF is closed, everyone goes off and starts working on the next 
version, leaving a few people to handle getting the production release 
integrated and debugged.  This is partly because nobody likes doing that work, 
and partly because most hackers aren't clear how they can help.

So, I have a modest suggestion on how to change all this:

Let's do quarterly development releases and supported production releases every 
18 months.

A 3-month release cycle would let people see their code go into the wild a lot 
faster; basically we'd do a CF then a development release.  That would limit 
pileup issues around people losing interest/moving on, forgetting what was 
going on, and conflicts between various features in development.  A shorter 
release/dev cycle is more manageable. By having supported production releases 
50% less often, we could keep the overall number of versions we need to patch 
the same as it is now.

The alternative to this is an aggressive recruitment and mentorship program to 
create more major contributors who can do deep review of patches.  But that 
doesn't seem to have happened in the last 5 years, and even if we started it 
now, it would be 2 years before it paid off.

-- 
Josh Berkus
Red Hat OSAS
(opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.4 CF 1] The Commitfest Slacker List

2013-07-06 Thread Joshua Berkus


- Original Message -
> * Josh Berkus (j...@agliodbs.com) wrote:
> > Is there anyone else on the committer list with similar circumstances?
> 
> I'll just flip it around and offer to be publically flogged whenever I'm
> not helping out with a commitfest. :)  Perhaps this should be more
> "opt-in" than "opt-out", wrt committers anyway.

Can we flog you even if you *are* helping?  I just wanna see the YouTube video, 
either way.  ;-)-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAlHYcnEACgkQrzgMPqB3kigEogCeONVFuhxuvOrgHObuhOSSiNcq
67AAmwQNJyXNXPR3Kk5jRAZMh9i65Wgy
=MH6M
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cascading replication: should we detect/prevent cycles?

2013-01-05 Thread Joshua Berkus
Robert,

> I'm sure it's possible; I don't *think* it's terribly easy.  The
> usual
> algorithm for cycle detection is to have each node send to the next
> node the path that the data has taken.  But, there's no unique
> identifier for each slave that I know of - you could use IP address,
> but that's not really unique.  And, if the WAL passes through an
> archive, how do you deal with that?  

Not that I know how to do this, but it seems like a more direct approach is to 
check whether there's a master anywhere up the line.  H.  Still sounds 
fairly difficult.

> I'm sure somebody could figure
> all of this stuff out, but it seems fairly complicated for the
> benefit
> we'd get.  I just don't think this is going to be a terribly common
> problem; if it turns out I'm wrong, I may revise my opinion.  :-)

I don't think it'll be that common either.  The problem is that when it does 
happen, it'll be very hard for the hapless sysadmin involved to troubleshoot.

> To me, it seems that lag monitoring between master and standby is
> something that anyone running a complex replication configuration
> should be doing - and yeah, I think anything involving four standbys
> (or cascading) qualifies as complex.  If you're doing that, you
> should
> notice pretty quickly that your replication lag is increasing
> steadily.  

There are many reasons why replication lag would increase steadily.

> You might also check pg_stat_replication the master and
> notice that there are no connections there any more. 

Well, if you've created a true cycle, every server has one or more replicas.  
The original case I presented was the most probably cause of accidental cycles: 
the original master dies, and the on-call sysadmin accidentally connects the 
first replica to the last replica while trying to recover the cluster.

AFAICT, the only way to troubleshoot a cycle is to test every server in the 
network to see if it's a master and has replicas, and if no server is a master 
with replicas, it's a cycle.  Again, not fast or intuitive.

 Could someone
> miss those tell-tale signs?  Sure.  But they could also set
> autovacuum_naptime to an hour and then file a support ticket
> complaining that about table bloat - and they do.  Personally, as
> user
> screw-ups go, I'd consider that scenario (and its fourteen cousins,
> twenty-seven second cousins, and three hundred and ninety two other
> extended family members) as higher-priority and lower effort to fix
> than this particular thing.

I agree that this isn't a particularly high-priority issue.  I do think it 
should go on the TODO list, though, just in case we get a GSOC student or other 
new contributor who wants to tackle it.

--Josh




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Feature Request: pg_replication_master()

2012-12-20 Thread Joshua Berkus
Andreas,
 
> Do you want the node one step up or the top-level in the chain?
> Because
> I don't think we can do the latter without complicating the
> replication
> protocol noticably.

Well, clearly a whole chain would be nice for the user.  But even just one step 
up would be very useful.

--Josh


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cascading replication: should we detect/prevent cycles?

2012-12-20 Thread Joshua Berkus
Robert,

> > What would such a test look like?  It's not obvious to me that
> > there's any rapid way for a user to detect this situation, without
> > checking each server individually.
> 
> Change something on the master and observe that none of the supposed
> standbys notice?

That doesn't sound like an infallible test, or a 60-second one.

My point is that in a complex situation (imagine a shop with 9 replicated 
servers in 3 different cascaded groups, immediately after a failover of the 
original master), it would be easy for a sysadmin, responding to middle of the 
night page, to accidentally fat-finger an IP address and create a cycle instead 
of a new master.  And once he's done that, a longish troubleshooting process to 
figure out what's wrong and why writes aren't working, especially if he goes to 
bed and some other sysadmin picks up the "Writes failing to PostgreSQL" ticket.

*if* it's relatively easy for us to detect cycles (that's a big if, I'm not 
sure how we'd do it), then it would help a lot for us to at least emit a 
WARNING.  That would short-cut a lot of troubleshooting.

--Josh Berkus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Switching timeline over streaming replication

2012-12-20 Thread Joshua Berkus


> I just committed a patch that should make the "requested WAL segment
> 00020003 has already been removed" errors go away.
> The
> trick was for walsenders to not switch to the new timeline until at
> least one record has been replayed on it. That closes the window
> where
> the walsender already considers the new timeline to be the latest,
> but
> the WAL file has not been created yet.

OK, I'll download the snapshot in a couple days and make sure this didn't 
breaks something else.

--Josh


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Feature Request: pg_replication_master()

2012-12-20 Thread Joshua Berkus

> As ever, we spent much energy on debating backwards compatibility
> rather than just solving the problem it posed, which is fairly easy
> to
> solve.

Well, IIRC, the debate was primarily of *your* making.  Almost everyone else on 
the thread was fine with the original patch, and it was nearly done for 9.2 
before you stepped in.  I can't find anyone else on that thread who thought 
that backwards compatibility was more important than fixing the API.

--Josh


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Feature Request: pg_replication_master()

2012-12-19 Thread Joshua Berkus

> This sounds like my previous suggestion of returning the primary
> conninfo value, but with just ip. That one came with a pretty bad
> patch, and was later postponed until we folded recovery.conf into
> the main configuration file parsing. I'm not really sure what
> happened to that project? (the configuration file one)

Hmmm, good point.  Just having primary_conninfo it in pg_settings would help a 
lot.

--Josh 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Feature Request: pg_replication_master()

2012-12-19 Thread Joshua Berkus

> It stalled because the patch author decided not to implement the
> request to detect recovery.conf in data directory, which allows
> backwards compatibility.

Well, I don't think we had agreement on how important backwards compatibility 
for recovery.conf was, particularly not on the whole 
recovery.conf/recovery.done functionality and the wierd formatting of 
recovery.conf.

However, with "include_if_exists" directives in postgresql.conf, or 
"include_dir", that would be easy to work around.  Don't we have something like 
that planned for SET PERSISTENT?

--Josh Berkus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] Cascading replication: should we detect/prevent cycles?

2012-12-19 Thread Joshua Berkus
Simon,

> My logic is that if you make a 1 minute test you will notice your
> mistake, which is glaringly obvious. That is sufficient to prevent
> that mistake, IMHO.

What would such a test look like?  It's not obvious to me that there's any 
rapid way for a user to detect this situation, without checking each server 
individually.

If there's a quick and easy way to test for cycles from the user side, we 
should put it in documentation somewhere.

--Josh


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Switching timeline over streaming replication

2012-12-19 Thread Joshua Berkus
Heikki,

The next time I get the issue, and I'm not paying for 5 cloud servers by the 
hour, I'll give you a login.

--Josh

- Original Message -
> On 19.12.2012 17:27, Heikki Linnakangas wrote:
> > On 19.12.2012 15:55, Heikki Linnakangas wrote:
> >> On 19.12.2012 04:57, Josh Berkus wrote:
> >>> Heikki,
> >>>
> >>> I ran into an unexpected issue while testing. I just wanted to
> >>> fire up
> >>> a chain of 5 replicas to see if I could connect them in a loop.
> >>> However, I ran into a weird issue when starting up "r3": it
> >>> refused to
> >>> come out of "the database is starting up" mode until I did a
> >>> write on
> >>> the master. Then it came up fine.
> >>>
> >>> master-->r1-->r2-->r3-->r4
> >>>
> >>> I tried doing the full replication sequence (basebackup, startup,
> >>> test)
> >>> with it twice and got the exact same results each time.
> >>>
> >>> This is very strange because I did not encounter the same issues
> >>> with r2
> >>> or r4. Nor have I seen this before in my tests.
> >>
> >> Ok.. I'm going to need some more details on how to reproduce this,
> >> I'm
> >> not seeing that when I set up four standbys.
> >
> > Ok, I managed to reproduce this now.
> 
> Hmph, no I didn't, I replied to wrong email. The problem I managed to
> reproduce was the one where you get "requested WAL
> segment 00020003 has already been removed" errors,
> reported by Thom.
> 
> - Heikki
> 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Switching timeline over streaming replication

2012-12-19 Thread Joshua Berkus
Heikki,

> The problem goes away after some time, after the 1st standby has
> streamed the contents of 00020003 and written it to
> disk, and the cascaded standby reconnects. But it would be nice to
> avoid
> that situation. I'm not sure how to do that yet, we might need to
> track
> the timeline we're currently receiving/sending more carefully. Or
> perhaps we need to copy the previous WAL segment to the new name when
> switching recovery target timeline, like we do when a server is
> promoted. I'll try to come up with something...

Would it be accurate to say that this issue only happens when all of the 
replicated servers have no traffic?

--Josh


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Potential autovacuum optimization: new tables

2012-10-13 Thread Joshua Berkus

> Ah.  Okay, maybe we can agree that that wasn't a good idea.

Oh, I'd say there's no question it was a mistake.  We just didn't have the data 
at the time to realize it.

> I don't really see that we need to bend over backwards to exactly
> match
> some data points that you made up out of thin air.  How about
> ceil(sqrt(N)) to start with?

We can start with anything, including Jeff Jane's equation (for my part, I 
think sqrt(N) will result in analyzing very large tables a bit too often) The 
tough part will be coming up with some way to test it.

--Josh


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecating RULES

2012-10-13 Thread Joshua Berkus
Simon,

> I think its sad we can't even attempt a technical conversation
> without
> you making snide ad hominem attacks that aren't even close to being
> true on a personal level, nor accurate in a technical sense.

I would prefer it if you actually addressed my substantive arguments, which, so 
far, you haven't.

--Josh Berkus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Potential autovacuum optimization: new tables

2012-10-13 Thread Joshua Berkus

> For my part, while that's certainly an interesting idea, it's far
> more
> complicated than even providing GUCs and the idea is to make PG just
> "do
> it right", not to offer the user more ways to get it wrong...

Yes, please let's not replace the existing too-simplistic knobs with giant 
complicated gadgets nobody, including us, understands.

For my part, over the last 3 years of consulting and dealing with 
postgresql.conf settings for more than 140 clients:

* only 10% of them ever touched the autoanalyze settings at all
* of the ~~ 14 who did:
   * 1 improved the tuning of their database
   * 3 of them messed up autoanalyze, causing stats and vacuum issues
   * ~~ 10 had no measurable effect

... so you'll understand when I say that I don't think ease of knob-twiddling 
is a priority for autoanalyze design.  In fact, I'd say that removing the knobs 
entirely is a design goal.

I've been going over the notes and email archives from the period where Matt 
O'Connor and I arrived at the current settings.  All of our testing was devoted 
to autovacuum, not autoanalyze.  The threshold+scale_factor design works pretty 
well for autovacuum; it prevents us from constantly vacuuming small tables, or 
large tables with less than 20% dead rows.  And I did extensive testing using 
DBT2 on OSDL to set the current defaults.

Our mistake was assuming that the same formula which worked well for vacuum 
would work well for analyze.  And since the DBT2 database has entirely 
medium-sized tables full of random data, no shortcomings in this thinking 
showed up in the tests.  Since the only counterproposal at the time was to have 
a flat percentage without a threshold, we got the current defaults.

So, problem #1 is coming up with a mathematical formula.  My initial target 
values are in terms of # of rows in the table vs. # of writes before analyze is 
triggered:

1 : 3
10 : 5
100 : 10
1000 : 100
10 : 2000
100 : 5000
1000 : 25000
1 : 10

 etc.  So problem #1 is a mathematical formula which gives this kind of 
curve.  I've tried some solution-seeking software, but I don't know how to use 
it well enough to get something useful.

Second problem is actually testing the result.  At this point, we don't have 
any performance tests which create anything other than fairly randomly 
distributed data, which doesn't tend to show up any issues in analyze.  We 
really need a performance test where new data is skewed and unbalanced, 
including tables of radically different sizes, and where we're set up to 
measure the level of inaccuracy in query statistics.  

Hmmm.  Actually, for measuring the innacuracy, I have some tools thanks to 
David Wheeler.  But not to generate the test in the first place.

--Josh Berkus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.2 final

2012-06-11 Thread Joshua Berkus
Robert,

Hmmm.  I was assuming September, given how late the beta came out, and that 
nobody has previously talked seriously about a June release.  I'll also point 
out that while there's a beta2 tarball, there was no announcement and no 
packages for it.

If we decide to do June, then PR will be minimal because I was assuming I had 
another 7 weeks to prepare it.  Not that that should be the deciding factor (it 
would be great to get out an early release and get it out of the way) but it 
should be taken into consideration.

- Original Message -
> So, when are we thinking we might release 9.2.0?
> 
> We've done a fall release the last two years, but it's not obvious to
> me that we have a whole lot of blockers left.  In fact, the only
> blocker for which we have nothing that looks like a fix at present
> seems to be this:
> 
> http://archives.postgresql.org/message-id/17129.1331607...@sss.pgh.pa.us
> 
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
> 
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
> 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Streaming-only Remastering

2012-06-10 Thread Joshua Berkus
So currently we have a major limitation in binary replication, where it is not 
possible to "remaster" your system (that is, designate the most caught-up 
standby as the new master) based on streaming replication only.  This is a 
major limitation because the requirement to copy physical logs over scp (or 
similar methods), manage and expire them more than doubles the administrative 
overhead of managing replication.  This becomes even more of a problem if 
you're doing cascading replication.

Therefore I think this is a high priority for 9.3.

As far as I can tell, the change required for remastering over streaming is 
relatively small; we just need to add a new record type to the streaming 
protocol, and then start writing the timeline change to that.  Are there other 
steps required which I'm not seeing?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange issues with 9.2 pg_basebackup & replication

2012-05-18 Thread Joshua Berkus
Fujii,
> 
> You mean that "remaster" is, after promoting one of standby servers,
> to make
> remaining standby servers reconnect to new master and resolve the
> timeline
> gap without the shared archive? Yep, that's one of my TODO items, but
> I'm not
> sure if I have enough time to implement that for 9.3

Well, not remastering from stream is the single largest usability obstacle for 
streaming replication, and severely limits the utility of cascading 
replication.  Is there any way you could get it done for 9.3?  I'm happy to 
spend lots of time testing it, if necessary.

--Josh Berkus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange issues with 9.2 pg_basebackup & replication

2012-05-18 Thread Joshua Berkus

> It might be easy to detect the situation where the standby has
> connected to itself,
> e.g., by assigning ID for each instance and checking whether IDs of
> two servers
> are the same. But it seems not easy to detect the
> circularly-connected
> two or more
> standbys.

Well, I think it would be fine not to worry about circles for now.  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange issues with 9.2 pg_basebackup & replication

2012-05-17 Thread Joshua Berkus
Yeah, I don't know how I produced the crash in the first place, because of 
course the self-replica should block all writes, and retesting it I can't get 
it to accept a write.  Not sure how I did it in the first place.

So the bug is just that you can connect a server to itself as its own replica.  
Since I can't think of any good reason to do this, we should simply error out 
on startup if someone sets things up that way.  How can we detect that we've 
connected streaming replication to the same server?

- Original Message -
> On Thu, May 17, 2012 at 10:42 PM, Ants Aasma 
> wrote:
> > On Thu, May 17, 2012 at 3:42 PM, Joshua Berkus 
> > wrote:
> >> Even more fun:
> >>
> >> 1) Set up a server as a cascading replica (e.g. max_wal_senders =
> >> 3, standby_mode = on )
> >>
> >> 2) Connect the server to *itself* as a replica.
> >>
> >> 3) This will work and report success, up until you do your first
> >> write.
> >>
> >> 4) Then ... segfault!
> >
> > I cannot reproduce this.
> 
> Me, neither.
> 
> Josh, could you show me the more detail procedure to reproduce the
> problem?
> 
> Regards,
> 
> --
> Fujii Masao
> 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why is indexonlyscan so darned slow?

2012-05-17 Thread Joshua Berkus
Jeff,

That's in-RAM speed ... I ran the query twice to make sure the index was 
cached, and it didn't get any better.  And I meant 5X per byte rather than 5X 
per tuple.

I talked this over with Haas, and his opinion is that we have a LOT of overhead 
in the way we transverse indexes, especially lookups which happen once per leaf 
node instead of in bulk.Certainly the performance I'm seeing would be 
consistent with that idea.

I'll try some multi-column covering indexes next to see how it looks. 

- Original Message -
> On Thu, May 17, 2012 at 5:22 AM, Joshua Berkus 
> wrote:
> > Ants,
> >
> > Well, that's somewhat better, but again hardly the gain in
> > performance I'd expect to see ... especially since this is ideal
> > circumstances for index-only scan.
> >
> > bench2=# select count(*) from pgbench_accounts;
> >  count
> > --
> >  2000
> > (1 row)
> >
> > Time: 3827.508 ms
> >
> > bench2=# set enable_indexonlyscan=off;
> > SET
> > Time: 0.241 ms
> > bench2=# select count(*) from pgbench_accounts;
> >  count
> > --
> >  2000
> > (1 row)
> >
> > Time: 16012.444 ms
> >
> > For some reason counting tuples in an index takes 5X as long (per
> > tuple) as counting them in a table.  Why?
> >
> 
> It looks like the IOS is taking 4x less time, not more time.
> 
> Anyway, the IOS follows the index logical structure, not the physical
> structure, so if the index is not in RAM it will really be hurt by
> the
> lack of sequential reads.
> 
> Cheers,
> 
> Jeff
> 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange issues with 9.2 pg_basebackup & replication

2012-05-17 Thread Joshua Berkus
Jim, Fujii,

Even more fun:

1) Set up a server as a cascading replica (e.g. max_wal_senders = 3, 
standby_mode = on )

2) Connect the server to *itself* as a replica.

3) This will work and report success, up until you do your first write.

4) Then ... segfault!  
 


- Original Message -
> On 5/16/12 10:53 AM, Fujii Masao wrote:
> > On Wed, May 16, 2012 at 3:43 AM, Joshua Berkus
> >  wrote:
> >>
> >>> Before restarting it, you need to do pg_basebackup and make a
> >>> base
> >>> backup
> >>> onto the standby again. Since you started the standby without
> >>> recovery.conf,
> >>> a series of WAL in the standby has gotten inconsistent with that
> >>> in
> >>> the master.
> >>> So you need a fresh backup to restart the standby.
> >>
> >> You're not understanding the bug.  The problem is that the standby
> >> came up and reported that it was replicating OK, when clearly it
> >> wasn't.
> >
> >> 8. Got this fatal error on the standby server:
> >>
> >> LOG:  record with incorrect prev-link 0/7B8 at 0/7E0
> >> LOG:  record with incorrect prev-link 0/7B8 at 0/7E0
> >>
> >> ... this error message repeated every 5s.
> >
> > According to your first report, ISTM you got error messages.
> 
> Only *after* it was correctly setup.
> 
> Josh's point is that if you flub the configuration, you should get an
> error, which is not what's happening now. Right now it just comes up
> and acts as if nothing's wrong.
> --
> Jim C. Nasby, Database Architect   j...@nasby.net
> 512.569.9461 (cell) http://jim.nasby.net
> 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] master and sync-replica diverging

2012-05-17 Thread Joshua Berkus
Erik,

Are you taking the counts *while* the table is loading?  In sync replication, 
it's possible for the counts to differ for a short time due to one of three 
things:

* transaction has been saved to the replica and confirm message hasn't reached 
the master yet
* replica has synched the transaction to the WAL log, but due to wal_delay 
settings hasn't yet applied it to the tables in memory.
* updating the master with synchronous_commit = local.

- Original Message -
> AMD FX 8120 / centos 6.2 / latest source (git head)
> 
> 
> It seems to be quite easy to force a 'sync' replica to not be equal
> to master by
> recreating+loading a table in a while loop.
> 
> 
> For this test I compiled+checked+installed three separate instances
> on the same machine.  The
> replica application_name are names 'wal_receiver_$copy' where $copy
> is 01, resp. 02.
> 
> $ ./sync_state.sh
>   pid  | application_name |   state   | sync_state
> ---+--+---+
>  19520 | wal_receiver_01  | streaming | sync
>  19567 | wal_receiver_02  | streaming | async
> (2 rows)
> 
>  port | synchronous_commit | synchronous_standby_names
> --++---
>  6564 | on | wal_receiver_01
> (1 row)
> 
>  port | synchronous_commit | synchronous_standby_names
> --++---
>  6565 | off|
> (1 row)
> 
>  port | synchronous_commit | synchronous_standby_names
> --++---
>  6566 | off|
> (1 row)
> 
> 
> 
> The test consists of creating a table and loading tab-separated data
> from file with COPY and then
> taking the rowcount of that table (13 MB, almost 200k rows) in all
> three instances:
> 
> 
> # wget
> http://flybase.org/static_pages/downloads/FB2012_03/genes/fbgn_annotation_ID_fb_2012_03.tsv.gz
> 
> slurp_file=fbgn_annotation_ID_fb_2012_03.tsv.gz
> 
> zcat $slurp_file \
>  | grep -v '^#' \
>  | grep -Ev '^[[:space:]]*$' \
>  | psql -c "
> drop table if exists $table cascade;
> create table $table (
>  gene_symbol  text
> ,primary_fbgn text
> ,secondary_fbgns  text
> ,annotation_idtext
> ,secondary_annotation_ids text
> );
> copy $table from stdin csv delimiter E'\t';
>  ";
> 
> # count on master:
> echo "select current_setting('port') port,count(*) from $table"|psql
> -qtXp 6564
> 
> # count on wal_receiver_01 (sync replica):
> echo "select current_setting('port') port,count(*) from $table"|psql
> -qtXp 6565
> 
> # count on wal_receiver_02 (async replica):
> echo "select current_setting('port') port,count(*) from $table"|psql
> -qtXp 6566
> 
> 
> 
> I expected the rowcounts from master and sync replica to always be
> the same.
> 
> Initially this seemed to be the case, but when I run the above
> sequence in a while loop for a few
> minutes about 10% of rowcounts from the sync-replica are not equal to
> the master.
> 
> Perhaps not a likely scenario, but surely such a deviating rowcount
> on a sync replica should not
> be possible?
> 
> 
> thank you,
> 
> 
> Erik Rijkers
> 
> 
> 
> 
> 
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
> 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why is indexonlyscan so darned slow?

2012-05-17 Thread Joshua Berkus
Ants,

Well, that's somewhat better, but again hardly the gain in performance I'd 
expect to see ... especially since this is ideal circumstances for index-only 
scan.  

bench2=# select count(*) from pgbench_accounts;
  count   
--
 2000
(1 row)

Time: 3827.508 ms

bench2=# set enable_indexonlyscan=off;
SET
Time: 0.241 ms
bench2=# select count(*) from pgbench_accounts;
  count   
--
 2000
(1 row)

Time: 16012.444 ms

For some reason counting tuples in an index takes 5X as long (per tuple) as 
counting them in a table.  Why?

- Original Message -
> On Thu, May 17, 2012 at 6:08 AM, Joshua Berkus 
> wrote:
> > As you can see, the indexonlyscan version of the query spends 5% as
> > much time reading the data as the seq scan version, and doesn't
> > have to read the heap at all.  Yet it spends 20 seconds doing ...
> > what, exactly?
> >
> > BTW, kudos on the new explain analyze reporting ... works great!
> 
> Looks like timing overhead. Timing is called twice per tuple which
> gives around 950ns per timing call for your index only result. This
> is
> around what is expected of hpet based timing. If you are on Linux you
> can check what clocksource you are using by running cat
> /sys/devices/system/clocksource/clocksource0/current_clocksource
> 
> You can verify that it is due to timing overhead by adding timing off
> to the explain clause. Or use the pg_test_timing utility to check the
> timing overhead on your system. With hpet based timing I'm seeing
> 660ns timing overhead and 26.5s execution for your query, with timing
> off execution time falls to 2.1s. For reference, tsc based timing
> gives 19.2ns overhead and 2.3s execution time with timing.
> 
> Ants Aasma
> --
> Cybertec Schönig & Schönig GmbH
> Gröhrmühlgasse 26
> A-2700 Wiener Neustadt
> Web: http://www.postgresql-support.de
> 
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
> 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Why is indexonlyscan so darned slow?

2012-05-16 Thread Joshua Berkus
So, I set up a test which should have been ideal setup for index-only scan.  
The index was 1/10 the size of the table, and fit in RAM (1G) which the table 
does not:

bench2=# select pg_size_pretty(pg_relation_size('pgbench_accounts_pkey'));
 pg_size_pretty

 428 MB  
(1 row)  

bench2=# select pg_size_pretty(pg_relation_size('pgbench_accounts'));
 pg_size_pretty

 5768 MB 
(1 row)

The table was just VACUUM ANALYZED and had no subsequent updates.  So, what's 
going on here?

bench2=# explain ( analyze on, buffers on ) select count(*) from 
pgbench_accounts;
   QUERY PLAN

-

 Aggregate  (cost=855069.99..855070.00 rows=1 width=0) (actual 
time=64014.573..64014.574 rows=1 loops=1)
   Buffers: shared hit=33 read=738289
   I/O Timings: read=27691.314
   ->  Seq Scan on pgbench_accounts  (cost=0.00..831720.39 rows=9339839 
width=0) (actual time=6790.669..46530.408 rows=20
00 loops=1)
 Buffers: shared hit=33 read=738289
 I/O Timings: read=27691.314
 Total runtime: 64014.626 ms
(7 rows) 

bench2=# explain ( analyze on, buffers on ) select count(*) from 
pgbench_accounts;

QUERY PLAN

-
-
 Aggregate  (cost=382829.37..382829.38 rows=1 width=0) (actual 
time=38325.026..38325.027 rows=1 loops=1)
   Buffers: shared hit=1 read=54653
   I/O Timings: read=907.202
   ->  Index Only Scan using pgbench_accounts_pkey on pgbench_accounts  
(cost=0.00..359479.77 rows=9339839 width=0) (actual t
ime=33.459..20110.908 rows=2000 loops=1)
 Heap Fetches: 0
 Buffers: shared hit=1 read=54653
 I/O Timings: read=907.202
 Total runtime: 38333.536 ms


As you can see, the indexonlyscan version of the query spends 5% as much time 
reading the data as the seq scan version, and doesn't have to read the heap at 
all.  Yet it spends 20 seconds doing ... what, exactly?  

BTW, kudos on the new explain analyze reporting ... works great!

--Josh

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange issues with 9.2 pg_basebackup & replication

2012-05-16 Thread Joshua Berkus

> > And: if we still have to ship logs, what's the point in even having
> > cascading replication?
> 
> At least cascading replication (1) allows you to adopt more flexible
> configuration of servers,

I'm just pretty shocked.  The last time we talked about this, at the end of the 
9.1 development cycle, you almost had remastering using streaming-only 
replication working, you just ran out of time.  Now it appears that you've 
abandoned working on that completely.  What's going on?




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange issues with 9.2 pg_basebackup & replication

2012-05-15 Thread Joshua Berkus

> Before restarting it, you need to do pg_basebackup and make a base
> backup
> onto the standby again. Since you started the standby without
> recovery.conf,
> a series of WAL in the standby has gotten inconsistent with that in
> the master.
> So you need a fresh backup to restart the standby.

You're not understanding the bug.  The problem is that the standby came up and 
reported that it was replicating OK, when clearly it wasn't.

--Josh

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange issues with 9.2 pg_basebackup & replication

2012-05-15 Thread Joshua Berkus
Fujii,

Wait, are you telling me that we *still* can't remaster from streaming 
replication?  Why wasn't that fixed in 9.2?

And: if we still have to ship logs, what's the point in even having cascading 
replication?

- Original Message -
> On Wed, May 16, 2012 at 1:36 AM, Thom Brown  wrote:
> > However, this isn't true when I restart the standby.  I've been
> > informed that this should work fine if a WAL archive has been
> > configured (which should be used anyway).
> 
> The WAL archive should be shared by master-replica and
> replica-replica,
> and recovery_target_timeline should be set to latest in
> replica-replica.
> If you configure that way, replica-replica would successfully
> reconnect to
> master-replica with no need to restart it.
> 
> > But one new problem I appear to have is that once I set up
> > archiving
> > and restart, then try pg_basebackup, it gets stuck and never shows
> > any
> > progress.  If I terminate pg_basebackup in this state and attempt
> > to
> > restart it more times than max_wal_senders, it can no longer run,
> > as
> > pg_basebackup didn't disconnect the stream, so ends up using all
> > senders.  And these show up in pg_stat_replication.  I have a
> > theory
> > that if archiving is enabled, restart postgres then generate some
> > WAL
> > to the point there is a file or two in the archive, pg_basebackup
> > can't stream anything.  Once I restart the server, it's fine and
> > continues as normal.  This has the same symptoms of the
> > "pg_basebackup
> > from running standby with streaming" issue.
> 
> This seems to be caused by spread checkpoint which is requested by
> pg_basebackup. IOW, this looks a normal behavior rather than a bug
> or an issue. What if you specify "-c fast" option in pg_basebackup?
> 
> Regards,
> 
> --
> Fujii Masao
> 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange issues with 9.2 pg_basebackup & replication

2012-05-15 Thread Joshua Berkus
Jim,

I didn't get as far as running any tests, actually.  All I did was try to set 
up 3 servers in cascading replication.  Then I tried shutting down 
master-master and promoting master-replica.  That's it.

- Original Message -
> On May 13, 2012, at 3:08 PM, Josh Berkus wrote:
> > More issues: promoting intermediate standby breaks replication.
> > 
> > To be a bit blunt here, has anyone tested cascading replication *at
> > all*
> > before this?
> 
> Josh, do you have scripts that you're using to do this testing? If so
> can you post them somewhere?
> 
> AFAIK we don't have any regression tests for all this replication
> stuff, but ISTM that we need some...
> --
> Jim C. Nasby, Database Architect   j...@nasby.net
> 512.569.9461 (cell) http://jim.nasby.net
> 
> 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Last gasp

2012-04-12 Thread Joshua Berkus

> I think the big take-away, education-wise, is that for our project,
> committer == grunt work.  Remember, I used to be the big committer of
> non-committer patches --- need I say more.  ;-)  LOL

Well, promoting several people to committer specifically and publically because 
of their review work would send that message a lot more strongly than your blog 
would.   It would also provide an incentive for a few of our major contributors 
to do more review work, if it got them to committer.

--Josh Berkus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Last gasp

2012-04-12 Thread Joshua Berkus


> If we were actually using git branches for it, the CF app could
> automatically close entries when they were committed. But that
> requires them to be committed *unmodified*, and I'm not sure that's
> reasonable. I also think requiring a git branch for the *simple*
> changes is adding more tooling and not less, and thus fails on that
> suggestion.

Well actually, the other advantage of using branches is that it would encourage 
committers to bounce a patch back to the submitter for modification *instead 
of* doing it themselves.  This would both have the advantage of saving time for 
the committer, and doing a better job of teaching submitters how to craft 
patches which don't need to be modified.  Ultimately, we need to train new 
major contributors in order to get past the current bottleneck.

Of course, this doesn't work as well for contributors who *can't* improve their 
patches, such as folks who have a language barrier with the comments.  But it's 
something to think about.

--Josh 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Last gasp

2012-04-11 Thread Joshua Berkus

> Ultimately, we're herding cats here.  I don't think you're going to
> get
> the community to suddenly be willing to march in lockstep instead.

If you, Peter, Simon, Robert, Heikki, Magnus, Peter G., Greg, Bruce and Andrew 
agreed on a calendar-driven, mostly unambiguous process and adhered to that 
process, then the one or two people who didn't follow along wouldn't matter.  
Everyone else would follow you.  The reason things are chaotic now is that our 
lead committers do not have consensus and are even inconsistent from CF to CF 
individually.

In other words: the problem is only unsolvable because *you* think it's 
unsolvable.   If you decide the problem is solvable, you already have the means 
to solve it.

--Josh Berkus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Last gasp

2012-04-11 Thread Joshua Berkus
All,

>From my observation, the CF process ... in fact, all development processes 
>we've had in Postgres ... have suffered from only one problem: lack of 
>consensus on how the process should work.  For example, we've *never* had 
>consensus around the criteria for kicking a patch out of a commitfest.  This 
>lack of consensus has resulted in disorganization, ennui towards the process, 
>deadline overruns, and a lot of general unhappiness.   People have stopped 
>believing in the CF system because we've stopped running it.

I'm encouraged at this point that we've seen where this lack of consensus can 
lead us, maybe at this point we're willing to set aside individual differences 
of opinion on what the criteria should be (especially when it comes to the 
patches we each individually care about) in service of a smoother-running 
process.  Some suggestions:

- for the first 2 weeks of each CF, there should be a *total* moritorium on 
discussing any features not in the current CF on -hackers.
- the CF manager should have unquestioned authority to kick patches.  As in, no 
arguing.
- we should have simple rules for the CF manager for kicking patches, as in:
   * no response from author in 5 days
   * judged as needing substantial work by reviewer
   * feature needs spec discussion

However, the real criteria don't matter as much as coming up with a set of 
criteria we're all willing to obey, whatever they are.

We also need better tools for the CF, but frankly better tools is a minor issue 
and easily solved if we have a consensus which people are willing to obey.  For 
that matter, if we have a smooth and impartial process, we can do other things, 
including: training new reviewers, promoting new committers, changing the 
length of the CF cycle, or changing the PostgreSQL release cycle (yes, really). 
 While our review and commit process is completely subjective and inconsistent, 
though, we can't do any of these things.

--Josh Berkus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] query cache

2012-03-24 Thread Joshua Berkus
Billy,

> I've done a brief search of the postgresql mail archives, and I've
> noticed a few projects for adding query caches to postgresql, (for
> example, Masanori Yamazaki's query cache proposal for GSOC 2011),

... which was completed, btw.  Take a look at the current release of pgPool.

Are you proposing this for GSOC2012, or is this just a general idea?

> I'm wondering if anyone would be interested in a query cache as a
> backend to postgresql? I've been playing around with the postgresql
> code, and if I'm understanding the code, I believe this is possible.

Well, you'd have to start by demonstrating the benefit of it.  The advantage of 
query caches in proxies and clients is well-known, because you can offload some 
of the work of the database onto other servers, this increasing capacity.  
Adding a query cache to the database server would require the "query identity 
recognition" of the cache to be far cheaper (as in 10X cheaper) than planning 
and running the query, which seems unlikely at best.

There are a number of proven caching models which PostgreSQL currently does not 
yet implement.  I'd think it would be more profitable to pursue one of those, 
such as:

* parse caching in the client (JDBC has this, but libpq does not).
* shared cached plans between sessions (snapshot issues here could be nasty)
* fully automated materialized views

If you want to do something radical and new, then come up with a way for a 
client to request and then reuse a complete query plan by passing it to the 
server.  That would pave the way for client-side plan caching (and plan 
manipulation) code written in a variety of languages, and thus further 
innovation through creative algorithms and other ideas.

--Josh Berkus




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Gsoc2012 Idea --- Social Network database schema

2012-03-24 Thread Joshua Berkus
Qi,

Yeah, I can see that.  That's a sign that you had a good idea for a project, 
actually: your idea is interesting enough that people want to debate it.  Make 
a proposal on Monday and our potential mentors will help you refine the idea.

- Original Message -
> 
> 
> 
> 
> > Date: Thu, 22 Mar 2012 13:17:01 -0400
> > Subject: Re: [HACKERS] Gsoc2012 Idea --- Social Network database
> > schema
> > From: cbbro...@gmail.com
> > To: kevin.gritt...@wicourts.gov
> > CC: pgsql-hackers@postgresql.org
> > 
> > On Thu, Mar 22, 2012 at 12:38 PM, Kevin Grittner
> >  wrote:
> > > Tom Lane  wrote:
> > >> Robert Haas  writes:
> > >>> Well, the standard syntax apparently aims to reduce the number
> > >>> of
> > >>> returned rows, which ORDER BY does not. Maybe you could do it
> > >>> with ORDER BY .. LIMIT, but the idea here I think is that we'd
> > >>> like to sample the table without reading all of it first, so
> > >>> that
> > >>> seems to miss the point.
> > >> 
> > >> I think actually the traditional locution is more like
> >! ; >> WHERE random() < constant
> > >> where the constant is the fraction of the table you want. And
> > >> yeah, the presumption is that you'd like it to not actually read
> > >> every row. (Though unless the sampling density is quite a bit
> > >> less than 1 row per page, it's not clear how much you're really
> > >> going to win.)
> > > 
> > > It's all going to depend on the use cases, which I don't think
> > > I've
> > > heard described very well yet.
> > > 
> > > I've had to pick random rows from, for example, a table of
> > > disbursements to support a financial audit. In those cases it has
> > > been the sample size that mattered, and order didn't. One
> > > interesting twist there is that for some of these financial
> > > audits
> > > they wanted the probability of a row being selected to be
> > > proportional ! to the dollar amount of the disbursement. I don't
> > > t hink you can do this without a first pass across the whole data
> > > set.
> > 
> > This one was commonly called "Dollar Unit Sampling," though the
> > terminology has gradually gotten internationalized.
> > http://www.dummies.com/how-to/content/how-does-monetary-unit-sampling-work.html
> > 
> > What the article doesn't mention is that some particularly large
> > items
> > might wind up covering multiple samples. In the example, they're
> > looking for a sample every $3125 down the list. If there was a
> > single
> > transaction valued at $3, that (roughly) covers 10 of the
> > desired
> > samples.
> > 
> > It isn't possible to do this without scanning across the entire
> > table.
> > 
> > If you want repeatability, you probably want to instantiate a copy
> > of
> > enough information to indicate the ordering chosen. That's probably
> > something that needs to be captured as part of the work of the
> > audit,
> > so n! ot only does it need to involve a pass across the data, it
> > probably requires capturing a fair bit of data for posterity.
> > --
> > When confronted by a difficult problem, solve it by reducing it to
> > the
> > question, "How would the Lone Ranger handle this?"
> 
> 
> 
> 
> 
> 
> The discussion till now has gone far beyond my understanding.
> Could anyone explain briefly what is the idea for now?
> The designing detail for me is still unfamiliar. I can only take time
> to understand while possible after being selected and put time on it
> to read relevant material.
> For now, I'm still curious why Neil's implementation is no longer
> working? The Postgres has been patched a lot, but the general idea
> behind Neil's implementation should still work, isn't it?
> Besides, whether this query is needed is still not decided. Seems
> this is another hard to decide point. Is it that this topic is still
> not so prepared for th e Gsoc yet? If really so, I think I still
> have time to switch to other topics. Any suggestion?
> 
> 
> Thanks.
> 
> Best Regards and Thanks
> Huang Qi Victor
> Computer Science of National University of Singapore

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 3rd Cluster Hackers Summit, May 15th in Ottawa

2012-02-12 Thread Joshua Berkus
Hackers, 

NTT Open Source has requested that I convene the 3rd Cluster Hackers summit at 
pgCon this year.  As last year, it will be held on Tuesday (May 15th) during 
tutorials (and not conflicting with the Developer Summit).

If you are a contributor to any of PostgreSQL's various replication, 
clustering, or virtualization and multiserver management tools, you are invited 
to attend.  Please RSVP (see below).

Draft agenda follows.  Please let me know of any contributions/changes to the 
agenda you have:

= Project Reports: 5 minutes from each project
   * Hot Standby/Binary Replication
   * pgPoolII
   * PostgresXC
   * Your Project Here

= Technical Issues of common interest
   * SSI in cluster/replication
   * Parser export
   * Managing consistent views of data
   * Fault detection and handling
   * Node addition/removal
   * Configuration and operation
   * Cursor in replication/multi master
   * Your Issue Here

The Cluster Summit will be from 10am to 5pm, with a break for lunch, which will 
be provided, sponsored by NTT.

If you will be able to attend, please respond (offlist) to this email with the 
following:

Your Name
Project(s) you work on
If you will be giving a Project Report
If you have additions to the agenda
Special dietary needs for lunch, if any
If you need travel assistance

Note that the availability of travel funding is not guaranteed; I can just 
agree to request it.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Review of patch renaming constraints

2012-01-12 Thread Joshua Berkus

Compiling on Ubuntu 10.04 LTS AMD64 on a GoGrid virtual machine from 2012-01-12 
git checkout.

Patch applied fine.

Docs are present, build, look good and are clear.

Changes to gram.y required Bison 2.5 to compile.  Are we requiring Bison 2.5 
now?  There's no configure check for it, so it took me quite a while to figure 
out what was wrong.

Make check passed.  Patch has tests for rename constraint.

Most normal uses of alter table ... rename constraint ... worked normally.  
However, the patch does not deal correctly with constraints which are not 
inherited, such as primary key constraints:

create table master ( category text not null, status int not null, value text );

alter table master add constraint master_key primary key ( category, status );

alter table master rename constraint master_key to master_primary_key;

create table partition_1 () inherits ( master );

create table partition_2 () inherits ( master );

alter table master rename constraint master_primary_key to master_key;

postgres=# alter table master rename constraint master_primary_key to 
master_key;
ERROR:  constraint "master_primary_key" for table "partition_1" does not exist
STATEMENT:  alter table master rename constraint master_primary_key to 
master_key;
ERROR:  constraint "master_primary_key" for table "partition_1" does not exist


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ISN was: Core Extensions relocation

2011-11-21 Thread Joshua Berkus

Bruce,

> I don't see any of this reaching the level that it needs to be
> backpatched, so I think we have to accept that this will be 9.2-only
> change.

Agreed.  If users encounter issues with the prefix in the field, it will be 
easy enough for them to back-patch.  But we don't want to be responsible for it 
as a project.

--Josh

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_restore --no-post-data and --post-data-only

2011-11-15 Thread Joshua Berkus


> > Here is a patch for that for pg_dump. The sections provided for are
> > pre-data, data and post-data, as discussed elsewhere. I still feel that
> > anything finer grained should be handled via pg_restore's --use-list
> > functionality. I'll provide a patch to do the same switch for pg_restore
> > shortly.
> >
> > Adding to the commitfest.
> >
> 
> 
> Updated version with pg_restore included is attached.

Functionality review:

I have tested the backported version of this patch using a 500GB production 
database with over 200 objects and it worked as specified. 

This functionality is extremely useful for the a variety of selective copying 
of databases, including creating shrunken test instances, ad-hoc parallel dump, 
differently indexed copies, and sanitizing copies of sensitive data, and even 
bringing the database up for usage while the indexes are still building.

Note that this feature has the odd effect that some constraints are loaded at 
the same time as the tables and some are loaded with the post-data.  This is 
consistent with how text-mode pg_dump has always worked, but will seem odd to 
the user.  This also raises the possibility of a future pg_dump/pg_restore 
optimization.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ISN was: Core Extensions relocation

2011-11-15 Thread Joshua Berkus
All,

> I agree.  The argument that this code is useful as example code has
> been offered before, but the justification is pretty thin when the
> example code is an example of a horrible design that no one should
> ever copy.

People are already using ISN (or at least ISBN) in production.  It's been 
around for 12 years.  So any step we take with contrib/ISN needs to take that 
into account -- just as we have with Tsearch2 and XML2.

One can certainly argue that some of the stuff in /contrib would be better on 
PGXN.  But in that case, it's not limited to ISN; there are several modules of 
insufficient quality (including intarray and ltree) or legacy nature which 
ought to be pushed out.  Probably most of them.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core Extensions relocation

2011-11-15 Thread Joshua Berkus
Peter,

> I consider contrib/isn to be quite broken. It hard codes ISBN
> prefixes
> for the purposes of sanitising ISBNs, even though their assignment is
> actually controlled by a decentralised body of regional authorities.
> I'd vote for kicking it out of contrib.

Submit a patch to fix it then.  

I use ISBN in 2 projects, and it's working fine for me.  I'll strongly resist 
any attempt to "kick it out".

--Josh Berkus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core Extensions relocation

2011-11-15 Thread Joshua Berkus
Greg,

> I'm not attached to the name, which I just pulled out of the air for
> the
> documentation.  Could just as easily call them built-in modules or
> extensions.  If the objection is that "extensions" isn't technically
> correct for auto-explain, you might call them core add-ons instead.
>  My
> thinking was that the one exception didn't make it worth the trouble
> to
> introduce a new term altogether here.  There's already too many terms
> used for talking about this sort of thing, the confusion from using a
> word other than "extensions" seemed larger than the confusion sown by
> auto-explain not fitting perfectly.

Well, I do think it should be *something* Extensions.  But Core Extensions 
implies that the other stuff is just random code, and makes the user wonder why 
it's included at all.  If we're going to rename some of the extensions, then we 
really need to rename them all or we look like those are being depreciated.

Maybe:

Core Management Extensions
Core Development Extensions
Additional Database Tools
Code Examples
Legacy Modules

I think that covers everything we have in contrib.

Given discussion, is there any point in reporting on the actual patch yet?

--Josh Berkus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-11-01 Thread Joshua Berkus
Robert,

> In most cases we either break backwards compatibility or require some
> type of switch to turn on backwards compatibility for those who want
> it. While the above plan tries to do one better, it leaves me feeling
> that the thing I don't like about this is that it sounds like you are
> forcing backwards compatibility on people who would much rather just
> do things the new way. Given that, I foresee a whole new generation
> of
> confused users who end up setting their configs one way only to have
> someone else set the same config in the other file, or some tool dump
> out some config file, overriding what was really intended. This will
> also make things *harder* for those tool providers you are trying to
> help, as they will be forced to support the behavior *both ways*. I'd
> much rather see some type of switch which turns on the old behavior
> for those who really want it, because while you can teach the new
> behavior, if you can't prevent the old behavior, you're creating
> operational headaches for yourself.

This is a good point.  There's also the second drawback, which is complexity of 
code, which I believe that Tom Lane has brought up before; having two 
separate-but-equal paths for configuration is liable to lead to a lot of bugs.

So, we have four potential paths regarding recovery.conf:

1) Break backwards compatibility entirely, and stop supporting recovery.conf as 
a trigger file at all.

2) Offer backwards compatibility only if "recovery_conf='filename'" is set in 
postgresql.conf, then behave like Simon's compromise.

3) Simon's compromise.

4) Don't ever change how recovery.conf works.

The only two of the above I see as being real options are (1) and (2).  (3) 
would, as Robert points out, cause DBAs to have unpleasant surprises when some 
third-party tool creates a recovery.conf they weren't expecting. So:

(1) pros:
   * new, clean API
   * makes everyone update their tools
   * no confusion on "how to do failover"
   * code simplicity
 cons:
   * breaks a bunch of 3rd-party tools
   * or forces them to maintain separate 9.1 and 9.2 branches

(2) pros:
   * allows people to use only new API if they want
   * allows gradual update of tools
   * can also lump in relocatable recovery.conf as feature
  cons:
   * puts off the day when vendors pay attention to the new API
 (and even more kicking & screaming when that day comes)
   * confusion about "how to do failover"
   * code complexity

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-25 Thread Joshua Berkus


> I rather like Tom's suggestion of include_if_exists.

include_if_exists certainly solves the recovery.conf/recovery.done problem.  We 
can even phase it out, like this:

9.2: include_if_exists = 'recovery.conf' in the default postgresql.conf file.
9.3: include_if_exists = 'recovery.conf' commented out by default
9.4: renaming recovery.conf to recovery.done by core PG code removed.

This gives users/vendors 3 years to update their scripts to remove dependence 
on recovery.conf.  I'm afraid that I agree with Simon that there's already a 
whole buncha 3rd-party code out there to support the current system.

--Josh Berkus 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [PATCH] Caching for stable expressions with constant arguments v3

2011-09-25 Thread Joshua Berkus
All,

I'd love to see someone evaluate the impact of Marti's patch on JDBC 
applications which use named prepared statements.   Anyone have a benchmark 
handy?

--Josh

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - symmetric

2011-09-25 Thread Joshua Berkus

> > Reminder:  BETWEEEN supports the SYMMETRIC keyword, so there is
> > a precedent for this.
> 
> And I don't see it as valuable enough to justify changing the
> grammar.

I agree that we should leave symmetry until 9.3.

--Josh Berkus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-25 Thread Joshua Berkus


> There might be a use case for a separate directive include_if_exists,
> or some such name.  But I think the user should have to tell us very
> clearly that it's okay for the file to not be found.

Better to go back to include_directory, then.

--Josh Berkus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-25 Thread Joshua Berkus
Folks,

What happens currently if we have an \include in postgresql.conf for a file 
which doesn't exist?  Is it ignored, or do we error out?

If it could just be ignored, maybe with a note in the logs, then we could be a 
lot more flexible.

--Josh Berkus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Satisfy extension dependency by one of multiple extensions

2011-09-24 Thread Joshua Berkus
All,

> >> We might want to have a system where an extension can declare that
> >> it
> >> "provides" capabilites, and then have another extension "require"
> >> those
> >> capabilities. That would be a neater solution to the case that
> >> there are
> >> multiple extensions that all provide the same capability.
> 
> +1

As a warning, this is the sort of thing which DEB and RPM have spent years 
implementing ... and still have problems with.  Not that we shouldn't do it, 
but we should be prepared for the amount of troubleshooting involved, which 
will be considerable.

--Josh Berkus



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-24 Thread Joshua Berkus

> Since we haven't yet come up with a reasonable way of machine-editing
> postgresql.conf, this seems like a fairly serious objection to
> getting
> rid of recovery.conf.  I wonder if there's a way we can work around
> that...

Well, we *did* actually come up with a reasonable way, but it died under an 
avalanche of bikeshedding and 
"we-must-do-everything-the-way-we-always-have-done".  I refer, of course, to 
the "configuration directory" patch, which was a fine solution, and would 
indeed take care of the recovery.conf issues as well had we implemented it.  We 
can *still* implement it, for 9.2.
 
> pg_ctl start -c work_mem=8MB -c recovery_target_time='...'

This wouldn't survive a restart, and isn't compatible with init scripts.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-16 Thread Joshua Berkus

> I'm in favor of defining a separate, content-free trigger file to
> enable
> archive recovery.  Not sure about the name "recovery.ready", though
> ---
> that makes it look like one of the WAL archive transfer trigger
> files,
> which does not seem like a great analogy.  The pg_standby
> documentation
> suggests names like "foo.trigger" for failover triggers, which is a
> bit
> better analogy because something external to the database creates the
> file.  What about "recovery.trigger"?

Do we want a trigger file to enable recovery, or one to *disable* recovery?  Or 
both?

Also, I might point out that we're really confusing our users by talking about 
"recovery" all the time, if they're just using streaming replication.  Just 
sayin'

> * will seeing these values present in pg_settings confuse anybody?

No.  pg_settings already has a couple dozen "developer" parameters which nobody 
not on this mailing list understands.  Adding the recovery parameters to it 
wouldn't confuse anyone further, and would have the advantage of making the 
recovery parameters available by monitoring query on a hot standby.

For that matter, I'd suggest that we add a read-only setting called in_recovery.

> * can the values be changed when not in recovery, if so what happens,
>   and again will that confuse anybody?

Yes, and no.

> * is there any security hazard from ordinary users being able to see
>   what settings had been used?

primary_conninfo could be a problem, since it's possible to set a password 
there.

--Josh

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Alpha 1 for 9.2

2011-09-10 Thread Joshua Berkus

> Download numbers for the installers were bordering on noise compared
> to the GA builds last time I looked, double figures iirc. I don't
> know about the tarballs offhand and can't check ATM.

Can you check when you get a chance?   I know that the DL numbers for the first 
alphas were very low, but I'm wondering about Alpha 3, 4 and 5.

The main value of the alphas is that our Windows users aren't going to do any 
testing which requires source code compile.  But if they're not doing any 
testing anyway, then there's no real point.

There's PR value in doing the alphas, but not enough to justify the effort 
involved. 

If we're not going to do regular alphas, I would push to do one special alpha 
release which includes all of the locking code improvements and similar 
features added to date.  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Alpha 1 for 9.2

2011-09-10 Thread Joshua Berkus

> That's not my recollection.  Obviously, it's hard to measure this one
> way or the other, but I don't recall there being a lot of test
> reports
> from people who are not already contributors and could have used some
> other way to get the code.

Do we have download stats for the alphas?   Dave?

--Josh

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] RC1 / Beta4?

2011-07-29 Thread Joshua Berkus
All,

Where are we on RC1 or Beta4 for PostgreSQL 9.1?  

While I know we're doing going to do a final release in August due to the 
europeans, it would be nice to move things along before then.  There don't seem 
to be any blockers open.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RRR] 9.2 CF2: 20 days in

2011-07-07 Thread Joshua Berkus
Robert,
 
> We need to start marking the patches that are "Waiting on Author" as
> "Returned with Feedback", ideally after checking that the status in
> the CF application is in fact up to date.   With a week left in the
> CommitFest at this point, anything that has been reviewed and still
> has issues is pretty much going to have to wait for the next round.
> We need to focus on (a) reviewing the patches that haven't been
> reviewed yet and (b) getting the stuff that is basically in good
> shape
> committed.  Otherwise, we're still going to be doing this CommitFest
> in September

Sure, I only want to do that for ones which have been waiting on author for 
more than a couple days though.  Working on that.

> I have been attempting to keep somewhat on top of the stuff that has
> become Ready for Committer, but there is too much of it for me to
> handle by myself.

Yeah, given that we're still in beta, I expected committing to be a problem.  
Not a surprise.

--Josh 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-08 Thread Joshua Berkus
Simon,

> The point I have made is that I disagree with a feature freeze date
> fixed ahead of time without regard to the content of the forthcoming
> release. I've not said I disagree with feature freezes altogether,
> which would be utterly ridiculous. Fixed dates are IMHO much less
> important than a sensible and useful feature set for our users.

This is such a non-argument it's silly.  We have so many new major features for 
9.1 that I'm having trouble writing sensible press releases which don't sound 
like a laundry list.

> MySQL
> repeatedly delivered releases with half-finished features and earned
> much disrespect. We have never done that previously and I am against
> doing so in the future.

This is also total BS.  I worked on the MySQL team.  Before Sun/Oracle, MySQL 
specifically had feature-driven releases, where Marketing decided what features 
5.0, 5.1 and 5.2 would have.  They also accepted new features during beta if 
Marketing liked them enough.  This resulted in the 5.1 release being *three 
years late*, and 5.3 being cancelled altogether.  And let's talk about the 
legendary instability of 5.0, because they decided that they couldn't cancel 
partitioning and stored procedures, whether they were ready for prime time or 
not and because they kept changing the API during beta.

MySQL never had time-based releases before Oracle took them over.  And Oracle 
has been having feature-free releases because they're trying to work through 
MySQL's list of thousands of unfixed bugs which dates back to 2003.

An argument for feature-driven releases is in fact an argument for the MySQL AB 
development model.  And that's not a company I want to emulate.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Joshua Berkus
Robert,

> Oh, I get that. I'm just dismayed that we can't have a discussion
> about the patch without getting sidetracked into a conversation about
> whether we should throw feature freeze out the window. 

That's not something you can change.  Whatever the patch is, even if it's a 
psql improvement, *someone* will argue that it's super-critical to shoehorn it 
into the release at the last minute.  It's a truism of human nature to 
rationalize exceptions where your own interest is concerned.

As long as we have solidarity of the committers that this is not allowed, 
however, this is not a real problem.  And it appears that we do.  In the 
future, it shouldn't even be necessary to discuss it.

For my part, I'm excited that we seem to be getting some big hairy important 
patches in to CF1, which means that those patches will be well-tested by the 
time 9.2 reaches beta.  Espeically getting Robert's patch and Simons's 
WALInsertLock work into CF1 means that we'll have 7 months to find serious bugs 
before beta starts.  So I'd really like to carry on with the current 
development schedule.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-07 Thread Joshua Berkus
> iew. The
> reason we usually skip the summer isn't actually a wholesale lack of
> people - it's because it's not so good from a publicity perspective,
> and it's hard to get all the packagers around at the same time.

Actually, the summer is *excellent* from a publicity perspective ... at least, 
June and July are.  Both of those months are full of US conferences whose PR we 
can piggyback on to make a splash.

August is really the only "bad" month from a PR perspective, because we lose a 
lot of our European RCs, and there's no bandwagons to jump on.  But even August 
has the advantage of having no major US or Christian holidays to interfere with 
release dates.

However, we're more likely to have an issue with *packager* availability in 
August.  Besides, isn't this a little premature?  Last I looked, we still have 
some big nasty open items.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] eviscerating the parser

2011-05-22 Thread Joshua Berkus
Robert,

> Another point is that parsing overhead is quite obviously not the
> reason for the massive performance gap between one core running simple
> selects on PostgreSQL and one core running simple selects on MySQL.
> Even if I had (further) eviscerated the parser to cover only the
> syntax those queries actually use, it wasn't going to buy more than a
> couple points.

I don't know if you say Jignesh's presentation, but there seems to be a lot of 
reason to believe that we are lock-bound on large numbers of concurrent 
read-only queries.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do you recommend 8.4 or 9.0 for basic usage?

2011-05-22 Thread Joshua Berkus
MauMau,

> Could you give me your frank opinions about which of 8.4 or 9.0 you
> recommend to ISVs who embed PostgreSQL?

So, first of all, you posted this question to the wrong list.  pgsql-general or 
pgsql-admin would have been more appropriate for this question.

That being said, I find your statistics on bug fixes interesting, so thank you 
for collecting them.

However, at this time there have already been four update releases for 9.0, so 
you can be fairly assured that any major bugs have been fixed.  9.0 was 
definitely a higher patch release (and longer beta) than 8.4 specifically 
because of streaming replication & hot standby.  Those are major, complex 
features which offer the opportunity for issues which only occur in 
multi-server configurations and are thus hard to test for.

Our company has multiple ISV clients who are deploying products built on 9.0.X, 
and to date have had no special issues.

As an ISV, though, you need to devise a plan whereby you can apply update 
releases to your client's machines if they are connected to the internet.  One 
of the primary reasons for update releases is closing security holes, which 
means that you need to have a way to upgrade your customers.  Some of the 
biggest issues we've seen in our clients is that an inability to apply 
in-the-field updates causing customers to experience bugs which have long been 
fixed in PostgreSQL releases.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-05-09 Thread Joshua Berkus
All,

> I agree that we should not reduce the support window. The fact that we
> can do in place upgrades of the data only addresses one pain point in
> upgrading. Large legacy apps require large retesting efforts when
> upgrading, often followed by lots more work renovating the code for
> backwards incompatibilities.

Definitely.  Heck, I can't get half our clients to apply *update* releases 
because they have a required QA process which takes a month.  And a lot of 
companies are just now deploying the 8.4 versions of their products. 

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [pgsql-advocacy] New Canadian nonprofit for trademark, postgresql.org domain, etc.

2011-05-07 Thread Joshua Berkus
Chris,

> Not totally Idle thought: it would be nice if the "holding
> corporation" doesn't need a bank account, as they impose burdens of
> fees (not huge, but not providing us notable value), and more
> importantly, impose administrative burdens. Our banks like to impose
> holds on accounts any time they are left inactive for ~six months,
> which is definitely a pain. It's a pain for my local LUG, which
> normally has financial activity only about once a year.

I'm glad you're on the board of the new NPO.  I wouldn't have known that ... US 
Banks are different, they *like* inactivity.

Anyway, something to discuss at the first board meeting of the new NPO.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Changing the continuation-line prompt in psql?

2011-05-01 Thread Joshua Berkus
Dimitri,

> > I'll bet someone a fancy drink at a conference that this thread goes
> > to at least 100 posts.
> 
> Of course, if we all are to argue about this bet… :)

Darn!  You've uncovered by sinister plan.  Foiled again!

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] increasing collapse_limits?

2011-05-01 Thread Joshua Berkus

Pavel,

> Actually we had to solve a issue with slow SELECT. The problem was in
> low value of JOIN_COLLAPSE_LIMITS. Can we increase a default of this
> value. I checked some complex query, and planner needed about 200ms
> for JOIN_COLLAPSE_LIMIT = 16. So some around 12 can be well.

I'm not comfortable with increasing the default, yet.  While folks on dedicated 
good hardware can handle a collapse of 10-12 joins, a lot of people are running 
PostgreSQL on VMs these days whose real CPU power is no better than a Pentium 
IV.  Also, if you're doing OLTP queries on small tables, spending 20ms planning 
a query is unreasonably slow in a way it is not for a DW query.

It does make a reasonable piece of advice for those tuning for DW, though.  
I'll add it to my list.

Speaking of which, what happened to replacing GEQO with Simulated Annealing?  
Where did that project go?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] branching for 9.2devel

2011-05-01 Thread Joshua Berkus


> I think the main thing we have to think about before choosing is
> whether
> we believe that we can shorten the CFs at all. Josh's proposal had
> 3-week CFs after the first one, which makes it a lot easier to have a
> fest in November or December, but only if you really can end it on
> time.

I think that 3 weeks is doable.  Generally by the last week of all of the CF 
except for the last one, we're largely waiting on either (a) authors who are 
slow to respond, (b) patches which are really hard to review, or (c) arguing 
out spec stuff on -hackers.  Generally the last week only has 1-3 patches open, 
and any of these things could be grounds for booting to the next CF anyway or 
working on the patches outside the CF.  For really hard patches (like Synch 
Rep) those things don't fit into the CF cycle anyway.

I'm not convinced that shorter than 3 weeks is doable, at least not without 
changing to a model of binary accept-or-reject.  Communications speeds are too 
slow and reviewer's availability is too random.

> In addition to the fun of working around the holiday season, perhaps
> we should also consider how much work we're likely to get out of
> people
> in the summer. Is it going to be useful to schedule a fest in either
> July or August? Will one month be better than the other?

Doesn't make a difference, both are equally bad.  However, if we're short on 
European reviewers, at least we'll be able to punt European patches immediately 
because the authors won't be answering their e-mail.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] branching for 9.2devel

2011-04-30 Thread Joshua Berkus

> > If CF1 is June1, though, when will CF4 be? Having a CF start Dec. 1
> > is probably a bad idea.
> 
> Well, I made a suggestion on this topic in my previous email on the
> subject...

I just searched backwards on this thread and I can't find it.  There's been a 
lot of posts.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Changing the continuation-line prompt in psql?

2011-04-30 Thread Joshua Berkus
I'll bet someone a fancy drink at a conference that this thread goes to at 
least 100 posts.

Let the bikeshedding begin!

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] branching for 9.2devel

2011-04-30 Thread Joshua Berkus
Robert,

> Tom and I were talking about starting maybe June 1, rather than July
> 1. You seem opposed but I'm not sure why.

Because I think -- strictly based on history and the complexity of the new 
features -- we'll still be fixing major issues with the beta in June, which was 
what Tom said as well the last time he posted about it on this thread.  

If CF1 is June1, though, when will CF4 be?  Having a CF start Dec. 1 is 
probably a bad idea.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] branching for 9.2devel

2011-04-29 Thread Joshua Berkus
All,
 
> +1 from me for keeping it as-is as well.

So it sounds like most committers want to keep the CFs on their existing 
schedule for another year.  Also that we don't want to branch until RC1.  While 
it would be nice to get some feedback from those who had bad experiences with 
the CF cycle, I don't know how to get it ... and the complaints I've received 
from submitters are NOT about the CF cycle.

What it sounds like we do have consensus on, though, is:
a) improving pg_indent so that it can be run portably, easily, and repeatably
b) greatly improving the "so you want to submit a patch" documentation
c) making CFs a little shorter (3 weeks instead of 4?)

I'll also add one of my own: developing some kind of dependable mentoring 
system for first-time patch submitters.

Beyond that, are we ready to set the schedule for 9.2 yet?  I'd tend to say 
that:

CF1: July 1-30
CF2: Sept 1-21
CF3: November 1-21
CF4: January 3-31

Realistically, given that we usually seem to still be hacking in March, we 
could have a 5th CF which would be exclusively for patches already reviewed in 
CF4 and "tiny" patches.  *however*, we've historically been extremely poor in 
enforcing gatekeeping rules on what's accepted to a CF, so I'm not sure that's 
a good idea.

Oh, and just so Robert will get off my back, I volunteer to run the 9.2CF1.  
Since I'm a better administrator than a reviewer.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] improvements to pgtune

2011-04-27 Thread Joshua Berkus

> Every time I've gotten pulled into discussions of setting parameters 
> based on live monitoring, it's turned into a giant black hole--absorbs a 
> lot of energy, nothing useful escapes from it.  I credit completely 
> ignoring that idea altogether, and using the simplest possible static 
> settings instead, as one reason I managed to ship code here that people 
> find useful.  I'm not closed to the idea, just not optimistic it will 
> lead anywhere useful.  That makes it hard to work on when there are so 
> many obvious things guaranteed to improve the program that could be done 
> instead.

What would you list as the main things pgtune doesn't cover right now?  I have 
my own list, but I suspect that yours is somewhat different.

I do think that autotuning based on interrogating the database is possible.  
However, I think the way to make it not be a tar baby is to tackle it one 
setting at a time, and start with ones we have the most information for.  One 
of the real challenges there is that some data can be gleaned from pg_* views, 
but a *lot* of useful performance data only shows up in the activity log, and 
then only if certain settings are enabled.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] "stored procedures"

2011-04-22 Thread Joshua Berkus
Tom,

> >> I'd like a pony, too. Let's be perfectly clear about this: there is
> >> no
> >> part of plpgsql that can run outside a transaction today, and
> >> probably
> >> no part of the other PLs either, and changing that "without major
> >> changes" is wishful thinking of the first order.

I always thought that it was pretty clear that autonomous transactions were a 
major feature, and very difficult to implement.  Otherwise we'd have done SPs 
back in 7.4 when we first had this discussion.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-04-18 Thread Joshua Berkus
Robert, Tom,

> Hm ... there are people out there who think *I* get high off rejecting
> patches. I have a t-shirt to prove it. But I seem to be pretty
> ineffective at it too, judging from these numbers.

It's a question of how we reject patches, especially first-time patches.   We 
can reject them in a way which makes the submitter more likely to fix them 
and/or work on something else, or we can reject them in a way which discourages 
people from submitting to PostgreSQL at all.

For example, the emails to Radoslaw mentioned nothing about pg_ident, 
documented spacing requirements, accidental inclusion of files he didn't mean 
to touch, etc.  Instead, a couple of people told him he should abandon his 
chosen development IDE in favor of emacs or vim.  Radoslaw happens to be 
thick-skinned and persistent, but other first-time submitters would have given 
up at that point and run off to a more welcoming project.

Mind, even better would be to get our "so you're submitting a patch" 
documentation and tools into shape; that way, all we need to do is send the 
first-time submitter a link.  Will work on that between testing ...

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MMAP Buffers

2011-04-17 Thread Joshua Berkus
Robert,

> Actually, I'd walk through fire for a 10% performance improvement if
> it meant only a *risk* to stability.

Depends on the degree of risk.  MMAP has the potential to introduce instability 
into areas of the code which have been completely reliable for years.  Adding 
20 new coredump cases with data loss for a 10% improvement seems like a poor 
bargain to me.  It doesn't help that the only DB to rely heavily on MMAP 
(MongoDB) is OSSDB's paragon of data loss.

However, in the case where the database is larger than RAM ... or better, 90% 
of RAM ... MMAP has the theoretical potential to improve performance quite a 
bit more than 10% ... try up to 900% on some queries.  However, I'd like to 
prove that in a test before we bother even debating the fundamental obstacles 
to using MMAP.  It's possible that these theoretical performance benefits will 
not materialize, even without data safeguards.
 
> The problem is that this is
> likely unfixably broken. In particular, I think the first sentence of
> Tom's response hit it right on the nose, and mirrors my own thoughts
> on the subject. To have any chance of working, you'd need to track
> buffer pins and shared/exclusive content locks for the pages that were
> being accessed outside of shared buffers; otherwise someone might be
> looking at a stale copy of the page.

Nothing is unfixable.  The question is whether it's worth the cost.  Let me see 
if I can build a tree with Radislaw's patch, and do some real performance tests.

I, for one, am glad he did this work.  We've discussed MMAP in the code off and 
on for years, but nobody wanted to do the work to test it.  Now someone has, 
and we can decide whether it's worth pursuing based on the numbers.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MMAP Buffers

2011-04-16 Thread Joshua Berkus
Radoslaw,

> I think 10% is quite good, as my stand-alone test of mmap vs. read
> shown that
> speed up of copying 100MB data to mem may be from ~20ms to ~100ms
> (depends on
> destination address). Of course deeper, system test simulating real
> usage will
> say more. In any case after good deals with writes, I will speed up
> reads. I
> think to bypass smgr/md much more and to expose shared id's (1,2,3...)
> for
> each file segment.

Well, given the risks to durability and stability associated with using MMAP, I 
doubt anyone would even consider it for a 10% throughput improvement.  However, 
I don't think the test you used demonstrates the best case for MMAP as a 
performance improvement.

> In attachment I sent test-scripts which I used to fill data, nothing
> complex
> (left from 2nd level caches).
> 
> Query I've used to measure was SELECT count(substr(content, 1, 1))
> FROM
> testcase1 WHERE multi_id > 5;
> 
> Timings ware taken from psql.
> 
> I didn't made load (I have about 2GB of free sapce at /home, and 4GB
> RAM) and
> stress (I'm not quite ready to try concurrent updates of same page -
> may fail,
> notice is and place to fix is in code) tests yet.

Yes, but this test case doesn't offer much advantage to MMAP.  Where I expect 
it would shine would be cases where the database is almost as big as, or much 
bigger than RAM ... where the extra data copying by current code is both 
frequent and wastes buffer space we need to use.  As well as concurrent reads 
from the same rows.

You can write a relatively simple custom script using pgBench to test this; you 
don't need a big complicated benchmark.  Once we get over the patch cleanup 
issues, I might be able to help with this. 

> Netbeans is quite good, of course it depends who likes what. Just try
> 7.0 RC
> 2.

I don't know if you've followed the formatting discussion, but apparently 
there's an issue with Netbeans re-indenting lines you didn't even edit.  It 
makes your patch hard to read or apply.  I expect that Netbeans has some method 
to reconfigure indenting, etc.; do you think you could configure it to 
PostgresQL standards so that this doesn't get in the way of evaluation of your 
ideas?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers

2011-04-16 Thread Joshua Berkus
All,

> Never, and that's not true. Heikki was being nice; I wouldn't have
> even
> slogged through it long enough to ask the questions he did before
> kicking it back as unusable. A badly formatted patch makes it
> impossible to evaluate whether the changes from a submission are
> reasonable or not without the reviewer fixing it first.

Then you can say that politely and firmly with direct reference to the problem, 
rather than making the submitter feel bad.

"Thank you for taking on testing an idea we've talked about on this list for a 
long time and not had the energy to test.  However, I'm having a hard time 
evaluating your patch for a few reasons ...(give reasons).  Would it be 
possible for you to resolve these and resubmit so that I can give the patch a 
good evaluation?"

... and once *one* person on this list has made such a comment, there is no 
need for two other hackers to pile on the reformat-your-patch bandwagon.

Our project has an earned reputation for being rejection-happy curmudgeons.  
This is something I heard more than once at MySQLConf, including from one 
student who chose to work on Drizzle instead of PostgreSQL for that reason.  I 
think that we could stand to go out of our way to be helpful to first-time 
submitters.

That doesn't mean that we have to accept patches mangled by using an IDE 
designed for Java, and which lack test cases.  However, we can be nice about it.

-- 
Josh Berkus
Niceness Nazi

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Single client performance on trivial SELECTs

2011-04-15 Thread Joshua Berkus
All,

While it would be nice to improve our performance on this workload, let me 
point out that it's not a very important workload from the point of view of 
real performance challenges.  Yes, there are folks out there with 100MB 
databases who only run one-liner select queries.  But frankly, 
Valemort/Redis/Mongo are going to whip both our and MySQL's butts on that kind 
of workload anyway.

Certainly any sacrifice of functionality in order to be faster at that kind of 
trivial workload would be foolhardy.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MMAP Buffers

2011-04-15 Thread Joshua Berkus
Radoslaw,

10% improvement isn't very impressive from a switch to mmap.  What workload did 
you test with?  What I'd really like to see is testing with databases which are 
50%, 90% and 200% the size of RAM ... that's where I'd expect the greatest gain 
from limiting copying. 

> Netbeans is possibly not very well suited to working on postgres code. 
> AFAIK emacs and/or vi(m) are used by almost all the major developers.

Guys, can we *please* focus on the patch for now, rather than the formatting, 
which is fixable with sed?
-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Feature request: pg_basebackup --force

2011-04-10 Thread Joshua Berkus
Magnus,

> That could certainly be useful, yes. But I have a feeling whomever
> tries to get that into 9.1 will be killed - but it's certainly good to
> put ont he list of things for 9.2.

Oh, no question.   At some point in 9.2 we should also discuss how basebackup 
considers "emtpy" directories.  Because the other thing I find myself 
constantly scripting is replacing the conf files on the replica after the base 
backup sync.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Feature request: pg_basebackup --force

2011-04-09 Thread Joshua Berkus
Magnus, all:

It seems a bit annoying to have to do an rm -rf * $PGDATA/ before resynching a 
standby using pg_basebackup.  This means that I still need to wrap basebackup 
in a shell script, instead of having it do everything for me ... especially if 
I have multiple tablespaces.

Couldn't we have a --force option which would clear all data and tablespace 
directories before resynching?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Bug in pg_hba.conf or pg_basebackup concerning replication connections

2011-04-09 Thread Joshua Berkus
All,

If I have the following line in pg_hba.conf:

hostreplication replication all md5

pg_basebackup -x -v -P -h master1 -U replication -D $PGDATA
pg_basebackup: could not connect to server: FATAL:  no pg_hba.conf entry for 
replication connection from host "216.121.61.233", user "replication"

But, if I change it to "all" users, replication succeeds:

hostreplication all all md5

... even if the user "postgres" (the only other user in this test) is declared 
"with noreplication".

I can't figure out what's going wrong here; either HBA is broken and won't 
accept a replication line unless user is "all", or pgbasebackup is doing 
something to test a connection as "postgres", even though no such connection 
attempt shows up in the logs.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_hba.conf needs sample replication line, replication user

2011-04-09 Thread Joshua Berkus
All,

We left this out of 9.0; let's not leave it out of 9.1.  We need an example 
"replication" line in pg_hba.conf, commented out.  e.g.

# host   replication   all samenet   md5

Also, what happened to having a "replication" user defined by default?  We 
talked this to death last year, I thought that's what we'd decided to do?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should psql support URI syntax?

2011-03-31 Thread Joshua Berkus

> I would think it would be purely syntatic sugar really, which does
> incorporate a familiar interface for those who are working in
> different
> worlds (.Net/Drupal/JAVA) etc...

I wouldn't mind having something more standard supported; I'm always looking up 
the conninfo for the options I don't use frequently.

However, is there any standard for database URIs?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Alpha 5 is now available for testing

2011-03-29 Thread Joshua Berkus
The fifth alpha release for PostgreSQL version 9.1, 9.1alpha5, is now 
available.  There are no new major features in this alpha release as compared 
to 9.1alpha4, but there are many minor bug fixes and improvements to features 
added in 9.1alpha4 and earlier alpha releases.  It is expected that no major 
new features will be added before final release; this is likely to be the final 
alpha release for PostgreSQL 9.1.  

Please download, install, and test Alpha5.  We depend on your bug reports and 
feedback in order to proceed to 9.1beta and to final release.  The more testing 
you do, the sooner 9.1 will be available. Thank you to the many users who 
reported bugs in earlier alphas.

Most of the 148 changes and fixes between Alpha4 and Alpha5 were around two 
major features, per-column collations and synchronous replication.  Work on 
per-column collations included some major refactoring, adding support for it in 
all features of PostgreSQL, and changes to the column collations API and system 
catalogs.  Multiple reported bugs were fixed in synchronous replication 
including lockups, issues with recovery mode, and replication being very slow 
with fsync = off.  If you tested either of these features, please retest as the 
code has changed significantly since Alpha4.

Other changes included:
* add post-creation hook for extensions
* numerous additions and corrections to documentation and release notes
* allow valid-on-creation foreign keys as column constraints
* refactor of min/max aggregate optimization
* fix potential race condition with pg_basebackup
* fix PL/Python array memory leak
* raise maximum value for many timeout configuration settings
* fix handling of "unknown" literals in UNION queries
* fix some division-by-zero issues in the code
* cleanup some variable handling in ECPG
* fix some makefile problems introduced in Alpha4
* make permissions for COMMENT ON ROLE consistent

The new features which are expected to be available in PostgreSQL 9.1 are 
documented in the release notes, at 
http://developer.postgresql.org/pgdocs/postgres/release-9-1.html. If you are 
able to help with organized alpha testing, please see the Alpha/Beta testing 
page: http://wiki.postgresql.org/wiki/HowToBetaTest

Alpha releases are not stable and should never be used in production; they are 
for testing new features only.  There is no guarantee that any features or APIs 
present in the alphas will be present, or the
same, in the final release.

Alpha release information page: http://www.postgresql.org/developer/alpha

Download the alpha release here:
http://www.postgresql.org/ftp/source/v9.1alpha5/

Alpha releases are primarily made in source code form only.  Binary packages 
for some operating systems will be prepared in the coming days.

-- 
Josh Berkus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Allow SQL-language functions to reference parameters by parameter name

2011-03-25 Thread Joshua Berkus
Tom,

> Personally I'd vote for *not* having any such dangerous semantics as
> that. We should have learned better by now from plpgsql experience.
> I think the best idea is to throw error for ambiguous references,
> period. 

As a likely heavy user of this feature, I agree with Tom here.  I really don't 
want the column being silently preferred in SQL functions, when PL/pgSQL 
functions are throwing an error.  I'd end up spending hours debugging this.

Also, I don't understand why this would be a dump/reload issue if $1 and $2 
continue to work.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC 2011 - Mentors? Projects?

2011-03-25 Thread Joshua Berkus
Tomas,

> I spoke to a teacher from a local university last week, mainly as we
> were looking for a place where a local PUG could meet regularly. I
> realized this could be a good opportunity to head-hunt some students
> to
> participate in this GSoC. Are we still interested in new students?

Yes, please!   We have had students from Charles University several times 
before, and would be glad to have more.  The wiki page has links to the 
information about the program.  Talk to Zdenek if you have more questions.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transactional DDL, but not Serializable

2011-03-25 Thread Joshua Berkus

> Making DDL serializable is *not* simple, and half-baked hacks won't
> make that situation better ...

That seemed unnecessary.  Whether or not you approve of Stephen's solution, he 
is dealing with a real issue.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers