date:20100907

Re: [HACKERS] Synchronization levels in SR

2010-09-07 Thread Fujii Masao

On Tue, Sep 7, 2010 at 6:02 AM, Simon Riggs  wrote:
> On Mon, 2010-09-06 at 22:32 +0200, Boszormenyi Zoltan wrote:
>> (in commit)
>> write wal record
>> release locks/etc   > wait for sync ack
>>
>> In the first case, the contention is obviously increased.
>> With this, we are creating more idle time in the server
>> instead of letting other transactions do their jobs as soon
>> as possible. The second method was implemented in my
>> patch. Are there any drawbacks with this?
>
> Then I respectfully suggest that you're releasing locks too early.
>
> Your proposal would allow a 2nd user to see the results of the 1st
> user's transaction before the 1st user knew about whether it had
> committed or not.
>
> I know why you want that, but I don't think its right.

Agreed. That's why I put the wait before ProcArrayEndTransaction()
is called.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Tom Lane

Max Bowsher  writes:
> On 08/09/10 00:37, Robert Haas wrote:
>> but if our CVS repository is busted maybe
>> we should be looking to fix that rather than complaining about
>> cvs2git.

> A possibility. We'd need a tool which would insert an extra node into
> the history graph of an RCS file. Unless we can bodge it by using
> x.y.z.0 as a revision id, it would also need to renumber all the
> revisions on the branch. Still, cvs2git has code to parse the RCS
> format, so it's probably achievable without too much work.

I did some experimentation with manual surgery on (a copy of ;-))
it.po,v and found that x.y.z.0 does seem to work; at least CVS isn't
obviously unhappy with it.  So transformations as simple as illustrated
below might be enough to fix this.  I do not have a copy of cvs2git
at hand to see what it does with this, though.

regards, tom lane


*** ./it.po,v~  Tue Sep  7 22:56:48 2010
--- ./it.po,v   Tue Sep  7 23:01:47 2010
***
*** 173,179 
  1.7
  date  2010.02.19.00.40.04;author petere;  state Exp;
  branches
!   1.7.6.1;
  next  1.6;
  
  1.6
--- 173,179 
  1.7
  date  2010.02.19.00.40.04;author petere;  state Exp;
  branches
!   1.7.6.0;
  next  1.6;
  
  1.6
***
*** 206,211 
--- 206,216 
  branches;
  next  ;
  
+ 1.7.6.0
+ date  2010.02.19.00.40.04;author petere;  state dead;
+ branches;
+ next  1.7.6.1;
+ 
  1.7.6.1
  date  2010.05.13.10.50.03;author petere;  state Exp;
  branches;
***
*** 3636,3641 
--- 3641,3654 
  @
  
  
+ 1.7.6.0
+ log
+ @log addition on branch
+ @
+ text
+ @@
+ 
+ 
  1.7.6.1
  log
  @Translation update

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Streaming a base backup from master

2010-09-07 Thread Bruce Momjian

Greg Stark wrote:
> The industry standard solution that we're missing that we *should* be
> figuring out how to implement is incremental backups.
> 
> I've actually been thinking about this recently and I think we could
> do it fairly easily with our existing infrastructure. I was planning
> on doing it as an external utility but it would be tempting to be able
> to request an external backup via the streaming protocol so maybe it
> would be better a bit more integrated.
> 
> The way I see it there are two alternatives. You need to start by
> figuring out which blocks have been modified since the last backup (or
> selected reference point). You can do this either by scanning every
> data file and picking every block with an LSN > the reference LSN. Or
> you can do it by scanning the WAL since that point and accumulating a
> list of block numbers.

That's what pgrman does already:

http://code.google.com/p/pg-rman/

Are you saying you want to do that over the libpq connection?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Robert Haas

On Tue, Sep 7, 2010 at 8:54 PM, Tom Lane  wrote:
> Max Bowsher  writes:
>> On 08/09/10 00:37, Robert Haas wrote:
>>> Well, if Max is correct that this bug is fixed in CVS 1.11.18 (I don't
>>> see it in the NEWS file) and that a checkout-by-date shows the file
>>> present during the time cvs2git claims it is present, then a less
>>> surprising translation wouldn't be a faithful representation of the
>>> contents of our CVS repository.
>
>> Correct. You'll have to decide whether you wish to represent your
>> current cvs repository, or attempt to doctor things to fix the insanity
>> CVS introduced.
>
> Well, even if the goal is to faithfully represent the bogus history
> shown by CVS, cvs2git isn't doing a good job of it.  In the case of
> src/bin/pg_dump/po/it.po, the CVS history claims that the version
> added to REL8_4_STABLE on 2010-05-13 is a child of the mainline
> version 1.7 committed on 2010-02-19.  Therefore, according to CVS
> the file existed on the branch from 2010-02-19, not 2010-02-28
> as claimed by the cvs2git translation.  I did some "cvs co" operations
> to check this and cvs does indeed retrieve the file between 02-19 and
> 02-28, but not before 02-19.  So I don't think you can defend the
> cvs2git behavior by claiming that it's an exact translation.
>
> Right at the moment, though, I'm more interested in the idea of
> patching the CVS repository to make the problem go away.

If we decide we're actually going to fix this problem, then I think
the definition of "fixed" should be that every tag of the form
RELx_y_z is an ancestor of the branch RELx_y_STABLE.  Maybe it would
be worth writing a sanity check along those lines.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Tom Lane

Max Bowsher  writes:
> On 08/09/10 00:37, Robert Haas wrote:
>> Well, if Max is correct that this bug is fixed in CVS 1.11.18 (I don't
>> see it in the NEWS file) and that a checkout-by-date shows the file
>> present during the time cvs2git claims it is present, then a less
>> surprising translation wouldn't be a faithful representation of the
>> contents of our CVS repository.

> Correct. You'll have to decide whether you wish to represent your
> current cvs repository, or attempt to doctor things to fix the insanity
> CVS introduced.

Well, even if the goal is to faithfully represent the bogus history
shown by CVS, cvs2git isn't doing a good job of it.  In the case of
src/bin/pg_dump/po/it.po, the CVS history claims that the version
added to REL8_4_STABLE on 2010-05-13 is a child of the mainline
version 1.7 committed on 2010-02-19.  Therefore, according to CVS
the file existed on the branch from 2010-02-19, not 2010-02-28
as claimed by the cvs2git translation.  I did some "cvs co" operations
to check this and cvs does indeed retrieve the file between 02-19 and
02-28, but not before 02-19.  So I don't think you can defend the
cvs2git behavior by claiming that it's an exact translation.

Right at the moment, though, I'm more interested in the idea of
patching the CVS repository to make the problem go away.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Tom Lane

Max Bowsher  writes:
> On 08/09/10 00:47, Tom Lane wrote:
>> Max Bowsher  writes:
>>> And, I've just tracked down that this bug was apparently fixed in CVS
>>> 1.11.18, released November 2004.
>> 
>> Hrm, what bug exactly?  As far as I've gathered from the discussion,
>> this is a fundamental design limitation of CVS, not a fixable bug.

> The bug that CVS represented addition to a branch in a way which didn't
> record when it occurred.

> The way in which it was bludgeoned into the RCS file format was somewhat
> hacky, but was a successful fix.

Well, good for them.  But even if we had updated our server to this
version of CVS instantly upon its release, we'd still be looking for
a workaround for the problem in cvs2git, because at least half of the
instances of this problem in our project history predate November 2004.

Do you happen to know details of the format change?  Because one
possible solution path seems to be to manually patch the desired
information into the CVS repository before we run cvs2git.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Max Bowsher

On 08/09/10 00:37, Robert Haas wrote:
> On Tue, Sep 7, 2010 at 7:18 PM, Tom Lane  wrote:
>> Robert Haas  writes:
>>> Well, as Max says downthread, cvs -r REL8_4_STABLE -d
>>> INTERMEDIATE_DATE apparently shows the file as being there, which is a
>>> fairly good argument for his position.
>>
>> I haven't tested, but if I understand what Max and Michael are saying
>> about CVS, that operation would probably show the file as being there
>> on *every* date between REL8_4_STABLE splitting off and the actual
>> addition of it.po to the branch.  Because CVS isn't paying attention to
>> the evidence of the intermediate tags not being there, either.
>>
>> Nonetheless, having the file pop into being and then disappear again
>> between two observable points seems way too much like quantum physics
>> for my taste.  I think it has to be possible for cvs2git to produce a
>> less surprising translation.
> 
> Well, if Max is correct that this bug is fixed in CVS 1.11.18 (I don't
> see it in the NEWS file) and that a checkout-by-date shows the file
> present during the time cvs2git claims it is present, then a less
> surprising translation wouldn't be a faithful representation of the
> contents of our CVS repository.

Correct. You'll have to decide whether you wish to represent your
current cvs repository, or attempt to doctor things to fix the insanity
CVS introduced.

> One thing I'm not quite clear on is
> how cvs2git thinks CVS "should" look given what we actually did vs.
> how it actually does look,

CVS from 1.11.18 kludges things to work right by inserting a file
revision on the branch in the dead (deleted) state with the same date as
the revision it branched from. This marks identifiably that it didn't
exist on the branch to start with, Then, a non-dead revision marks the
true addition of the file to the branch. I'm attaching a sample RCS file.

> but if our CVS repository is busted maybe
> we should be looking to fix that rather than complaining about
> cvs2git.

A possibility. We'd need a tool which would insert an extra node into
the history graph of an RCS file. Unless we can bodge it by using
x.y.z.0 as a revision id, it would also need to renumber all the
revisions on the branch. Still, cvs2git has code to parse the RCS
format, so it's probably achievable without too much work.

Max.
head1.1;
access;
symbols
b1:1.1.0.2;
locks; strict;
comment @# @;


1.1
date2010.09.08.00.33.01;author maxb;state Exp;
branches
1.1.2.1;
next;
commitidlO0BL09PCcYPwINu;

1.1.2.1
date2010.09.08.00.33.01;author maxb;state dead;
branches;
next1.1.2.2;
commitidFuoVc28H18LVwINu;

1.1.2.2
date2010.09.08.00.33.17;author maxb;state Exp;
branches;
next;
commitidFuoVc28H18LVwINu;


desc
@@


1.1
log
@Foo2.
@
text
@@


1.1.2.1
log
@file b was added on branch b1 on 2010-09-08 00:33:17 +
@
text
@@


1.1.2.2
log
@Merge.
@
text
@@




signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Max Bowsher

On 08/09/10 00:47, Tom Lane wrote:
> Max Bowsher  writes:
>> And, I've just tracked down that this bug was apparently fixed in CVS
>> 1.11.18, released November 2004.
> 
> Hrm, what bug exactly?  As far as I've gathered from the discussion,
> this is a fundamental design limitation of CVS, not a fixable bug.

The bug that CVS represented addition to a branch in a way which didn't
record when it occurred.

The way in which it was bludgeoned into the RCS file format was somewhat
hacky, but was a successful fix.

Max.

signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Tom Lane

Max Bowsher  writes:
> And, I've just tracked down that this bug was apparently fixed in CVS
> 1.11.18, released November 2004.

Hrm, what bug exactly?  As far as I've gathered from the discussion,
this is a fundamental design limitation of CVS, not a fixable bug.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "Freezing" per-role settings

2010-09-07 Thread Tom Lane

Jeff Davis  writes:
> On Tue, 2010-09-07 at 13:30 -0700, David Fetter wrote:
>> Offhand, I'm not thinking of past examples of mutating/disappearing
>> GUC that people would want to freeze, nor of a new GUC that would
>> negate or substantially alter such freezing.  What have I missed?

> It just seems like the wrong mechanism.

Yeah, it seems like an ugly and probably basically-wrong solution
to the given problem.  And there are a ton of corner cases.  For
example, if I "freeze" a user's search_path, what happens if the user
tries to call a function that has a search_path property attached?  Does
it matter whether the function is owned by some other userid that maybe
doesn't have a freeze for that value?  Similarly, if the user calls a
function that is SECURITY DEFINER to some other role that hasn't got the
freeze flag set, should that function be allowed to change the setting
internally, and if not why not?

For that matter, if user A owns a SECURITY DEFINER function that doesn't
try to set search_path, should a "freeze search_path" applied to user A
somehow result in implicit switches of search_path when that function is
invoked by user B?  (Good luck making that one happen without
catastrophic performance degradation, because it would mean looking into
the system catalogs on every function call to see if this
action-at-a-distance should affect this function call.)

And none of this seems to have a lot to do with the original goal,
which IIUC was to make a session read-only, not the activities blamable
on a particular user identity.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Robert Haas

On Tue, Sep 7, 2010 at 7:18 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> Well, as Max says downthread, cvs -r REL8_4_STABLE -d
>> INTERMEDIATE_DATE apparently shows the file as being there, which is a
>> fairly good argument for his position.
>
> I haven't tested, but if I understand what Max and Michael are saying
> about CVS, that operation would probably show the file as being there
> on *every* date between REL8_4_STABLE splitting off and the actual
> addition of it.po to the branch.  Because CVS isn't paying attention to
> the evidence of the intermediate tags not being there, either.
>
> Nonetheless, having the file pop into being and then disappear again
> between two observable points seems way too much like quantum physics
> for my taste.  I think it has to be possible for cvs2git to produce a
> less surprising translation.

Well, if Max is correct that this bug is fixed in CVS 1.11.18 (I don't
see it in the NEWS file) and that a checkout-by-date shows the file
present during the time cvs2git claims it is present, then a less
surprising translation wouldn't be a faithful representation of the
contents of our CVS repository.  One thing I'm not quite clear on is
how cvs2git thinks CVS "should" look given what we actually did vs.
how it actually does look, but if our CVS repository is busted maybe
we should be looking to fix that rather than complaining about
cvs2git.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Tom Lane

Robert Haas  writes:
> Well, as Max says downthread, cvs -r REL8_4_STABLE -d
> INTERMEDIATE_DATE apparently shows the file as being there, which is a
> fairly good argument for his position.

I haven't tested, but if I understand what Max and Michael are saying
about CVS, that operation would probably show the file as being there
on *every* date between REL8_4_STABLE splitting off and the actual
addition of it.po to the branch.  Because CVS isn't paying attention to
the evidence of the intermediate tags not being there, either.

Nonetheless, having the file pop into being and then disappear again
between two observable points seems way too much like quantum physics
for my taste.  I think it has to be possible for cvs2git to produce a
less surprising translation.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Robert Haas

On Tue, Sep 7, 2010 at 6:34 PM, Tom Lane  wrote:
> Hmm.  Some further looking in the git log output shows that that
> "manufactured commit" is actually the ONLY commit shown as being a
> predecessor of REL8_4_3.  Everything else after 8.4.2 was tagged is
> shown as reached from refs/tags/REL8_4_4.  This is at the least pretty
> weird, and I have to suppose it's the manufactured commit causing it.
> It does appear to agree with your explanation: the "8.4.3" state is
> not part of the branch's main evolution, but is a little side branch
> all by itself.

Yep, that's what it is.

>> The effect of all of this is that if someone checks out a git commit
>> between 2010-02-28 and 2010-05-13, it.po will be there, even though
>> file didn't exist on that CVS branch at that time.
>
> Yeah, that's what it's doing for me.
>
>> Max's contention
>> seems to be that this is a CVS problem rather than a cvs2git problem.
>
> No doubt.  However, the facts on the ground are that it.po is provably
> not there in REL8_4_0, REL8_4_1, REL8_4_2, or REL8_4_3, and is there in
> REL8_4_4, and that no commit on the branch touched it before 2010-05-13
> (just before 8.4.4).  I will be interested to see the argument why
> cvs2git should consider the sanest translation of these facts to involve
> adding it.po to the branch after 8.4.2 and removing it again before
> 8.4.3.

Well, as Max says downthread, cvs -r REL8_4_STABLE -d
INTERMEDIATE_DATE apparently shows the file as being there, which is a
fairly good argument for his position.  I think it's pretty amusing
that on this of all projects, where we regularly complain to people
about not updating to the latest minor release, we are six minor
releases out of date

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Tom Lane

Max Bowsher  writes:
> Hmm. Now I'm speculating vaguely about how the cycle breaker could be
> convinced to break branch update commits into as many pieces as
> possible, instead of as few.

That same thought occurred to me.  If it simply didn't aggregate, but
treated each such file separately, would we end up with a saner history?
We would have more individual manufactured commits, but I think they
might be less surprising.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Max Bowsher

On 07/09/10 23:34, Tom Lane wrote:
> No doubt.  However, the facts on the ground are that it.po is provably
> not there in REL8_4_0, REL8_4_1, REL8_4_2, or REL8_4_3, and is there in
> REL8_4_4, and that no commit on the branch touched it before 2010-05-13
> (just before 8.4.4).  I will be interested to see the argument why
> cvs2git should consider the sanest translation of these facts to involve
> adding it.po to the branch after 8.4.2 and removing it again before
> 8.4.3.

Only that cvs2git isn't quite so smart as to take tags present on a
branch as a guideline of when to introduce files that sprung into
existence on a branch at an uncertain point. It merely operates by
breaking cyclic dependencies between the various events it observes in
the CVS repository. In this case, the "create branch REL8_4_STABLE"
operation gets broken into several pieces to fit around the actual
revisions involved.

Hmm. Now I'm speculating vaguely about how the cycle breaker could be
convinced to break branch update commits into as many pieces as
possible, instead of as few.

Max.

signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] Synchronous replication - patch status inquiry

2010-09-07 Thread Bruce Momjian

Robert Haas wrote:
> On Tue, Sep 7, 2010 at 11:59 AM, Simon Riggs  wrote:
> >> What I *think* you're saying is that the slave doesn't send per-commit
> >> messages, but instead processes the WAL as it's received and then sends
> >> a heres-where-I-am status message back upstream immediately before going
> >> to sleep waiting for the next chunk. ?That's fine as far as the protocol
> >> goes, but I'm not convinced that it really does all that much in terms
> >> of improving performance. ?You still have the problem that the master
> >> has to fsync its WAL before it can send it to the slave. ?Also, the
> >> slave won't know whether it ought to fsync its own WAL before replying.
> >
> > Yes, apart from last sentence. Please wait for the code.
> 
> So, we're going around and around in circles here because you're
> repeatedly refusing to explain how the slave will know WHEN to send
> acknowledgments back to the master without knowing which sync rep
> level is in use.  It seems to be perfectly evident to everyone else
> here that there are only two ways for this to work: either the value
> is configured on the standby, or there's a registration system on the
> master and the master tells the standby its wishes.  Instead of asking
> the entire community to wait for an unspecified period of time for you
> to write code that will handle this in an unspecified way, how about
> answering the question?  We've wasted far too much time arguing about
> this already.

Ideally I would like the sync method to be set on each slave, and have
some method for the master to query the sync mode of all the slaves, e.g.
appname.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Tom Lane

I wrote:
> Hmm.  Some further looking in the git log output shows that that
> "manufactured commit" is actually the ONLY commit shown as being a
> predecessor of REL8_4_3.  Everything else after 8.4.2 was tagged is
> shown as reached from refs/tags/REL8_4_4.  This is at the least pretty
> weird, and I have to suppose it's the manufactured commit causing it.
> It does appear to agree with your explanation: the "8.4.3" state is
> not part of the branch's main evolution, but is a little side branch
> all by itself.

This same pattern can be found repeated in at least ten earlier places
in our project history, btw --- just look for commits using the phrase
"manufactured by cvs2svn to create tag" instead of "to create branch".
The worst example is probably the one for tag REL7_1_BETA, which deletes
70-odd files.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Max Bowsher

On 07/09/10 23:20, Max Bowsher wrote:
> On 07/09/10 23:15, Robert Haas wrote:
>> On Tue, Sep 7, 2010 at 5:47 PM, Tom Lane  wrote:
>>> BTW, why is this commit shown as being a predecessor of refs/tags/REL8_4_4
>>> and not refs/tags/REL8_4_3?  That's nothing to do with it.po, perhaps,
>>> but it sure looks wrong.  (Magnus, did you check against the 8.4.3 tarball?)
>>
>> I think this is another result of the same basic problem.  Since
>> cvs2git thinks it.po was added to REL8_4_STABLE on 2010-02-28 rather
>> than 2010-05-13,the REL8_4_STABLE version that existed on to
>> 2010-03-12, when 8.4.3 was tagged, includes that file.  But cvs2git
>> also knows that 8.4.3 does NOT include that file, so it picks the
>> commit on the 8.4.3 branch that most closely matches the contents of
>> the tag (namely, Marc's "tag 8.4.3" commit) and then shoves a
>> manufactured commit on top of that to make the contents of the 8.4.3
>> tag match what actually got tagged.  But that manufactured commit is
>> only there to make the tag contents match; it's not actually part of
>> the branch.  If the conversion correctly made it.po get added on
>> 2010-05-13 rather than 2010-02-28 then Marc's "tag 8.4.3" commit would
>> match the tag contents exactly and no manufactured commit would be
>> created.
> 
> Yes, this is the correct analysis.
> 
>> The effect of all of this is that if someone checks out a git commit
>> between 2010-02-28 and 2010-05-13, it.po will be there, even though
>> file didn't exist on that CVS branch at that time.  Max's contention
>> seems to be that this is a CVS problem rather than a cvs2git problem.
>> Perhaps we can do something like cvs update -r REL8_4_STABLE -d
>> SOME_INTERMEDIATE_DATE and see whether that file is there or not.
> 
> $ cvs co -r REL8_4_STABLE -D "2010-04-01" pgsql
> ...
> $ ls -la pgsql/src/bin/pg_dump/po/it.po
> -rw-r--r-- 1 maxb maxb 67871 2010-02-19 00:40 pgsql/src/bin/pg_dump/po/it.po
> 
> It's there.


And, I've just tracked down that this bug was apparently fixed in CVS
1.11.18, released November 2004.

Max.



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Tom Lane

Robert Haas  writes:
> On Tue, Sep 7, 2010 at 5:47 PM, Tom Lane  wrote:
>> BTW, why is this commit shown as being a predecessor of refs/tags/REL8_4_4
>> and not refs/tags/REL8_4_3?

> I think this is another result of the same basic problem.  Since
> cvs2git thinks it.po was added to REL8_4_STABLE on 2010-02-28 rather
> than 2010-05-13,the REL8_4_STABLE version that existed on to
> 2010-03-12, when 8.4.3 was tagged, includes that file.  But cvs2git
> also knows that 8.4.3 does NOT include that file, so it picks the
> commit on the 8.4.3 branch that most closely matches the contents of
> the tag (namely, Marc's "tag 8.4.3" commit) and then shoves a
> manufactured commit on top of that to make the contents of the 8.4.3
> tag match what actually got tagged.  But that manufactured commit is
> only there to make the tag contents match; it's not actually part of
> the branch.  If the conversion correctly made it.po get added on
> 2010-05-13 rather than 2010-02-28 then Marc's "tag 8.4.3" commit would
> match the tag contents exactly and no manufactured commit would be
> created.

Hmm.  Some further looking in the git log output shows that that
"manufactured commit" is actually the ONLY commit shown as being a
predecessor of REL8_4_3.  Everything else after 8.4.2 was tagged is
shown as reached from refs/tags/REL8_4_4.  This is at the least pretty
weird, and I have to suppose it's the manufactured commit causing it.
It does appear to agree with your explanation: the "8.4.3" state is
not part of the branch's main evolution, but is a little side branch
all by itself.

> The effect of all of this is that if someone checks out a git commit
> between 2010-02-28 and 2010-05-13, it.po will be there, even though
> file didn't exist on that CVS branch at that time.

Yeah, that's what it's doing for me.

> Max's contention
> seems to be that this is a CVS problem rather than a cvs2git problem.

No doubt.  However, the facts on the ground are that it.po is provably
not there in REL8_4_0, REL8_4_1, REL8_4_2, or REL8_4_3, and is there in
REL8_4_4, and that no commit on the branch touched it before 2010-05-13
(just before 8.4.4).  I will be interested to see the argument why
cvs2git should consider the sanest translation of these facts to involve
adding it.po to the branch after 8.4.2 and removing it again before
8.4.3.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "Freezing" per-role settings

2010-09-07 Thread Jeff Davis

On Tue, 2010-09-07 at 14:49 -0700, David Fetter wrote:
> There are two problems at hand here, as I see it: the more general
> problem of "freezing" settings for a given role, and the very specific
> capability of guaranteeing read-only-ness, which could have large
> implications in, for example, data warehousing and replication
> systems.
> 
> Should we just call them separate problems, look into how to approach
> the latter one, and table the former?

That sounds like a good plan. Right now, the only solid use case we have
is "read only role", and it's difficult to build a (good) general
mechanism from a single use case.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Max Bowsher

On 07/09/10 23:15, Robert Haas wrote:
> On Tue, Sep 7, 2010 at 5:47 PM, Tom Lane  wrote:
>> BTW, why is this commit shown as being a predecessor of refs/tags/REL8_4_4
>> and not refs/tags/REL8_4_3?  That's nothing to do with it.po, perhaps,
>> but it sure looks wrong.  (Magnus, did you check against the 8.4.3 tarball?)
> 
> I think this is another result of the same basic problem.  Since
> cvs2git thinks it.po was added to REL8_4_STABLE on 2010-02-28 rather
> than 2010-05-13,the REL8_4_STABLE version that existed on to
> 2010-03-12, when 8.4.3 was tagged, includes that file.  But cvs2git
> also knows that 8.4.3 does NOT include that file, so it picks the
> commit on the 8.4.3 branch that most closely matches the contents of
> the tag (namely, Marc's "tag 8.4.3" commit) and then shoves a
> manufactured commit on top of that to make the contents of the 8.4.3
> tag match what actually got tagged.  But that manufactured commit is
> only there to make the tag contents match; it's not actually part of
> the branch.  If the conversion correctly made it.po get added on
> 2010-05-13 rather than 2010-02-28 then Marc's "tag 8.4.3" commit would
> match the tag contents exactly and no manufactured commit would be
> created.

Yes, this is the correct analysis.

> The effect of all of this is that if someone checks out a git commit
> between 2010-02-28 and 2010-05-13, it.po will be there, even though
> file didn't exist on that CVS branch at that time.  Max's contention
> seems to be that this is a CVS problem rather than a cvs2git problem.
> Perhaps we can do something like cvs update -r REL8_4_STABLE -d
> SOME_INTERMEDIATE_DATE and see whether that file is there or not.

$ cvs co -r REL8_4_STABLE -D "2010-04-01" pgsql
...
$ ls -la pgsql/src/bin/pg_dump/po/it.po
-rw-r--r-- 1 maxb maxb 67871 2010-02-19 00:40 pgsql/src/bin/pg_dump/po/it.po

It's there.

Max.



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Robert Haas

On Tue, Sep 7, 2010 at 5:47 PM, Tom Lane  wrote:
> BTW, why is this commit shown as being a predecessor of refs/tags/REL8_4_4
> and not refs/tags/REL8_4_3?  That's nothing to do with it.po, perhaps,
> but it sure looks wrong.  (Magnus, did you check against the 8.4.3 tarball?)

I think this is another result of the same basic problem.  Since
cvs2git thinks it.po was added to REL8_4_STABLE on 2010-02-28 rather
than 2010-05-13,the REL8_4_STABLE version that existed on to
2010-03-12, when 8.4.3 was tagged, includes that file.  But cvs2git
also knows that 8.4.3 does NOT include that file, so it picks the
commit on the 8.4.3 branch that most closely matches the contents of
the tag (namely, Marc's "tag 8.4.3" commit) and then shoves a
manufactured commit on top of that to make the contents of the 8.4.3
tag match what actually got tagged.  But that manufactured commit is
only there to make the tag contents match; it's not actually part of
the branch.  If the conversion correctly made it.po get added on
2010-05-13 rather than 2010-02-28 then Marc's "tag 8.4.3" commit would
match the tag contents exactly and no manufactured commit would be
created.

The effect of all of this is that if someone checks out a git commit
between 2010-02-28 and 2010-05-13, it.po will be there, even though
file didn't exist on that CVS branch at that time.  Max's contention
seems to be that this is a CVS problem rather than a cvs2git problem.
Perhaps we can do something like cvs update -r REL8_4_STABLE -d
SOME_INTERMEDIATE_DATE and see whether that file is there or not.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Max Bowsher

On 07/09/10 21:25, Magnus Hagander wrote:
> On Tue, Sep 7, 2010 at 22:06, Robert Haas  wrote:
>> On Tue, Sep 7, 2010 at 10:08 AM, Robert Haas  wrote:
>>> On Tue, Sep 7, 2010 at 9:56 AM, Magnus Hagander  wrote:
 You're saying you don't "require" a fix on the latest issue here? Or
 should we spend some time trying to figure out if we can fix it with
 git-filter-branch?
>>>
>>> I think that "the latest issue here" is the issue of how files get
>>> added to branches, which we discussed before with pretty much the same
>>> set of conclusions.  I'm not wild about the way that's getting
>>> converted, but I'm not sure I care enough about it to argue with Tom.
>>> However, I want to convince myself that the deletes we've done over
>>> the years have been properly handled.  I need to look at Max's latest
>>> conversion and I'll look at yours as well.
>>
>> Magnus -
>>
>> I just looked at your latest conversion (based on what Max did) and it
>> looks a lot better.  I think, though, that we should re-remove these
>> branches:
>>
>>  origin/unlabeled-1.44.2
>>  origin/unlabeled-1.51.2
>>  origin/unlabeled-1.59.2
>>  origin/unlabeled-1.87.2
>>  origin/unlabeled-1.90.2
> 
> Oh yeah, I did the push before I ran that step of my script. Oops, sorry.
> 

Speaking of which, could you update the public copy of all the
conversion documentation / machinery?

Thanks,
Max.



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] "Freezing" per-role settings

2010-09-07 Thread David Fetter

On Tue, Sep 07, 2010 at 02:43:12PM -0700, Jeff Davis wrote:
> On Tue, 2010-09-07 at 13:30 -0700, David Fetter wrote:
> > Offhand, I'm not thinking of past examples of mutating/disappearing
> > GUC that people would want to freeze, nor of a new GUC that would
> > negate or substantially alter such freezing.  What have I missed?
> 
> If you'll allow me to change my argument slightly, it just seems
> chaotic. We'd be introducing the 100+ GUCs all as potential security
> features, and it would (presumably) be up to the user whether they
> considered it a "security feature" or not. I think, in practice, that
> would confuse users about the security of the system, and we'd be more
> reluctant to change GUC behavior because someone, somewhere, might have
> considered it a part of their system's security.
> 
> Perhaps someone will assume that they can prevent a user from performing
> joins by disabling and freezing enable_hashjoin/nestloop/mergejoin. Or
> perhaps someone will try to contain a user to a few schemas by freezing
> the search_path. Maybe this is a little far-fetched, but the point is
> that we are quite a ways away from blessing all GUCs with a word like
> "security".
> 
> It just seems like the wrong mechanism.

OK :)

> > > It makes more sense to tie it to the role directly, so DDL.
> > 
> > There are still arguments for making it DCL-ish, in the sense that
> > it is, at least in this case, viewable as a data control issue.
> 
> I would be more open to it if it didn't rely on GUCs at all.

There are two problems at hand here, as I see it: the more general
problem of "freezing" settings for a given role, and the very specific
capability of guaranteeing read-only-ness, which could have large
implications in, for example, data warehousing and replication
systems.

Should we just call them separate problems, look into how to approach
the latter one, and table the former?

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Tom Lane

Max Bowsher  writes:
> It wouldn't - except for the fact that cvs2git batches such manufactured
> commits such that there is no guarantee that a single manufactured
> commit pertains only to files in the commit immediately afterwards. For
> example, consider the it.po file in the commit referenced in this thread
> yesterday:

OK, I looked at this example, and I'm confused again.  The actual 8.4
history of src/bin/pg_dump/po/it.po is that it was removed from HEAD
on 2009-06-26, before the 8.4 branch was split off; and then re-added to
the 8.4 branch on 2010-05-13, just before 8.4.4 was tagged.  See
http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/bin/pg_dump/po/it.po

Looking at Max's conversion with git log --all --source --name-status,
this file is shown as modified in the latter commit:

commit 575981a2fd6da5ccbf75c57580bf2d98b41f936e refs/tags/REL8_4_4
Author: Peter Eisentraut 
Date:   Thu May 13 10:50:20 2010 +

Translation update

...
M   src/bin/pg_dump/po/it.po
...

The deletion is correctly shown here:

commit 4ade8dc6f7030b14306916b787fa8f75e4d49b2e refs/tags/REL8_4_0
Author: Peter Eisentraut 
Date:   Fri Jun 26 19:33:52 2009 +

Translation updates for 8.4 release.

File that are translated less than 80% have been removed, as per new
translation team policy.

...
D   src/bin/pg_dump/po/it.po
...

Now I can find two intermediate commits that touched this file:

commit b78e79ec74fd4fac0c24753bbf8fa69fe7e6feb9 refs/tags/REL8_4_3
Author: PostgreSQL Daemon 
Date:   Fri Mar 12 03:23:24 2010 +

This commit was manufactured by cvs2svn to create tag 'REL8_4_3'.

Sprout from REL8_4_STABLE 2010-03-12 03:23:23 UTC Marc G. Fournier 
 ''
Delete:
src/bin/pg_dump/po/it.po

D   src/bin/pg_dump/po/it.po

commit b36518cb880bb236496ec3e505ede4001ce56157 refs/tags/REL8_4_4
Author: PostgreSQL Daemon 
Date:   Sun Feb 28 21:32:02 2010 +

This commit was manufactured by cvs2svn to create branch 'REL8_4_STABLE'.

Cherrypick from master 2010-02-28 21:31:57 UTC Tom Lane 
 'Fix up memory management problems in contrib/xml2.':
contrib/xml2/expected/xml2.out
contrib/xml2/sql/xml2.sql
src/bin/pg_dump/po/it.po

A   contrib/xml2/expected/xml2.out
A   contrib/xml2/sql/xml2.sql
A   src/bin/pg_dump/po/it.po

Now it seems to me that this is just totally wacko.  In the first place,
the commit "manufactured by cvs2svn to create tag 'REL8_4_3'" postdates
the commit where Marc actually tagged 8.4.3:

commit 3aa54912637319c516f59d3a0265cb7826ed125f refs/tags/REL8_4_4
Author: Marc G. Fournier 
Date:   Fri Mar 12 03:23:23 2010 +

tag 8.4.3

M   configure
M   configure.in
M   doc/bug.template
M   src/include/pg_config.h.win32
M   src/interfaces/libpq/libpq.rc.in
M   src/port/win32ver.rc

BTW, why is this commit shown as being a predecessor of refs/tags/REL8_4_4
and not refs/tags/REL8_4_3?  That's nothing to do with it.po, perhaps,
but it sure looks wrong.  (Magnus, did you check against the 8.4.3 tarball?)

But the main gripe is: how can it be claimed to be sane to represent the
revision history as being that it.po was added to 8.4.4 two weeks before
it was deleted from 8.4.3?

There is definitely *something* not kosher about the manufactured-commit
logic.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "Freezing" per-role settings

2010-09-07 Thread Jeff Davis

On Tue, 2010-09-07 at 13:30 -0700, David Fetter wrote:
> Offhand, I'm not thinking of past examples of mutating/disappearing
> GUC that people would want to freeze, nor of a new GUC that would
> negate or substantially alter such freezing.  What have I missed?

If you'll allow me to change my argument slightly, it just seems
chaotic. We'd be introducing the 100+ GUCs all as potential security
features, and it would (presumably) be up to the user whether they
considered it a "security feature" or not. I think, in practice, that
would confuse users about the security of the system, and we'd be more
reluctant to change GUC behavior because someone, somewhere, might have
considered it a part of their system's security.

Perhaps someone will assume that they can prevent a user from performing
joins by disabling and freezing enable_hashjoin/nestloop/mergejoin. Or
perhaps someone will try to contain a user to a few schemas by freezing
the search_path. Maybe this is a little far-fetched, but the point is
that we are quite a ways away from blessing all GUCs with a word like
"security".

It just seems like the wrong mechanism.

> > It makes more sense to tie it to the role directly, so DDL.
> 
> There are still arguments for making it DCL-ish, in the sense that it
> is, at least in this case, viewable as a data control issue.

I would be more open to it if it didn't rely on GUCs at all.

Regards,
Jeff Davis

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] function_name.parameter_name

2010-09-07 Thread David E. Wheeler

I think so. Try it!

David

On Sep 7, 2010, at 11:39 AM, Sergey Konoplev wrote:

> Hi,
> 
> On 7 September 2010 20:35, Tom Lane  wrote:
>> How does $subject differ from what we already do?  See
>> http://www.postgresql.org/docs/9.0/static/plpgsql-structure.html
> 
> So will it be possible to do things like this?
> 
> 1.
> CREATE FUNCTION func_name(arg_name text) RETURNS integer AS $$
> BEGIN
>RAISE INFO '%', func_name.arg_name;
> ...
> 
> 2.
> CREATE FUNCTION func_name() RETURNS integer AS $$
> DECLARE
>var_name text := 'bla';
> BEGIN
>RAISE INFO '%', func_name.var_name;
> ...
> 
> 3.
> CREATE FUNCTION func_very_very_very_very_long_name() RETURNS integer AS $$
> << func_alias >>
> DECLARE
>var_name text := 'bla';
> BEGIN
>RAISE INFO '%', func_alias.var_name;




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Magnus Hagander

On Tue, Sep 7, 2010 at 22:16, Tom Lane  wrote:
> Robert Haas  writes:
>> I just looked at your latest conversion (based on what Max did) and it
>> looks a lot better.  I think, though, that we should re-remove these
>> branches:
>
>>   origin/unlabeled-1.44.2
>>   origin/unlabeled-1.51.2
>>   origin/unlabeled-1.59.2
>>   origin/unlabeled-1.87.2
>>   origin/unlabeled-1.90.2
>
> I haven't looked at Magnus' latest iteration, but in Max's version
> this was showing as a branch:
>
>  remotes/origin/REL8_0_0
>
> AFAIK that was simply a mistake: it was intended to be a tag not a
> branch.  If it's feasible to downgrade it to a tag during the
> conversion, that would be a good thing to do.

Shold be doable with a simple:
git tag REL8_0_0 REL8_0_0
git branch -D REL8_0_0

I'll try that and re-run my content-verification script on top of that.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "Freezing" per-role settings

2010-09-07 Thread David Fetter

On Tue, Sep 07, 2010 at 12:41:51PM -0700, Jeff Davis wrote:
> On Tue, 2010-09-07 at 11:39 -0700, David Fetter wrote:
> > We'd like to create a role called read_only, with eponymous
> > capability.
> 
> Seems useful.

Great to hear :)

> > If so, is it more
> > DCL-ish, or more DDL-ish?
> 
> I don't like the idea of a security model relying on the ability (or
> lack thereof) to set GUCs. Imagine the effects of adding new GUCs,
> removing old ones, changing a GUC name, or tweaking the behavior
> slightly.

Offhand, I'm not thinking of past examples of mutating/disappearing
GUC that people would want to freeze, nor of a new GUC that would
negate or substantially alter such freezing.  What have I missed?

> It makes more sense to tie it to the role directly, so DDL.

There are still arguments for making it DCL-ish, in the sense that it
is, at least in this case, viewable as a data control issue.

> Also, you should put this in the context of previous discussions, which
> lead to the "ON ALL TABLES IN SCHEMA" feature in 9.0. In particular,
> that feature only affects existing objects, and you are trying to create
> some kind of permissions mask which will affect new objects, as well.

I guess I can see a case for making "read-only" non-global, but I
think a good first try at it would be to make such "freezes" global.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Magnus Hagander

On Tue, Sep 7, 2010 at 22:06, Robert Haas  wrote:
> On Tue, Sep 7, 2010 at 10:08 AM, Robert Haas  wrote:
>> On Tue, Sep 7, 2010 at 9:56 AM, Magnus Hagander  wrote:
>>> You're saying you don't "require" a fix on the latest issue here? Or
>>> should we spend some time trying to figure out if we can fix it with
>>> git-filter-branch?
>>
>> I think that "the latest issue here" is the issue of how files get
>> added to branches, which we discussed before with pretty much the same
>> set of conclusions.  I'm not wild about the way that's getting
>> converted, but I'm not sure I care enough about it to argue with Tom.
>> However, I want to convince myself that the deletes we've done over
>> the years have been properly handled.  I need to look at Max's latest
>> conversion and I'll look at yours as well.
>
> Magnus -
>
> I just looked at your latest conversion (based on what Max did) and it
> looks a lot better.  I think, though, that we should re-remove these
> branches:
>
>  origin/unlabeled-1.44.2
>  origin/unlabeled-1.51.2
>  origin/unlabeled-1.59.2
>  origin/unlabeled-1.87.2
>  origin/unlabeled-1.90.2

Oh yeah, I did the push before I ran that step of my script. Oops, sorry.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Tom Lane

Robert Haas  writes:
> I just looked at your latest conversion (based on what Max did) and it
> looks a lot better.  I think, though, that we should re-remove these
> branches:

>   origin/unlabeled-1.44.2
>   origin/unlabeled-1.51.2
>   origin/unlabeled-1.59.2
>   origin/unlabeled-1.87.2
>   origin/unlabeled-1.90.2

I haven't looked at Magnus' latest iteration, but in Max's version
this was showing as a branch:

  remotes/origin/REL8_0_0

AFAIK that was simply a mistake: it was intended to be a tag not a
branch.  If it's feasible to downgrade it to a tag during the
conversion, that would be a good thing to do.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-09-07 Thread Robert Haas

On Tue, Sep 7, 2010 at 4:06 PM, marcin mank  wrote:
> On Tue, Sep 7, 2010 at 5:17 PM, Tom Lane  wrote:
>> We can *not* allow the slave to replay WAL ahead of what is known
>> committed to disk on the master.  The only way to make that safe
>> is the compare-notes-and-ship-WAL-back approach that Robert mentioned.
>>
>> If you feel that decoupling WAL application is absolutely essential
>> to have a credible feature, then you'd better bite the bullet and
>> start working on the ship-WAL-back code.
>>
>
> In the mode where it is not required that the WAL is applied (only
> sent to the slave / synced to slave disk) one alternative is to have a
> separate pointer to the last WAL record that can be safely applied on
> the slave. Then You can send the un-synced WAL to the slave (while
> concurrently syncing it on the master). When both the slave an the
> master sync complete, one can give the client a commit notification,
> increase the pointer, and send it to the slave (it would be a separate
> WAL record type I guess).
>
> In case of master failure, the slave can discard the un-applied WAL
> after the pointer.

But the pointer on the slave has to be fsync'd to make it persistent,
which likely takes roughly the same amount of time as fsync-ing the
WAL itself.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Robert Haas

On Tue, Sep 7, 2010 at 10:08 AM, Robert Haas  wrote:
> On Tue, Sep 7, 2010 at 9:56 AM, Magnus Hagander  wrote:
>> You're saying you don't "require" a fix on the latest issue here? Or
>> should we spend some time trying to figure out if we can fix it with
>> git-filter-branch?
>
> I think that "the latest issue here" is the issue of how files get
> added to branches, which we discussed before with pretty much the same
> set of conclusions.  I'm not wild about the way that's getting
> converted, but I'm not sure I care enough about it to argue with Tom.
> However, I want to convince myself that the deletes we've done over
> the years have been properly handled.  I need to look at Max's latest
> conversion and I'll look at yours as well.

Magnus -

I just looked at your latest conversion (based on what Max did) and it
looks a lot better.  I think, though, that we should re-remove these
branches:

  origin/unlabeled-1.44.2
  origin/unlabeled-1.51.2
  origin/unlabeled-1.59.2
  origin/unlabeled-1.87.2
  origin/unlabeled-1.90.2

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-09-07 Thread marcin mank

On Tue, Sep 7, 2010 at 5:17 PM, Tom Lane  wrote:
> We can *not* allow the slave to replay WAL ahead of what is known
> committed to disk on the master.  The only way to make that safe
> is the compare-notes-and-ship-WAL-back approach that Robert mentioned.
>
> If you feel that decoupling WAL application is absolutely essential
> to have a credible feature, then you'd better bite the bullet and
> start working on the ship-WAL-back code.
>

In the mode where it is not required that the WAL is applied (only
sent to the slave / synced to slave disk) one alternative is to have a
separate pointer to the last WAL record that can be safely applied on
the slave. Then You can send the un-synced WAL to the slave (while
concurrently syncing it on the master). When both the slave an the
master sync complete, one can give the client a commit notification,
increase the pointer, and send it to the slave (it would be a separate
WAL record type I guess).

In case of master failure, the slave can discard the un-applied WAL
after the pointer.

Greetings
marcin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "Freezing" per-role settings

2010-09-07 Thread Jeff Davis

On Tue, 2010-09-07 at 11:39 -0700, David Fetter wrote:
> We'd like to create a role called read_only, with eponymous
> capability.

Seems useful.

> If so, is it more
> DCL-ish, or more DDL-ish?

I don't like the idea of a security model relying on the ability (or
lack thereof) to set GUCs. Imagine the effects of adding new GUCs,
removing old ones, changing a GUC name, or tweaking the behavior
slightly. It makes more sense to tie it to the role directly, so DDL.

Also, you should put this in the context of previous discussions, which
lead to the "ON ALL TABLES IN SCHEMA" feature in 9.0. In particular,
that feature only affects existing objects, and you are trying to create
some kind of permissions mask which will affect new objects, as well.

Regards,
Jeff Davis

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] UTF16 surrogate pairs in UTF8 encoding

2010-09-07 Thread Peter Eisentraut

On sön, 2010-08-22 at 15:15 -0400, Tom Lane wrote:
> > We combine the surrogate pair components to a single code point and
> > encode that in UTF-8.  We don't encode the components separately;
> that
> > would be wrong.
> 
> Oh, OK.  Should the docs make that a bit clearer?

Done.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] function_name.parameter_name

2010-09-07 Thread Sergey Konoplev

Hi,

On 7 September 2010 20:35, Tom Lane  wrote:
> How does $subject differ from what we already do?  See
> http://www.postgresql.org/docs/9.0/static/plpgsql-structure.html

So will it be possible to do things like this?

1.
CREATE FUNCTION func_name(arg_name text) RETURNS integer AS $$
BEGIN
RAISE INFO '%', func_name.arg_name;
...

2.
CREATE FUNCTION func_name() RETURNS integer AS $$
DECLARE
var_name text := 'bla';
BEGIN
RAISE INFO '%', func_name.var_name;
...

3.
CREATE FUNCTION func_very_very_very_very_long_name() RETURNS integer AS $$
<< func_alias >>
DECLARE
var_name text := 'bla';
BEGIN
RAISE INFO '%', func_alias.var_name;
...


-- 
Sergey Konoplev

Blog: http://gray-hemp.blogspot.com /
Linkedin: http://ru.linkedin.com/in/grayhemp /
JID/GTalk: gray...@gmail.com / Skype: gray-hemp / ICQ: 29353802

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] "Freezing" per-role settings

2010-09-07 Thread David Fetter

Folks,

I noticed a little unimplemented feature which I suspect a lot of
people would find useful, namely the ability to "freeze" certain
settings for a role.

Example: We'd like to create a role called read_only, with eponymous
capability.  At the moment, we can't do what's below, but I'd like to
be able to make it possible.  First, we'd issue the following, which
doesn't work yet:

ALTER ROLE read_only SET transaction_isolation read_only;

Then, there's one way via DCL (Data Control Language)

REVOKE SET transaction_isolation FROM read_only;

Another would be via DDL:

ALTER ROLE read_only FREEZE transaction_isolation;

I'd think of the reverse of each of these as GRANT and ALTER ...  THAW,
respectively.

Is anyone else interested in such a feature?  If so, is it more
DCL-ish, or more DDL-ish?

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Tom Lane

Max Bowsher  writes:
> On 07/09/10 18:16, Tom Lane wrote:
>> Hmm, I see.  This depends on the fact that git commits reference
>> filesystem states and not deltas, correct?  So it does actually make
>> sense to just delete that commit from the history.  I was concerned
>> that it'd invalidate later commits, but I guess it doesn't.

> It wouldn't - except for the fact that cvs2git batches such manufactured
> commits such that there is no guarantee that a single manufactured
> commit pertains only to files in the commit immediately afterwards.

Hmm ... so the consequence of that would be that (in this example) it.po
would show up as being part of the REL8_4_STABLE file set as of that
commit, rather than as of the later commit where it really got added.
That's kind of annoying, but it is not a showstopper I think.  Recall
that the goals we set for this conversion in the first place were
(1) duplicate the file set as of any back release tag and (2) duplicate
the CVS log history as nearly as practical.  We know we have met (1),
because Magnus explicitly tested that.  IMO we have met (2) adequately
as well, with or without any fix for the manufactured-commit issue.

On reflection it might be better to leave well enough alone, though.
Anybody looking at the "real commit" in future might be confused by
the fact that it added a seemingly unrelated file.  It would be less
confusing to have an obviously made-up commit adding some files,
probably.

A compromise might be to excise only those manufactured commits that
added files directly related to the following real commit.  I haven't
looked to see how many there are that grouped unrelated files.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] can we publish a aset interface?

2010-09-07 Thread Pavel Stehule

2010/9/7 Robert Haas :
> On Tue, Sep 7, 2010 at 12:44 PM, Pavel Stehule  
> wrote:
>>> I don't see how you could do anything with this that you can't do with
>>> the existing implementation.  It's not as if you can store pointers
>>> into an mmap'd block and then count on them being valid the next time
>>> you map the file...  it might not end up at the same offset.
>>
>> you can, but you have to do preallocation and you have to use a FIXED flag.
>
> MAP_FIXED?  As TFM says: "Because requiring a fixed address for a
> mapping is less portable, the use of this option  is  discouraged."

yes, I know. This will be used for proprietary Czech language - 95% of
postgresql instalations are on Linux, 10% on MS Windows (in Czech
Republic)

I don't plan to try to move this module to core. And it's useless -
other languages has not our problems.

Regards

Pavel

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise Postgres Company
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] can we publish a aset interface?

2010-09-07 Thread Robert Haas

On Tue, Sep 7, 2010 at 12:44 PM, Pavel Stehule  wrote:
>> I don't see how you could do anything with this that you can't do with
>> the existing implementation.  It's not as if you can store pointers
>> into an mmap'd block and then count on them being valid the next time
>> you map the file...  it might not end up at the same offset.
>
> you can, but you have to do preallocation and you have to use a FIXED flag.

MAP_FIXED?  As TFM says: "Because requiring a fixed address for a
mapping is less portable, the use of this option  is  discouraged."

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronous replication - patch status inquiry

2010-09-07 Thread Robert Haas

On Tue, Sep 7, 2010 at 2:15 PM, Simon Riggs  wrote:
> Every time I explain anything, I get someone run around shouting "but
> that can't work!". I'm sorry, but again your logic is poor and the bias
> against properly considering viable alternatives is the only thing
> perfectly evident. So yes, I agree, it is a waste of time discussing it
> until I show working code.

Obviously you don't "agree", because that's the exact opposite of what
I just said.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Max Bowsher

On 07/09/10 18:16, Tom Lane wrote:
> Michael Haggerty  writes:
>> Tom Lane wrote:
>>> What I'd like is for those commits to vanish from the git log entirely.
> 
>> It seems to me that in your case such commits could be "grafted over":
> 
>> *---*---*---*
>>  \
>>   A---B---C---D
> 
>> E.g., if "C" is one of these special manufactured commits, then you
>> could use git grafts to change the parent of "D" from "C" to "B", then
>> bake in the change with "git filter-branch".  This would make C
>> inaccessible and subject to garbage collection.
> 
> Hmm, I see.  This depends on the fact that git commits reference
> filesystem states and not deltas, correct?  So it does actually make
> sense to just delete that commit from the history.  I was concerned
> that it'd invalidate later commits, but I guess it doesn't.

It wouldn't - except for the fact that cvs2git batches such manufactured
commits such that there is no guarantee that a single manufactured
commit pertains only to files in the commit immediately afterwards. For
example, consider the it.po file in the commit referenced in this thread
yesterday:

commit b36518cb880bb236496ec3e505ede4001ce56157
Author: PostgreSQL Daemon 
Date:   Sun Feb 28 21:32:02 2010 +

This commit was manufactured by cvs2svn to create branch
'REL8_4_STABLE'.

Cherrypick from master 2010-02-28 21:31:57 UTC Tom Lane
 'Fix up memory management problems in contrib/xml2.':
contrib/xml2/expected/xml2.out
contrib/xml2/sql/xml2.sql
src/bin/pg_dump/po/it.po


Max.



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] Synchronous replication - patch status inquiry

2010-09-07 Thread Simon Riggs

On Tue, 2010-09-07 at 12:07 -0400, Robert Haas wrote:
> On Tue, Sep 7, 2010 at 11:59 AM, Simon Riggs  wrote:
> >> What I *think* you're saying is that the slave doesn't send per-commit
> >> messages, but instead processes the WAL as it's received and then sends
> >> a heres-where-I-am status message back upstream immediately before going
> >> to sleep waiting for the next chunk.  That's fine as far as the protocol
> >> goes, but I'm not convinced that it really does all that much in terms
> >> of improving performance.  You still have the problem that the master
> >> has to fsync its WAL before it can send it to the slave.  Also, the
> >> slave won't know whether it ought to fsync its own WAL before replying.
> >
> > Yes, apart from last sentence. Please wait for the code.
> 
> So, we're going around and around in circles here because you're
> repeatedly refusing to explain how the slave will know WHEN to send
> acknowledgments back to the master without knowing which sync rep
> level is in use.  It seems to be perfectly evident to everyone else
> here that there are only two ways for this to work: either the value
> is configured on the standby, or there's a registration system on the
> master and the master tells the standby its wishes.  Instead of asking
> the entire community to wait for an unspecified period of time for you
> to write code that will handle this in an unspecified way, how about
> answering the question?  We've wasted far too much time arguing about
> this already.

Every time I explain anything, I get someone run around shouting "but
that can't work!". I'm sorry, but again your logic is poor and the bias
against properly considering viable alternatives is the only thing
perfectly evident. So yes, I agree, it is a waste of time discussing it
until I show working code.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-09-07 Thread Markus Wanner


On 09/07/2010 05:55 PM, Markus Wanner wrote:

Robert's argument


Sorry, I meant Ron.

Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-09-07 Thread Markus Wanner


Hi,

On 09/07/2010 06:00 PM, Robert Haas wrote:

People who are more concerned about performance than robustness aren't
going to use sync rep in the first place.


I'm advocating sync (or eager, FWIW) replication for years, now. One of 
the hardest preconception I'm always confronted with is: this must 
perform poorly!


Whether or not that's true depends, but my point is: people who need 
that level of robustness certainly care about performance as well. 
Telling them to use async replication instead is not an option. (The 
ability to mix sync and async replication per transaction is one, BTW).



They're going to run it in
async, which will improve performance by FAR more than you'll ever be
able to manage by deciding that you don't care about handling some of
the failure cases correctly.


Running in async and then trying to achieve the required level of 
robustness in the application layer pretty certainly performs worse than 
a good sync replication implementation. Async only wins if you really 
don't care about the loss of transactions in the case of a failure. In 
every other case, robustness is better taken care of by the database 
system itself, IMO.


That being said, I certainly agree to do things step by step. And the 
ability to write to WAL and wait for ack from a standby concurrently can 
(and probably should) be considered an optimization, yes.


Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: tsearch - some memory diet

2010-09-07 Thread Tom Lane

Heikki Linnakangas  writes:
> A more general solution would be to have a new MemoryContext 
> implementation that does the same your patch does. Ie. instead of 
> tracking each allocation, just allocate a big chunk, and have palloc() 
> return the next n free bytes from it, like a stack. pfree() would 
> obviously not work, but wholesale MemoryContextDelete of the whole 
> memory context would.

The trick with that is to not crash horribly if pfree or
GetMemoryChunkSpace or GetMemoryChunkContext is applied to such a chunk.
Perhaps we can live without that requirement, but it greatly limits the
safe usage of such a context type.

In the particular case here, the dictionary structures could probably
safely use such a context type, but I'm not sure it's worth bothering
if the long-term plan is to implement a precompiler.  There would be
no need for this after the precompiled representation is installed,
because that'd just be one big hunk of memory anyway.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Tom Lane

Michael Haggerty  writes:
> Tom Lane wrote:
>> What I'd like is for those commits to vanish from the git log entirely.

> It seems to me that in your case such commits could be "grafted over":

> *---*---*---*
>  \
>   A---B---C---D

> E.g., if "C" is one of these special manufactured commits, then you
> could use git grafts to change the parent of "D" from "C" to "B", then
> bake in the change with "git filter-branch".  This would make C
> inaccessible and subject to garbage collection.

Hmm, I see.  This depends on the fact that git commits reference
filesystem states and not deltas, correct?  So it does actually make
sense to just delete that commit from the history.  I was concerned
that it'd invalidate later commits, but I guess it doesn't.

> But please check by hand to make sure that this makes sense; for
> example, it could be that other branches in the neighborhood make the
> excision impossible.

Since we weren't doing merging, nor branching off from back branches,
I'm having a hard time seeing how there'd be any risk there.  Is there
a case I'm missing?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Tom Lane

Max Bowsher  writes:
> On 07/09/10 16:47, Tom Lane wrote:
>> Max Bowsher  writes:
>>> ... Just as soon as I can figure out how
>>> to cleanly fit that into cvs2git's structure, I want it to change the
>>> word "create" to "update" in most of those commits.

>> I thought all of those message texts were taken from the configuration
>> file.

> Yes, but currently these two cases both reference the same entry in the
> configuration file.

Oh, I misunderstood the "most" bit ;-)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Michael Haggerty

Tom Lane wrote:
> Magnus Hagander  writes:
>> On Tue, Sep 7, 2010 at 17:07, Tom Lane  wrote:
>>> Look for
>>>This commit was manufactured by cvs2svn to create branch ...
> 
>> Ok, found a bunch of those (78 to be exact). And the issue with them
>> is we want to change the commit author on them to be whomever made the
>> first commit on the branch *after* that?
> 
> What I'd like is for those commits to vanish from the git log entirely.
> 
> In a practical sense, what you should probably do is for each file
> mentioned in such a commit, cause the file's addition to the branch to
> become part of the first regular commit on the branch that touched that
> file.  In the CVS history, at least, there always is such a commit
> (since we never did the cvs tag -b thing).  I am not sure though whether
> the converted git history includes a touch of the file in that commit,
> if the version committed into the branch is identical to what was on
> HEAD.  Michael, can you comment on that point?

If the situation is a file that had a branch tag added to it after the
branch was first created, then there is a git commit corresponding to
that event that consists of the addition of that file with no history.
This commit might also include the addition of other files to the
branch, but should not include any file content changes.

It seems to me that in your case such commits could be "grafted over":

*---*---*---*
 \
  A---B---C---D

E.g., if "C" is one of these special manufactured commits, then you
could use git grafts to change the parent of "D" from "C" to "B", then
bake in the change with "git filter-branch".  This would make C
inaccessible and subject to garbage collection.

But please check by hand to make sure that this makes sense; for
example, it could be that other branches in the neighborhood make the
excision impossible.

Michael

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] can we publish a aset interface?

2010-09-07 Thread Pavel Stehule

2010/9/7 Robert Haas :
> On Tue, Sep 7, 2010 at 9:27 AM, Pavel Stehule  wrote:
>> 2010/9/7 Robert Haas :
>>> On Tue, Sep 7, 2010 at 4:53 AM, Pavel Stehule  
>>> wrote:
 I would to use a special memory context for shared data (based on
 mmap) and I like impementation of aset. There is only one difference -
 aset is based on malloc and I would to use a mmap.

 malloc() is used in AllocSetContextCreate and AllocSetAlloc. These
 procedures should be overwritten, but other code and data structures
 can be used. This step can be useful for previous discuss about some
 more comfortable maintaining of shared memory.

 What do you think about?
>>>
>>> What would this be good for?
>>>
>>
>> I try to solve performance problems with czech tsearch. I checked
>> serialization and deserialization, but this decrease load time only to
>> 100ms (from 500) that is too much for us. After some gaming with mmap
>> I thinking so there some chance to preallocate mmap memory, and then
>> use a special memory context based on mmap instead of malloc.
>> Teoretically I can copy aset interface - this module probably never be
>> in core (this problem is probably local - only Czech), but it isn't
>> nice. So I asking.
>
> I don't see how you could do anything with this that you can't do with
> the existing implementation.  It's not as if you can store pointers
> into an mmap'd block and then count on them being valid the next time
> you map the file...  it might not end up at the same offset.

you can, but you have to do preallocation and you have to use a FIXED flag.

Pavel


>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise Postgres Company
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: tsearch dictionary initialization hook

2010-09-07 Thread Pavel Stehule

Hello

2010/9/7 Teodor Sigaev :
> Hm, what is aim of this hook? It looks like a wrapper of dictionary init
> method.

If I use a mmap for shared dictionary, then I have to prealloc and
maybe preread dictionary - it can be done in external module. But I
have to join preloaded dictionary to requested dictionary. This hook
allows this relation - and it's general - I don't need any special
support in ispell dictionary.

Regards

Pavel


>
>> I propose a new hook type - that helps with controlling a life cycle
>> of some tsearch dictionaries. This hook has minimal impact on
>> performance - it's called once per session for one tsearch
>> configuration.
>
> --
> Teodor Sigaev                                   E-mail: teo...@sigaev.ru
>                                                   WWW: http://www.sigaev.ru/
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: tsearch - some memory diet

2010-09-07 Thread Teodor Sigaev


A more general solution would be to have a new MemoryContext
implementation that does the same your patch does. Ie. instead of
tracking each allocation, just allocate a big chunk, and have palloc()
return the next n free bytes from it, like a stack. pfree() would
obviously not work, but wholesale MemoryContextDelete of the whole
memory context would.


repalloc() will not work too. Such implementation should have possibility to 
debug memory allocation/management by using some kind of red-zones or 
CLOBBER_FREED_MEMORY/MEMORY_CONTEXT_CHECKING

--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: tsearch - some memory diet

2010-09-07 Thread Pavel Stehule

2010/9/7 Heikki Linnakangas :
> On 07/09/10 19:27, Teodor Sigaev wrote:
>>>
>>> on 32bit from 27MB (3399 blocks) to 13MB (1564 blocks)
>>> on 64bit from 55MB to cca 27MB.
>>
>> Good results. But, I think, there are more places in ispell to use
>> hold_memory():
>> - affixes and affix tree
>> - regis (REGex for ISpell, regis.c)
>
> A more general solution would be to have a new MemoryContext implementation
> that does the same your patch does. Ie. instead of tracking each allocation,
> just allocate a big chunk, and have palloc() return the next n free bytes
> from it, like a stack. pfree() would obviously not work, but wholesale
> MemoryContextDelete of the whole memory context would.
>
> I remember I actually tried this years ago, trying to reduce the overhead of
> parsing IIRC. The parser also does a lot of small allocations that are not
> individually pfree'd. And I think it helped a tiny bit, but I didn't pursue
> it further. But if there's many places where it would help, then it might
> well be worth it.

I sent patch last year - simpleAllocator, and this idea was rejected.
But now I dislike this idea too. This is unclear, and some forgotten
"free" calling can do problems. This is more simple, more readable and
secure.

Regards

Pavel Stehule

>
> --
>  Heikki Linnakangas
>  EnterpriseDB   http://www.enterprisedb.com
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Max Bowsher

On 07/09/10 16:47, Tom Lane wrote:
> Max Bowsher  writes:
>> Personally, the idea of trying to use git-filter-branch to make what
>> cvs2git currently gives you more sensible scares me silly.
> 
> I'm not excited about it either --- but if Magnus wants to experiment,
> no harm trying.
> 
>> Another glitch that might be worth fixing before you convert is the way
>> that cvs2git says "This commit was manufactured by cvs2svn to create
>> branch", when it actually means "manufactured to incrementally create
>> the branch state as it appears in CVS" - i.e. many of these commits
>> actually update an existing branch. Just as soon as I can figure out how
>> to cleanly fit that into cvs2git's structure, I want it to change the
>> word "create" to "update" in most of those commits.
> 
> I thought all of those message texts were taken from the configuration
> file.

Yes, but currently these two cases both reference the same entry in the
configuration file.

Max.



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] patch: tsearch - some memory diet

2010-09-07 Thread Pavel Stehule

2010/9/7 Teodor Sigaev :
>> on 32bit from 27MB (3399 blocks) to 13MB (1564 blocks)
>> on 64bit from 55MB to cca 27MB.
>
> Good results. But, I think, there are more places in ispell to use
> hold_memory():
> - affixes and affix tree
> - regis (REGex for ISpell, regis.c)

yes, but minimally for Czech dictionary other places are not
important. It's decrease 1MB more.

Last month I moved all these unreleased parts and it's not important.
But can be interesting check it for other languages than Czech.

Regards

Pavel Stehule

>
>
> --
> Teodor Sigaev                                   E-mail: teo...@sigaev.ru
>                                                   WWW: http://www.sigaev.ru/
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: tsearch - some memory diet

2010-09-07 Thread Heikki Linnakangas


On 07/09/10 19:27, Teodor Sigaev wrote:

on 32bit from 27MB (3399 blocks) to 13MB (1564 blocks)
on 64bit from 55MB to cca 27MB.


Good results. But, I think, there are more places in ispell to use
hold_memory():
- affixes and affix tree
- regis (REGex for ISpell, regis.c)


A more general solution would be to have a new MemoryContext 
implementation that does the same your patch does. Ie. instead of 
tracking each allocation, just allocate a big chunk, and have palloc() 
return the next n free bytes from it, like a stack. pfree() would 
obviously not work, but wholesale MemoryContextDelete of the whole 
memory context would.


I remember I actually tried this years ago, trying to reduce the 
overhead of parsing IIRC. The parser also does a lot of small 
allocations that are not individually pfree'd. And I think it helped a 
tiny bit, but I didn't pursue it further. But if there's many places 
where it would help, then it might well be worth it.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] function_name.parameter_name

2010-09-07 Thread David E. Wheeler

On Sep 7, 2010, at 9:35 AM, Tom Lane wrote:

> How does $subject differ from what we already do?  See
> http://www.postgresql.org/docs/9.0/static/plpgsql-structure.html
> particularly this:
> 
>   Note: There is actually a hidden "outer block" surrounding the
>   body of any PL/pgSQL function. This block provides the
>   declarations of the function's parameters (if any), as well as
>   some special variables such as FOUND (see Section 39.5.5). The
>   outer block is labeled with the function's name, meaning that
>   parameters and special variables can be qualified with the
>   function's name.

Well I'll be damned. I never knew about this! So I can get rid of those aliases!

  
http://github.com/theory/pgxn-manager/commit/e5add190ff5358a0b2ede64b62616491be454c50

Thanks Tom, I had *no idea* about this.

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] function_name.parameter_name

2010-09-07 Thread Tom Lane

"David E. Wheeler"  writes:
> Anyone ever thought to try to add $subject to PL/pgSQL?

How does $subject differ from what we already do?  See
http://www.postgresql.org/docs/9.0/static/plpgsql-structure.html
particularly this:

Note: There is actually a hidden "outer block" surrounding the
body of any PL/pgSQL function. This block provides the
declarations of the function's parameters (if any), as well as
some special variables such as FOUND (see Section 39.5.5). The
outer block is labeled with the function's name, meaning that
parameters and special variables can be qualified with the
function's name.


regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: tsearch - some memory diet

2010-09-07 Thread Teodor Sigaev


on 32bit from 27MB (3399 blocks) to 13MB (1564 blocks)
on 64bit from 55MB to cca 27MB.


Good results. But, I think, there are more places in ispell to use 
hold_memory():
- affixes and affix tree
- regis (REGex for ISpell, regis.c)


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] function_name.parameter_name

2010-09-07 Thread David E. Wheeler

Howdy,

Anyone ever thought to try to add $subject to PL/pgSQL? Someone left a 
[comment][] on the PGXN blog about how this is a supported syntax for using 
named parameters on Oracle. The context is to avoid conflicts between variable 
names and column names by function-qualifyin the former and table-qualifying 
the latter.

[comment]: 
http://blog.pgxn.org/post/1053165383/alias-in-vogue#dsq-comment-75687336

Would this be do-able in PL/pgSQL?

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronous replication - patch status inquiry

2010-09-07 Thread Robert Haas

On Tue, Sep 7, 2010 at 11:59 AM, Simon Riggs  wrote:
>> What I *think* you're saying is that the slave doesn't send per-commit
>> messages, but instead processes the WAL as it's received and then sends
>> a heres-where-I-am status message back upstream immediately before going
>> to sleep waiting for the next chunk.  That's fine as far as the protocol
>> goes, but I'm not convinced that it really does all that much in terms
>> of improving performance.  You still have the problem that the master
>> has to fsync its WAL before it can send it to the slave.  Also, the
>> slave won't know whether it ought to fsync its own WAL before replying.
>
> Yes, apart from last sentence. Please wait for the code.

So, we're going around and around in circles here because you're
repeatedly refusing to explain how the slave will know WHEN to send
acknowledgments back to the master without knowing which sync rep
level is in use.  It seems to be perfectly evident to everyone else
here that there are only two ways for this to work: either the value
is configured on the standby, or there's a registration system on the
master and the master tells the standby its wishes.  Instead of asking
the entire community to wait for an unspecified period of time for you
to write code that will handle this in an unspecified way, how about
answering the question?  We've wasted far too much time arguing about
this already.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] gincostestimate

2010-09-07 Thread Teodor Sigaev


I also dropped the use of rd_amcache, instead having ginGetStats()

Ok, I'm agree


I didn't do anything about the questionable equations in
gincostestimate.  Those need to either be fixed, or documented as
to why they're correct.  Other than that I think this could be
committed.


Fixed, and slightly reworked to be more clear.
Attached patch is based on your patch.

--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


gincostestimate-0.24.gz
Description: Unix tar archive

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-09-07 Thread Robert Haas

On Tue, Sep 7, 2010 at 11:06 AM, Simon Riggs  wrote:
> On Tue, 2010-09-07 at 16:31 +0200, Markus Wanner wrote:
>> On 09/07/2010 04:15 PM, Robert Haas wrote:
>> > In theory, that's true, but if we do that, then there's an even bigger
>> > problem: the slave might have replayed WAL ahead of the master
>> > location; therefore the slave is now corrupt and a new base backup
>> > must be taken.
>>
>> The slave isn't corrupt. It would suffice to "late abort" committed
>> transactions the master doesn't know about.
>
> The slave *might* be ahead of the master. And if it is, the case we're
> discussing is where the master just crashed and *might* not even be
> coming back at all, at least for a while. The standby does differ from
> master, but with the master down I don't regard that as a useful
> statement.
>
> If we wait for fsync on master and then transfer to standby the times
> are additive. If we do them concurrently the response times will be the
> maximum response time of fsync/transfer, as Markus observes.
>
> ISTM that most people would be more interested in reducing response
> times by ~50% rather than in being exactly correct in an edge case. So
> we should be planning that as a robustness option, not "it cannot be
> done", which seems to be echoing around to much for my liking.

People who are more concerned about performance than robustness aren't
going to use sync rep in the first place.  They're going to run it in
async, which will improve performance by FAR more than you'll ever be
able to manage by deciding that you don't care about handling some of
the failure cases correctly.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronous replication - patch status inquiry

2010-09-07 Thread Simon Riggs

On Tue, 2010-09-07 at 11:41 -0400, Tom Lane wrote:
> Simon Riggs  writes:
> > On Tue, 2010-09-07 at 10:47 -0400, Tom Lane wrote:
> >> Simon Riggs  writes:
> >>> The WAL is sent from master to standby in 8192 byte chunks, frequently
> >>> including multiple commits. From standby, one reply per chunk. If we
> >>> need to wait for apply while nothing else is received, we do. 
> >> 
> >> That premise is completely false.  SR does not send WAL in page units.
> >> If it did, it would have the same performance problems as the old
> >> WAL-file-at-a-time implementation, just with slightly smaller
> >> granularity.
> 
> > There's no dependence on pages in that proposal, so don't understand.
> 
> Oh, well you certainly didn't explain it well then.
> 
> What I *think* you're saying is that the slave doesn't send per-commit
> messages, but instead processes the WAL as it's received and then sends
> a heres-where-I-am status message back upstream immediately before going
> to sleep waiting for the next chunk.  That's fine as far as the protocol
> goes, but I'm not convinced that it really does all that much in terms
> of improving performance.  You still have the problem that the master
> has to fsync its WAL before it can send it to the slave.  Also, the
> slave won't know whether it ought to fsync its own WAL before replying.

Yes, apart from last sentence. Please wait for the code.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-09-07 Thread Markus Wanner


Hi,

On 09/07/2010 05:17 PM, Tom Lane wrote:

Oh yes it is.  If the slave replays WAL that didn't happen on the
master, it might for instance have heap tuples in TID slots that are
empty on the master, or index pages laid out differently from the
master.  Trying to apply additional WAL from the master will fail badly.


Sure. Reverting to the master's state would be required to be able to 
safely proceed. Granted, that's far from simple.


Robert's argument about read queries on the standby convinced me, that 
you always need to recover to the node with the newest transactions 
applied (i.e. better advance rather than revert). Making sure the 
standby can't ever be ahead of the master node certainly is the simplest 
way to guarantee that. At its cost for normal operation, though.


How about a master failure which leads to a fail-over, immediately 
followed by a failure of that former standby (and now a master)? The old 
master might then be in the very same situation: having WAL applied that 
the new master doesn't. Do we require former masters to fetch a base 
backup? How does it know the difference, once it gets back up?



We can *not* allow the slave to replay WAL ahead of what is known
committed to disk on the master.  The only way to make that safe
is the compare-notes-and-ship-WAL-back approach that Robert mentioned.


Agreed.

(And it's worth pointing out that this approach has a pretty nasty 
requirement for a full-cluster crash: all nodes that were synchronously 
replicated to need to come back up after such a crash, so as to be able 
to reliably determine which has the newest transaction).



If you feel that decoupling WAL application is absolutely essential
to have a credible feature, then you'd better bite the bullet and
start working on the ship-WAL-back code.


My feeling is that WAL is the wrong format to do replication. But that's 
a another story.


Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronous replication - patch status inquiry

2010-09-07 Thread Robert Haas

On Tue, Sep 7, 2010 at 11:41 AM, Tom Lane  wrote:
> Oh, well you certainly didn't explain it well then.
>
> What I *think* you're saying is that the slave doesn't send per-commit
> messages, but instead processes the WAL as it's received and then sends
> a heres-where-I-am status message back upstream immediately before going
> to sleep waiting for the next chunk.  That's fine as far as the protocol
> goes, but I'm not convinced that it really does all that much in terms
> of improving performance.  You still have the problem that the master
> has to fsync its WAL before it can send it to the slave.

We have that problem in all of these proposals, don't we?  We
certainly have no infrastructure to handle the slave getting ahead of
the master in the WAL stream.

> Also, the
> slave won't know whether it ought to fsync its own WAL before replying.

Right.  And whether it ought to replay it before replying.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Tom Lane

I wrote:
> Magnus Hagander  writes:
>> Ok, found a bunch of those (78 to be exact).

> What I'd like is for those commits to vanish from the git log entirely.

> In a practical sense, what you should probably do is for each file
> mentioned in such a commit, cause the file's addition to the branch to
> become part of the first regular commit on the branch that touched that
> file.  In the CVS history, at least, there always is such a commit
> (since we never did the cvs tag -b thing).  I am not sure though whether
> the converted git history includes a touch of the file in that commit,

Given that there are only 78 such commits, it would not take too long to
manually prepare a list of which commit each file addition should get
moved into.  Would that be a more sensible approach than trying to
extract the information from the git log?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] can we publish a aset interface?

2010-09-07 Thread Robert Haas

On Tue, Sep 7, 2010 at 11:18 AM, Alvaro Herrera
 wrote:
> Excerpts from Robert Haas's message of mar sep 07 10:13:12 -0400 2010:
>
>> > I try to solve performance problems with czech tsearch. I checked
>> > serialization and deserialization, but this decrease load time only to
>> > 100ms (from 500) that is too much for us. After some gaming with mmap
>> > I thinking so there some chance to preallocate mmap memory, and then
>> > use a special memory context based on mmap instead of malloc.
>> > Teoretically I can copy aset interface - this module probably never be
>> > in core (this problem is probably local - only Czech), but it isn't
>> > nice. So I asking.
>>
>> I don't see how you could do anything with this that you can't do with
>> the existing implementation.  It's not as if you can store pointers
>> into an mmap'd block and then count on them being valid the next time
>> you map the file...  it might not end up at the same offset.
>
> Hmm, surely you could store offsets instead of absolute pointers.

Surely you could.  But then where does palloc come in?  As Tom said
upthread, the right thing to do here is to create a pre-compiler that
outputs a pointer-free representation which you can then mmap().

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Tom Lane

Max Bowsher  writes:
> Personally, the idea of trying to use git-filter-branch to make what
> cvs2git currently gives you more sensible scares me silly.

I'm not excited about it either --- but if Magnus wants to experiment,
no harm trying.

> Another glitch that might be worth fixing before you convert is the way
> that cvs2git says "This commit was manufactured by cvs2svn to create
> branch", when it actually means "manufactured to incrementally create
> the branch state as it appears in CVS" - i.e. many of these commits
> actually update an existing branch. Just as soon as I can figure out how
> to cleanly fit that into cvs2git's structure, I want it to change the
> word "create" to "update" in most of those commits.

I thought all of those message texts were taken from the configuration
file.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-09-07 Thread Tom Lane

Simon Riggs  writes:
> On Tue, 2010-09-07 at 11:17 -0400, Tom Lane wrote:
>> We can *not* allow the slave to replay WAL ahead of what is known
>> committed to disk on the master.  The only way to make that safe
>> is the compare-notes-and-ship-WAL-back approach that Robert mentioned.
>> 
>> If you feel that decoupling WAL application is absolutely essential
>> to have a credible feature, then you'd better bite the bullet and
>> start working on the ship-WAL-back code.

> Why not just failover? 

Guaranteed failover is another large piece we don't have.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronous replication - patch status inquiry

2010-09-07 Thread Tom Lane

Simon Riggs  writes:
> On Tue, 2010-09-07 at 10:47 -0400, Tom Lane wrote:
>> Simon Riggs  writes:
>>> The WAL is sent from master to standby in 8192 byte chunks, frequently
>>> including multiple commits. From standby, one reply per chunk. If we
>>> need to wait for apply while nothing else is received, we do. 
>> 
>> That premise is completely false.  SR does not send WAL in page units.
>> If it did, it would have the same performance problems as the old
>> WAL-file-at-a-time implementation, just with slightly smaller
>> granularity.

> There's no dependence on pages in that proposal, so don't understand.

Oh, well you certainly didn't explain it well then.

What I *think* you're saying is that the slave doesn't send per-commit
messages, but instead processes the WAL as it's received and then sends
a heres-where-I-am status message back upstream immediately before going
to sleep waiting for the next chunk.  That's fine as far as the protocol
goes, but I'm not convinced that it really does all that much in terms
of improving performance.  You still have the problem that the master
has to fsync its WAL before it can send it to the slave.  Also, the
slave won't know whether it ought to fsync its own WAL before replying.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] PgWest 2010 CFP extended (one week)

2010-09-07 Thread Joshua D. Drake

For one week guys... 

https://www.postgresqlconference.org/2010/west/cfp/

-- 
PostgreSQL - XMPP: jdrake(at)jabber(dot)postgresql(dot)org
   Consulting, Development, Support, Training
   503-667-4564 - http://www.commandprompt.com/
   The PostgreSQL Company, serving since 1997

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Max Bowsher

On 07/09/10 16:21, Magnus Hagander wrote:
> On Tue, Sep 7, 2010 at 17:07, Tom Lane  wrote:
>> Magnus Hagander  writes:
>>> On Tue, Sep 7, 2010 at 16:16, Tom Lane  wrote:
 If you want to try, and it doesn't take much time, go for it.  I was
 just saying I wouldn't complain if we decide to live with it as-is.
>>
>>> Ok. Do we have a way of identifying them - e.g. is it all the commits
>>> with a certain commit msg?
>>
>> Look for
>>This commit was manufactured by cvs2svn to create branch ...
> 
> Ok, found a bunch of those (78 to be exact). And the issue with them
> is we want to change the commit author on them to be whomever made the
> first commit on the branch *after* that?

I would say you emphatically don't want to do that, because they can
contain more changes that were unrelated to that author.

The logic, as I understand it from Michael's explanation of cvs2git's
guts, is to flush out any pending "add to branch because of implicit
appearance of a branch tag" operations when something other change is
about to occur on the destination branch. So unrelated stuff can get
batched together.

Personally, the idea of trying to use git-filter-branch to make what
cvs2git currently gives you more sensible scares me silly. I think the
approach should be to use it as is, or improve cvs2git.

Another glitch that might be worth fixing before you convert is the way
that cvs2git says "This commit was manufactured by cvs2svn to create
branch", when it actually means "manufactured to incrementally create
the branch state as it appears in CVS" - i.e. many of these commits
actually update an existing branch. Just as soon as I can figure out how
to cleanly fit that into cvs2git's structure, I want it to change the
word "create" to "update" in most of those commits.

Max.

signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] Synchronization levels in SR

2010-09-07 Thread Markus Wanner


On 09/07/2010 04:47 PM, Ron Mayer wrote:

In that situation, wouldn't it be possible that a different client
queried the slave and already saw the result of that transaction
which would later be rolled back?


Good point, yes.

Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Tom Lane

Magnus Hagander  writes:
> On Tue, Sep 7, 2010 at 17:07, Tom Lane  wrote:
>> Look for
>>        This commit was manufactured by cvs2svn to create branch ...

> Ok, found a bunch of those (78 to be exact). And the issue with them
> is we want to change the commit author on them to be whomever made the
> first commit on the branch *after* that?

What I'd like is for those commits to vanish from the git log entirely.

In a practical sense, what you should probably do is for each file
mentioned in such a commit, cause the file's addition to the branch to
become part of the first regular commit on the branch that touched that
file.  In the CVS history, at least, there always is such a commit
(since we never did the cvs tag -b thing).  I am not sure though whether
the converted git history includes a touch of the file in that commit,
if the version committed into the branch is identical to what was on
HEAD.  Michael, can you comment on that point?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-09-07 Thread Simon Riggs

On Tue, 2010-09-07 at 11:17 -0400, Tom Lane wrote:
> Markus Wanner  writes:
> > On 09/07/2010 04:15 PM, Robert Haas wrote:
> >> In theory, that's true, but if we do that, then there's an even bigger
> >> problem: the slave might have replayed WAL ahead of the master
> >> location; therefore the slave is now corrupt and a new base backup
> >> must be taken.
> 
> > The slave isn't corrupt. It would suffice to "late abort" committed 
> > transactions the master doesn't know about.
> 
> Oh yes it is.  If the slave replays WAL that didn't happen on the
> master, it might for instance have heap tuples in TID slots that are
> empty on the master, or index pages laid out differently from the
> master.  Trying to apply additional WAL from the master will fail badly.
> 
> We can *not* allow the slave to replay WAL ahead of what is known
> committed to disk on the master.  The only way to make that safe
> is the compare-notes-and-ship-WAL-back approach that Robert mentioned.
> 
> If you feel that decoupling WAL application is absolutely essential
> to have a credible feature, then you'd better bite the bullet and
> start working on the ship-WAL-back code.

Why not just failover? 

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronous replication - patch status inquiry

2010-09-07 Thread Simon Riggs

On Tue, 2010-09-07 at 10:47 -0400, Tom Lane wrote:
> Simon Riggs  writes:
> > On Tue, 2010-09-07 at 09:27 +0300, Heikki Linnakangas wrote:
> >> For the sake of argument, yes that's what I was thinking. Now please 
> >> explain how *you're* thinking it should work.
> 
> > The WAL is sent from master to standby in 8192 byte chunks, frequently
> > including multiple commits. From standby, one reply per chunk. If we
> > need to wait for apply while nothing else is received, we do. 
> 
> That premise is completely false.  SR does not send WAL in page units.
> If it did, it would have the same performance problems as the old
> WAL-file-at-a-time implementation, just with slightly smaller
> granularity.

There's no dependence on pages in that proposal, so don't understand.

What aspect of the above would you change? and to what?

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Magnus Hagander

On Tue, Sep 7, 2010 at 17:07, Tom Lane  wrote:
> Magnus Hagander  writes:
>> On Tue, Sep 7, 2010 at 16:16, Tom Lane  wrote:
>>> If you want to try, and it doesn't take much time, go for it.  I was
>>> just saying I wouldn't complain if we decide to live with it as-is.
>
>> Ok. Do we have a way of identifying them - e.g. is it all the commits
>> with a certain commit msg?
>
> Look for
>        This commit was manufactured by cvs2svn to create branch ...

Ok, found a bunch of those (78 to be exact). And the issue with them
is we want to change the commit author on them to be whomever made the
first commit on the branch *after* that?


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] can we publish a aset interface?

2010-09-07 Thread Alvaro Herrera

Excerpts from Robert Haas's message of mar sep 07 10:13:12 -0400 2010:

> > I try to solve performance problems with czech tsearch. I checked
> > serialization and deserialization, but this decrease load time only to
> > 100ms (from 500) that is too much for us. After some gaming with mmap
> > I thinking so there some chance to preallocate mmap memory, and then
> > use a special memory context based on mmap instead of malloc.
> > Teoretically I can copy aset interface - this module probably never be
> > in core (this problem is probably local - only Czech), but it isn't
> > nice. So I asking.
> 
> I don't see how you could do anything with this that you can't do with
> the existing implementation.  It's not as if you can store pointers
> into an mmap'd block and then count on them being valid the next time
> you map the file...  it might not end up at the same offset.

Hmm, surely you could store offsets instead of absolute pointers.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-09-07 Thread Tom Lane

Markus Wanner  writes:
> On 09/07/2010 04:15 PM, Robert Haas wrote:
>> In theory, that's true, but if we do that, then there's an even bigger
>> problem: the slave might have replayed WAL ahead of the master
>> location; therefore the slave is now corrupt and a new base backup
>> must be taken.

> The slave isn't corrupt. It would suffice to "late abort" committed 
> transactions the master doesn't know about.

Oh yes it is.  If the slave replays WAL that didn't happen on the
master, it might for instance have heap tuples in TID slots that are
empty on the master, or index pages laid out differently from the
master.  Trying to apply additional WAL from the master will fail badly.

We can *not* allow the slave to replay WAL ahead of what is known
committed to disk on the master.  The only way to make that safe
is the compare-notes-and-ship-WAL-back approach that Robert mentioned.

If you feel that decoupling WAL application is absolutely essential
to have a credible feature, then you'd better bite the bullet and
start working on the ship-WAL-back code.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Tom Lane

Magnus Hagander  writes:
> On Tue, Sep 7, 2010 at 16:16, Tom Lane  wrote:
>> If you want to try, and it doesn't take much time, go for it.  I was
>> just saying I wouldn't complain if we decide to live with it as-is.

> Ok. Do we have a way of identifying them - e.g. is it all the commits
> with a certain commit msg?

Look for
This commit was manufactured by cvs2svn to create branch ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-09-07 Thread Simon Riggs

On Tue, 2010-09-07 at 16:31 +0200, Markus Wanner wrote:
> On 09/07/2010 04:15 PM, Robert Haas wrote:
> > In theory, that's true, but if we do that, then there's an even bigger
> > problem: the slave might have replayed WAL ahead of the master
> > location; therefore the slave is now corrupt and a new base backup
> > must be taken.
> 
> The slave isn't corrupt. It would suffice to "late abort" committed 
> transactions the master doesn't know about.

The slave *might* be ahead of the master. And if it is, the case we're
discussing is where the master just crashed and *might* not even be
coming back at all, at least for a while. The standby does differ from
master, but with the master down I don't regard that as a useful
statement.

If we wait for fsync on master and then transfer to standby the times
are additive. If we do them concurrently the response times will be the
maximum response time of fsync/transfer, as Markus observes.

ISTM that most people would be more interested in reducing response
times by ~50% rather than in being exactly correct in an edge case. So
we should be planning that as a robustness option, not "it cannot be
done", which seems to be echoing around to much for my liking.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] knngist - 0.8

2010-09-07 Thread Teodor Sigaev


http://www.sigaev.ru/misc/builtin_knngist_core-0.8.gz
http://www.sigaev.ru/misc/builtin_knngist_itself-0.8.gz
http://www.sigaev.ru/misc/builtin_knngist_proc-0.8.gz
http://www.sigaev.ru/misc/builtin_knngist_contrib_pg_trgm-0.8.gz
http://www.sigaev.ru/misc/builtin_knngist_contrib_btree_gist-0.8.gz


New version, synced with CVS HEAD
http://www.sigaev.ru/misc/builtin_knngist_itself-0.8.2.gz
http://www.sigaev.ru/misc/builtin_knngist_contrib_btree_gist-0.8.1.gz


AFAICS, these patches include no documentation.  That's pretty much a
fatal flaw for a feature of this magnitude.  At an absolute minimum,
you need to update the system catalog documentation and the
documentation on CREATE / ALTER OPERATOR CLASS.  There might be some
other places that need to be touched, also.

Oleg promised to do that



+   if (opform->oprresult == BOOLOID)
+   ereport(ERROR,
+
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+errmsg("index ordering
operators must not return boolean")));

My first thought was that this code was there to prevent people from
doing the wrong thing by accident.  But I have a niggling feeling that
you're actually relying on this for the correctness of the system.  I
hope I'm wrong, because I don't think that would be a very good idea.


This play is around do we really want to have support of boolean-distance in 
GiST? I think no, because it's a strange idea to measure distance in true/false 
measurement units. I can't imagine such real-life distance definition and never 
heard about that.


Next, pg_amop_opr_fam_index requires uniqueness of operation in operation family 
and a lot of places in planner believes in that. Suppose, changing that requires 
a lot of work which has the single aim to support boolean distance in ORDER BY 
clause.




The GIST code code use more comments; and perhaps the names of some of
the functions and structures could be chosen to be more descriptive.
I think that what used to be called GISTSearchStack has apparently
been replaced with DataPointer; it's not obvious to me that it's good
to change the name, but if it is I don't think DataPointer is a good
GISTSearchStack is replaced by RBTree (GISTScanOpaqueData->stack), tree's nodes 
contain a StackElem struct which represents list of pointers at the same 
distance. Each pointer could be a pointer to the inner index's page or to the

heap's tuple and this struct is a DataPointer.

Note, list of DataPointer in StackElem struct is organized by non-obvious way: 
we keep pointer to the head of list and pointer to the middle of list. New 
pointer-to-heap is inserted in the beginning of list, pointers-to-index-page - 
in the middle. That's done because we would like to:

1) pop pointers-to-heap as fast as possible, before any pointers-to-index-page
2) pop pointers-to-index-page to deep page (which is closer to leaf pages)
   first. That's good for KNN performance and emulates classical first-depth
   search in ordinary search.

> choice.  gistindex_keytest has been replaced (sort of) by
> processIndexTuple, which again seems more generic than what it
> replaced.
Renamed, comments are improved


Minor nit: the word "shoould" is mis-spelled.

fixed

BTW, now consistentFn is able to "manage" tree traversal - even for for ordinary 
search, GiST will choose child page with minimal distance.


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: tsearch dictionary initialization hook

2010-09-07 Thread Teodor Sigaev


Hm, what is aim of this hook? It looks like a wrapper of dictionary init method.


I propose a new hook type - that helps with controlling a life cycle
of some tsearch dictionaries. This hook has minimal impact on
performance - it's called once per session for one tsearch
configuration.


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] can we publish a aset interface?

2010-09-07 Thread Tom Lane

Robert Haas  writes:
> On Tue, Sep 7, 2010 at 9:27 AM, Pavel Stehule  wrote:
>> I try to solve performance problems with czech tsearch. I checked
>> serialization and deserialization, but this decrease load time only to
>> 100ms (from 500) that is too much for us. After some gaming with mmap
>> I thinking so there some chance to preallocate mmap memory, and then
>> use a special memory context based on mmap instead of malloc.
>> Teoretically I can copy aset interface - this module probably never be
>> in core (this problem is probably local - only Czech), but it isn't
>> nice. So I asking.

> I don't see how you could do anything with this that you can't do with
> the existing implementation.  It's not as if you can store pointers
> into an mmap'd block and then count on them being valid the next time
> you map the file...  it might not end up at the same offset.

More to the point, this entire approach to speeding up dictionary loading
has already been proposed and rejected, and it'll get rejected again if
it's submitted.

The conclusion of the previous discussion was that we should build
"precompiled" dictionaries, using some pointer-free representation,
which would be stored in files that could be either mmap'd in or just
read in if running on a platform lacking mmap.  There is no need for
any shmem allocator in that implementation.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronous replication - patch status inquiry

2010-09-07 Thread Tom Lane

Simon Riggs  writes:
> On Tue, 2010-09-07 at 09:27 +0300, Heikki Linnakangas wrote:
>> For the sake of argument, yes that's what I was thinking. Now please 
>> explain how *you're* thinking it should work.

> The WAL is sent from master to standby in 8192 byte chunks, frequently
> including multiple commits. From standby, one reply per chunk. If we
> need to wait for apply while nothing else is received, we do. 

That premise is completely false.  SR does not send WAL in page units.
If it did, it would have the same performance problems as the old
WAL-file-at-a-time implementation, just with slightly smaller
granularity.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-09-07 Thread Ron Mayer

Markus Wanner wrote:
> On 09/07/2010 02:16 PM, Robert Haas wrote:
>> practice, this means that the master and standby need to compare notes
>> on the ending WAL location and whichever one is further advanced needs
>> to stream the intervening records to the other.
> 
> Not necessarily, no. Remember that the client didn't get a commit
> confirmation. So reverting might also be a correct solution (i.e. not
> violating the durability constraint).

In that situation, wouldn't it be possible that a different client
queried the slave and already saw the result of that transaction
which would later be rolled back?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] can we publish a aset interface?

2010-09-07 Thread Tom Lane

Pavel Stehule  writes:
> I would to use a special memory context for shared data (based on
> mmap) and I like impementation of aset. There is only one difference -
> aset is based on malloc and I would to use a mmap.

> malloc() is used in AllocSetContextCreate and AllocSetAlloc. These
> procedures should be overwritten, but other code and data structures
> can be used. This step can be useful for previous discuss about some
> more comfortable maintaining of shared memory.

> What do you think about?

If you're proposing factoring aset.c into two levels, I don't think so.
That code is already a tremendous performance hot-spot and introducing
any more inefficiency into it doesn't seem like a good idea.  Especially
not for shared memory allocation, which is a feature that still has
no buy-in.  Also, you'd need to do more than just replace malloc: you'd
need to add locking capability.  That would make the code even uglier,
and slower, if it has to support locking or no locking dynamically.

Use the mcxt.c switch.  That's what it's there for.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Magnus Hagander

On Tue, Sep 7, 2010 at 16:16, Tom Lane  wrote:
> Magnus Hagander  writes:
>> On Tue, Sep 7, 2010 at 15:53, Tom Lane  wrote:
>>> Michael Haggerty  writes:
 Somebody could use "git filter-branch" to make this change after the
 conversion, but I can't estimate how much work it would be.
>>>
>>> The conversion is already far better than I expected it would be when
>>> we were first discussing this switch, so my inclination is to just live
>>> with this one wart.
>
>> You're saying you don't "require" a fix on the latest issue here? Or
>> should we spend some time trying to figure out if we can fix it with
>> git-filter-branch?
>
> If you want to try, and it doesn't take much time, go for it.  I was
> just saying I wouldn't complain if we decide to live with it as-is.

Ok. Do we have a way of identifying them - e.g. is it all the commits
with a certain commit msg?


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-09-07 Thread Markus Wanner


On 09/07/2010 04:15 PM, Robert Haas wrote:

In theory, that's true, but if we do that, then there's an even bigger
problem: the slave might have replayed WAL ahead of the master
location; therefore the slave is now corrupt and a new base backup
must be taken.


The slave isn't corrupt. It would suffice to "late abort" committed 
transactions the master doesn't know about.


However, I realize that undoing of WAL isn't something that's 
implemented (nor planned). So it's probably easier to forward the master 
in such a case.



Yeah, I hope we'll get there eventually.


Understood. Thanks.

Markus Wanner


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Tom Lane

Magnus Hagander  writes:
> On Tue, Sep 7, 2010 at 15:53, Tom Lane  wrote:
>> Michael Haggerty  writes:
>>> Somebody could use "git filter-branch" to make this change after the
>>> conversion, but I can't estimate how much work it would be.
>> 
>> The conversion is already far better than I expected it would be when
>> we were first discussing this switch, so my inclination is to just live
>> with this one wart.

> You're saying you don't "require" a fix on the latest issue here? Or
> should we spend some time trying to figure out if we can fix it with
> git-filter-branch?

If you want to try, and it doesn't take much time, go for it.  I was
just saying I wouldn't complain if we decide to live with it as-is.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-09-07 Thread Robert Haas

On Tue, Sep 7, 2010 at 9:45 AM, Markus Wanner  wrote:
> On 09/07/2010 02:16 PM, Robert Haas wrote:
>>
>> Right, definitely.  The trouble is that if they happen concurrently,
>> and there's a crash, you have to be prepared for the possibility that
>> either one of the two has completed and the other is not.
>
> Understood.
>
>> In
>> practice, this means that the master and standby need to compare notes
>> on the ending WAL location and whichever one is further advanced needs
>> to stream the intervening records to the other.
>
> Not necessarily, no. Remember that the client didn't get a commit
> confirmation. So reverting might also be a correct solution (i.e. not
> violating the durability constraint).

In theory, that's true, but if we do that, then there's an even bigger
problem: the slave might have replayed WAL ahead of the master
location; therefore the slave is now corrupt and a new base backup
must be taken.

>> This would be an
>> awesome feature, but it's hard, so for a first version, it makes sense
>> to commit on the master first and then on the standby after the master
>> is known done.
>
> The obvious downside of that is that latency adds up, instead of just being
> the max of the two operations. And that for normal operation. While at best
> it saves an un-confirmed transaction in the failure case.
>
> It might be harder to implement, yes.

Yeah, I hope we'll get there eventually.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] can we publish a aset interface?

2010-09-07 Thread Robert Haas

On Tue, Sep 7, 2010 at 9:27 AM, Pavel Stehule  wrote:
> 2010/9/7 Robert Haas :
>> On Tue, Sep 7, 2010 at 4:53 AM, Pavel Stehule  
>> wrote:
>>> I would to use a special memory context for shared data (based on
>>> mmap) and I like impementation of aset. There is only one difference -
>>> aset is based on malloc and I would to use a mmap.
>>>
>>> malloc() is used in AllocSetContextCreate and AllocSetAlloc. These
>>> procedures should be overwritten, but other code and data structures
>>> can be used. This step can be useful for previous discuss about some
>>> more comfortable maintaining of shared memory.
>>>
>>> What do you think about?
>>
>> What would this be good for?
>>
>
> I try to solve performance problems with czech tsearch. I checked
> serialization and deserialization, but this decrease load time only to
> 100ms (from 500) that is too much for us. After some gaming with mmap
> I thinking so there some chance to preallocate mmap memory, and then
> use a special memory context based on mmap instead of malloc.
> Teoretically I can copy aset interface - this module probably never be
> in core (this problem is probably local - only Czech), but it isn't
> nice. So I asking.

I don't see how you could do anything with this that you can't do with
the existing implementation.  It's not as if you can store pointers
into an mmap'd block and then count on them being valid the next time
you map the file...  it might not end up at the same offset.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fix for pg_upgrade's forcing pg_controldata into English

2010-09-07 Thread Bruce Momjian

Bruce Momjian wrote:
> Tom Lane wrote:
> > Bruce Momjian  writes:
> > > Tom Lane wrote:
> > >> I certainly hope that pg_regress isn't freeing the strings it passes
> > >> to putenv() ...
> > 
> > > pg_regress does not restore these settings (it says with C/English) so
> > > the code is different.
> > 
> > That's not what I'm on about.  You're trashing strings that are part of
> > the live environment.  It might accidentally fail to fail for you, if
> > your version of free() doesn't immediately clobber the released storage,
> > but it's still broken.  Read the putenv() man page.
> > 
> > + #ifndef WIN32
> > +   char   *envstr = (char *) pg_malloc(ctx, strlen(var) +
> > +   strlen(val) + 1);
> > + 
> > +   sprintf(envstr, "%s=%s", var, val);
> > +   putenv(envstr);
> > +   pg_free(envstr);
> > 
> > + #else
> > +   SetEnvironmentVariableA(var, val);
> > + #endif
> > 
> > The fact that there is no such free() in pg_regress is not an oversight
> > or shortcut.
> 
> Interesting.  I did not know this and it was not clear from my manual
> page or FreeBSD's manual page, but Linux clearly does this.
> 
> Updated patch attached.

Applied to HEAD and 9.0.X.  Thanks for the ideas/review.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Robert Haas

On Tue, Sep 7, 2010 at 9:56 AM, Magnus Hagander  wrote:
> You're saying you don't "require" a fix on the latest issue here? Or
> should we spend some time trying to figure out if we can fix it with
> git-filter-branch?

I think that "the latest issue here" is the issue of how files get
added to branches, which we discussed before with pretty much the same
set of conclusions.  I'm not wild about the way that's getting
converted, but I'm not sure I care enough about it to argue with Tom.
However, I want to convince myself that the deletes we've done over
the years have been properly handled.  I need to look at Max's latest
conversion and I'll look at yours as well.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Magnus Hagander

On Tue, Sep 7, 2010 at 15:53, Tom Lane  wrote:
> Michael Haggerty  writes:
>> Tom Lane wrote:
>>> So, if we're prepared to assert that we've never done that, could we
>>> have an option to cvs2git that is willing to use the first commit on
>>> a branch to represent the act of adding the file to the branch?
>
>> I'm afraid this would be pretty far down on my long todo list.
>
> Fair enough.
>
>> Somebody could use "git filter-branch" to make this change after the
>> conversion, but I can't estimate how much work it would be.
>
> The conversion is already far better than I expected it would be when
> we were first discussing this switch, so my inclination is to just live
> with this one wart.
>
> I spent more time over the weekend comparing various branches' histories
> between cvs2cl and Max's repository.  I found a lot of places where
> cvs2cl had problems :-(, but none where the git history could be blamed.
> I'm ready to sign off on this conversion process as being Good Enough,
> modulo two points:
>
> * Change the committer name assigned to manufactured commits, as already
> mentioned.
>
> * Please make the manufactured commits read "cvs2git" not "cvs2svn".
> I don't want people wondering in future when it was we used SVN.
>
> AFAIK both of these are trivial configuration fixes.

I'm actually re-running a migration right now with this - and with the
change to use rcs instead of cvs, to see if I can reproduce Max's
proper repository.

You're saying you don't "require" a fix on the latest issue here? Or
should we spend some time trying to figure out if we can fix it with
git-filter-branch?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-09-07 Thread Tom Lane

Michael Haggerty  writes:
> Tom Lane wrote:
>> So, if we're prepared to assert that we've never done that, could we
>> have an option to cvs2git that is willing to use the first commit on
>> a branch to represent the act of adding the file to the branch?

> I'm afraid this would be pretty far down on my long todo list.

Fair enough.

> Somebody could use "git filter-branch" to make this change after the
> conversion, but I can't estimate how much work it would be.

The conversion is already far better than I expected it would be when
we were first discussing this switch, so my inclination is to just live
with this one wart.

I spent more time over the weekend comparing various branches' histories
between cvs2cl and Max's repository.  I found a lot of places where
cvs2cl had problems :-(, but none where the git history could be blamed.
I'm ready to sign off on this conversion process as being Good Enough,
modulo two points:

* Change the committer name assigned to manufactured commits, as already
mentioned.

* Please make the manufactured commits read "cvs2git" not "cvs2svn".
I don't want people wondering in future when it was we used SVN.

AFAIK both of these are trivial configuration fixes.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-09-07 Thread Markus Wanner


On 09/07/2010 02:16 PM, Robert Haas wrote:

Right, definitely.  The trouble is that if they happen concurrently,
and there's a crash, you have to be prepared for the possibility that
either one of the two has completed and the other is not.


Understood.


In
practice, this means that the master and standby need to compare notes
on the ending WAL location and whichever one is further advanced needs
to stream the intervening records to the other.


Not necessarily, no. Remember that the client didn't get a commit 
confirmation. So reverting might also be a correct solution (i.e. not 
violating the durability constraint).



This would be an
awesome feature, but it's hard, so for a first version, it makes sense
to commit on the master first and then on the standby after the master
is known done.


The obvious downside of that is that latency adds up, instead of just 
being the max of the two operations. And that for normal operation. 
While at best it saves an un-confirmed transaction in the failure case.


It might be harder to implement, yes.

Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] can we publish a aset interface?

2010-09-07 Thread Pavel Stehule

2010/9/7 Robert Haas :
> On Tue, Sep 7, 2010 at 4:53 AM, Pavel Stehule  wrote:
>> I would to use a special memory context for shared data (based on
>> mmap) and I like impementation of aset. There is only one difference -
>> aset is based on malloc and I would to use a mmap.
>>
>> malloc() is used in AllocSetContextCreate and AllocSetAlloc. These
>> procedures should be overwritten, but other code and data structures
>> can be used. This step can be useful for previous discuss about some
>> more comfortable maintaining of shared memory.
>>
>> What do you think about?
>
> What would this be good for?
>

I try to solve performance problems with czech tsearch. I checked
serialization and deserialization, but this decrease load time only to
100ms (from 500) that is too much for us. After some gaming with mmap
I thinking so there some chance to preallocate mmap memory, and then
use a special memory context based on mmap instead of malloc.
Teoretically I can copy aset interface - this module probably never be
in core (this problem is probably local - only Czech), but it isn't
nice. So I asking.

Regards

Pavel Stehule

> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise Postgres Company
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 113 matches

Mail list logo