Re: [HACKERS] Git out of sync vs. CVS

2010-01-26 Thread Magnus Hagander
On Thu, Jan 21, 2010 at 17:11, Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
 So add me to the list of people who think that if
 these are going to be recurring, we should look at moving from cvs
 to git as soon as 9.0 is released.

 The gating factor is not release schedule; it is the still-unaddressed
 tasks that must be done before we can consider moving.
 http://wiki.postgresql.org/wiki/Switching_PostgreSQL_from_CVS_to_Git

Assuming git-cvsserver works as advertised (which we should verify of
course) there are really only two points left:
Confirm past releases can be built identically from Git, using binary diff 
which I intend to look at, and
Provide backport examples 
which Heikki has promised to look at


Unless the NLS scripts actually do commits, in which case they also
have to be changed.

So the list really isn't very long. I think it's perfectly possible to
clear it off before the release. Because we still only want to change
after the release, or are you saying once those are fixed, we can
change even if we happen to be in beta at the time?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git out of sync vs. CVS

2010-01-26 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
 So add me to the list of people who think that if
 these are going to be recurring, we should look at moving from
 cvs to git as soon as 9.0 is released.
 
 The gating factor is not release schedule; it is the still-
 unaddressed tasks that must be done before we can consider moving.
 http://wiki.postgresql.org/wiki/Switching_PostgreSQL_from_CVS_to_Git
 
If you think people can work on that list without risk of delaying
the release, OK.  I was assuming that such work would be too
disruptive to work on at this point in a release cycle, and might
possibly pull time from folks who would otherwise be working on the
release.  Do you disagree?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git out of sync vs. CVS

2010-01-22 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 So add me to the list of people who think that if
 these are going to be recurring, we should look at moving from cvs
 to git as soon as 9.0 is released.

The gating factor is not release schedule; it is the still-unaddressed
tasks that must be done before we can consider moving.
http://wiki.postgresql.org/wiki/Switching_PostgreSQL_from_CVS_to_Git

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git out of sync vs. CVS

2010-01-21 Thread Magnus Hagander
On Wed, Jan 20, 2010 at 15:36, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Jan 20, 2010 at 4:27 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 Magnus Hagander wrote:
 On Wed, Jan 20, 2010 at 09:52, Magnus Hagander mag...@hagander.net wrote:
 On Tue, Jan 19, 2010 at 16:59, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jan 19, 2010 at 10:44 AM, Magnus Hagander mag...@hagander.net 
 wrote:
 On Mon, Jan 18, 2010 at 01:53, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
 Magnus Hagander  wrote:

 the Git repository is missing parts of two non-recent commits.
 We've seen this happen before.
 That seems like kind of a blasé attitude toward something upon which
 some people rely.
 For the record, I am one of those people. I use it for *all* my
 postgresql development. And this is a serious pain.
 FWIW, I am in favor of rewinding and making everyone rebase, but I
 think we should do it ASAP.
 Ok, I started looking at this.

 First, it's not at all clear to me what Peter means wiht his comments.
 But it happens to be that one of the commits he's referring to is all
 the way back in August. So we'd have to rewind it all that way. Do we
 really want to do that, or do we want to do a manual commit on the
 repository bringing it back in sync instead? (either by knowing what's
 wrong with those commits, or do a complete diff of cvs head vs git
 head)

 Actually, such a correction patch would be nice and short. Attached
 for reference. Thoughts?

 That seems better than rewinding the history all the way back to August.

 It seems pretty horrible to me.  That means we'll have a range of
 times 5 months long for which the git repository doesn't match CVS.

Yes.

But how bad is that really the way we do things now? It still works
perfectly fine for development against HEAD, which believe is what
most people are using it for at this point. (As long as somebody keeps
finding these things when they happen, that is)

I'm going to do the fixup for now. We can always rewind past that one
later if we have to, it's not like it's going to get any worse.


 Admittedly, I understand that this is going to be extremely painful
 for anyone who (like Heikki) has to manage a substantial private
 branch.

Well, git actually picks that up reasonably well these days, but it's
still a bit of a pain. Also, all the links people have posted will no
longer be valid, etc.


 I haven't been in a hurry to see us move to git because the git mirror
 is, for most purposes, just as good.  But if the git mirror is going
 to start sucking, then I'm in a hurry.  The way I used to work before
 I learned git seems laughable now, and I do NOT want to go back.

I can only agree with this. I would very much like to see that
discussion opened again - after we've released 9.0. But for that
reason, it'd be good if we could take care of the issues listed on the
wiki page before that happens :-)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git out of sync vs. CVS

2010-01-21 Thread Magnus Hagander
On Tue, Jan 19, 2010 at 21:07, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 I wrote:

 Perhaps it is as simple, though, as using the client's time
 instead of the CVS server's time -- that's one of the things I've
 seen cause problems for this sort of thing using CVS before.

 I got a brief consult with a Ruby programmer here under the if it's
 less than ten minutes you don't have to schedule it through a
 manager rule.  From what we can see, fromcvs scans for all entries
 *after* a previous run time, but it isn't setting an upper bound
 on time during the scan.  I haven't found where it saves the time
 for the lower limit of the next run, but I rather suspect that it
 grabs the current time near the end of the scan.  If this is an
 accurate assessment, to avoid a window for lost commits, we'd have
 to fix a time before we started the scan to use as the upper bound
 for CVS commits to handle, and use it for the previous run time.

 There's still the possible issue of *whose* clock we're using for
 this.

 Reality check: does the frequency of lost CVS commits within git
 seem consistent with this theory?

Well, supposedly all our servers are synced with NTP. I know the main
cvs server is, and the git server is, but it goes past the anoncvs
server which is a hub.org server so I don't know for sure there - but
I think it is? So I don't think it's the machines-out-of-sync issue.
Or at least the window for that is *really* small.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git out of sync vs. CVS

2010-01-21 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 On Tue, Jan 19, 2010 at 21:07, Kevin Grittner
 Reality check: does the frequency of lost CVS commits within git
 seem consistent with this theory?

 Well, supposedly all our servers are synced with NTP. I know the main
 cvs server is, and the git server is, but it goes past the anoncvs
 server which is a hub.org server so I don't know for sure there - but
 I think it is? So I don't think it's the machines-out-of-sync issue.
 Or at least the window for that is *really* small.

I have noticed that CVS operations (at least from the user's viewpoint)
work in local time.  So even if the clocks are synced, a different TZ
setting could conceivably lead to issues.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git out of sync vs. CVS

2010-01-21 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 
 I have noticed that CVS operations (at least from the user's
 viewpoint) work in local time.  So even if the clocks are synced,
 a different TZ setting could conceivably lead to issues.
 
Hmmm...  If that were the issue I would think we'd've seen the
problem more often.  From reading over the Ruby code, it appears to
me that if a commit happens when fromcvs is scanning for recent
commits, and commit touches a part the scan has already passed, we'd
see anomalies like this, although my weak Ruby skills leave me less
than 100% sure.  The same skill deficiency means it would take me at
least three FTE days to fix the flaw in fromcvs, which I'd have to
do off-hours.  So add me to the list of people who think that if
these are going to be recurring, we should look at moving from cvs
to git as soon as 9.0 is released.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git out of sync vs. CVS

2010-01-21 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 So the list really isn't very long. I think it's perfectly possible to
 clear it off before the release. Because we still only want to change
 after the release, or are you saying once those are fixed, we can
 change even if we happen to be in beta at the time?

When and if we have the prerequisite tasks done, it'll be time enough to
think about exactly when to schedule the move.  Given the amount of
movement on the prerequisites in the past year, I'm not planning to
worry about it today.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git out of sync vs. CVS

2010-01-21 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
 So add me to the list of people who think that if
 these are going to be recurring, we should look at moving from
 cvs to git as soon as 9.0 is released.
 
 The gating factor is not release schedule; it is the still-
 unaddressed tasks that must be done before we can consider moving.
 http://wiki.postgresql.org/wiki/Switching_PostgreSQL_from_CVS_to_Git
 
 If you think people can work on that list without risk of delaying
 the release, OK.  I was assuming that such work would be too
 disruptive to work on at this point in a release cycle, and might
 possibly pull time from folks who would otherwise be working on the
 release.  Do you disagree?

Oh, if you meant that people should start dealing with those tasks after
release, that's fine with me.  I read your comment to be that we should
schedule the move for immediately after release, prerequisites or no.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git out of sync vs. CVS

2010-01-20 Thread Magnus Hagander
On Tue, Jan 19, 2010 at 16:59, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jan 19, 2010 at 10:44 AM, Magnus Hagander mag...@hagander.net wrote:
 On Mon, Jan 18, 2010 at 01:53, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
 Magnus Hagander  wrote:

 the Git repository is missing parts of two non-recent commits.

 We've seen this happen before.

 That seems like kind of a blasé attitude toward something upon which
 some people rely.

 For the record, I am one of those people. I use it for *all* my
 postgresql development. And this is a serious pain.

 FWIW, I am in favor of rewinding and making everyone rebase, but I
 think we should do it ASAP.

Ok, I started looking at this.

First, it's not at all clear to me what Peter means wiht his comments.
But it happens to be that one of the commits he's referring to is all
the way back in August. So we'd have to rewind it all that way. Do we
really want to do that, or do we want to do a manual commit on the
repository bringing it back in sync instead? (either by knowing what's
wrong with those commits, or do a complete diff of cvs head vs git
head)


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git out of sync vs. CVS

2010-01-20 Thread Magnus Hagander
On Wed, Jan 20, 2010 at 09:52, Magnus Hagander mag...@hagander.net wrote:
 On Tue, Jan 19, 2010 at 16:59, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jan 19, 2010 at 10:44 AM, Magnus Hagander mag...@hagander.net 
 wrote:
 On Mon, Jan 18, 2010 at 01:53, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
 Magnus Hagander  wrote:

 the Git repository is missing parts of two non-recent commits.

 We've seen this happen before.

 That seems like kind of a blasé attitude toward something upon which
 some people rely.

 For the record, I am one of those people. I use it for *all* my
 postgresql development. And this is a serious pain.

 FWIW, I am in favor of rewinding and making everyone rebase, but I
 think we should do it ASAP.

 Ok, I started looking at this.

 First, it's not at all clear to me what Peter means wiht his comments.
 But it happens to be that one of the commits he's referring to is all
 the way back in August. So we'd have to rewind it all that way. Do we
 really want to do that, or do we want to do a manual commit on the
 repository bringing it back in sync instead? (either by knowing what's
 wrong with those commits, or do a complete diff of cvs head vs git
 head)

Actually, such a correction patch would be nice and short. Attached
for reference. Thoughts?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


git_head_cleanup.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git out of sync vs. CVS

2010-01-20 Thread Heikki Linnakangas
Magnus Hagander wrote:
 On Wed, Jan 20, 2010 at 09:52, Magnus Hagander mag...@hagander.net wrote:
 On Tue, Jan 19, 2010 at 16:59, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jan 19, 2010 at 10:44 AM, Magnus Hagander mag...@hagander.net 
 wrote:
 On Mon, Jan 18, 2010 at 01:53, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
 Magnus Hagander  wrote:

 the Git repository is missing parts of two non-recent commits.
 We've seen this happen before.
 That seems like kind of a blasé attitude toward something upon which
 some people rely.
 For the record, I am one of those people. I use it for *all* my
 postgresql development. And this is a serious pain.
 FWIW, I am in favor of rewinding and making everyone rebase, but I
 think we should do it ASAP.
 Ok, I started looking at this.

 First, it's not at all clear to me what Peter means wiht his comments.
 But it happens to be that one of the commits he's referring to is all
 the way back in August. So we'd have to rewind it all that way. Do we
 really want to do that, or do we want to do a manual commit on the
 repository bringing it back in sync instead? (either by knowing what's
 wrong with those commits, or do a complete diff of cvs head vs git
 head)
 
 Actually, such a correction patch would be nice and short. Attached
 for reference. Thoughts?

That seems better than rewinding the history all the way back to August.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git out of sync vs. CVS

2010-01-20 Thread Robert Haas
On Wed, Jan 20, 2010 at 4:27 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 Magnus Hagander wrote:
 On Wed, Jan 20, 2010 at 09:52, Magnus Hagander mag...@hagander.net wrote:
 On Tue, Jan 19, 2010 at 16:59, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jan 19, 2010 at 10:44 AM, Magnus Hagander mag...@hagander.net 
 wrote:
 On Mon, Jan 18, 2010 at 01:53, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
 Magnus Hagander  wrote:

 the Git repository is missing parts of two non-recent commits.
 We've seen this happen before.
 That seems like kind of a blasé attitude toward something upon which
 some people rely.
 For the record, I am one of those people. I use it for *all* my
 postgresql development. And this is a serious pain.
 FWIW, I am in favor of rewinding and making everyone rebase, but I
 think we should do it ASAP.
 Ok, I started looking at this.

 First, it's not at all clear to me what Peter means wiht his comments.
 But it happens to be that one of the commits he's referring to is all
 the way back in August. So we'd have to rewind it all that way. Do we
 really want to do that, or do we want to do a manual commit on the
 repository bringing it back in sync instead? (either by knowing what's
 wrong with those commits, or do a complete diff of cvs head vs git
 head)

 Actually, such a correction patch would be nice and short. Attached
 for reference. Thoughts?

 That seems better than rewinding the history all the way back to August.

It seems pretty horrible to me.  That means we'll have a range of
times 5 months long for which the git repository doesn't match CVS.

Admittedly, I understand that this is going to be extremely painful
for anyone who (like Heikki) has to manage a substantial private
branch.

I haven't been in a hurry to see us move to git because the git mirror
is, for most purposes, just as good.  But if the git mirror is going
to start sucking, then I'm in a hurry.  The way I used to work before
I learned git seems laughable now, and I do NOT want to go back.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git out of sync vs. CVS

2010-01-20 Thread Heikki Linnakangas
Robert Haas wrote:
 On Wed, Jan 20, 2010 at 4:27 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 Magnus Hagander wrote:
 Actually, such a correction patch would be nice and short. Attached
 for reference. Thoughts?
 That seems better than rewinding the history all the way back to August.
 
 It seems pretty horrible to me.  That means we'll have a range of
 times 5 months long for which the git repository doesn't match CVS.
 
 Admittedly, I understand that this is going to be extremely painful
 for anyone who (like Heikki) has to manage a substantial private
 branch.

I won't object to rewinding, it should be fairly painless to rebase.

 I haven't been in a hurry to see us move to git because the git mirror
 is, for most purposes, just as good.  But if the git mirror is going
 to start sucking, then I'm in a hurry.  The way I used to work before
 I learned git seems laughable now, and I do NOT want to go back.

My feelings exactly. I'm not in a hurry to switch because the mirror is
good enough for me. But if *I* have to spend time fixing the mirror
every few weeks, I'm not happy. Magnus has been kind enough to handle
the last mirror troubles, but I believe hë́ shares the feeling.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git out of sync vs. CVS

2010-01-20 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 Magnus Hagander wrote:
 Actually, such a correction patch would be nice and short. Attached
 for reference. Thoughts?

 That seems better than rewinding the history all the way back to August.

+1 ... I'm just an interested observer not a user of the git repository,
but this approach seems far less work for everyone concerned.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git out of sync vs. CVS

2010-01-19 Thread Magnus Hagander
On Mon, Jan 18, 2010 at 01:53, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Magnus Hagander  wrote:

 the Git repository is missing parts of two non-recent commits.

 We've seen this happen before.

 That seems like kind of a blasé attitude toward something upon which
 some people rely.

For the record, I am one of those people. I use it for *all* my
postgresql development. And this is a serious pain.

It has been brought up before. Nobody has come up with a completely
safe way to do it, because CVS simply doesn't have the capabilities
required.

And yes, it is annoying to have to deal with the issues with CVS at
the same time as people keep saying CVS is perfectly fine. It's not.
It's just that we are doing our best to work around the issues in it,
and sometimes that leads to these issues.


 When we (at Wisconsin State Courts) were using CVS and had scripts to
 automatically merge changes from one branch to another, we saw this
 sort of thing unless people were very careful to grab a timestamp in
 the past for their ranges and use it throughout the script.  Perhaps
 the script is just not careful enough?  (Said in total ignorance of
 what the PostgreSQL process here actually is)

That would be one way. However, AFAIK the tool we use (fromcvs)
doesn't support this. If somebody were to extend the tool with that,
it would be much appreciated. It's a Ruby tool though, so there's not
a thing I can do about it myself... And it's basically undocumented.

But yes, if we do that and set the timestamp far enough back in time,
that should make it reasonably safe. Given how long some operations
can take ((C) year change, release tagging IIRC, stuff like that),
this has to be a fairly large number, which means the git mirror will
lack even further behind. But if that's what we have to pay to make it
safe, I guess we should... The time would have to be long enough to
cover any cvs commit including potential network slowness during it
etc.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git out of sync vs. CVS

2010-01-19 Thread Robert Haas
On Tue, Jan 19, 2010 at 10:44 AM, Magnus Hagander mag...@hagander.net wrote:
 On Mon, Jan 18, 2010 at 01:53, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
 Magnus Hagander  wrote:

 the Git repository is missing parts of two non-recent commits.

 We've seen this happen before.

 That seems like kind of a blasé attitude toward something upon which
 some people rely.

 For the record, I am one of those people. I use it for *all* my
 postgresql development. And this is a serious pain.

FWIW, I am in favor of rewinding and making everyone rebase, but I
think we should do it ASAP.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git out of sync vs. CVS

2010-01-19 Thread Magnus Hagander
On Tuesday, January 19, 2010, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jan 19, 2010 at 10:44 AM, Magnus Hagander mag...@hagander.net wrote:
 On Mon, Jan 18, 2010 at 01:53, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
 Magnus Hagander  wrote:

 the Git repository is missing parts of two non-recent commits.

 We've seen this happen before.

 That seems like kind of a blasé attitude toward something upon which
 some people rely.

 For the record, I am one of those people. I use it for *all* my
 postgresql development. And this is a serious pain.

 FWIW, I am in favor of rewinding and making everyone rebase, but I
 think we should do it ASAP.

Got time to figure out exactly how far to rewind?

/Magnus


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git out of sync vs. CVS

2010-01-19 Thread Aidan Van Dyk
* Magnus Hagander mag...@hagander.net [100119 10:44]:
 
  When we (at Wisconsin State Courts) were using CVS and had scripts to
  automatically merge changes from one branch to another, we saw this
  sort of thing unless people were very careful to grab a timestamp in
  the past for their ranges and use it throughout the script.  Perhaps
  the script is just not careful enough?  (Said in total ignorance of
  what the PostgreSQL process here actually is)
 
 That would be one way. However, AFAIK the tool we use (fromcvs)
 doesn't support this. If somebody were to extend the tool with that,
 it would be much appreciated. It's a Ruby tool though, so there's not
 a thing I can do about it myself... And it's basically undocumented.
 
 But yes, if we do that and set the timestamp far enough back in time,
 that should make it reasonably safe. Given how long some operations
 can take ((C) year change, release tagging IIRC, stuff like that),
 this has to be a fairly large number, which means the git mirror will
 lack even further behind. But if that's what we have to pay to make it
 safe, I guess we should... The time would have to be long enough to
 cover any cvs commit including potential network slowness during it
 etc.

Well, when I was running my conversion, I took a cheap way, I just
rsynced twice (with a delay, I don't remember how long I decided was long
enough) and made sure the 2nd rsync didn't do anything, before I let
fromcvs at the copy of CVSROOT.

Sure, it's not perfect either, I based that on the hope that no single
CVS commit would have a period of $X of inactivity on the CVSROOT.

Of course, that could all be useless (for my PG conversion) if the PG
CVSROOT that was an unstable point-in-time copy of the real CVSROOT, but
I was rsyncing CVSROOT of other projects too, so I needed it for my own
conversions...

a.

-- 
Aidan Van Dyk Create like a god,
ai...@highrise.ca   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature


Re: [HACKERS] Git out of sync vs. CVS

2010-01-19 Thread Kevin Grittner
Magnus Hagander mag...@hagander.net wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 Magnus Hagander  wrote:

 the Git repository is missing parts of two non-recent
 commits.

 We've seen this happen before.

 That seems like kind of a blasé attitude toward something upon
 which some people rely.
 
 For the record, I am one of those people. I use it for *all* my
 postgresql development. And this is a serious pain.
 
It appears I took your comment the wrong way.  Apologies.
 
 When we (at Wisconsin State Courts) were using CVS and had
 scripts to automatically merge changes from one branch to
 another, we saw this sort of thing unless people were very
 careful to grab a timestamp in the past for their ranges and use
 it throughout the script. Perhaps the script is just not careful
 enough?  (Said in total ignorance of what the PostgreSQL process
 here actually is)
 
 That would be one way. However, AFAIK the tool we use (fromcvs)
 doesn't support this. If somebody were to extend the tool with
 that, it would be much appreciated. It's a Ruby tool though, so
 there's not a thing I can do about it myself... And it's basically
 undocumented.
 
 But yes, if we do that and set the timestamp far enough back in
 time, that should make it reasonably safe. Given how long some
 operations can take ((C) year change, release tagging IIRC, stuff
 like that), this has to be a fairly large number, which means the
 git mirror will lack even further behind. But if that's what we
 have to pay to make it safe, I guess we should... The time would
 have to be long enough to cover any cvs commit including potential
 network slowness during it etc.
 
My Ruby skills are minimal, but we've got some Ruby gurus around
here -- maybe between my rough skills and a few impositions on the
others I could wrangle something.  Is there any particular version I
should be looking at?  The last official version I can find is
0.0.0.132 from May 3, 2009.
 
Although, if there's not some reasonably obvious fix (like
subtracting some fixed amount of time from a timestamp they're
already grabbing), perhaps we should just plan on limping along
until we can convert to git.
 
Oh, and what sort of delay do you feel would be long enough to
cover any cvs commit including potential network slowness during it
etc.?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git out of sync vs. CVS

2010-01-19 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Oh, and what sort of delay do you feel would be long enough to
 cover any cvs commit including potential network slowness during it
 etc.?

Why should the script make any assumptions about delay at all?
It seems to me that the problem comes from failing to check for
changed files, no more and no less.  It would be much less of an
issue if a non-atomic CVS commit showed up as two separate GIT
commits with similar log messages.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git out of sync vs. CVS

2010-01-19 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Oh, and what sort of delay do you feel would be long enough to
 cover any cvs commit including potential network slowness during
 it etc.?
 
 Why should the script make any assumptions about delay at all?
 It seems to me that the problem comes from failing to check for
 changed files, no more and no less.  It would be much less of an
 issue if a non-atomic CVS commit showed up as two separate GIT
 commits with similar log messages.
 
I was trying to be accommodating; if Magnus's take on this isn't a
consensus, I'll put forward in a little more detail what I had in
mind.
 
What we did with our scripts was to grab the current time *from the
CVS server* (since not all clocks are necessarily set accurately)
and using that as the end of a time range.  The end of the previous
time range was recorded on successful completion; we would us that
as the start of a time range.  Done carefully, that allows no
commits to be missed.  The only way something could be done twice
would be for the process to die after it had pushed through some
changes and before it reached completion and saved the time.
 
Now, I haven't looked at the fromcvs code yet to know how easy or
hard it would be to use this logic within that package, so this is
still pretty hand-wavy.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git out of sync vs. CVS

2010-01-19 Thread Aidan Van Dyk
* Tom Lane t...@sss.pgh.pa.us [100119 11:47]:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
  Oh, and what sort of delay do you feel would be long enough to
  cover any cvs commit including potential network slowness during it
  etc.?
 
 Why should the script make any assumptions about delay at all?
 It seems to me that the problem comes from failing to check for
 changed files, no more and no less.  It would be much less of an
 issue if a non-atomic CVS commit showed up as two separate GIT
 commits with similar log messages.

Well, I guess you could say:

  fromcvs should go back and recheck all the previous work it's done,
  and double check and make sure no new files have changed for the
  timestamp/log message pair it's already done, because CVS isn't atomic

But, I think that path leads to craziness... I mean, how far back?  CVS
is non-attomic enough that 2 (well, $N) people can commit separate
stuff, all with overlapping time stamps, and they can even commit stuff
in the past of they really want...

But, all I have to say is it's not perfect, pretty good, just deal with the
things as they come, after all, it's CVS

;-)

If you want better than pretty good, drop CVS, do a one-time
conversion (a la parsecvs/cvs2git) and get on with life...  As long as
CVS is the tool of choice, pretty good is really good...

-- 
Aidan Van Dyk Create like a god,
ai...@highrise.ca   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature


Re: [HACKERS] Git out of sync vs. CVS

2010-01-19 Thread Kevin Grittner
Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 
 I haven't looked at the fromcvs code yet to know how easy or
 hard it would be to use this logic within that package
 
Well, now I have looked.  It's about 2,000 lines of pretty dense
Ruby code (not as many comments as one would hope, especially since
there appears to be *no* other documentation of any sort).  On a
quick scan, they seem to be *trying* to do what I suggested, which
means that some sort of fix could probably be worked out, but that
the issue could be subtle enough that it could be hard to find.
 
Perhaps it is as simple, though, as using the client's time instead
of the CVS server's time -- that's one of the things I've seen cause
problems for this sort of thing using CVS before.  I haven't spotted
where they're getting the time.
 
Is there anyone fluent in Ruby who wants to look at this and see how
they're getting it?
 
http://ww2.fs.ei.tum.de/~corecode/hg/fromcvs/log/132
 
By the way, is anyone working on fixing up the current problem? 
I've been talking about trying to prevent recurrences, but that's
not gonna help get the current problem solved
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git out of sync vs. CVS

2010-01-19 Thread Kevin Grittner
I wrote:
 
 Perhaps it is as simple, though, as using the client's time
 instead of the CVS server's time -- that's one of the things I've
 seen cause problems for this sort of thing using CVS before.
 
I got a brief consult with a Ruby programmer here under the if it's
less than ten minutes you don't have to schedule it through a
manager rule.  From what we can see, fromcvs scans for all entries
*after* a previous run time, but it isn't setting an upper bound
on time during the scan.  I haven't found where it saves the time
for the lower limit of the next run, but I rather suspect that it
grabs the current time near the end of the scan.  If this is an
accurate assessment, to avoid a window for lost commits, we'd have
to fix a time before we started the scan to use as the upper bound
for CVS commits to handle, and use it for the previous run time.
 
There's still the possible issue of *whose* clock we're using for
this.
 
Reality check: does the frequency of lost CVS commits within git
seem consistent with this theory?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git out of sync vs. CVS

2010-01-17 Thread Magnus Hagander
2010/1/17 Peter Eisentraut pete...@gmx.net:
 Maybe I'm hallucinating and someone could check this in their
 environment, but it appears to me that the Git repository is missing
 parts of two non-recent commits.  See attached patch.

Not having looked at the repo in detail, but I bet this happened
because the git mirror grabbed it's snapshot in the middle of a cvs
commit with multiple files. Since cvs doesn't have atomic commits, I
think that kind of thing can happen. Does that seem possible wrt these
commits specifically?

I don't really know how to fix that. It's kind of hard to do
transaction safe replication from a system without transactions ;)

As for fixing it, I guess we can try the
rewind-to-commit-before-this-and-rerun. That'll break people who have
branched after, but last time it seemed that most peoples git clients
would clean that up automatically. Which commits are these exactly?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git out of sync vs. CVS

2010-01-17 Thread Peter Eisentraut
On sön, 2010-01-17 at 20:50 +0100, Magnus Hagander wrote:
 As for fixing it, I guess we can try the
 rewind-to-commit-before-this-and-rerun. That'll break people who have
 branched after, but last time it seemed that most peoples git clients
 would clean that up automatically. Which commits are these exactly?

These two belong together:

http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/contrib/start-scripts/freebsd?rev=1.5
http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/contrib/start-scripts/osx/PostgreSQL?rev=1.4

And this is a separate one:

http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/config/python.m4?rev=1.17



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git out of sync vs. CVS

2010-01-17 Thread Magnus Hagander
2010/1/17 Peter Eisentraut pete...@gmx.net:
 On sön, 2010-01-17 at 20:50 +0100, Magnus Hagander wrote:
 As for fixing it, I guess we can try the
 rewind-to-commit-before-this-and-rerun. That'll break people who have
 branched after, but last time it seemed that most peoples git clients
 would clean that up automatically. Which commits are these exactly?

 These two belong together:

 http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/contrib/start-scripts/freebsd?rev=1.5
 http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/contrib/start-scripts/osx/PostgreSQL?rev=1.4

 And this is a separate one:

 http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/config/python.m4?rev=1.17

Well, if we're going to roll something back in git, it's the git
comits that are interesting... To figure out how far back in time to
go.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git out of sync vs. CVS

2010-01-17 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 2010/1/17 Peter Eisentraut pete...@gmx.net:
 Maybe I'm hallucinating and someone could check this in their
 environment, but it appears to me that the Git repository is missing
 parts of two non-recent commits.  See attached patch.

 Not having looked at the repo in detail, but I bet this happened
 because the git mirror grabbed it's snapshot in the middle of a cvs
 commit with multiple files. Since cvs doesn't have atomic commits, I
 think that kind of thing can happen.

That would explain a single CVS commit appearing as two separate commits
in the git history; but it hardly seems like an acceptable excuse for
missing changes altogether, which is what I think Peter said he saw.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git out of sync vs. CVS

2010-01-17 Thread Magnus Hagander
2010/1/17 Tom Lane t...@sss.pgh.pa.us:
 Magnus Hagander mag...@hagander.net writes:
 2010/1/17 Peter Eisentraut pete...@gmx.net:
 Maybe I'm hallucinating and someone could check this in their
 environment, but it appears to me that the Git repository is missing
 parts of two non-recent commits.  See attached patch.

 Not having looked at the repo in detail, but I bet this happened
 because the git mirror grabbed it's snapshot in the middle of a cvs
 commit with multiple files. Since cvs doesn't have atomic commits, I
 think that kind of thing can happen.

 That would explain a single CVS commit appearing as two separate commits
 in the git history; but it hardly seems like an acceptable excuse for
 missing changes altogether, which is what I think Peter said he saw.

It's likely the combination of that, and the cvs to git sync script
not considering that this can happen. So when it does the second pass
(once it's all been synced) it detects it as a single commit, and
doesn't re-import it.

We've seen this happen before.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git out of sync vs. CVS

2010-01-17 Thread Kevin Grittner
Magnus Hagander  wrote:
 
 the Git repository is missing parts of two non-recent commits.
 
 We've seen this happen before.
 
That seems like kind of a blasé attitude toward something upon which
some people rely.
 
When we (at Wisconsin State Courts) were using CVS and had scripts to
automatically merge changes from one branch to another, we saw this
sort of thing unless people were very careful to grab a timestamp in
the past for their ranges and use it throughout the script.  Perhaps
the script is just not careful enough?  (Said in total ignorance of
what the PostgreSQL process here actually is)
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers