Re: [Monotone-devel] cvs

2013-07-17 Thread Markus Wanner
Hendrik,

On 07/15/2013 07:21 PM, Hendrik Boom wrote:
 Just wondering what the current status of of monotone's CVS support;
 in particular, cvs_import, csv_pull. cvs_sync, cvs_takeover, and I've 
 heard there's even a cvs_push.

Standard monotone's cvs_import works fine for simple cases. I didn't do
any work on the cvs_import branch and I don't think it's in a usable state.

You might want to check if cvs2svn can be of help. They have a nice git
export function and their CVS sanitizer code is field proven. I'm not
sure if you can get that into monotone, though.

 Is the conversion a one-time event, or can it keepp up with further 
 revisions on the cvs site withoug having to start over?

There are all one-time conversion options, which need access to the RCS
files on the CVS server.

tailor may be an option, if you want a continuous mirror. It certainly
has a monotone plugin. I'm not sure what the status of cvs_sync is, but
it's intended to provide continuous synchronization as well.

Regards

Markus Wanner

___
Monotone-devel mailing list
Monotone-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/monotone-devel


[Monotone-devel] cvs

2013-07-15 Thread Hendrik Boom
Just wondering what the current status of of monotone's CVS support;
in particular, cvs_import, csv_pull. cvs_sync, cvs_takeover, and I've 
heard there's even a cvs_push.

Do any of these work well?  Do they come close to importing most of the 
history in an intelligible way?

Is the conversion a one-time event, or can it keepp up with further 
revisions on the cvs site withoug having to start over?

-- hendrik


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/monotone-devel


[Monotone-devel] cvs-import

2010-08-25 Thread hendrik
I just looked again at documentation page
http://monotone.ca/docs/RCS.html#RCS

Perhaps I remember wrong, but I thought a year or so ago cvs_import was 
edged with limitations and warnings -- things like it would only import 
one branch, and the like.

I had been considering modifying cvs2svn to turn it into a cvs2mtn.  Now 
the documentation seems indicate that 
mtn cvs_import pathname
does the whole job?  Have things changed since then?  Does this mean 
that I no longer have to build svn2mtn?  If so, thanks, especially since 
I haven't had any real time to work on that in the past year, so no work 
is wasted.

-- hendrik

___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


RE: [Monotone-devel] cvs and monotone

2007-06-04 Thread Kelly F. Hickel
Somehow I missed Markus' reply, and just found it in the archives,
so

Markus Schiltknecht
Sun, 27 May 2007 11:55:04 -0700

Hi,

Kelly F. Hickel wrote:

Is reality really as dark as it seems at the moment

If you really need connected branches, I fear the answer is yes.
Yes, we pretty much do.  We could choose a small subset of branches, but
being unable do it at all seems like a pretty big roadblock.


I'm still struggling with the cvsimport-branch-reconstruction branch of
monotone. But CVS is so wicked and brain damaged that it's very hard to
get usable information from it.

Is your CVS repository publicly available?
No, it's not.

Regards

Markus

-- 

Kelly F. Hickel
Senior Software Architect
MQSoftware, Inc
952.345.8677
[EMAIL PROTECTED]


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On
 Behalf Of Kelly F. Hickel
 Sent: Friday, May 25, 2007 11:25 AM
 To: monotone-devel@nongnu.org
 Subject: [Monotone-devel] cvs and monotone
 
 Was about to do another rtest import of our cvs repo to play around
 with
 the new mtn.  Ran into the fact that the cvs_import command has a new
 required argument (or at least one I don't remember using) named
 -branch.  What's that for??
 
 While googling for the answer to that, I ran into this page
 http://www.venge.net/mtn-wiki/MonotoneAndCVS which contains the
 following terribly disturbing nugget:
 
 There is an important limitation, though. This method doesn't
 presently
 try to attach branches to their parents, either on the mainline or on
 other branches, instead each CVS branch gets its own separate linear
 history in the resulting monotone db.
 
 That's pretty amazingly disturbing, especially since I don't believe
 I'd
 ever seen that statement before
 
 Assuming that's actually true, it seems to be a pretty big problem.
 We'd discussed just skimming the currently active branches from cvs
 into
 monotone, but even that is problematical if it's not going back to the
 most recent common ancestor of the branches.
 
 I also ran across this: http://www.venge.net/mtn-wiki/CvsImport which
 at
 least gives me hope.
 
 
 Is reality really as dark as it seems at the moment
 
 Thanks,
 
 
 --
 Kelly F. Hickel
 Senior Software Architect
 MQSoftware, Inc
 952.345.8677
 [EMAIL PROTECTED]
 
 
 
 ___
 Monotone-devel mailing list
 Monotone-devel@nongnu.org
 http://lists.nongnu.org/mailman/listinfo/monotone-devel


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] cvs and monotone

2007-05-27 Thread Markus Schiltknecht

Hi,

Kelly F. Hickel wrote:

Is reality really as dark as it seems at the moment


If you really need connected branches, I fear the answer is yes.

I'm still struggling with the cvsimport-branch-reconstruction branch of 
monotone. But CVS is so wicked and brain damaged that it's very hard to 
get usable information from it.


Is your CVS repository publicly available?

Regards

Markus



___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] cvs import

2006-09-15 Thread Markus Schiltknecht

Hi,

Nathaniel Smith wrote:

.- A --.
   /\
--x  x-- C
   \/
'- B --'


You can't do this, unless you want to do some sort of inexact inverse
patching -- you would need to know what file-1 looks like with only A,
and what file-1 looks like with only B, but you don't.


That's were I've been heading to. I don't know if it's doable. But the 
reasoning behind was something like: if CVS is able to commit to A and B 
twice, no matter in which order, those changes probably didn't conflict. 
Thus we could extract them and apply separately.


Would a star merge with the previous commit and A and B tell us more? Or 
a reverse look at it with 'ancestor' C and A and B?


But that looks like micro optimization anyway.


You could fork into one A/B revision and one B/A revision, but that
doesn't seem helpful.

Or even merge A and B into one single revision (since you can't 
determine exactly what belongs to A and what to B), thus:


AB - C


Door A seems somewhat better than this, at least you get to preserve
all commit messages.


Hm.. you're right. The changelog could be put together, but we can't 
simply concatenate the authors...


Regards

Markus


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] cvs import

2006-09-15 Thread Petr Baudis
Dear diary, on Thu, Sep 14, 2006 at 03:53:24AM CEST, I got a letter
where Daniel Carosone [EMAIL PROTECTED] said that...
 On Wed, Sep 13, 2006 at 08:57:33PM -0400, Jon Smirl wrote:
  Mozilla is 120,000 files. The complexity comes from 10 years worth of
  history. A few of the files have around 1,700 revisions. There are
  about 1,600 branches and 1,000 tags. The branch number is inflated
  because cvs2svn is generating extra branches, the real number is
  around 700. The CVS repo takes 4.2GB disk space. cvs2svn turns this
  into 250,000 commits over about 1M unique revisions.
 
 Those numbers are pretty close to those in the NetBSD repository, and
 between them these probably represent just about the most extensive
 public CVS test data available. 

  Don't forget OpenOffice. It's just a shame that the OpenOffice CVS
tree is not available for cloning.

http://wiki.services.openoffice.org/wiki/SVNMigration

-- 
Petr Pasky Baudis
Stuff: http://pasky.or.cz/
Snow falling on Perl. White noise covering line noise.
Hides all the bugs too. -- J. Putnam


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] cvs import

2006-09-14 Thread Markus Schiltknecht

Hi,

Nathaniel Smith wrote:

Regarding the basic dependency-based algorithm, the approach of
throwing everything into blobs and then trying to tease them apart
again seems backwards.  What I'm thinking is, first we go through and
build the history graph for each file.  Now, advance a frontier across
the all of these graphs simultaneously.  Your frontier is basically a
map filename - CVS revision, that represents a tree snapshot.


Hm.. weren't you the one saying we should profit from the experience of 
cvs2svn? Another question I'm asking myself: if it would have been that 
easy to write a sane CVS importer, why didn't cvs2svn do something like 
that?


Anyway, I didn't want to go into discussing more algorithms here. And 
the discussion is already ways to noisy for my feeling. I want to write 
code, not emails  :-)



Regarding storing things on disk vs. in memory: we always used to
stress-test monotone's cvs importer with the gcc history; just a few
weeks ago someone did a test import of NetBSD's src repo (~180k
commits) on a desktop with 2 gigs of RAM.  It takes a pretty big
history to really require disk (and for that matter, people with
histories that big likely have a big enough organization that they can
get access to some big iron to run the conversion on -- and probably
will want to anyway, to make it run in reasonable time).


Full ack.


Probably the biggest technical advantage of having the converter built
into monotone is that it makes it easy to import the file contents.
Since this data is huge (100x the repo size, maybe?), and the naive
algorithm for reconstructing takes time that is quadratic in the depth
of history, this is very valuable.  I'm not sure what sort of dump
format one could come up with that would avoid making this step very
expensive.


I can imagine a dump format that is only loosely coupled to the file 
data and deltas. But it seems like a lot of work to write a generic 
format which performs well for all VCSes.



I also suspect that SVN's dump format is suboptimal at the metadata
level -- we would essentially have to run a lot of branch/tag
inferencing logic _again_ to go from SVN-style one giant tree with
branches described as copies, and multiple copies allowed for
branches/tags that are built up over time, to monotone-style
DAG of tree snapshots.  This would be substantially less annoying
inferencing logic than that needed to decipher CVS in the first place,
granted, and it's stuff we want to write at some point anyway to allow
SVN importing, but it adds another step where information could be
lost.  I may be biased because I grok monotone better, but I suspect
it would be much easier to losslessly convert a monotone-style history
to an svn-style history than vice versa, possibly a generic dumping
tool would want to generate output that looks more like monotone's
model?  


Yeah, and the GIT people want the generic dump look more like GIT. And 
then there are darcs, mercurial, etc...



Even if we _do_ end up writing two implementations of the algorithm,
we should share a test suite.  


Sure, but as cvs2svn has another license, I can't just copy them over 
:-(  I will write some tests, but if I write them in our monotone-lua 
testsuite, I'm sure nobody else is going to use them.


Regards

Markus



___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] cvs import

2006-09-14 Thread Markus Schiltknecht

Hi,

the algorithm Nathaniel described looks simple, clean and logic to me. 
What were the reasons for the more complex algorithms cvs2svn uses? In 
which way is the proposed dependency-based one better?


Regards

Markus

Nathaniel Smith wrote:

I just read over the thread on the cvs2svn list about this -- I have a
few random thoughts.  Take them with a grain of salt, since I haven't
actually tried writing a CVS importer myself...

Regarding the basic dependency-based algorithm, the approach of
throwing everything into blobs and then trying to tease them apart
again seems backwards.  What I'm thinking is, first we go through and
build the history graph for each file.  Now, advance a frontier across
the all of these graphs simultaneously.  Your frontier is basically a
map filename - CVS revision, that represents a tree snapshot.  The
basic loop is:
  1) pick some subset of files to advance to their next revision
  2) slide the frontier one CVS revision forward on each of those
 files
  3) snapshot the new frontier (write it to the target VCS as a new
 tree commit)
  4) go to step 1
Obviously, this will produce a target VCS history that respects the
CVS dependency graph, so that's good; it puts a strict limit on how
badly whatever heuristics we use can screw us over if they guess wrong
about things.  Also, it makes the problem much simpler -- all the
heuristics are now in step 1, where we are given a bunch of possible
edits, and we have to pick some subset of them to accept next.




___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] cvs import

2006-09-14 Thread Michael Haggerty
Nathaniel Smith writes:
 I just read over the thread on the cvs2svn list about this -- I have a
 few random thoughts.  Take them with a grain of salt, since I haven't
 actually tried writing a CVS importer myself...
 
 Regarding the basic dependency-based algorithm, the approach of
 throwing everything into blobs and then trying to tease them apart
 again seems backwards.  What I'm thinking is, first we go through and
 build the history graph for each file.  Now, advance a frontier across
 the all of these graphs simultaneously.  Your frontier is basically a
 map filename - CVS revision, that represents a tree snapshot.  The
 basic loop is:
   1) pick some subset of files to advance to their next revision
   2) slide the frontier one CVS revision forward on each of those
  files
   3) snapshot the new frontier (write it to the target VCS as a new
  tree commit)
   4) go to step 1
 Obviously, this will produce a target VCS history that respects the
 CVS dependency graph, so that's good; it puts a strict limit on how
 badly whatever heuristics we use can screw us over if they guess wrong
 about things.  Also, it makes the problem much simpler -- all the
 heuristics are now in step 1, where we are given a bunch of possible
 edits, and we have to pick some subset of them to accept next.
 
 This isn't trivial problem.  I think the main thing you want to avoid
 is:
 1  2  3  4
 |  |  |  |
   --o--o--o--o- -- current frontier
 |  |  |  |
 A  B  A  C
|
A
 say you have four files named 1, 2, 3, and 4.  We want to
 slide the frontier down, and the next edits were originally created by
 one of three commits, A, B, or C.  In this situation, we can take
 commit B, or we can take commit C, but we don't want to take commit A
 until _after_ we have taken commit B -- because otherwise we will end
 up splitting A up into two different commits, A1, B, A2.

The main problem with converting CVS repositories is its unreliable
timestamps.  Sometimes they are off by a few minutes; that would be no
problem for your algorithm.  But they might be off by hours (maybe a
timezone was set incorrectly), and it is not unusual to have a server
with a bad battery that resets its time to Jan 1 1970 after each reboot
for a while before somebody notices it.  Timestamps that are too far in
the future are probably rarer, but also occur.  CVS timestamps are
simply not to be trusted.

The best hope to correcting timestamp problems is pooling information
across files.  For example, you might have the following case:

  1   2
  |   |
  A   Z
  |
  B
  :
  Y
  |
  Z

where A..Y have correct timestamps but Z has an incorrect timestamp far
in the past.  It is clear from the dependency graph that Z was committed
after Y, and by implication revision Z of file 2 was committed at the
same time.  But your algorithm would grab revision Z of file 2 first,
even before revision A of file 1.

The point of the blob method that I proposed is that timestamps are
secondary in deciding what constitutes a changeset.  Any changeset
consistent with the dependency graph (subject maybe to some timestamp
heuristics *) is accepted.

[*] Typically, clock inaccuracies will affect all CVS revisions that
made up a change set.  Therefore the suggestion to split blobs that have
more than (say) a 5 minute time gap within them.

 There are a lot of approaches one could take here, on up to pulling
 out a full-on optimal constraint satisfaction system (if we can route
 chips, we should be able to pick a good ordering for accepting CVS
 edits, after all).  A really simple heuristic, though, would be to
 just pick the file whose next commit has the earliest timestamp, then
 group in all the other next commits with the same commit message,
 and (maybe) a similar timestamp.  I have a suspicion that this
 heuristic will work really, really, well in practice.  Also, it's
 cheap to apply, and worst case you accidentally split up a commit that
 already had wacky timestamps, and we already know that we _have_ to do
 that in some cases.
 
 Handling file additions could potentially be slightly tricky in this
 model.  I guess it is not so bad, if you model added files as being
 present all along (so you never have to add add whole new entries to
 the frontier), with each file starting out in a pre-birth state, and
 then addition of the file is the first edit performed on top of that,
 and you treat these edits like any other edits when considering how to
 advance the frontier.
 
 I have no particular idea on how to handle tags and branches here;
 I've never actually wrapped my head around CVS's model for those .
 I'm not seeing any obvious problem with handling them, though.

Tags and branches do not have any timestamps at all in CVS.  (You can
sometimes put bounds on the timestamps: a branch must have been created
after the version from which it sprouts, and before the first commit on
the branch (if there ever was a commit on the branch).)  And it is 

Re: [Monotone-devel] cvs import

2006-09-14 Thread Markus Schiltknecht

Hi,

Michael Haggerty wrote:

The main problem with converting CVS repositories is its unreliable
timestamps.  Sometimes they are off by a few minutes; that would be no
problem for your algorithm.  But they might be off by hours (maybe a
timezone was set incorrectly), and it is not unusual to have a server
with a bad battery that resets its time to Jan 1 1970 after each reboot
for a while before somebody notices it.  Timestamps that are too far in
the future are probably rarer, but also occur.  CVS timestamps are
simply not to be trusted.

The best hope to correcting timestamp problems is pooling information
across files.  For example, you might have the following case:

  1   2
  |   |
  A   Z
  |
  B
  :
  Y
  |
  Z

where A..Y have correct timestamps but Z has an incorrect timestamp far
in the past.  It is clear from the dependency graph that Z was committed
after Y, and by implication revision Z of file 2 was committed at the
same time.  But your algorithm would grab revision Z of file 2 first,
even before revision A of file 1.


But you could use another method to determine what to commit first. One 
which takes only dependency graph into account.


The simplest variant would be:

1. randomly choose a commit (or take the one with the lowest timestamp
   for a mostly good starter)

2. collect the other file's commits which seem to belong to the same
   revision (for me, a revision is a set of files, as in monotone. I
   don't know what terms you use here, probably we should define a
   set of terms to discuss such issues and avoid confusion.)

3. check if any of those file commits conflict in the dependency graph.
   I.e. in your example above file 1 would also find a commit Z, but
   it conflicts A, B, ... and Y.

   If there are conflics, take the first one in your graph (A) and
   repeat from step 2 with that commit. Otherwise continue.

4. You now have the 'next' revision to commit (next in the dependency
   graph sense).


With such an algorithm, you won't rely on the timestamps, but only on 
the dependencies. Thus, what other advantages would the blob method have?



Tags and branches do not have any timestamps at all in CVS.  (You can
sometimes put bounds on the timestamps: a branch must have been created
after the version from which it sprouts, and before the first commit on
the branch (if there ever was a commit on the branch).)  And it is not
possible to distinguish whether two branches/tags sprouted from the same
revision of a file or whether one sprouted from the other.  So a
date-based method has to work hard to get tags and branches correct.


But in the above way, none of it would be timestamp based. You could, as 
you do in your blob method, insert tag and branch 'events', which would 
be dependent on a commit event of a certain file. You would then not get 
a 'revision' in step 4 above, but a branch or tag.


(Don't get me wrong, I think the blob method is better. Because I 
suspect importing a CVS repository can't be that simple. But I'm missing 
prove of that.)


Regards

Markus


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] cvs import

2006-09-14 Thread Michael Haggerty
Markus Schiltknecht wrote:
 Michael Haggerty wrote:
 The main problem with converting CVS repositories is its unreliable
 timestamps.  Sometimes they are off by a few minutes; that would be no
 problem for your algorithm.  But they might be off by hours (maybe a
 timezone was set incorrectly), and it is not unusual to have a server
 with a bad battery that resets its time to Jan 1 1970 after each reboot
 for a while before somebody notices it.  Timestamps that are too far in
 the future are probably rarer, but also occur.  CVS timestamps are
 simply not to be trusted.

 The best hope to correcting timestamp problems is pooling information
 across files.  For example, you might have the following case:

   1   2
   |   |
   A   Z
   |
   B
   :
   Y
   |
   Z

 where A..Y have correct timestamps but Z has an incorrect timestamp far
 in the past.  It is clear from the dependency graph that Z was committed
 after Y, and by implication revision Z of file 2 was committed at the
 same time.  But your algorithm would grab revision Z of file 2 first,
 even before revision A of file 1.
 
 But you could use another method to determine what to commit first. One
 which takes only dependency graph into account.
 
 The simplest variant would be:
 
 1. randomly choose a commit (or take the one with the lowest timestamp
for a mostly good starter)
 
 2. collect the other file's commits which seem to belong to the same
revision (for me, a revision is a set of files, as in monotone. I
don't know what terms you use here, probably we should define a
set of terms to discuss such issues and avoid confusion.)
 
 3. check if any of those file commits conflict in the dependency graph.
I.e. in your example above file 1 would also find a commit Z, but
it conflicts A, B, ... and Y.
 
If there are conflics, take the first one in your graph (A) and
repeat from step 2 with that commit. Otherwise continue.
 
 4. You now have the 'next' revision to commit (next in the dependency
graph sense).
 
 
 With such an algorithm, you won't rely on the timestamps, but only on
 the dependencies. Thus, what other advantages would the blob method have?

Step 2 is essentially the creation of a blob, isn't it?

And steps 2 and 3 could be an infinite loop, because of

   1   2
   |   |
   A   B
   |   |
   B   A

This can arise if two (nonatomic, remember) CVS commits are going on at
the same time, even without clock errors.  Of course more complicated
loops can also arise.

 Tags and branches do not have any timestamps at all in CVS.  (You can
 sometimes put bounds on the timestamps: a branch must have been created
 after the version from which it sprouts, and before the first commit on
 the branch (if there ever was a commit on the branch).)  And it is not
 possible to distinguish whether two branches/tags sprouted from the same
 revision of a file or whether one sprouted from the other.  So a
 date-based method has to work hard to get tags and branches correct.
 
 But in the above way, none of it would be timestamp based. You could, as
 you do in your blob method, insert tag and branch 'events', which would
 be dependent on a commit event of a certain file. You would then not get
 a 'revision' in step 4 above, but a branch or tag.
 
 (Don't get me wrong, I think the blob method is better. Because I
 suspect importing a CVS repository can't be that simple. But I'm missing
 prove of that.)

Yes, but branches and especially tags are very slippery.  They don't
even have to be created (chronologically) before a succeeding commit on
the same file.  So you'll have branch/tag events rising to the top of
the frontier and you need some way to decide when to process them.

Not that this part is much easier in the blob scheme, except that from
early on you have a global picture of the topology of branches/tags so I
think it should be easier to design the heuristics that will be needed.

Michael


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] cvs import

2006-09-14 Thread Markus Schiltknecht

Hi,

Michael Haggerty wrote:

Markus Schiltknecht wrote:

With such an algorithm, you won't rely on the timestamps, but only on
the dependencies. Thus, what other advantages would the blob method have?


Step 2 is essentially the creation of a blob, isn't it?


Sure. Except that you won't have inter-blob dependencies to resolve.


And steps 2 and 3 could be an infinite loop, because of

   1   2
   |   |
   A   B
   |   |
   B   A


True, but you could easily check for that. Just remember what you've 
already tried and don't try again. To me the question is: what to do 
then? Split A into two commits around B:


A1 - B - A2 - C

Or (for monotone or git): try to seperate into individual commits (not 
always possible) and create two heads, which then merge later on. I.e.:


.- A --.
   /\
--x  x-- C
   \/
'- B --'

Or even merge A and B into one single revision (since you can't 
determine exactly what belongs to A and what to B), thus:


AB - C


This can arise if two (nonatomic, remember) CVS commits are going on at
the same time, even without clock errors.  Of course more complicated
loops can also arise.


Yes, but the problem stays the same for Nathaniel's continuous algorithm 
and for your blob-method.



Yes, but branches and especially tags are very slippery.  They don't
even have to be created (chronologically) before a succeeding commit on
the same file.  So you'll have branch/tag events rising to the top of
the frontier and you need some way to decide when to process them.


If you apply the exactly same algorithm for 'commit', 'tag' and 'branch' 
events, I don't see no problem there.


Except the 'loop' resolution will work differently if your loop consists 
of not only commits.



Not that this part is much easier in the blob scheme, except that from
early on you have a global picture of the topology of branches/tags so I
think it should be easier to design the heuristics that will be needed.


Ah, that's a difference. What do we gain with the 'global picture'?

Regards

Markus



___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] cvs import

2006-09-14 Thread Nathaniel Smith
On Thu, Sep 14, 2006 at 10:05:42AM +0200, Markus Schiltknecht wrote:
 Hi,
 
 Nathaniel Smith wrote:
 Regarding the basic dependency-based algorithm, the approach of
 throwing everything into blobs and then trying to tease them apart
 again seems backwards.  What I'm thinking is, first we go through and
 build the history graph for each file.  Now, advance a frontier across
 the all of these graphs simultaneously.  Your frontier is basically a
 map filename - CVS revision, that represents a tree snapshot.
 
 Hm.. weren't you the one saying we should profit from the experience of 
 cvs2svn?

Yes, and apparently their experience is saying that their algorithm
could be improved :-).

 Another question I'm asking myself: if it would have been that 
 easy to write a sane CVS importer, why didn't cvs2svn do something like 
 that?

I don't know, that's why I asked them :-).

 I also suspect that SVN's dump format is suboptimal at the metadata
 level -- we would essentially have to run a lot of branch/tag
 inferencing logic _again_ to go from SVN-style one giant tree with
 branches described as copies, and multiple copies allowed for
 branches/tags that are built up over time, to monotone-style
 DAG of tree snapshots.  This would be substantially less annoying
 inferencing logic than that needed to decipher CVS in the first place,
 granted, and it's stuff we want to write at some point anyway to allow
 SVN importing, but it adds another step where information could be
 lost.  I may be biased because I grok monotone better, but I suspect
 it would be much easier to losslessly convert a monotone-style history
 to an svn-style history than vice versa, possibly a generic dumping
 tool would want to generate output that looks more like monotone's
 model?  
 
 Yeah, and the GIT people want the generic dump look more like GIT. And 
 then there are darcs, mercurial, etc...

Well, monotone, git, and mercurial at least all share a design
heritage, and would want pretty much the same format... :-)

 Even if we _do_ end up writing two implementations of the algorithm,
 we should share a test suite.  
 
 Sure, but as cvs2svn has another license, I can't just copy them over 
 :-(  I will write some tests, but if I write them in our monotone-lua 
 testsuite, I'm sure nobody else is going to use them.

Duh, I forgot about the license thing :-(.

Tests could be written in a somewhat standardized way, and then we
could just have a harness to run them in our testsuite, others could
have harnesses to run them in their testsuites, while keeping the
actual test data shared.

-- Nathaniel

-- 
Eternity is very long, especially towards the end.
  -- Woody Allen


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] cvs import

2006-09-14 Thread Nathaniel Smith
On Thu, Sep 14, 2006 at 01:14:11PM +0200, Markus Schiltknecht wrote:
 Hi,
 
 Michael Haggerty wrote:
 Markus Schiltknecht wrote:
 With such an algorithm, you won't rely on the timestamps, but only on
 the dependencies. Thus, what other advantages would the blob method have?
 
 Step 2 is essentially the creation of a blob, isn't it?
 
 Sure. Except that you won't have inter-blob dependencies to resolve.
 
 And steps 2 and 3 could be an infinite loop, because of
 
1   2
|   |
A   B
|   |
B   A
 
 True, but you could easily check for that. Just remember what you've 
 already tried and don't try again. To me the question is: what to do 
 then? Split A into two commits around B:
 
 A1 - B - A2 - C
 
 Or (for monotone or git): try to seperate into individual commits (not 
 always possible) and create two heads, which then merge later on. I.e.:
 
 .- A --.
/\
 --x  x-- C
\/
 '- B --'

You can't do this, unless you want to do some sort of inexact inverse
patching -- you would need to know what file-1 looks like with only A,
and what file-1 looks like with only B, but you don't.

You could fork into one A/B revision and one B/A revision, but that
doesn't seem helpful.

 Or even merge A and B into one single revision (since you can't 
 determine exactly what belongs to A and what to B), thus:
 
 AB - C

Door A seems somewhat better than this, at least you get to preserve
all commit messages.

-- Nathaniel

-- 
When the flush of a new-born sun fell first on Eden's green and gold,
Our father Adam sat under the Tree and scratched with a stick in the mould;
And the first rude sketch that the world had seen was joy to his mighty heart,
Till the Devil whispered behind the leaves, It's pretty, but is it Art?
  -- The Conundrum of the Workshops, Rudyard Kipling


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] cvs import

2006-09-14 Thread Shawn Pearce
Petr Baudis [EMAIL PROTECTED] wrote:
   Don't forget OpenOffice. It's just a shame that the OpenOffice CVS
 tree is not available for cloning.
 
   http://wiki.services.openoffice.org/wiki/SVNMigration

Hmm, the KDE repo is even larger than Mozilla: 19 GB in CVS and
499,367 revisions.  Question is, are those distinct file revisions
or SVN revisions?  And just what machine did they use that completed
that conversion in 38 hours?

-- 
Shawn.


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


[Monotone-devel] cvs import

2006-09-13 Thread Markus Schiltknecht

Hi,

I've been trying to understand the cvsimport algorithm used by monotone 
and wanted to adjust that to be more like the one in cvs2svn.


I've had some problems with cvs2svn itself and began to question the 
algorithm used there. It turned out that the cvs2svn people have 
discussed an improved algorithms and are about to write a cvs2svn 2.0. 
The main problem with the current algorithm is that it depends on the 
timestamp information stored in the CVS repository.


Instead, it would be much better to just take the dependencies of the 
revisions into account. Considering the timestamp an irrelevant (for the 
import) attribute of the revision.


Now, that can be used to convert from CVS to about anything else. 
Obviously we were discussing about subversion, but then there was git, 
too. And monotone.


I'm beginning to question if one could come up with a generally useful 
cleaned-and-sane-CVS-changeset-dump-format, which could then be used by 
importers to all sort of VCSes. This would make monotone's cvsimport 
function dependent on cvs2svn (and therefore python). But the general 
try-to-get-something-usefull-from-an-insane-CVS-repository-algorithm 
would only have to be written once.


On the other hand, I see that lots of the cvsimport functionality for 
monotone has already been written (rcs file parsing, stuffing files, 
file deltas and complete revisions into the monotone database, etc..). 
Changing it to a better algorithm does not seem to be _that_ much work 
anymore. Plus the hard part seems to be to come up with a good 
algorithm, not implementing it. And we could still exchange our 
experience with the general algorithm with the cvs2svn people.


Plus, the guy who mentioned git pointed out that git needs quite a 
different dump-format than subversion to do an efficient conversion. I 
think coming up with a generally-usable dump format would not be that easy.


So you see, I'm slightly favoring the second implementation approach 
with a C++ implementation inside monotone.


Thoughts or comments?

Markus


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] cvs import

2006-09-13 Thread Markus Schiltknecht

Sorry, I forgot to mention some pointers:

Here is the thread where I've started the discussion about the cvs2svn 
algorithm:

http://cvs2svn.tigris.org/servlets/ReadMsg?list=devmsgNo=1599

And this is a proposal for an algorithm to do cvs imports independant of 
the timestamp:

http://cvs2svn.tigris.org/servlets/ReadMsg?list=devmsgNo=1451

Regards

Markus


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] cvs import

2006-09-13 Thread Nathaniel Smith
On Wed, Sep 13, 2006 at 07:46:40PM +0200, Markus Schiltknecht wrote:
 Hi,
 
 I've been trying to understand the cvsimport algorithm used by monotone 
 and wanted to adjust that to be more like the one in cvs2svn.
 
 I've had some problems with cvs2svn itself and began to question the 
 algorithm used there. It turned out that the cvs2svn people have 
 discussed an improved algorithms and are about to write a cvs2svn 2.0. 
 The main problem with the current algorithm is that it depends on the 
 timestamp information stored in the CVS repository.
 
 Instead, it would be much better to just take the dependencies of the 
 revisions into account. Considering the timestamp an irrelevant (for the 
 import) attribute of the revision.

I just read over the thread on the cvs2svn list about this -- I have a
few random thoughts.  Take them with a grain of salt, since I haven't
actually tried writing a CVS importer myself...

Regarding the basic dependency-based algorithm, the approach of
throwing everything into blobs and then trying to tease them apart
again seems backwards.  What I'm thinking is, first we go through and
build the history graph for each file.  Now, advance a frontier across
the all of these graphs simultaneously.  Your frontier is basically a
map filename - CVS revision, that represents a tree snapshot.  The
basic loop is:
  1) pick some subset of files to advance to their next revision
  2) slide the frontier one CVS revision forward on each of those
 files
  3) snapshot the new frontier (write it to the target VCS as a new
 tree commit)
  4) go to step 1
Obviously, this will produce a target VCS history that respects the
CVS dependency graph, so that's good; it puts a strict limit on how
badly whatever heuristics we use can screw us over if they guess wrong
about things.  Also, it makes the problem much simpler -- all the
heuristics are now in step 1, where we are given a bunch of possible
edits, and we have to pick some subset of them to accept next.

This isn't trivial problem.  I think the main thing you want to avoid
is:
1  2  3  4
|  |  |  |
  --o--o--o--o- -- current frontier
|  |  |  |
A  B  A  C
   |
   A
say you have four files named 1, 2, 3, and 4.  We want to
slide the frontier down, and the next edits were originally created by
one of three commits, A, B, or C.  In this situation, we can take
commit B, or we can take commit C, but we don't want to take commit A
until _after_ we have taken commit B -- because otherwise we will end
up splitting A up into two different commits, A1, B, A2.

There are a lot of approaches one could take here, on up to pulling
out a full-on optimal constraint satisfaction system (if we can route
chips, we should be able to pick a good ordering for accepting CVS
edits, after all).  A really simple heuristic, though, would be to
just pick the file whose next commit has the earliest timestamp, then
group in all the other next commits with the same commit message,
and (maybe) a similar timestamp.  I have a suspicion that this
heuristic will work really, really, well in practice.  Also, it's
cheap to apply, and worst case you accidentally split up a commit that
already had wacky timestamps, and we already know that we _have_ to do
that in some cases.

Handling file additions could potentially be slightly tricky in this
model.  I guess it is not so bad, if you model added files as being
present all along (so you never have to add add whole new entries to
the frontier), with each file starting out in a pre-birth state, and
then addition of the file is the first edit performed on top of that,
and you treat these edits like any other edits when considering how to
advance the frontier.

I have no particular idea on how to handle tags and branches here;
I've never actually wrapped my head around CVS's model for those :-).
I'm not seeing any obvious problem with handling them, though.

In this approach, incremental conversion is cheap, easy, and robust --
simply remember what frontier corresponded to the final revision
imported, and restart the process directly at that frontier.


Regarding storing things on disk vs. in memory: we always used to
stress-test monotone's cvs importer with the gcc history; just a few
weeks ago someone did a test import of NetBSD's src repo (~180k
commits) on a desktop with 2 gigs of RAM.  It takes a pretty big
history to really require disk (and for that matter, people with
histories that big likely have a big enough organization that they can
get access to some big iron to run the conversion on -- and probably
will want to anyway, to make it run in reasonable time).

 Now, that can be used to convert from CVS to about anything else. 
 Obviously we were discussing about subversion, but then there was git, 
 too. And monotone.
 
 I'm beginning to question if one could come up with a generally useful 
 cleaned-and-sane-CVS-changeset-dump-format, which could then be used by 
 importers to all sort of VCSes. 

Re: [Monotone-devel] cvs import

2006-09-13 Thread Daniel Carosone
On Wed, Sep 13, 2006 at 03:52:00PM -0700, Nathaniel Smith wrote:
 This isn't trivial problem.  I think the main thing you want to avoid
 is:
 1  2  3  4
 |  |  |  |
   --o--o--o--o- -- current frontier
 |  |  |  |
 A  B  A  C
|
A
 There are a lot of approaches one could take here, on up to pulling
 out a full-on optimal constraint satisfaction system (if we can route
 chips, we should be able to pick a good ordering for accepting CVS
 edits, after all).  A really simple heuristic, though, would be to
 just pick the file whose next commit has the earliest timestamp, then
 group in all the other next commits with the same commit message,
 and (maybe) a similar timestamp.  

Pick the earliest first, or more generally: take all the file commits
immediately below the frontier.  Find revs further below the frontier
(up to some small depth or time limit) on other files that might match
them, based on changelog etc (the same grouping you describe, and we
do now).  Eliminate any of those that are not entirely on the frontier
(ie, have some other revision in the way, as with file 2).  Commit the
remaining set in time order. [*]

If you wind up with an empty set, then you need to split revs, but at
this point you have only conflicting revs on the frontier i.e. you've
already committed all the other revs you can that might have avoided
this need, whereas we currently might be doing this too often).

For time order, you could look at each rev as having a time window,
from the first to last commit matching.  If the revs windows are
non-overlapping, commit them in order.  If the rev windows overlap, at
this point we already know the file changes don't overlap - we *could*
commit these as parallel heads and merge them, to better model the
original developer's overlapping commits.

 Handling file additions could potentially be slightly tricky in this
 model.  I guess it is not so bad, if you model added files as being
 present all along (so you never have to add add whole new entries to
 the frontier), with each file starting out in a pre-birth state, and
 then addition of the file is the first edit performed on top of that,
 and you treat these edits like any other edits when considering how to
 advance the frontier.

CVS allows resurrections too..

 I have no particular idea on how to handle tags and branches here;
 I've never actually wrapped my head around CVS's model for those :-).
 I'm not seeing any obvious problem with handling them, though.

Tags could be modelled as another 'event' in the file graph, like a
commit. If your frontier advances through both revisions and a 'tag
this revision' event, the same sequencing as above would work. If tags
had been moved, this would wind up with a sequence whereby commits
interceded with tagging, and we'd need to split the commits such that
we could end up with a revision matching the tagged content.

 In this approach, incremental conversion is cheap, easy, and robust --
 simply remember what frontier corresponded to the final revision
 imported, and restart the process directly at that frontier.

Hm. Except for the tagging idea above, because tags can be applied
behind a live cvs frontier.

--
Dan.


pgpGMTUVW8nis.pgp
Description: PGP signature
___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] cvs import

2006-09-13 Thread Daniel Carosone
On Thu, Sep 14, 2006 at 09:21:39AM +1000, Daniel Carosone wrote:
  I have no particular idea on how to handle tags and branches here;
  I've never actually wrapped my head around CVS's model for those :-).
  I'm not seeing any obvious problem with handling them, though.
 
 Tags could be modelled as another 'event' in the file graph, like a
 commit. If your frontier advances through both revisions and a 'tag
 this revision' event, the same sequencing as above would work.

Likewise, if we had file branched events in the file lifeline (based
on the rcs id's), then we would be sure to always have a monotone
revision that corresponded to the branching event, where we could
attach the revisions in the branch.

Because we can't split tags, and can't split branch events, we will
end up splitting file commits (down to individual commits per file) in
order to arrive at the revisions we need for those.

Because tags and branches can be across subsets of the tree, we gain
some scheduling flexibility about where in the reconstructed sequence
they can come.

Many well-managed CVS repositories will use good practices, such as
having a branch base tag.  If they do, then they will help this
algorithm produce correct results.

Once we have a branch with a base starting revision, we can pretty
much treat it independently from there: make a whole new set of file
lifelines along the RCS branches and a new frontier for it.

--
Dan.


pgpHeFQtuX3V5.pgp
Description: PGP signature
___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] cvs import

2006-09-13 Thread Jon Smirl

On 9/13/06, Nathaniel Smith [EMAIL PROTECTED] wrote:

On Wed, Sep 13, 2006 at 04:42:01PM -0700, Keith Packard wrote:
 However, this means that parsecvs must hold the entire tree state in
 memory, which turned out to be its downfall with large repositories.
 Worked great for all of X.org, not so good with Mozilla.

Does anyone know how big Mozilla (or other humonguous repos, like KDE)
are, in terms of number of files?


Mozilla is 120,000 files. The complexity comes from 10 years worth of
history. A few of the files have around 1,700 revisions. There are
about 1,600 branches and 1,000 tags. The branch number is inflated
because cvs2svn is generating extra branches, the real number is
around 700. The CVS repo takes 4.2GB disk space. cvs2svn turns this
into 250,000 commits over about 1M unique revisions.



A few numbers for repositories I had lying around:
  Linux kernel -- ~21,000
  gcc -- ~42,000
  NetBSD src repo -- ~100,000
  uClinux distro -- ~110,000

These don't seem very indimidating... even if it takes an entire
kilobyte per CVS revision to store the information about it that we
need to make decisions about how to move the frontier... that's only
110 megabytes for the largest of these repos.  The frontier sweeping
algorithm only _needs_ to have available the current frontier, and the
current frontier+1.  Storing information on every version of every
file in memory might be worse; but since the algorithm accesses this
data in a linear way, it'd be easy enough to stick those in a
lookaside table on disk if really necessary, like a bdb or sqlite file
or something.

(Again, in practice storing all the metadata for the entire 180k
revisions of the 100k files in the netbsd repo was possible on a
desktop.  Monotone's cvs_import does try somewhat to be frugal about
memory, though, interning strings and suchlike.)

-- Nathaniel

--
When the flush of a new-born sun fell first on Eden's green and gold,
Our father Adam sat under the Tree and scratched with a stick in the mould;
And the first rude sketch that the world had seen was joy to his mighty heart,
Till the Devil whispered behind the leaves, It's pretty, but is it Art?
  -- The Conundrum of the Workshops, Rudyard Kipling
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html




--
Jon Smirl
[EMAIL PROTECTED]


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] cvs import

2006-09-13 Thread Daniel Carosone
On Wed, Sep 13, 2006 at 08:57:33PM -0400, Jon Smirl wrote:
 Mozilla is 120,000 files. The complexity comes from 10 years worth of
 history. A few of the files have around 1,700 revisions. There are
 about 1,600 branches and 1,000 tags. The branch number is inflated
 because cvs2svn is generating extra branches, the real number is
 around 700. The CVS repo takes 4.2GB disk space. cvs2svn turns this
 into 250,000 commits over about 1M unique revisions.

Those numbers are pretty close to those in the NetBSD repository, and
between them these probably represent just about the most extensive
public CVS test data available. 

I've only done imports of individual top-level dirs (what used to be
modules), like src and pkgsrc, because they're used independently and
don't really overlap.

src had about 180k commits over 1M versions of 120k files, 1000 tags
and 260 branches. pkgsrc had 110k commits over about half as many
files and versions thereof.  We too have a few hot files, one had
13,625 revisions.  xsrc adds a bunch more files and content, but not
many versions; that's mostly vendor branches and only some local
changes.  Between them the cvs ,v files take up 4.7G covering about 13
years of history.

One thing that was interesting was that src used to be several
different modules, but we rearranged the repository at one point to
match the checkout structure these modules produced (combining them
all under the src dir).  This doesn't seem to have upset the import at
all.  Just about every other form of CVS evil has been perpetrated in
this repository at some stage or other too, but always very carefully.

--
Dan.


pgp0TPIIjcCvI.pgp
Description: PGP signature
___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] cvs import

2006-09-13 Thread Shawn Pearce
Daniel Carosone [EMAIL PROTECTED] wrote:
 On Wed, Sep 13, 2006 at 08:57:33PM -0400, Jon Smirl wrote:
  Mozilla is 120,000 files. The complexity comes from 10 years worth of
  history. A few of the files have around 1,700 revisions. There are
  about 1,600 branches and 1,000 tags. The branch number is inflated
  because cvs2svn is generating extra branches, the real number is
  around 700. The CVS repo takes 4.2GB disk space. cvs2svn turns this
  into 250,000 commits over about 1M unique revisions.
 
 Those numbers are pretty close to those in the NetBSD repository, and
 between them these probably represent just about the most extensive
 public CVS test data available. 

I don't know exactly how big it is but the Gentoo CVS repository
is also considered to be very large (about the size of the Mozilla
repository) and just as difficult to import.  Its either crashed or
taken about a month to process with the current Git CVS-Git tools.

Since I know that the bulk of the Gentoo CVS repository is the
portage tree I did a quick find|wc -l in my /usr/portage; its about
124,500 files.

Its interesting that Gentoo has almost as large of a repository given
that its such a young project, compared to NetBSD and Mozilla.  :-)

-- 
Shawn.


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] cvs import

2006-09-13 Thread Shawn Pearce
Keith Packard [EMAIL PROTECTED] wrote:
 On Wed, 2006-09-13 at 15:52 -0700, Nathaniel Smith wrote:
 
  Regarding the basic dependency-based algorithm, the approach of
  throwing everything into blobs and then trying to tease them apart
  again seems backwards.  What I'm thinking is, first we go through and
  build the history graph for each file.  Now, advance a frontier across
  the all of these graphs simultaneously.  Your frontier is basically a
  map filename - CVS revision, that represents a tree snapshot. 
 
 Parsecvs does this, except backwards from now into the past; I found it
 easier to identify merge points than branch points (Oh, look, these two
 branches are the same now, they must have merged).

Why not let Git do that?  If two branches are the same in CVS then
shouldn't they have the same tree SHA1 in Git?  Surely comparing
20 bytes of SHA1 is faster than almost any other comparsion...
 
 However, this means that parsecvs must hold the entire tree state in
 memory, which turned out to be its downfall with large repositories.
 Worked great for all of X.org, not so good with Mozilla.

Any chance that can be paged in on demand from some sort of work
file?  git-fast-import hangs onto a configurable number of tree
states (default of 5) but keeps them in an LRU chain and dumps the
ones that aren't current.

-- 
Shawn.


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] cvs import

2006-09-13 Thread Daniel Carosone
On Wed, Sep 13, 2006 at 10:30:17PM -0400, Shawn Pearce wrote:
 I don't know exactly how big it is but the Gentoo CVS repository
 is also considered to be very large (about the size of the Mozilla
 repository) and just as difficult to import.  Its either crashed or
 taken about a month to process with the current Git CVS-Git tools.

Ah, thanks for the tip.

 Since I know that the bulk of the Gentoo CVS repository is the
 portage tree I did a quick find|wc -l in my /usr/portage; its about
 124,500 files.
 
 Its interesting that Gentoo has almost as large of a repository given
 that its such a young project, compared to NetBSD and Mozilla.  :-)

Portage uses files and thus CVS very differently, though.  Each ebuild
for each package revision of each version of a third-party package
(like, say, monotone 0.28 and 0.29, and -r1, -r2 pkg bumps of those if
they were needed) is its own file that's added, maybe edited a couple
of times, and then deleted again later as new versions are added and
older ones retired.  These are copies and renames in the workspace,
but are invisible to CVS.  This uses up lots more files than a single
long-lived build that gets edited each time; the Attic dirs must have
huge numbers of files, way beyond the number that are live now.

This lets portage keep builds around in a HEAD checkout for multiple
versions at once, tagged internally with different statuses.
Effectively, these tags take the place of VCS-based branches and
releases, and are more flexible for end users tracking their favourite
applications while keeping the rest of their system stable.

If they had a VCS that supported file cloning and/or renaming, and
used that to follow history between these ebuild files, things would
be very different. There are some interesting use cases for VCS tools
in supporting this behaviour nicely, too.  

--
Dan.

pgpnc37Mmz5Hp.pgp
Description: PGP signature
___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] CVS import errors

2005-08-26 Thread Måns Rullgård
Måns Rullgård [EMAIL PROTECTED] writes:

 I'm trying to import a CVS repository into monotone.  All goes
 seemingly well, in that there are no warnings or error messages.
 However, when I check it out, I notice that a lot of the files are old
 versions, and some are missing altogether.  The set is not consistent
 with any point in the past, either.  If I import only a subset of the
 repository (a few files), I get different versions, sometimes even the
 latest.

 I reported this to the bug tracker a week ago, but it appears to have
 gone unnoticed there.  For reference, the report there is at URL
 https://savannah.nongnu.org/bugs/?func=detailitemitem_id=14151, where
 I also attached the failing repo.

Please, could someone at least comment on this?  Or should I be
looking for a replacement for monotone?

-- 
Måns Rullgård
[EMAIL PROTECTED]


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] CVS import errors

2005-08-26 Thread Måns Rullgård
Nathaniel Smith [EMAIL PROTECTED] writes:

 On Fri, Aug 26, 2005 at 08:36:01AM +0100, M?ns Rullg?rd wrote:
 M?ns Rullg?rd [EMAIL PROTECTED] writes:
 
  I'm trying to import a CVS repository into monotone.  All goes
  seemingly well, in that there are no warnings or error messages.
  However, when I check it out, I notice that a lot of the files are old
  versions, and some are missing altogether.  The set is not consistent
  with any point in the past, either.  If I import only a subset of the
  repository (a few files), I get different versions, sometimes even the
  latest.
 
  I reported this to the bug tracker a week ago, but it appears to have
  gone unnoticed there.  For reference, the report there is at URL
  https://savannah.nongnu.org/bugs/?func=detailitemitem_id=14151, where
  I also attached the failing repo.
 
 Please, could someone at least comment on this?  Or should I be
 looking for a replacement for monotone?

 Sorry about that.

 Unfortunately, the answer is yes, it seems to be broken; but, as
 you've seen, no-one seems to have time to look at it ATM :-/.

 Other options are to use Tailor:
http://www.darcs.net/DarcsWiki/Tailor
 or to check out and build the net.venge.monotone.cvssync branch, which
 is a version of monotone with a different, incremental CVS importer
 built in.

 Given that this repo seems to have been converted from BK (and I'm
 suspicious that this might be related to our problems importing it,
 CVS files have ill-defined structure in some ways and it's possible
 that bkcvs is generating something that CVS can read but would never
 itself produce), you might have some luck writing a script based on
 tridge's sourcepuller program.  In principle, this could preserve
 the full merge history graph, rather than the degraded linearization
 bkcvs produces.  The Xaraya folks might have some insight into good
 ways to go straight BK-monotone.

I didn't know it was possible to go from BK directly to anything
else.  Thanks for the pointers.

 As for other systems, your best bet is probably SVN; svn2cvs is the
 only CVS converter that can do better than the above options (except,
 possibly, for some unreleased software that Canonical uses).

I don't like SVN, being all centralized and that.

-- 
Måns Rullgård
[EMAIL PROTECTED]


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel