(I apologize for top-posting, I still haven’t figured out how to fix my email 
client)

 

There’s nearly 94k commits in the git repo, and I expect the hg repo has that 
same number. It’s a tad more than 10,000.

 

I’ll definitely take a look at that tool; my main weakness is that I don’t know 
hg commands or similar, but comparing separate commits is most definitely 
better.

 

@Ethan: I meant that I would write all the output to a file for comparison, but 
apparently that’s not a very good idea, so here I drop it instead.

 

I’ll look at the tool and see what I can do. I’ll try to document my findings 
if I can’t come up with a good solution, and probably even if I do.

 

Cheers,

-Emanuel

 

From: Senthil Kumaran [mailto:sent...@uthcode.com] 
Sent: Sunday, May 08, 2016 8:43 PM
To: Émanuel Barry
Cc: core-workflow
Subject: Re: [core-workflow] Some questions

 

Hi Émanuel,

 

On Sun, May 8, 2016 at 4:40 PM, Émanuel Barry <vgr...@live.ca 
<mailto:vgr...@live.ca> > wrote:

Take each X commit (say, every 100th or 1000th commit, or even every commit if 
we decide to be insane^Wprecise), store hashes of all files at that revision 
with possibly the file tree, in a .py file as a list or dict, or json or 
anything you prefer. Then I upload it for you to look at and you can compare 
with the mercurial repo. Or we run the same script on the mercurial repo and 
compare the resulting files.

 

If we store anything externally, that could start limiting us.

 

I looked at the problem in this angle - final cpython git repo has ~10000 
commits in master branch. That's not a large number to deal with. The orginal 
hg repo should have exact number of commits. We have to do a diff between each 
of these commits, including merge commits. and check if contents of those 
commits are same, if we encounter anything where git-repo differs in content or 
history from hg-repo, we alert and fail.

 

Since this is a history checking operation and we could complete this in 
O(minutes) or ~1 hour to validate the repos. This will give us confidence on 
the migration, and will help us evaluate multiple hg -> git repos that have 
been migrated at different points in time.

 

This feature will go in this tool: 
https://github.com/orsenthil/cpython-hg-to-git , which we will use to migrate, 
sync, and validate hg->git repos.

If interested, you could research for efficient way to do the above operation 
and submit a pull request against that tool.

 

HTH,

Senthil

 

 

 

 

_______________________________________________
core-workflow mailing list
core-workflow@python.org
https://mail.python.org/mailman/listinfo/core-workflow
This list is governed by the PSF Code of Conduct: 
https://www.python.org/psf/codeofconduct

Reply via email to