On Wed, 27 Nov 2019, Maxim Kuvyrkov wrote: > IMO, we should aim to convert complete SVN history frozen at a specific > point. So that if we don't want to convert some of the branches or tags > to git, then we should delete them from SVN repository before > conversion.
Sure, we could do that. Eric, can you confirm that, with current reposurgeon, if a branch or tag was deleted in SVN and does not appear in the final revision of /branches or /tags, it should not appear in the resulting converted repository, so that any cases where reposurgeon fails to reflect such a deletion-in-SVN should be reported as a reposurgeon bug? And that the same applies where a branch or tag was renamed - that only the new name, not the old one, should appear in the converted repository? There are quite a few deletions in gcc.lift for tags that do not actually appear in /tags in the current SVN repository, but I'm not sure how many are actually relevant with current reposurgeon. Then there is the policy question to resolve about which tags we actually want to keep (I'm presuming we want to keep essentially all branches by default, absent any clear malformations that make us decide to delete a given branch). Tags in the following classes make up the bulk of those currently in the SVN repository, and right now are deleted in gcc.lift under the comment "Tentative tag removals - might be backed out": (a) Tags for branchpoints and tracking merge status - largely from the CVS era when they were needed to work around the lack of a global version number for the whole source tree, but also from the early SVN era when there were global revision numbers but SVN merge tracking hadn't yet been implemented. (b) Tags for vendor-branch imports of third-party software (maybe tagging just part of the source tree, in the CVS era). That's the GC_*, LIBTOOL_* and ZLIB_* tags, for example. (c) Tags for other (non-GCC) projects, from when ,v files were linked between different repositories (most of those have already been deleted in SVN, however). (d) Tags for snapshots (most of those have already been deleted in SVN, however). (e) Tags with the same names as branches (at least three of them). Formally that's valid in git just as it is in SVN - there's no conflict between refs/heads/<name> and refs/tags/<name> - but in practice it's confusing and such tags were surely created by mistake. (f) Tags for distributor releases as opposed to official GCC / EGCS releases. All of the above except for (f) are not particularly idiomatic to have in git, simply because git tracks merges and a global commit identifier can be used to refer to a snapshot etc. rather than needing a tag. (Because SVN does merge tracking, I don't think many tags in the classes (a) through (e) have been created recently in SVN either - they are generally from the CVS or early SVN eras, or created by mistake.) So what do people think about tags in the above classes? Should some or all of those be deleted in SVN? (Note: my checks of parents for cvs2svn-created tags were limited to those currently kept by gcc.lift. If we decide to keep more tags from the cvs2svn era I'll need to run further checks of tag parents.) -- Joseph S. Myers jos...@codesourcery.com