I thought the general consensus at minimum was to investigate a git mirror that stripped some artifacts out (jars etc) to lighten up the work of the process. If at some point the project switched to git, such a mirror might be a suitable git repo for the project with archived older versions in SVN.
I think probably what is lacking is a volunteer to figure it all out. -Doug On Tue, Dec 15, 2015 at 11:32 AM, Mark Miller <markrmil...@gmail.com> wrote: > Anyone willing to lead this discussion to some kind of better resolution? > Did that whole back and forth help with any ideas on the best path forward? > I know it's a complicated issue, git / svn, the light side, the dark side, > but doesn't GitHub also depend on this mirroring? It's going to be super > annoying when I can no longer pull from a relatively up to date git remote. > > Who has boiled down the correct path? > > - Mark > > On Wed, Dec 9, 2015 at 6:07 AM Dawid Weiss <dawid.we...@gmail.com> wrote: > >> FYI. >> >> - All of Lucene's SVN, incremental deltas, uncompressed: 5.0G >> - the above, tar.bz2: 1.2G >> >> Sadly, I didn't succeed at recreating a local SVN repo from those >> incremental dumps. svnadmin load fails with a cryptic error related to >> the fact that revision number of node-copy operations refer to >> original SVN numbers and they're apparently renumbered on import. >> svnadmin isn't smart enough to somehow keep a reference of those >> original numbers and svndumpfilter can't work with incremental dump >> files... A seemingly trivial task of splitting a repo on a clean >> boundary seems incredibly hard with SVN... >> >> If anybody wishes to play with the dump files, here they are: >> http://goo.gl/m6q3J8 >> >> Dawid >> >> On Tue, Dec 8, 2015 at 10:49 PM, Upayavira <u...@odoko.co.uk> wrote: >> > You can't avoid having the history in SVN. The ASF has one large repo, >> and >> > won't be deleting that repo, so the history will survive in perpetuity, >> > regardless of what we do now. >> > >> > Upayavira >> > >> > On Tue, Dec 8, 2015, at 09:24 PM, Doug Turnbull wrote: >> > >> > It seems you'd want to preserve that history in a frozen/archiced >> Apache Svn >> > repo for Lucene. Then make the new git repo slimmer before switching. >> Folks >> > that want very old versions or doing research can at least go through >> the >> > original SVN repo. >> > >> > On Tuesday, December 8, 2015, Dawid Weiss <dawid.we...@gmail.com> >> wrote: >> > >> > One more thing, perhaps of importance, the raw Lucene repo contains >> > all the history of projects that then turned top-level (Nutch, >> > Mahout). These could also be dropped (or ignored) when converting to >> > git. If we agree JARs are not relevant, why should projects not >> > directly related to Lucene/ Solr be? >> > >> > Dawid >> > >> > On Tue, Dec 8, 2015 at 10:05 PM, Dawid Weiss <dawid.we...@gmail.com> >> wrote: >> >>> Don’t know how much we have of historic jars in our history. >> >> >> >> I actually do know. Or will know. In about ~10 hours. I wrote a script >> >> that does the following: >> >> >> >> 1) git log all revisions touching >> https://svn.apache.org/repos/asf/lucene >> >> 2) grep revision numbers >> >> 3) use svnrdump to get every single commit (revision) above, in >> >> incremental mode. >> >> >> >> This will allow me to: >> >> >> >> 1) recreate only Lucene/ Solr SVN, locally. >> >> 2) measure the size of SVN repo. >> >> 3) measure the size of any conversion to git (even if it's one-by-one >> >> checkout, then-sync with git). >> >> >> >> From what I see up until now size should not be an issue at all. Even >> >> with all binary blobs so far the SVN incremental dumps measure ~3.7G >> >> (and I'm about 75% done). There is one interesting super-large commit, >> >> this one: >> >> >> >> svn log -r1240618 https://svn.apache.org/repos/asf/lucene >> >> >> ------------------------------------------------------------------------ >> >> r1240618 | gsingers | 2012-02-04 22:45:17 +0100 (Sat, 04 Feb 2012) | 1 >> >> line >> >> >> >> LUCENE-2748: bring in old Lucene docs >> >> >> >> This commit diff weights... wait for it... 1.3G! I didn't check what >> >> it actually was. >> >> >> >> Will keep you posted. >> >> >> >> D. >> > >> > --------------------------------------------------------------------- >> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> > For additional commands, e-mail: dev-h...@lucene.apache.org >> > >> > >> > >> > >> > -- >> > Doug Turnbull | Search Relevance Consultant | OpenSource Connections, >> LLC | >> > 240.476.9983 >> > Author:Relevant Search >> > This e-mail and all contents, including attachments, is considered to be >> > Company Confidential unless explicitly stated otherwise, regardless of >> > whether attachments are marked as such. >> > >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >> -- > - Mark > about.me/markrmiller > -- *Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections <http://opensourceconnections.com>, LLC | 240.476.9983 Author: Relevant Search <http://manning.com/turnbull> This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.