Anyone willing to lead this discussion to some kind of better resolution? Did that whole back and forth help with any ideas on the best path forward? I know it's a complicated issue, git / svn, the light side, the dark side, but doesn't GitHub also depend on this mirroring? It's going to be super annoying when I can no longer pull from a relatively up to date git remote.
Who has boiled down the correct path? - Mark On Wed, Dec 9, 2015 at 6:07 AM Dawid Weiss <dawid.we...@gmail.com> wrote: > FYI. > > - All of Lucene's SVN, incremental deltas, uncompressed: 5.0G > - the above, tar.bz2: 1.2G > > Sadly, I didn't succeed at recreating a local SVN repo from those > incremental dumps. svnadmin load fails with a cryptic error related to > the fact that revision number of node-copy operations refer to > original SVN numbers and they're apparently renumbered on import. > svnadmin isn't smart enough to somehow keep a reference of those > original numbers and svndumpfilter can't work with incremental dump > files... A seemingly trivial task of splitting a repo on a clean > boundary seems incredibly hard with SVN... > > If anybody wishes to play with the dump files, here they are: > http://goo.gl/m6q3J8 > > Dawid > > On Tue, Dec 8, 2015 at 10:49 PM, Upayavira <u...@odoko.co.uk> wrote: > > You can't avoid having the history in SVN. The ASF has one large repo, > and > > won't be deleting that repo, so the history will survive in perpetuity, > > regardless of what we do now. > > > > Upayavira > > > > On Tue, Dec 8, 2015, at 09:24 PM, Doug Turnbull wrote: > > > > It seems you'd want to preserve that history in a frozen/archiced Apache > Svn > > repo for Lucene. Then make the new git repo slimmer before switching. > Folks > > that want very old versions or doing research can at least go through the > > original SVN repo. > > > > On Tuesday, December 8, 2015, Dawid Weiss <dawid.we...@gmail.com> wrote: > > > > One more thing, perhaps of importance, the raw Lucene repo contains > > all the history of projects that then turned top-level (Nutch, > > Mahout). These could also be dropped (or ignored) when converting to > > git. If we agree JARs are not relevant, why should projects not > > directly related to Lucene/ Solr be? > > > > Dawid > > > > On Tue, Dec 8, 2015 at 10:05 PM, Dawid Weiss <dawid.we...@gmail.com> > wrote: > >>> Don’t know how much we have of historic jars in our history. > >> > >> I actually do know. Or will know. In about ~10 hours. I wrote a script > >> that does the following: > >> > >> 1) git log all revisions touching > https://svn.apache.org/repos/asf/lucene > >> 2) grep revision numbers > >> 3) use svnrdump to get every single commit (revision) above, in > >> incremental mode. > >> > >> This will allow me to: > >> > >> 1) recreate only Lucene/ Solr SVN, locally. > >> 2) measure the size of SVN repo. > >> 3) measure the size of any conversion to git (even if it's one-by-one > >> checkout, then-sync with git). > >> > >> From what I see up until now size should not be an issue at all. Even > >> with all binary blobs so far the SVN incremental dumps measure ~3.7G > >> (and I'm about 75% done). There is one interesting super-large commit, > >> this one: > >> > >> svn log -r1240618 https://svn.apache.org/repos/asf/lucene > >> ------------------------------------------------------------------------ > >> r1240618 | gsingers | 2012-02-04 22:45:17 +0100 (Sat, 04 Feb 2012) | 1 > >> line > >> > >> LUCENE-2748: bring in old Lucene docs > >> > >> This commit diff weights... wait for it... 1.3G! I didn't check what > >> it actually was. > >> > >> Will keep you posted. > >> > >> D. > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > > For additional commands, e-mail: dev-h...@lucene.apache.org > > > > > > > > > > -- > > Doug Turnbull | Search Relevance Consultant | OpenSource Connections, > LLC | > > 240.476.9983 > > Author:Relevant Search > > This e-mail and all contents, including attachments, is considered to be > > Company Confidential unless explicitly stated otherwise, regardless of > > whether attachments are marked as such. > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > -- - Mark about.me/markrmiller