I thought the general consensus at minimum was to investigate a git mirror
that stripped some artifacts out (jars etc) to lighten up the work of the
process. If at some point the project switched to git, such a mirror might
be a suitable git repo for the project with archived older versions in SVN.

I think probably what is lacking is a volunteer to figure it all out.

-Doug

On Tue, Dec 15, 2015 at 11:32 AM, Mark Miller <markrmil...@gmail.com> wrote:

> Anyone willing to lead this discussion to some kind of better resolution?
> Did that whole back and forth help with any ideas on the best path forward?
> I know it's a complicated issue, git / svn, the light side, the dark side,
> but doesn't GitHub also depend on this mirroring? It's going to be super
> annoying when I can no longer pull from a relatively up to date git remote.
>
> Who has boiled down the correct path?
>
> - Mark
>
> On Wed, Dec 9, 2015 at 6:07 AM Dawid Weiss <dawid.we...@gmail.com> wrote:
>
>> FYI.
>>
>> - All of Lucene's SVN, incremental deltas, uncompressed: 5.0G
>> - the above, tar.bz2: 1.2G
>>
>> Sadly, I didn't succeed at recreating a local SVN repo from those
>> incremental dumps. svnadmin load fails with a cryptic error related to
>> the fact that revision number of node-copy operations refer to
>> original SVN numbers and they're apparently renumbered on import.
>> svnadmin isn't smart enough to somehow keep a reference of those
>> original numbers and svndumpfilter can't work with incremental dump
>> files... A seemingly trivial task of splitting a repo on a clean
>> boundary seems incredibly hard with SVN...
>>
>> If anybody wishes to play with the dump files, here they are:
>> http://goo.gl/m6q3J8
>>
>> Dawid
>>
>> On Tue, Dec 8, 2015 at 10:49 PM, Upayavira <u...@odoko.co.uk> wrote:
>> > You can't avoid having the history in SVN. The ASF has one large repo,
>> and
>> > won't be deleting that repo, so the history will survive in perpetuity,
>> > regardless of what we do now.
>> >
>> > Upayavira
>> >
>> > On Tue, Dec 8, 2015, at 09:24 PM, Doug Turnbull wrote:
>> >
>> > It seems you'd want to preserve that history in a frozen/archiced
>> Apache Svn
>> > repo for Lucene. Then make the new git repo slimmer before switching.
>> Folks
>> > that want very old versions or doing research can at least go through
>> the
>> > original SVN repo.
>> >
>> > On Tuesday, December 8, 2015, Dawid Weiss <dawid.we...@gmail.com>
>> wrote:
>> >
>> > One more thing, perhaps of importance, the raw Lucene repo contains
>> > all the history of projects that then turned top-level (Nutch,
>> > Mahout). These could also be dropped (or ignored) when converting to
>> > git. If we agree JARs are not relevant, why should projects not
>> > directly related to Lucene/ Solr be?
>> >
>> > Dawid
>> >
>> > On Tue, Dec 8, 2015 at 10:05 PM, Dawid Weiss <dawid.we...@gmail.com>
>> wrote:
>> >>> Don’t know how much we have of historic jars in our history.
>> >>
>> >> I actually do know. Or will know. In about ~10 hours. I wrote a script
>> >> that does the following:
>> >>
>> >> 1) git log all revisions touching
>> https://svn.apache.org/repos/asf/lucene
>> >> 2) grep revision numbers
>> >> 3) use svnrdump to get every single commit (revision) above, in
>> >> incremental mode.
>> >>
>> >> This will allow me to:
>> >>
>> >> 1) recreate only Lucene/ Solr SVN, locally.
>> >> 2) measure the size of SVN repo.
>> >> 3) measure the size of any conversion to git (even if it's one-by-one
>> >> checkout, then-sync with git).
>> >>
>> >> From what I see up until now size should not be an issue at all. Even
>> >> with all binary blobs so far the SVN incremental dumps measure ~3.7G
>> >> (and I'm about 75% done). There is one interesting super-large commit,
>> >> this one:
>> >>
>> >> svn log -r1240618 https://svn.apache.org/repos/asf/lucene
>> >>
>> ------------------------------------------------------------------------
>> >> r1240618 | gsingers | 2012-02-04 22:45:17 +0100 (Sat, 04 Feb 2012) | 1
>> >> line
>> >>
>> >> LUCENE-2748: bring in old Lucene docs
>> >>
>> >> This commit diff weights... wait for it... 1.3G! I didn't check what
>> >> it actually was.
>> >>
>> >> Will keep you posted.
>> >>
>> >> D.
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> > For additional commands, e-mail: dev-h...@lucene.apache.org
>> >
>> >
>> >
>> >
>> > --
>> > Doug Turnbull | Search Relevance Consultant | OpenSource Connections,
>> LLC |
>> > 240.476.9983
>> > Author:Relevant Search
>> > This e-mail and all contents, including attachments, is considered to be
>> > Company Confidential unless explicitly stated otherwise, regardless of
>> > whether attachments are marked as such.
>> >
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>> --
> - Mark
> about.me/markrmiller
>



-- 
*Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections
<http://opensourceconnections.com>, LLC | 240.476.9983
Author: Relevant Search <http://manning.com/turnbull>
This e-mail and all contents, including attachments, is considered to be
Company Confidential unless explicitly stated otherwise, regardless
of whether attachments are marked as such.

Reply via email to