You can't avoid having the history in SVN. The ASF has one large repo,
and won't be deleting that repo, so the history will survive in
perpetuity, regardless of what we do now.

Upayavira

On Tue, Dec 8, 2015, at 09:24 PM, Doug Turnbull wrote:
> It seems you'd want to preserve that history in a frozen/archiced
> Apache Svn repo for Lucene. Then make the new git repo slimmer before
> switching. Folks that want very old versions or doing research can at
> least go through the original SVN repo.
>
> On Tuesday, December 8, 2015, Dawid Weiss
> <dawid.we...@gmail.com> wrote:
>> One more thing, perhaps of importance, the raw Lucene repo contains
>>
all the history of projects that then turned top-level (Nutch,
>>
Mahout). These could also be dropped (or ignored) when converting to
>>
git. If we agree JARs are not relevant, why should projects not
>>
directly related to Lucene/ Solr be?
>>
>>
Dawid
>>
>>
On Tue, Dec 8, 2015 at 10:05 PM, Dawid Weiss <dawid.we...@gmail.com> wrote:
>>
>> Don’t know how much we have of historic jars in our history.
>>
>
>>
> I actually do know. Or will know. In about ~10 hours. I wrote a script
>>
> that does the following:
>>
>
>>
> 1) git log all revisions touching
>    https://svn.apache.org/repos/asf/lucene
>>
> 2) grep revision numbers
>>
> 3) use svnrdump to get every single commit (revision) above, in
>>
> incremental mode.
>>
>
>>
> This will allow me to:
>>
>
>>
> 1) recreate only Lucene/ Solr SVN, locally.
>>
> 2) measure the size of SVN repo.
>>
> 3) measure the size of any conversion to git (even if it's one-by-one
>>
> checkout, then-sync with git).
>>
>
>>
> From what I see up until now size should not be an issue at all. Even
>>
> with all binary blobs so far the SVN incremental dumps measure ~3.7G
>>
> (and I'm about 75% done). There is one interesting super-large commit,
>>
> this one:
>>
>
>>
> svn log -r1240618 https://svn.apache.org/repos/asf/lucene
>>
> ----------------------------------------------------------------
> --------
>>
> r1240618 | gsingers | 2012-02-04 22:45:17 +0100 (Sat, 04 Feb 2012)
> | 1 line
>>
>
>>
> LUCENE-2748: bring in old Lucene docs
>>
>
>>
> This commit diff weights... wait for it... 1.3G! I didn't check what
>>
> it actually was.
>>
>
>>
> Will keep you posted.
>>
>
>>
> D.
>>
>>
---------------------------------------------------------------------
>>
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>
For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>
>
> --
> *Doug Turnbull **| *Search Relevance Consultant | OpenSource
> Connections[1], LLC | 240.476.9983 Author:Relevant Search[2] This e-
> mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless of
> whether attachments are marked as such.



Links:

  1. http://opensourceconnections.com
  2. http://manning.com/turnbull

Reply via email to