Re: Lucene/Solr git mirror will soon turn off

Mark Miller Fri, 18 Dec 2015 09:26:57 -0800

I've filed https://issues.apache.org/jira/browse/LUCENE-6937 as a parent
issue to discuss and work through a migration.


I'm going to assume we are going to go ahead with this until someone steps
up and says otherwise. So far we seem to have consensus. In any case, that
JIRA is probably the best place to voice dissent.

With the complete Git repo, we still have to look at the build and any
other implications. Once that is done, we should probably open an INFRA
JIRA issue to start discussing what the INFRA team needs from us to
complete a migration.

- Mark

On Fri, Dec 18, 2015 at 12:05 PM Dawid Weiss <dawid.we...@gmail.com> wrote:

>
> I've made some comments about the conversion process here:
>
> https://issues.apache.org/jira/browse/LUCENE-6933?focusedCommentId=15064208#comment-15064208
>
> Feel free to try it out.
> https://github.com/dweiss/lucene-solr-svn2git
>
> I don't know what the next steps are. This looks like a good starting
> point to switch over to git with all the development? The only thing I
> still plan on doing is getting rid of a few large binary blobs in
> historical resources, but even without it this seems acceptable size-wise
> (~200mb).
>
> Dawid
>
>
>
> On Thu, Dec 17, 2015 at 9:13 AM, Dawid Weiss <dawid.we...@gmail.com>
> wrote:
>
>>
>> > The question I had (I am sure a very dumb one): WHY do we care about 
>> > history
>> preserved perfectly in Git?
>>
>> For me it's for sentimental, archival and task-challenge reasons.
>> Robert's requirement is that git praise/blame/log works and on a given file
>> and shows its true history of changes. Everyone has his own reasons I
>> guess. If the initial clone is small enough then I see no problem in
>> keeping the history if we can preserve it.
>>
>> Dawid
>>
>>
>>
>> On Thu, Dec 17, 2015 at 4:52 AM, david.w.smi...@gmail.com <
>> david.w.smi...@gmail.com> wrote:
>>
>>> +1 totally agree.  Any way; the bloat should largely be the binaries &
>>> unrelated projects, not code (small text files).
>>>
>>> On Wed, Dec 16, 2015 at 10:36 PM Doug Turnbull <
>>> dturnb...@opensourceconnections.com> wrote:
>>>
>>>> In defense of more history immediately available--it is often far more
>>>> useful to poke around code history/run blame to figure out some code than
>>>> by taking it at face value. Putting this in a secondary place like
>>>> Apache SVN repo IMO reduces the readability of the code itself. This is
>>>> doubly true for new developers that won't know about Apache's SVN. And
>>>> Lucene can be quite intricate code. Further in my own work poking around in
>>>> github mirrors I frequently hit the current cutoff. Which is one reason I
>>>> stopped using them for anything but the casual investigation.
>>>>
>>>> I'm not totally against a cutoff point, but I'd advocate for exhausting
>>>> other options first, such as trimming out unrelated projects, binaries, 
>>>> etc.
>>>>
>>>> -Doug
>>>>
>>>>
>>>> On Wednesday, December 16, 2015, Shawn Heisey <apa...@elyograg.org>
>>>> wrote:
>>>>
>>>>> On 12/16/2015 5:53 PM, Alexandre Rafalovitch wrote:
>>>>> > On 16 December 2015 at 00:44, Dawid Weiss <dawid.we...@gmail.com>
>>>>> wrote:
>>>>> >> 4) The size of JARs is really not an issue. The entire SVN repo I
>>>>> mirrored
>>>>> >> locally (including empty interim commits to cater for
>>>>> svn:mergeinfos) is 4G.
>>>>> >> If you strip the stuff like javadocs and side projects (Nutch,
>>>>> Tika, Mahout)
>>>>> >> then I bet the entire history can fit in 1G total. Of course
>>>>> stripping JARs
>>>>> >> is also doable.
>>>>> > I think this answered one of the issues. So, this is not something
>>>>> to focus on.
>>>>> >
>>>>> > The question I had (I am sure a very dumb one): WHY do we care about
>>>>> > history preserved perfectly in Git? Because that seems to be the real
>>>>> > bottleneck now. Does anybody still checks out an intermediate commit
>>>>> > in Solr 1.4 branch?
>>>>>
>>>>> I do not think we need every bit of history -- at least in the primary
>>>>> read/write repository.  I wonder how much of a size difference there
>>>>> would be between tossing all history before 5.0 and tossing all history
>>>>> before the ivy transition was completed.
>>>>>
>>>>> In the interests of reducing the size and download time of a clone
>>>>> operation, I definitely think we should trim history in the main repo
>>>>> to
>>>>> some arbitrary point, as long as the full history is available
>>>>> elsewhere.  It's my understanding that it will remain in
>>>>> svn.apache.org
>>>>> (possibly forever), and I think we could also create "historical"
>>>>> read-only git repos.
>>>>>
>>>>> Almost every time I am working on the code, I only care about the
>>>>> stable
>>>>> branch and trunk.  Sometimes I will check out an older 4.x tag so I can
>>>>> see the exact code referenced by a stacktrace in a user's error
>>>>> message,
>>>>> but when this is required, I am willing to go to an entirely different
>>>>> repository and chew up bandwidth/disk resourcesto obtain it, and I do
>>>>> not care whether it is git or svn.  As time marches on, fewer people
>>>>> will have reasons to look at the historical record.
>>>>>
>>>>> Thanks,
>>>>> Shawn
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>>>
>>>>>
>>>> --
>>>> *Doug Turnbull **| *Search Relevance Consultant | OpenSource
>>>> Connections <http://opensourceconnections.com>, LLC | 240.476.9983
>>>> Author: Relevant Search <http://manning.com/turnbull>
>>>> This e-mail and all contents, including attachments, is considered to
>>>> be Company Confidential unless explicitly stated otherwise, regardless
>>>> of whether attachments are marked as such.
>>>>
>>>> --
>>> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
>>> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
>>> http://www.solrenterprisesearchserver.com
>>>
>>
>>
> --
- Mark
about.me/markrmiller

Re: Lucene/Solr git mirror will soon turn off

Reply via email to