In defense of more history immediately available--it is often far more useful to poke around code history/run blame to figure out some code than by taking it at face value. Putting this in a secondary place like Apache SVN repo IMO reduces the readability of the code itself. This is doubly true for new developers that won't know about Apache's SVN. And Lucene can be quite intricate code. Further in my own work poking around in github mirrors I frequently hit the current cutoff. Which is one reason I stopped using them for anything but the casual investigation.
I'm not totally against a cutoff point, but I'd advocate for exhausting other options first, such as trimming out unrelated projects, binaries, etc. -Doug On Wednesday, December 16, 2015, Shawn Heisey <apa...@elyograg.org <javascript:_e(%7B%7D,'cvml','apa...@elyograg.org');>> wrote: > On 12/16/2015 5:53 PM, Alexandre Rafalovitch wrote: > > On 16 December 2015 at 00:44, Dawid Weiss <dawid.we...@gmail.com> wrote: > >> 4) The size of JARs is really not an issue. The entire SVN repo I > mirrored > >> locally (including empty interim commits to cater for svn:mergeinfos) > is 4G. > >> If you strip the stuff like javadocs and side projects (Nutch, Tika, > Mahout) > >> then I bet the entire history can fit in 1G total. Of course stripping > JARs > >> is also doable. > > I think this answered one of the issues. So, this is not something to > focus on. > > > > The question I had (I am sure a very dumb one): WHY do we care about > > history preserved perfectly in Git? Because that seems to be the real > > bottleneck now. Does anybody still checks out an intermediate commit > > in Solr 1.4 branch? > > I do not think we need every bit of history -- at least in the primary > read/write repository. I wonder how much of a size difference there > would be between tossing all history before 5.0 and tossing all history > before the ivy transition was completed. > > In the interests of reducing the size and download time of a clone > operation, I definitely think we should trim history in the main repo to > some arbitrary point, as long as the full history is available > elsewhere. It's my understanding that it will remain in svn.apache.org > (possibly forever), and I think we could also create "historical" > read-only git repos. > > Almost every time I am working on the code, I only care about the stable > branch and trunk. Sometimes I will check out an older 4.x tag so I can > see the exact code referenced by a stacktrace in a user's error message, > but when this is required, I am willing to go to an entirely different > repository and chew up bandwidth/disk resourcesto obtain it, and I do > not care whether it is git or svn. As time marches on, fewer people > will have reasons to look at the historical record. > > Thanks, > Shawn > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > -- *Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections <http://opensourceconnections.com>, LLC | 240.476.9983 Author: Relevant Search <http://manning.com/turnbull> This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.