I vote for 2!

I see no reason we should get rid of the history.

The way I have read before, ASF does not need to have a legal clearance for
every historical code revision. What is important is the legal clearance at
the moment the code grant happens.

I don't believe the GitHub 2GB limit is any indicator of anything except
their capacity and business decision. The Linux kernel is close to 2GB,
OpenOffice is 1.5GB, Hadoop is 400MB, Lucene-Solr is 200MB, JMeter is
200MB, etc.

NetBeans is project with over a decade of history with hundreds of people.
The first commit is see is from 1999.

Of course that such a large and old project will have a large repository!

And as time passes each repository will only grow. I just read a
StackOverflow answer on how to determine the GitHub repository size and
their example for git/git mentioned it was 40MB -- it's, I believe, 200MB
now.

I also don't think 3) will result in much economy. I doubt there are many
JARs or temporary build results.

If the current repository turns out too much for the Apache Infra we could
decide in time how to improve that, but as an Incubation goal I believe
just switching to git should be enough.



--emi

On Fri, Oct 7, 2016 at 12:16 AM, Mark Struberg <[email protected]>
wrote:

> Hi!
>
> I’ve migrated the NetBeans hg repo into GIT. Sadly this repo takes about
> 3.6 GiB and thus we cannot host it on github or Bitbucket (both have a 2GB
> limit).
> I am currently hosting the repo on a small private server.
> If anyone is interested then send me a private mail with your public key
> and I’ll give you access.
> Jaroslav, Geertjan and a few others already have a clone.
>
> There are basically 3 ways how we can handle this
>
> 1.) import a tarball into a fresh git repo. We would loose the history but
> we only have sources which are explicitly cleared by Oracle.
>
> 2.) import the full hg history. That is pretty thick which means it’s not
> that easy to clone. github pull requests also wont work as we exceed the
> 2GB limit…
> In addition the hg repo currently also contains lots of GPL libraries like
> e.g. hibernate jar, etc. That’s something we don’t host at the ASF.
>
> 3.) Take the git import from hg and filter it. Remove all (most) jars,
> temporary build results etc. We might also get rid of a few old branches
> etc. If we keep the original hg repo around in read only mode then we
> should be able to loose tons of weight.
>
> I personally prefer option 3.
> But that is also the most labor intensive.
>
>
> LieGrue,
> strub

Reply via email to