Hi Emilian! The problem with 2 is that it won’t work nicely.
There are 2 problems as sketched. a.) the repo contains binaries which are GPL licensed. That needs to get kicked out of the repo anyway. > What is important is the legal clearance at > the moment the code grant happens. Yes, but Oracle can only grant stuff under ALv2 where they own the rights themselves. They simply don’t own any rights for a hibernate.jar… Also the pure fact that it contains binaries at all is not really good. It’s called source code management for a reason. Not sure if the ant build already uses ivy. If not then we need to improve this. It also contains temporary build artifacts (well, unfortunately such things happen…) b.) the repo size is about 3.6 GiB. That’s really huge. Devs would not even be able to git-clone this over to their own github repos as those are limited to 2GB. So how should we get pull requests in that case? I agree with you that we should preserve the history though. Thus the idea with moving over the original hg repo to some other place and switch it into read-only mode. And have the new GIT repo stripped down to the core parts (of course with their history). git-filter-branch is your friend. LieGrue, strub > Am 07.10.2016 um 11:37 schrieb Emilian Bold <[email protected]>: > > I vote for 2! > > I see no reason we should get rid of the history. > > The way I have read before, ASF does not need to have a legal clearance for > every historical code revision. What is important is the legal clearance at > the moment the code grant happens. > > I don't believe the GitHub 2GB limit is any indicator of anything except > their capacity and business decision. The Linux kernel is close to 2GB, > OpenOffice is 1.5GB, Hadoop is 400MB, Lucene-Solr is 200MB, JMeter is > 200MB, etc. > > NetBeans is project with over a decade of history with hundreds of people. > The first commit is see is from 1999. > > Of course that such a large and old project will have a large repository! > > And as time passes each repository will only grow. I just read a > StackOverflow answer on how to determine the GitHub repository size and > their example for git/git mentioned it was 40MB -- it's, I believe, 200MB > now. > > I also don't think 3) will result in much economy. I doubt there are many > JARs or temporary build results. > > If the current repository turns out too much for the Apache Infra we could > decide in time how to improve that, but as an Incubation goal I believe > just switching to git should be enough. > > > > --emi > > On Fri, Oct 7, 2016 at 12:16 AM, Mark Struberg <[email protected]> > wrote: > >> Hi! >> >> I’ve migrated the NetBeans hg repo into GIT. Sadly this repo takes about >> 3.6 GiB and thus we cannot host it on github or Bitbucket (both have a 2GB >> limit). >> I am currently hosting the repo on a small private server. >> If anyone is interested then send me a private mail with your public key >> and I’ll give you access. >> Jaroslav, Geertjan and a few others already have a clone. >> >> There are basically 3 ways how we can handle this >> >> 1.) import a tarball into a fresh git repo. We would loose the history but >> we only have sources which are explicitly cleared by Oracle. >> >> 2.) import the full hg history. That is pretty thick which means it’s not >> that easy to clone. github pull requests also wont work as we exceed the >> 2GB limit… >> In addition the hg repo currently also contains lots of GPL libraries like >> e.g. hibernate jar, etc. That’s something we don’t host at the ASF. >> >> 3.) Take the git import from hg and filter it. Remove all (most) jars, >> temporary build results etc. We might also get rid of a few old branches >> etc. If we keep the original hg repo around in read only mode then we >> should be able to loose tons of weight. >> >> I personally prefer option 3. >> But that is also the most labor intensive. >> >> >> LieGrue, >> strub
