> a.) the repo contains binaries which are GPL licensed. That needs to get kicked out of the repo anyway.
Could you give an example for this? Like, a revision I can look at? > b.) the repo size is about 3.6 GiB. That’s really huge. Devs would not even be able to git-clone this over to their own github repos as those are limited to 2GB. Like I've said, I see the linux kernel will grow over 2GB soon. I wonder what will github do then. I don't see 4GB as something huge nowadays. What is that, 2 hours of YouTube at 720p? And any individual will do that only once. Further updates will be incremental. > So how should we get pull requests in that case? How do kernel contributors do pull requests? A patch attached to an issue or email would work just fine. --emi On Fri, Oct 7, 2016 at 1:42 PM, Mark Struberg <[email protected]> wrote: > Hi Emilian! > > The problem with 2 is that it won’t work nicely. > > There are 2 problems as sketched. > > a.) the repo contains binaries which are GPL licensed. That needs to get > kicked out of the repo anyway. > > > What is important is the legal clearance at > > the moment the code grant happens. > Yes, but Oracle can only grant stuff under ALv2 where they own the rights > themselves. They simply don’t own any rights for a hibernate.jar… > > Also the pure fact that it contains binaries at all is not really good. > It’s called source code management for a reason. > Not sure if the ant build already uses ivy. If not then we need to improve > this. > It also contains temporary build artifacts (well, unfortunately such > things happen…) > > > b.) the repo size is about 3.6 GiB. That’s really huge. Devs would not > even be able to git-clone this over to their own github repos as those are > limited to 2GB. > So how should we get pull requests in that case? > > I agree with you that we should preserve the history though. > Thus the idea with moving over the original hg repo to some other place > and switch it into read-only mode. > And have the new GIT repo stripped down to the core parts (of course with > their history). > git-filter-branch is your friend. > > LieGrue, > strub > > > > Am 07.10.2016 um 11:37 schrieb Emilian Bold <[email protected]>: > > > > I vote for 2! > > > > I see no reason we should get rid of the history. > > > > The way I have read before, ASF does not need to have a legal clearance > for > > every historical code revision. What is important is the legal clearance > at > > the moment the code grant happens. > > > > I don't believe the GitHub 2GB limit is any indicator of anything except > > their capacity and business decision. The Linux kernel is close to 2GB, > > OpenOffice is 1.5GB, Hadoop is 400MB, Lucene-Solr is 200MB, JMeter is > > 200MB, etc. > > > > NetBeans is project with over a decade of history with hundreds of > people. > > The first commit is see is from 1999. > > > > Of course that such a large and old project will have a large repository! > > > > And as time passes each repository will only grow. I just read a > > StackOverflow answer on how to determine the GitHub repository size and > > their example for git/git mentioned it was 40MB -- it's, I believe, 200MB > > now. > > > > I also don't think 3) will result in much economy. I doubt there are many > > JARs or temporary build results. > > > > If the current repository turns out too much for the Apache Infra we > could > > decide in time how to improve that, but as an Incubation goal I believe > > just switching to git should be enough. > > > > > > > > --emi > > > > On Fri, Oct 7, 2016 at 12:16 AM, Mark Struberg <[email protected] > > > > wrote: > > > >> Hi! > >> > >> I’ve migrated the NetBeans hg repo into GIT. Sadly this repo takes about > >> 3.6 GiB and thus we cannot host it on github or Bitbucket (both have a > 2GB > >> limit). > >> I am currently hosting the repo on a small private server. > >> If anyone is interested then send me a private mail with your public key > >> and I’ll give you access. > >> Jaroslav, Geertjan and a few others already have a clone. > >> > >> There are basically 3 ways how we can handle this > >> > >> 1.) import a tarball into a fresh git repo. We would loose the history > but > >> we only have sources which are explicitly cleared by Oracle. > >> > >> 2.) import the full hg history. That is pretty thick which means it’s > not > >> that easy to clone. github pull requests also wont work as we exceed the > >> 2GB limit… > >> In addition the hg repo currently also contains lots of GPL libraries > like > >> e.g. hibernate jar, etc. That’s something we don’t host at the ASF. > >> > >> 3.) Take the git import from hg and filter it. Remove all (most) jars, > >> temporary build results etc. We might also get rid of a few old branches > >> etc. If we keep the original hg repo around in read only mode then we > >> should be able to loose tons of weight. > >> > >> I personally prefer option 3. > >> But that is also the most labor intensive. > >> > >> > >> LieGrue, > >> strub > >
