> a.) the repo contains binaries which are GPL licensed. That needs to get
kicked out of the repo anyway.

Could you give an example for this? Like, a revision I can look at?

> b.) the repo size is about 3.6 GiB. That’s really huge. Devs would not
even be able to git-clone this over to their own github repos as those are
limited to 2GB.

Like I've said, I see the linux kernel will grow over 2GB soon. I wonder
what will github do then.

I don't see 4GB as something huge nowadays. What is that, 2 hours of
YouTube at 720p? And any individual will do that only once. Further updates
will be incremental.

> So how should we get pull requests in that case?

How do kernel contributors do pull requests? A patch attached to an issue
or email would work just fine.



--emi

On Fri, Oct 7, 2016 at 1:42 PM, Mark Struberg <[email protected]>
wrote:

> Hi Emilian!
>
> The problem with 2 is that it won’t work nicely.
>
> There are 2 problems as sketched.
>
> a.) the repo contains binaries which are GPL licensed. That needs to get
> kicked out of the repo anyway.
>
> > What is important is the legal clearance at
> > the moment the code grant happens.
> Yes, but Oracle can only grant stuff under ALv2 where they own the rights
> themselves. They simply don’t own any rights for a hibernate.jar…
>
> Also the pure fact that it contains binaries at all is not really good.
> It’s called source code management for a reason.
> Not sure if the ant build already uses ivy. If not then we need to improve
> this.
> It also contains temporary build artifacts (well, unfortunately such
> things happen…)
>
>
> b.) the repo size is about 3.6 GiB. That’s really huge. Devs would not
> even be able to git-clone this over to their own github repos as those are
> limited to 2GB.
> So how should we get pull requests in that case?
>
> I agree with you that we should preserve the history though.
> Thus the idea with moving over the original hg repo to some other place
> and switch it into read-only mode.
> And have the new GIT repo stripped down to the core parts (of course with
> their history).
> git-filter-branch is your friend.
>
> LieGrue,
> strub
>
>
> > Am 07.10.2016 um 11:37 schrieb Emilian Bold <[email protected]>:
> >
> > I vote for 2!
> >
> > I see no reason we should get rid of the history.
> >
> > The way I have read before, ASF does not need to have a legal clearance
> for
> > every historical code revision. What is important is the legal clearance
> at
> > the moment the code grant happens.
> >
> > I don't believe the GitHub 2GB limit is any indicator of anything except
> > their capacity and business decision. The Linux kernel is close to 2GB,
> > OpenOffice is 1.5GB, Hadoop is 400MB, Lucene-Solr is 200MB, JMeter is
> > 200MB, etc.
> >
> > NetBeans is project with over a decade of history with hundreds of
> people.
> > The first commit is see is from 1999.
> >
> > Of course that such a large and old project will have a large repository!
> >
> > And as time passes each repository will only grow. I just read a
> > StackOverflow answer on how to determine the GitHub repository size and
> > their example for git/git mentioned it was 40MB -- it's, I believe, 200MB
> > now.
> >
> > I also don't think 3) will result in much economy. I doubt there are many
> > JARs or temporary build results.
> >
> > If the current repository turns out too much for the Apache Infra we
> could
> > decide in time how to improve that, but as an Incubation goal I believe
> > just switching to git should be enough.
> >
> >
> >
> > --emi
> >
> > On Fri, Oct 7, 2016 at 12:16 AM, Mark Struberg <[email protected]
> >
> > wrote:
> >
> >> Hi!
> >>
> >> I’ve migrated the NetBeans hg repo into GIT. Sadly this repo takes about
> >> 3.6 GiB and thus we cannot host it on github or Bitbucket (both have a
> 2GB
> >> limit).
> >> I am currently hosting the repo on a small private server.
> >> If anyone is interested then send me a private mail with your public key
> >> and I’ll give you access.
> >> Jaroslav, Geertjan and a few others already have a clone.
> >>
> >> There are basically 3 ways how we can handle this
> >>
> >> 1.) import a tarball into a fresh git repo. We would loose the history
> but
> >> we only have sources which are explicitly cleared by Oracle.
> >>
> >> 2.) import the full hg history. That is pretty thick which means it’s
> not
> >> that easy to clone. github pull requests also wont work as we exceed the
> >> 2GB limit…
> >> In addition the hg repo currently also contains lots of GPL libraries
> like
> >> e.g. hibernate jar, etc. That’s something we don’t host at the ASF.
> >>
> >> 3.) Take the git import from hg and filter it. Remove all (most) jars,
> >> temporary build results etc. We might also get rid of a few old branches
> >> etc. If we keep the original hg repo around in read only mode then we
> >> should be able to loose tons of weight.
> >>
> >> I personally prefer option 3.
> >> But that is also the most labor intensive.
> >>
> >>
> >> LieGrue,
> >> strub
>
>

Reply via email to