You said

"We gain independence from a corporate entity controlling the infrastructure 
and data we generate in development"

I understand the appeal of independence from a corporate entity.

But I think it's worth asking: what does that independence look like in 
practice? If we move to Codeberg, we're still relying on a third party, just a 
smaller, volunteer-run one. If we self-host, someone in our community needs to 
maintain servers, handle security, manage backups, and keep CI running. Right 
now Codeberg's CI only supports amd64, so our arm64 and multi-distro builds 
would need self-hosted runners that we'd have to provision and maintain 
ourselves.

As for the data we generate in development, it's an open source project. Our 
commits, issues, and discussions are public by design. That's the deal we made 
when we chose this model, and it's a good deal. What additional control would a 
different platform give us over data that is meant to be open?

I'm not dismissing the concern. I just want to make sure the cost of acting on 
it is proportionate to what we actually gain.

Best regards,
Luca



On March 27, 2026 6:47:24 PM GMT+08:00, Luca Toniolo <[email protected]> wrote:
>Hi Bertho,
>
>I think it’s important to clarify what "rebuilding" actually looks like in 
>practice. Moving to Codeberg isn't just a matter of effort, it would be a 
>significant technical downgrade. Their CI currently lacks native ARM support 
>and runs on a single global queue, which would leave our infrastructure a 
>shadow of what it is on GitHub.
>
>We also have to consider discoverability. GitHub is the de-facto home for open 
>source, and most new contributors find projects where the code already lives. 
>Moving to a niche platform risks cutting off that pipeline.
>
>If this is largely a response to the recent news about GitHub using 
>interaction data for Copilot training by default, we could address that with a 
>simple automated PR message reminding contributors to opt out in their 
>settings. Is this specific policy change the main driver for you, or is there 
>a more fundamental issue at play?
>
>Best, Luca
>
>
>
>On March 27, 2026 5:29:45 PM GMT+08:00, Bertho Stultiens <[email protected]> 
>wrote:
>>On 3/27/26 9:27 AM, Luca Toniolo wrote:
>>> Copilot doing statistical analysis on publicly available GPL code is, if 
>>> anything, less than what the GPL already explicitly permits.
>>
>>Yes, as long as you abide by the license.
>>
>>But LLMs do much more than just statistical analysis. LLMs generate output 
>>from the training set and people are encouraged to use that output.
>>The problem is that LLMs are known to reproduce their input/training data. 
>>The problem is that they reproduce training/learned code and stripped the GPL 
>>license from that code. That is the real problem.
>>
>>The fact that we can't prevent these corporations from scraping and doing 
>>this is a fact of how the Internet works. However, the fact that they did it 
>>does not make it right or their use legal.
>>
>>
>>> Mailing list archives have been indexed by Google, crawled by the Wayback 
>>> Machine, scraped by researchers, and read by recruiters for as long as 
>>> they've existed. Our commit messages, review comments, and design 
>>> discussions have been public and searchable for years. That was true before 
>>> Copilot, and it would remain true if we moved to GitLab, Codeberg, or a 
>>> self-hosted Gitea instance tomorrow. None of these platforms prevent 
>>> scraping.
>>
>>It is not only about what is publicly visible on the site(s). It is about the 
>>use and process how you do things.
>>
>>The information that is available *inside* github about you and what you are 
>>doing are quite more extensive than what can be viewed from the public record.
>>
>>The announcement from github makes, in principle, any and all data subject to 
>>input into their LLMs. That I cannot accept and will seriously consider my 
>>options.
>>
>>
>>> GPL enforcement, even in clear-cut cases of actual license violation, has 
>>> historically been rare and difficult. The FSF and SFLC have pursued only 
>>> the most egregious cases, and even those took years. LinuxCNC itself has 
>>> never enforced the GPL against anyone.
>>
>>The non-enforcement of copyright violations does _not_ make it alright to 
>>become an infringer or to condone copyright infringement. Besides, the cases 
>>that were enforced were victory for the GPL and made many an infringer think 
>>twice or back off.
>>
>>That is not to say that there are many uncaught infringers. There are and we 
>>should all discourage that where ever and how ever we can.
>>
>>
>>> The idea of taking drastic action over something that may not even
>>> constitute a violation seems disproportionate.
>>That is unsettled case law.
>>
>>However, the action is not just taken over copyrights. The action would also 
>>be taken to prevent a commercial entity from exploiting internal insights 
>>they acquire from us using the site.
>>
>>Besides, it sends a strong message that their (github's) behaviour will 
>>result in users changing their ways.
>>
>>
>>> If we migrate off GitHub, what do we actually gain? We lose CI 
>>> infrastructure that works, we lose contributor familiarity, we lose 
>>> discoverability for new contributors, we lose issue and PR history, and we 
>>> solve nothing, because the code was already scraped, the mailing lists were 
>>> already indexed,
>>
>>We gain independence from a corporate entity controlling the infrastructure 
>>and data we generate in development.
>>
>>CI is not that difficult, but we'd need to rebuild. IMO a small price for 
>>what we gain.
>>
>>Commit history is in git. We can extract issues and PR data. You know, scrape 
>>it? ;-)
>>
>>Discoverability, hm... Use a search engine on the Internet: find linuxcnc.org 
>>-> link to development. How difficult is that? Not that we've been very 
>>active at promoting ourselves in the past 20 years or so...
>>
>>
>>> and the next platform will face the same reality.
>>The next platform will not necessarily have that same reality. That is why 
>>Codeberg is such a good option, they are a non-profit with an outspoken goal 
>>to support and further FOSS 
>>(https://docs.codeberg.org/getting-started/what-is-codeberg/).
>>
>>-- 
>>Greetings Bertho
>>
>>(disclaimers are disclaimed)
>>
>>
>>
>>_______________________________________________
>>Emc-developers mailing list
>>[email protected]
>>https://lists.sourceforge.net/lists/listinfo/emc-developers
>_______________________________________________
>Emc-developers mailing list
>[email protected]
>https://lists.sourceforge.net/lists/listinfo/emc-developers

_______________________________________________
Emc-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/emc-developers

Reply via email to