Re: [Emc-developers] Forced copilot - please disable

gene heskett Fri, 27 Mar 2026 06:12:05 -0700

On 3/27/26 07:12, Luca Toniolo wrote:

You said


"We gain independence from a corporate entity controlling the infrastructure and 
data we generate in development"

I understand the appeal of independence from a corporate entity.

But I think it's worth asking: what does that independence look like in 
practice? If we move to Codeberg, we're still relying on a third party, just a 
smaller, volunteer-run one. If we self-host, someone in our community needs to 
maintain servers, handle security, manage backups, and keep CI running. Right 
now Codeberg's CI only supports amd64, so our arm64 and multi-distro builds 
would need self-hosted runners that we'd have to provision and maintain 
ourselves.

As for the data we generate in development, it's an open source project. Our 
commits, issues, and discussions are public by design. That's the deal we made 
when we chose this model, and it's a good deal. What additional control would a 
different platform give us over data that is meant to be open?

I'm not dismissing the concern. I just want to make sure the cost of acting on 
it is proportionate to what we actually gain.

Best regards,
Luca



On March 27, 2026 6:47:24 PM GMT+08:00, Luca Toniolo <[email protected]> wrote:

Hi Bertho,

I think it’s important to clarify what "rebuilding" actually looks like in 
practice. Moving to Codeberg isn't just a matter of effort, it would be a significant 
technical downgrade. Their CI currently lacks native ARM support and runs on a single 
global queue, which would leave our infrastructure a shadow of what it is on GitHub.

We also have to consider discoverability. GitHub is the de-facto home for open 
source, and most new contributors find projects where the code already lives. 
Moving to a niche platform risks cutting off that pipeline.

If this is largely a response to the recent news about GitHub using interaction 
data for Copilot training by default, we could address that with a simple 
automated PR message reminding contributors to opt out in their settings. Is 
this specific policy change the main driver for you, or is there a more 
fundamental issue at play?

Best, Luca



On March 27, 2026 5:29:45 PM GMT+08:00, Bertho Stultiens <[email protected]> 
wrote:

On 3/27/26 9:27 AM, Luca Toniolo wrote:

Copilot doing statistical analysis on publicly available GPL code is, if 
anything, less than what the GPL already explicitly permits.

Yes, as long as you abide by the license.

But LLMs do much more than just statistical analysis. LLMs generate output from 
the training set and people are encouraged to use that output.
The problem is that LLMs are known to reproduce their input/training data. The 
problem is that they reproduce training/learned code and stripped the GPL 
license from that code. That is the real problem.

The fact that we can't prevent these corporations from scraping and doing this 
is a fact of how the Internet works. However, the fact that they did it does 
not make it right or their use legal.

Mailing list archives have been indexed by Google, crawled by the Wayback 
Machine, scraped by researchers, and read by recruiters for as long as they've 
existed. Our commit messages, review comments, and design discussions have been 
public and searchable for years. That was true before Copilot, and it would 
remain true if we moved to GitLab, Codeberg, or a self-hosted Gitea instance 
tomorrow. None of these platforms prevent scraping.

It is not only about what is publicly visible on the site(s). It is about the 
use and process how you do things.

The information that is available *inside* github about you and what you are 
doing are quite more extensive than what can be viewed from the public record.

The announcement from github makes, in principle, any and all data subject to 
input into their LLMs. That I cannot accept and will seriously consider my 
options.

GPL enforcement, even in clear-cut cases of actual license violation, has 
historically been rare and difficult. The FSF and SFLC have pursued only the 
most egregious cases, and even those took years. LinuxCNC itself has never 
enforced the GPL against anyone.

The non-enforcement of copyright violations does _not_ make it alright to 
become an infringer or to condone copyright infringement. Besides, the cases 
that were enforced were victory for the GPL and made many an infringer think 
twice or back off.

That is not to say that there are many uncaught infringers. There are and we 
should all discourage that where ever and how ever we can.

The idea of taking drastic action over something that may not even
constitute a violation seems disproportionate.

That is unsettled case law.

However, the action is not just taken over copyrights. The action would also be 
taken to prevent a commercial entity from exploiting internal insights they 
acquire from us using the site.

Besides, it sends a strong message that their (github's) behaviour will result 
in users changing their ways.

If we migrate off GitHub, what do we actually gain? We lose CI infrastructure 
that works, we lose contributor familiarity, we lose discoverability for new 
contributors, we lose issue and PR history, and we solve nothing, because the 
code was already scraped, the mailing lists were already indexed,

We gain independence from a corporate entity controlling the infrastructure and 
data we generate in development.

CI is not that difficult, but we'd need to rebuild. IMO a small price for what 
we gain.

Commit history is in git. We can extract issues and PR data. You know, scrape 
it? ;-)

Discoverability, hm... Use a search engine on the Internet: find linuxcnc.org 
-> link to development. How difficult is that? Not that we've been very active 
at promoting ourselves in the past 20 years or so..

and the next platform will face the same reality.

The next platform will not necessarily have that same reality. That is why 
Codeberg is such a good option, they are a non-profit with an outspoken goal to 
support and further FOSS 
(https://docs.codeberg.org/getting-started/what-is-codeberg/).

--
Greetings Bertho

(disclaimers are disclaimed)

Here is where I'd have to disagree, Bertho. The fact, mentionedpreviously in this thread, that github is stripping the GPLv2 licensebefore spitting out our code to the rest of the world s/b grounds for alegal action. That, as M$ knows well, will need deep pockets we don'thave. Codeberg may be a good idea in 3 or 5 years as developmentproceeds but for our purposes NOW, not so much if it has no arm support.

Keep looking for a github workalike that honors the GPLv2, or sell thefarm for .1 cents a section.


.

_______________________________________________
Emc-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/emc-developers

_______________________________________________
Emc-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/emc-developers

_______________________________________________
Emc-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/emc-developers


Cheers, Gene Heskett, CET.
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author, 1940)
If we desire respect for the law, we must first make the law respectable.
 - Louis D. Brandeis
Don't poison our oceans, interdict drugs at the src.



_______________________________________________
Emc-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/emc-developers

Re: [Emc-developers] Forced copilot - please disable

Reply via email to