Coincidentally, this is discussed in a paper that will appear next week
on arXiv:
(and here for the impatients
https://www.dropbox.com/scl/fi/2tjmdemfk4mfy7fxpm3vq/AIPolicy.pdf?rlkey=fx0it2h8v5r517t3u1lj2lye9&st=f2fjvkmd&dl=0
<https://www.dropbox.com/scl/fi/2tjmdemfk4mfy7fxpm3vq/AIPolicy.pdf?rlkey=fx0it2h8v5r517t3u1lj2lye9&st=f2fjvkmd&dl=0>)
AI Policy, Disclosure, and Human in the Loop: How Are Contribution
Guidelines Adapting to GenAI?
* Andre Hora
* Romain Robbes
/[...] This paper provides an initial empirical study to explore how
open source projects are adapting to GenAI contributions. We analyzed
1,000 popular GitHub repositories and identified 118 AI policies for
contributors. Our results show that (1) 78% of the AI policies allow
contributions generated with GenAI, while 22% explicitly discourage
their use; (2) 51% of the AI policies require the disclosure of
AI-generated contributions; and (3) 74% of the AI policies require a
human in the loop during contribution. [...]/
nicolas
On 2026-05-14 09:38, Aaron Wohl via Pharo-dev wrote:
Human sign-off would be a great system. Unfortunately, the current
setup requires humans to generate fixes for Pharo. The discussion is
to allow AI-generated proposed fixes. I don't understand the
copyright issue, so I will let someone who does explain why AI fixes
are a copyright problem.
The usual anti-AI copyright issues are 1) you can't copyright the work
of an AI and 2) AIs will always steal other peoples work and lie about
where it came from.
----- Original message -----
From: Sven Van Caekenberghe <[email protected]>
To: Pharo Development List <[email protected]>
Cc: stephane ducasse <[email protected]>,
[email protected], Aaron Wohl <[email protected]>
Subject: Re: [Pharo-dev] claude.ai <http://claude.ai> code review of
pharo-v
Date: Thursday, May 14, 2026 2:52 AM
AI Coding Assistants — The Linux Kernel documentation
<https://docs.kernel.org/process/coding-assistants.html>
docs.kernel.org <https://docs.kernel.org/process/coding-assistants.html>
<https://docs.kernel.org/process/coding-assistants.html>
It is great that you did this and important. But each change has to be
validated by a capable VM developer, who takes responsibility.
On 14 May 2026, at 04:42, Aaron Wohl via Pharo-dev
<[email protected]> wrote:
It has been in the news that Anthropic delayed the release of Mythos
to invite groups to fix the AI's bugs. The news reports suggest that
the big open-source projects are accepting AI bug fixes (not just bug
reports). If the linux kernal and chrome are accepting AI code how
can they do it and pharo can not?
https://www.linuxfoundation.org/blog/project-glasswing-gives-maintainers-advanced-ai-to-secure-open-source
I am not suggesting that AI code should be blindly accepted, some of
AI bugs and fixes are fine. It is faster to have a human review than
to generate fixes.
----- Original message -----
From: stephane ducasse <[email protected]>
To: Aaron Wohl <[email protected]>
Cc: [email protected], Pharo-dev <[email protected]>
Subject: Re: [Pharo-dev] claude.ai code review of pharo-v
Date: Wednesday, May 13, 2026 3:26 PM
thanks we will check this.
On 13 May 2026, at 19:24, Aaron Wohl <[email protected]> wrote:
I had Claude create PRs for the current git main and for the version
12 you asked for.
I didn't submit them because they are AI-generated. I have no
interest in doing manual work that an AI can do. On copyright: if
not being able to copyright AI work is the issue, perhaps you could
copyright the collection (of source lines)?
For pharo-12 there are 93 bug fixes posted here.
https://www.awohl.com/pharo-bugs-2026-05-13/
----- Original message -----
From: stephane ducasse <[email protected]>
To: Aaron Wohl <[email protected]>
Cc: [email protected], Pharo Development List
<[email protected]>
Subject: Re: [Pharo-dev] claude.ai code review of pharo-v
Date: Wednesday, May 13, 2026 8:14 AM
Hi aaron
Today I discussed with Guille (who was and still a bit sick and
under recovery) and did not have the opportunity to discuss with
Pablo (who is on vacation)
We are interested in pullrequests that improve the security and code
of the VM and pluggins.
So your analyses are definitively worth.
We would love to have PRs and for us this is great if you do them
and not an IA so that we can control the copyright concerns.
So thank you for your time and idea.
Guille may contact you directly.
S (this week we have 3 working days) and next week we have also some
non-working days.
The month of may is a gruyere.
On 12 May 2026, at 09:10, Aaron Wohl via Pharo-dev
<[email protected]> wrote:
I regularly have Claude.AI review my code https://awohl.com, I
pointed Claude at the Pharo-VM to see what it could see. A lot of
the issues it found are things like strcpy without bounds checks
with strings no one would ever make that big. However, it found
some issues that are always triggered.
Full list of issues
https://github.com/avwohl/iospharo/blob/main/docs/pharo-vm-code-reivew-2026-05-11.md
I did not make an AI-generated patch list. [email protected]
mentioned copyright concerns by [email protected] of
accepting AI code. I assume this is due to the inability to
copyright AI-generated work and to AI's tendency to steal others'
work and lie about its origins. However, a lot of the fixes are
so trivial, like off-by-one errors, that I don't know how much of
an issue it is. If you want me to PR fixes and or tests that
trigger for the full list, or always list, let me know.
Many of the issues would only happen if someone were trying to
break things (MAXLEN symlinks, damaged image files). If one were a
mindset, it could be called AI slop. However, here is a short list
of things that always cause issues in everyday operation: for
example, every plugin loaded damages heap memory, or the nightly
build of signed code disables SSL checks, so a simple DNS hack
could get malicious code signed.
Source: ~/pharo.md (review of /Users/wohl/esrc/pharo-vm @ pharo-10,
2026-05-11)
Filter: only items that fire in normal benign operation, not edge cases
requiring huge/crafted strings or attacker-chosen sizes.
================================================================
A. ALWAYS — MEMORY DAMAGE / UNDEFINED BEHAVIOR
================================================================
1. sqNamedPrims.c:56-57 — calloc(sizeof(ModuleEntry)+strlen(name))
then strcpy writes strlen+1; 1-byte heap overflow on every plugin
load. [#3.26]
2. ffi/callbacks/callbacks.c:14-32 — stack-allocated
CallbackInvocation registered in runner->callbackStack/global
queue; sig_longjmp exit leaves dangling stack pointer after every
same-thread callback. [#2.8]
3. ffi/typesPrimitives.c:174-188 — setHandler(receiver, structType)
stores pointer before failed()/ffi_get_struct_offsets checks; any
error path frees memory while receiver still holds the dangling
handle. [#2.6]
4. threadSafeQueue.c:113-137 — queue->first and node->element read
outside the mutex; lock-holder's free(node) leaves the other
consumer walking freed memory on every concurrent dequeue (hit by
every FFI workload). [#3.21]
5. SocketPluginImpl.c:1109-1123 — sqSocketDestroy frees PSP(s)
after sqSocketAbortConnection queues a closeHandler against pss;
AIO dispatch fires on freed memory. [#3.22]
6. ffi/utils.c:43-50 — readString returns an un-pinned image-memory
pointer; GC between strlen() and the caller's strcpy invalidates
the length and the address. [#5.27]
7. ffi/callbacks/callbacks.c:24-29 — runner->callbackStack chain
updated without any lock; reentrant callbacks from multiple threads
on the same Runner corrupt the linked list. [#5.4]
8. pathUtilities.c:233-237 — strrchr(name,'.') result stored in
fileExtension, but the NULL guard tests the unrelated `extension`
variable; strcmp(NULL,...) crashes on any directory entry without a
dot (e.g. "Makefile"). [#4.2]
9. pathUtilities.c:163 — first[strlen(first)-1] reads first[-1]
(one byte before the buffer) whenever first is "", reachable from
parameters.c:210-212 fallback. [#4.3]
10. externalPrimitives.c:57,66 — module path assembled in a
file-static moduleNameBuffer with no lock; concurrent loads
scribble each other's path and dlopen/LoadLibrary sees a torn
string. [#5.3]
11. debugUnix.c:88-95,122-162 — SIGSEGV/SIGBUS/SIGFPE handler calls
fopen/vfprintf/backtrace_symbols_fd/ctime_r/semaphore_wait (none
async-signal-safe) and uses SA_NODEFER so the handler can re-enter
itself on every crash. [#5.1]
12. debugUnix.c:123,144,154 — sigaction.sa_mask never initialized
via sigemptyset for term_handler_action and sigpipe_handler_action;
kernel reads uninitialized stack to decide what to mask on every
install. [#5.2]
13. debug.c:57-66 — glibc strerror_r returns a pointer that may not
write to the supplied buffer; caller prints the buffer
unconditionally, leaking uninitialized stack bytes on every error
path. [#5.7]
14. SocketPluginImpl.c:2494-2496 — sqSocket lastError stored in a
file-static, clobbered across sockets; every concurrent socket
failure overwrites another socket's error state. [#5.17]
15. aioWin.c:457/465 — heap-interior alias into allHandles
transiently equals a freshly malloc'd region; any early return
between the two assignments leaves a free-of-interior or double-use
hazard. [#1/§9 PARTIAL]
================================================================
B. ALWAYS — SECURITY / CORRECTNESS (NOT MEMORY DAMAGE)
================================================================
TLS / SqueakSSL
---------------
1. sqUnixSSL.c:89-143 — SSL_CTX_set_verify never called;
SSL_get_verify_result returns X509_V_OK by default → no certificate
validation on any TLS connection. [#2.9]
2. sqUnixSSL.c:102-107 — SSLv23_method with only SSLv2/v3 disabled;
TLS 1.0 and 1.1 still accepted on every handshake. [#2.9 / #6.3]
3. sqUnixSSL.c:115 — cipher list "!ADH:HIGH:MEDIUM:@STRENGTH"
permits MEDIUM ciphers. [#2.9]
4. sqUnixSSL.c:107 — no SSL_OP_NO_COMPRESSION (CRIME), no
SSL_OP_NO_RENEGOTIATION, no SSL_OP_CIPHER_SERVER_PREFERENCE. [#6.3]
5. sqWin32SSL.c:269-275,215,349-353 — sqExtractPeerName copies
serverName verbatim into peerName instead of extracting the cert
subject; epp.pwszServerName=NULL disables SChannel's hostname
check, so image-side peerName==serverName is meaningless on every
connection. [#3.19]
6. sqMacSSL.c:154-201,262-272,363-383 —
kSSLSessionOptionBreakOnServerAuth disables auto verification;
manual SecTrustEvaluate runs with no SSL policy carrying the
hostname, so hostname is never checked. [#3.20]
7. sqWin32SSL.c:216-218 — SP_PROT_TLS1_0/1_1/1_2 enabled for both
client and server roles. [#6.1]
8. sqMacSSL.c:154-164 — SSLSetProtocolVersionMin(ctx,
kTLSProtocol1) sets minimum to TLS 1.0. [#6.2]
VM internals
------------
9. memoryUnix.c:66-89,109-111 — JIT pages mmap'd
PROT_READ|PROT_WRITE|PROT_EXEC permanently;
sqMakeMemoryExecutableFromTo / NotExecutable hooks are commented
out, so W^X is defeated on Linux/FreeBSD. [#5.24]
10. debug.c:45 — error(char*) forwards the argument as the format
string into the vfprintf-style logger; exported API contract leaks
a %n/%s primitive to any future caller-controlled string. [#4.10]
11. ffi/typesPrimitives.c:170-172 — getHandler() returns the first
slot of any oop with no class tag check; libffi consumes
attacker-shaped ffi_type fields on every struct cif build, giving a
controlled-dispatch primitive to anyone who can register an FFI
struct. [#2.5]
Build / supply chain (every build / every CI run)
-------------------------------------------------
12. Jenkinsfile:84,249 — fetch-and-execute installer via `wget … |
bash` over plain HTTP. [#7]
13. scripts/runTests.sh:31 — `wget -O - https://get.pharo.org/64/80
| bash`, executed from PR workflows. [#7]
14. scripts/installCygwin.ps1:7-9 — Cygwin installer + mirror
retrieved over plain HTTP. [#7]
15. cmake/importLibFFI.cmake / importLibGit2.cmake /
importSDL2.cmake — dependencies pinned to mutable git tags, no
commit-SHA. [#7]
16. macros.cmake:69-103 + every cmake/import*.cmake using
files.pharo.org — DownloadProject calls omit URL_HASH for libgit2,
libssh2, openssl, zlib, SDL2, cairo, pixman, libpng, freetype,
fontconfig, harfbuzz, gcc-runtime. [#7]
17. cmake/importFreetype2.cmake:47-49 — direct savannah.gnu.org
download with no URL_HASH. [#7]
18. docker/ubuntu-arm64/Dockerfile,
docker/debian10-armv7/Dockerfile — base images unpinned (no @sha256
digest). [#7]
19. Jenkinsfile:97-403 — every upload uses `scp -o
StrictHostKeyChecking=no` against files.pharo.org. [#7]
20. Jenkinsfile:138 + cmake/sign.cmake:8-11 — SIGN_CERT_PASSWORD
passed through environment under a broad withCredentials() block. [#7]
21. .github/workflows/continuous-integration-workflow.yaml:2 — `on:
[push, pull_request]` with no `permissions:` block; fork PRs can
edit runTests.sh and execute with the workflow GITHUB_TOKEN scope. [#7]
22. .github/workflows/...:11,14,68 — EOL runners (ubuntu-18.04,
windows-2016) and EOL actions (checkout@v1, upload-artifact@v1). [#7]
23. cmake/packaging.cmake:92 — CPACK_PACKAGE_CHECKSUM "SHA1". [#7]
24. scripts/installCygwin.ps1:35-48 — `cygwin -q` suppresses
signature warnings during install. [#7]
25. CMakeLists.txt:206,266-296 + cmake/Linux.cmake:1 — no
-D_FORTIFY_SOURCE=2, no -fstack-protector-strong, no -fPIE/-pie, no
-Wformat-security, no -Wl,-z,relro / -z,now / -z,noexecstack on
Linux release builds. [#7]
26. CMakeLists.txt:206,266-296 — -Wno-int-conversion and
-Wno-pointer-sign actively silenced; both classes of warning catch
real bugs. [#7]
27. cmake/Linux.cmake:1 — Linux rpath set to "." (relative to CWD)
instead of "$ORIGIN". [#7]
28. CMakeLists.txt:206 — Windows/Cygwin builds lack /GS, /guard:cf,
/DYNAMICBASE, /NXCOMPAT. [#7]
--
Nicolas Anquetil
Evref team -- Inria Lille