Coincidentally, this is discussed in a paper that will appear next week on arXiv: (and here for the impatients https://www.dropbox.com/scl/fi/2tjmdemfk4mfy7fxpm3vq/AIPolicy.pdf?rlkey=fx0it2h8v5r517t3u1lj2lye9&st=f2fjvkmd&dl=0 <https://www.dropbox.com/scl/fi/2tjmdemfk4mfy7fxpm3vq/AIPolicy.pdf?rlkey=fx0it2h8v5r517t3u1lj2lye9&st=f2fjvkmd&dl=0>)


   AI Policy, Disclosure, and Human in the Loop: How Are Contribution
   Guidelines Adapting to GenAI?

 * Andre Hora
 * Romain Robbes

/[...] This paper provides an initial empirical study to explore how open source projects are adapting to GenAI contributions. We analyzed 1,000 popular GitHub repositories and identified 118 AI policies for contributors. Our results show that (1) 78% of the AI policies allow contributions generated with GenAI, while 22% explicitly discourage their use; (2) 51% of the AI policies require the disclosure of AI-generated contributions; and (3) 74% of the AI policies require a human in the loop during contribution. [...]/

nicolas

On 2026-05-14 09:38, Aaron Wohl via Pharo-dev wrote:
Human sign-off would be a great system.  Unfortunately, the current setup requires humans to generate fixes for Pharo.   The discussion is to allow AI-generated proposed fixes.  I don't understand the copyright issue, so I will let someone who does explain why AI fixes are a copyright problem.

The usual anti-AI copyright issues are 1) you can't copyright the work of an AI and 2) AIs will always steal other peoples work and lie about where it came from.

----- Original message -----
From: Sven Van Caekenberghe <[email protected]>
To: Pharo Development List <[email protected]>
Cc: stephane ducasse <[email protected]>, [email protected], Aaron Wohl <[email protected]> Subject: Re: [Pharo-dev] claude.ai <http://claude.ai> code review of pharo-v
Date: Thursday, May 14, 2026 2:52 AM

AI Coding Assistants — The Linux Kernel documentation <https://docs.kernel.org/process/coding-assistants.html>
docs.kernel.org <https://docs.kernel.org/process/coding-assistants.html>
        <https://docs.kernel.org/process/coding-assistants.html>


It is great that you did this and important. But each change has to be validated by a capable VM developer, who takes responsibility.

On 14 May 2026, at 04:42, Aaron Wohl via Pharo-dev <[email protected]> wrote:

It has been in the news that Anthropic delayed the release of Mythos to invite groups to fix the AI's bugs.  The news reports suggest that the big open-source projects are accepting AI bug fixes (not just bug reports).  If the linux kernal and chrome are accepting AI code how can they do it and pharo can not?
https://www.linuxfoundation.org/blog/project-glasswing-gives-maintainers-advanced-ai-to-secure-open-source
I am not suggesting that AI code should be blindly accepted, some of AI bugs and fixes are fine.  It is faster to have a human review than to generate fixes.

----- Original message -----
From: stephane ducasse <[email protected]>
To: Aaron Wohl <[email protected]>
Cc: [email protected], Pharo-dev <[email protected]>
Subject: Re: [Pharo-dev] claude.ai code review of pharo-v
Date: Wednesday, May 13, 2026 3:26 PM

thanks we will check this.


On 13 May 2026, at 19:24, Aaron Wohl <[email protected]> wrote:

I had Claude create PRs for the current git main and for the version 12 you asked for. I didn't submit them because they are AI-generated.  I have no interest in doing manual work that an AI can do.  On copyright: if not being able to copyright AI work is the issue, perhaps you could copyright the collection (of source lines)?

For pharo-12 there are 93 bug fixes posted here.
https://www.awohl.com/pharo-bugs-2026-05-13/

----- Original message -----
From: stephane ducasse <[email protected]>
To: Aaron Wohl <[email protected]>
Cc: [email protected], Pharo Development List <[email protected]>
Subject: Re: [Pharo-dev] claude.ai code review of pharo-v
Date: Wednesday, May 13, 2026 8:14 AM

Hi aaron

Today I discussed with Guille (who was and still a bit sick and under recovery) and did not have the opportunity to discuss with Pablo (who is on vacation)

We are interested in pullrequests that improve the security and code of the VM and pluggins.
So your analyses are definitively worth.

We would love to have PRs and for us this is great if you do them and not an IA so that we can control the copyright concerns.

So thank you for your time and idea.
Guille may contact you directly.

S (this week we have 3 working days) and next week we have also some non-working days.
The month of may is a gruyere.


On 12 May 2026, at 09:10, Aaron Wohl via Pharo-dev <[email protected]> wrote:

I regularly have Claude.AI review my code https://awohl.com, I pointed Claude at the Pharo-VM to see what it could see.  A lot of the issues it found are things like strcpy without bounds checks with strings no one would ever make that big. However, it found some issues that are always triggered.

Full list of issues
https://github.com/avwohl/iospharo/blob/main/docs/pharo-vm-code-reivew-2026-05-11.md

I did not make an AI-generated patch list. [email protected] mentioned copyright concerns by [email protected] of accepting AI code.  I assume this is due to the inability to copyright AI-generated work and to AI's tendency to steal others' work and lie about its origins.   However, a lot of the fixes are so trivial, like off-by-one errors, that I don't know how much of an issue it is. If you want me to PR fixes and or tests that trigger for the full list, or always list, let me know.

Many of the issues would only happen if someone were trying to break things (MAXLEN symlinks, damaged image files). If one were a mindset, it could be called AI slop.  However, here is a short list of things that always cause issues in everyday operation: for example, every plugin loaded damages heap memory, or the nightly build of signed code disables SSL checks, so a simple DNS hack could get malicious code signed.

Source: ~/pharo.md (review of /Users/wohl/esrc/pharo-vm @ pharo-10, 2026-05-11)
Filter: only items that fire in normal benign operation, not edge cases
requiring huge/crafted strings or attacker-chosen sizes.

================================================================
A. ALWAYS — MEMORY DAMAGE / UNDEFINED BEHAVIOR
================================================================

1. sqNamedPrims.c:56-57 — calloc(sizeof(ModuleEntry)+strlen(name)) then strcpy writes strlen+1; 1-byte heap overflow on every plugin load. [#3.26] 2. ffi/callbacks/callbacks.c:14-32 — stack-allocated CallbackInvocation registered in runner->callbackStack/global queue; sig_longjmp exit leaves dangling stack pointer after every same-thread callback. [#2.8] 3. ffi/typesPrimitives.c:174-188 — setHandler(receiver, structType) stores pointer before failed()/ffi_get_struct_offsets checks; any error path frees memory while receiver still holds the dangling handle. [#2.6] 4. threadSafeQueue.c:113-137 — queue->first and node->element read outside the mutex; lock-holder's free(node) leaves the other consumer walking freed memory on every concurrent dequeue (hit by every FFI workload). [#3.21] 5. SocketPluginImpl.c:1109-1123 — sqSocketDestroy frees PSP(s) after sqSocketAbortConnection queues a closeHandler against pss; AIO dispatch fires on freed memory. [#3.22] 6. ffi/utils.c:43-50 — readString returns an un-pinned image-memory pointer; GC between strlen() and the caller's strcpy invalidates the length and the address. [#5.27] 7. ffi/callbacks/callbacks.c:24-29 — runner->callbackStack chain updated without any lock; reentrant callbacks from multiple threads on the same Runner corrupt the linked list. [#5.4] 8. pathUtilities.c:233-237 — strrchr(name,'.') result stored in fileExtension, but the NULL guard tests the unrelated `extension` variable; strcmp(NULL,...) crashes on any directory entry without a dot (e.g. "Makefile").  [#4.2] 9. pathUtilities.c:163 — first[strlen(first)-1] reads first[-1] (one byte before the buffer) whenever first is "", reachable from parameters.c:210-212 fallback. [#4.3] 10. externalPrimitives.c:57,66 — module path assembled in a file-static moduleNameBuffer with no lock; concurrent loads scribble each other's path and dlopen/LoadLibrary sees a torn string. [#5.3] 11. debugUnix.c:88-95,122-162 — SIGSEGV/SIGBUS/SIGFPE handler calls fopen/vfprintf/backtrace_symbols_fd/ctime_r/semaphore_wait (none async-signal-safe) and uses SA_NODEFER so the handler can re-enter itself on every crash. [#5.1] 12. debugUnix.c:123,144,154 — sigaction.sa_mask never initialized via sigemptyset for term_handler_action and sigpipe_handler_action; kernel reads uninitialized stack to decide what to mask on every install. [#5.2] 13. debug.c:57-66 — glibc strerror_r returns a pointer that may not write to the supplied buffer; caller prints the buffer unconditionally, leaking uninitialized stack bytes on every error path. [#5.7] 14. SocketPluginImpl.c:2494-2496 — sqSocket lastError stored in a file-static, clobbered across sockets; every concurrent socket failure overwrites another socket's error state. [#5.17] 15. aioWin.c:457/465 — heap-interior alias into allHandles transiently equals a freshly malloc'd region; any early return between the two assignments leaves a free-of-interior or double-use hazard. [#1/§9 PARTIAL]


================================================================
B. ALWAYS — SECURITY / CORRECTNESS (NOT MEMORY DAMAGE)
================================================================

TLS / SqueakSSL
---------------
1. sqUnixSSL.c:89-143 — SSL_CTX_set_verify never called; SSL_get_verify_result returns X509_V_OK by default → no certificate validation on any TLS connection. [#2.9] 2. sqUnixSSL.c:102-107 — SSLv23_method with only SSLv2/v3 disabled; TLS 1.0 and 1.1 still accepted on every handshake. [#2.9 / #6.3] 3. sqUnixSSL.c:115 — cipher list "!ADH:HIGH:MEDIUM:@STRENGTH" permits MEDIUM ciphers. [#2.9] 4. sqUnixSSL.c:107 — no SSL_OP_NO_COMPRESSION (CRIME), no SSL_OP_NO_RENEGOTIATION, no SSL_OP_CIPHER_SERVER_PREFERENCE. [#6.3] 5. sqWin32SSL.c:269-275,215,349-353 — sqExtractPeerName copies serverName verbatim into peerName instead of extracting the cert subject; epp.pwszServerName=NULL disables SChannel's hostname check, so image-side peerName==serverName is meaningless on every connection. [#3.19] 6. sqMacSSL.c:154-201,262-272,363-383 — kSSLSessionOptionBreakOnServerAuth disables auto verification; manual SecTrustEvaluate runs with no SSL policy carrying the hostname, so hostname is never checked. [#3.20] 7. sqWin32SSL.c:216-218 — SP_PROT_TLS1_0/1_1/1_2 enabled for both client and server roles. [#6.1] 8. sqMacSSL.c:154-164 — SSLSetProtocolVersionMin(ctx, kTLSProtocol1) sets minimum to TLS 1.0. [#6.2]

VM internals
------------
9. memoryUnix.c:66-89,109-111 — JIT pages mmap'd PROT_READ|PROT_WRITE|PROT_EXEC permanently; sqMakeMemoryExecutableFromTo / NotExecutable hooks are commented out, so W^X is defeated on Linux/FreeBSD. [#5.24] 10. debug.c:45 — error(char*) forwards the argument as the format string into the vfprintf-style logger; exported API contract leaks a %n/%s primitive to any future caller-controlled string. [#4.10] 11. ffi/typesPrimitives.c:170-172 — getHandler() returns the first slot of any oop with no class tag check; libffi consumes attacker-shaped ffi_type fields on every struct cif build, giving a controlled-dispatch primitive to anyone who can register an FFI struct. [#2.5]

Build / supply chain (every build / every CI run)
-------------------------------------------------
12. Jenkinsfile:84,249 — fetch-and-execute installer via `wget … | bash` over plain HTTP. [#7] 13. scripts/runTests.sh:31 — `wget -O - https://get.pharo.org/64/80 | bash`, executed from PR workflows. [#7] 14. scripts/installCygwin.ps1:7-9 — Cygwin installer + mirror retrieved over plain HTTP. [#7] 15. cmake/importLibFFI.cmake / importLibGit2.cmake / importSDL2.cmake — dependencies pinned to mutable git tags, no commit-SHA. [#7] 16. macros.cmake:69-103 + every cmake/import*.cmake using files.pharo.org — DownloadProject calls omit URL_HASH for libgit2, libssh2, openssl, zlib, SDL2, cairo, pixman, libpng, freetype, fontconfig, harfbuzz, gcc-runtime. [#7] 17. cmake/importFreetype2.cmake:47-49 — direct savannah.gnu.org download with no URL_HASH. [#7] 18. docker/ubuntu-arm64/Dockerfile, docker/debian10-armv7/Dockerfile — base images unpinned (no @sha256 digest). [#7] 19. Jenkinsfile:97-403 — every upload uses `scp -o StrictHostKeyChecking=no` against files.pharo.org. [#7] 20. Jenkinsfile:138 + cmake/sign.cmake:8-11 — SIGN_CERT_PASSWORD passed through environment under a broad withCredentials() block. [#7] 21. .github/workflows/continuous-integration-workflow.yaml:2 — `on: [push, pull_request]` with no `permissions:` block; fork PRs can edit runTests.sh and execute with the workflow GITHUB_TOKEN scope. [#7] 22. .github/workflows/...:11,14,68 — EOL runners (ubuntu-18.04, windows-2016) and EOL actions (checkout@v1, upload-artifact@v1). [#7]
23. cmake/packaging.cmake:92 — CPACK_PACKAGE_CHECKSUM "SHA1". [#7]
24. scripts/installCygwin.ps1:35-48 — `cygwin -q` suppresses signature warnings during install. [#7] 25. CMakeLists.txt:206,266-296 + cmake/Linux.cmake:1 — no -D_FORTIFY_SOURCE=2, no -fstack-protector-strong, no -fPIE/-pie, no -Wformat-security, no -Wl,-z,relro / -z,now / -z,noexecstack on Linux release builds. [#7] 26. CMakeLists.txt:206,266-296 — -Wno-int-conversion and -Wno-pointer-sign actively silenced; both classes of warning catch real bugs. [#7] 27. cmake/Linux.cmake:1 — Linux rpath set to "." (relative to CWD) instead of "$ORIGIN". [#7] 28. CMakeLists.txt:206 — Windows/Cygwin builds lack /GS, /guard:cf, /DYNAMICBASE, /NXCOMPAT. [#7]

--
Nicolas Anquetil
Evref team -- Inria Lille

Reply via email to