[Pharo-dev] Re: claude.ai code review of pharo-v

Nicolas Anquetil Sat, 16 May 2026 01:37:18 -0700

Coincidentally, this is discussed in a paper that will appear next weekon arXiv:(and here for the impatientshttps://www.dropbox.com/scl/fi/2tjmdemfk4mfy7fxpm3vq/AIPolicy.pdf?rlkey=fx0it2h8v5r517t3u1lj2lye9&st=f2fjvkmd&dl=0<https://www.dropbox.com/scl/fi/2tjmdemfk4mfy7fxpm3vq/AIPolicy.pdf?rlkey=fx0it2h8v5r517t3u1lj2lye9&st=f2fjvkmd&dl=0>)



   AI Policy, Disclosure, and Human in the Loop: How Are Contribution
   Guidelines Adapting to GenAI?

 * Andre Hora
 * Romain Robbes

/[...] This paper provides an initial empirical study to explore howopen source projects are adapting to GenAI contributions. We analyzed1,000 popular GitHub repositories and identified 118 AI policies forcontributors. Our results show that (1) 78% of the AI policies allowcontributions generated with GenAI, while 22% explicitly discouragetheir use; (2) 51% of the AI policies require the disclosure ofAI-generated contributions; and (3) 74% of the AI policies require ahuman in the loop during contribution. [...]/


nicolas

On 2026-05-14 09:38, Aaron Wohl via Pharo-dev wrote:

Human sign-off would be a great system. Unfortunately, the currentsetup requires humans to generate fixes for Pharo. The discussion isto allow AI-generated proposed fixes. I don't understand thecopyright issue, so I will let someone who does explain why AI fixesare a copyright problem.
The usual anti-AI copyright issues are 1) you can't copyright the workof an AI and 2) AIs will always steal other peoples work and lie aboutwhere it came from.
----- Original message -----
From: Sven Van Caekenberghe <[email protected]>
To: Pharo Development List <[email protected]>
Cc: stephane ducasse <[email protected]>,[email protected], Aaron Wohl <[email protected]>Subject: Re: [Pharo-dev] claude.ai <http://claude.ai> code review ofpharo-v
Date: Thursday, May 14, 2026 2:52 AM
AI Coding Assistants — The Linux Kernel documentation<https://docs.kernel.org/process/coding-assistants.html>
docs.kernel.org <https://docs.kernel.org/process/coding-assistants.html>
        <https://docs.kernel.org/process/coding-assistants.html>
It is great that you did this and important. But each change has to bevalidated by a capable VM developer, who takes responsibility.
On 14 May 2026, at 04:42, Aaron Wohl via Pharo-dev<[email protected]> wrote:
It has been in the news that Anthropic delayed the release of Mythosto invite groups to fix the AI's bugs. The news reports suggest thatthe big open-source projects are accepting AI bug fixes (not just bugreports). If the linux kernal and chrome are accepting AI code howcan they do it and pharo can not?
https://www.linuxfoundation.org/blog/project-glasswing-gives-maintainers-advanced-ai-to-secure-open-source
I am not suggesting that AI code should be blindly accepted, some ofAI bugs and fixes are fine. It is faster to have a human review thanto generate fixes.
----- Original message -----
From: stephane ducasse <[email protected]>
To: Aaron Wohl <[email protected]>
Cc: [email protected], Pharo-dev <[email protected]>
Subject: Re: [Pharo-dev] claude.ai code review of pharo-v
Date: Wednesday, May 13, 2026 3:26 PM

thanks we will check this.
On 13 May 2026, at 19:24, Aaron Wohl <[email protected]> wrote:
I had Claude create PRs for the current git main and for the version12 you asked for.I didn't submit them because they are AI-generated. I have nointerest in doing manual work that an AI can do. On copyright: ifnot being able to copyright AI work is the issue, perhaps you couldcopyright the collection (of source lines)?
For pharo-12 there are 93 bug fixes posted here.
https://www.awohl.com/pharo-bugs-2026-05-13/

----- Original message -----
From: stephane ducasse <[email protected]>
To: Aaron Wohl <[email protected]>
Cc: [email protected], Pharo Development List<[email protected]>
Subject: Re: [Pharo-dev] claude.ai code review of pharo-v
Date: Wednesday, May 13, 2026 8:14 AM

Hi aaron
Today I discussed with Guille (who was and still a bit sick andunder recovery) and did not have the opportunity to discuss withPablo (who is on vacation)
We are interested in pullrequests that improve the security and codeof the VM and pluggins.
So your analyses are definitively worth.
We would love to have PRs and for us this is great if you do themand not an IA so that we can control the copyright concerns.
So thank you for your time and idea.
Guille may contact you directly.
S (this week we have 3 working days) and next week we have also somenon-working days.
The month of may is a gruyere.
On 12 May 2026, at 09:10, Aaron Wohl via Pharo-dev<[email protected]> wrote:
I regularly have Claude.AI review my code https://awohl.com, Ipointed Claude at the Pharo-VM to see what it could see. A lot ofthe issues it found are things like strcpy without bounds checkswith strings no one would ever make that big. However, it foundsome issues that are always triggered.
Full list of issues
https://github.com/avwohl/iospharo/blob/main/docs/pharo-vm-code-reivew-2026-05-11.md
I did not make an AI-generated patch list. [email protected]mentioned copyright concerns by [email protected] ofaccepting AI code. I assume this is due to the inability tocopyright AI-generated work and to AI's tendency to steal others'work and lie about its origins. However, a lot of the fixes areso trivial, like off-by-one errors, that I don't know how much ofan issue it is. If you want me to PR fixes and or tests thattrigger for the full list, or always list, let me know.
Many of the issues would only happen if someone were trying tobreak things (MAXLEN symlinks, damaged image files). If one were amindset, it could be called AI slop. However, here is a short listof things that always cause issues in everyday operation: forexample, every plugin loaded damages heap memory, or the nightlybuild of signed code disables SSL checks, so a simple DNS hackcould get malicious code signed.
Source: ~/pharo.md (review of /Users/wohl/esrc/pharo-vm @ pharo-10,2026-05-11)
Filter: only items that fire in normal benign operation, not edge cases
requiring huge/crafted strings or attacker-chosen sizes.

================================================================
A. ALWAYS — MEMORY DAMAGE / UNDEFINED BEHAVIOR
================================================================
1. sqNamedPrims.c:56-57 — calloc(sizeof(ModuleEntry)+strlen(name))then strcpy writes strlen+1; 1-byte heap overflow on every pluginload. [#3.26]2. ffi/callbacks/callbacks.c:14-32 — stack-allocatedCallbackInvocation registered in runner->callbackStack/globalqueue; sig_longjmp exit leaves dangling stack pointer after everysame-thread callback. [#2.8]3. ffi/typesPrimitives.c:174-188 — setHandler(receiver, structType)stores pointer before failed()/ffi_get_struct_offsets checks; anyerror path frees memory while receiver still holds the danglinghandle. [#2.6]4. threadSafeQueue.c:113-137 — queue->first and node->element readoutside the mutex; lock-holder's free(node) leaves the otherconsumer walking freed memory on every concurrent dequeue (hit byevery FFI workload). [#3.21]5. SocketPluginImpl.c:1109-1123 — sqSocketDestroy frees PSP(s)after sqSocketAbortConnection queues a closeHandler against pss;AIO dispatch fires on freed memory. [#3.22]6. ffi/utils.c:43-50 — readString returns an un-pinned image-memorypointer; GC between strlen() and the caller's strcpy invalidatesthe length and the address. [#5.27]7. ffi/callbacks/callbacks.c:24-29 — runner->callbackStack chainupdated without any lock; reentrant callbacks from multiple threadson the same Runner corrupt the linked list. [#5.4]8. pathUtilities.c:233-237 — strrchr(name,'.') result stored infileExtension, but the NULL guard tests the unrelated `extension`variable; strcmp(NULL,...) crashes on any directory entry without adot (e.g. "Makefile"). [#4.2]9. pathUtilities.c:163 — first[strlen(first)-1] reads first[-1](one byte before the buffer) whenever first is "", reachable fromparameters.c:210-212 fallback. [#4.3]10. externalPrimitives.c:57,66 — module path assembled in afile-static moduleNameBuffer with no lock; concurrent loadsscribble each other's path and dlopen/LoadLibrary sees a tornstring. [#5.3]11. debugUnix.c:88-95,122-162 — SIGSEGV/SIGBUS/SIGFPE handler callsfopen/vfprintf/backtrace_symbols_fd/ctime_r/semaphore_wait (noneasync-signal-safe) and uses SA_NODEFER so the handler can re-enteritself on every crash. [#5.1]12. debugUnix.c:123,144,154 — sigaction.sa_mask never initializedvia sigemptyset for term_handler_action and sigpipe_handler_action;kernel reads uninitialized stack to decide what to mask on everyinstall. [#5.2]13. debug.c:57-66 — glibc strerror_r returns a pointer that may notwrite to the supplied buffer; caller prints the bufferunconditionally, leaking uninitialized stack bytes on every errorpath. [#5.7]14. SocketPluginImpl.c:2494-2496 — sqSocket lastError stored in afile-static, clobbered across sockets; every concurrent socketfailure overwrites another socket's error state. [#5.17]15. aioWin.c:457/465 — heap-interior alias into allHandlestransiently equals a freshly malloc'd region; any early returnbetween the two assignments leaves a free-of-interior or double-usehazard. [#1/§9 PARTIAL]
================================================================
B. ALWAYS — SECURITY / CORRECTNESS (NOT MEMORY DAMAGE)
================================================================

TLS / SqueakSSL
---------------
1. sqUnixSSL.c:89-143 — SSL_CTX_set_verify never called;SSL_get_verify_result returns X509_V_OK by default → no certificatevalidation on any TLS connection. [#2.9]2. sqUnixSSL.c:102-107 — SSLv23_method with only SSLv2/v3 disabled;TLS 1.0 and 1.1 still accepted on every handshake. [#2.9 / #6.3]3. sqUnixSSL.c:115 — cipher list "!ADH:HIGH:MEDIUM:@STRENGTH"permits MEDIUM ciphers. [#2.9]4. sqUnixSSL.c:107 — no SSL_OP_NO_COMPRESSION (CRIME), noSSL_OP_NO_RENEGOTIATION, no SSL_OP_CIPHER_SERVER_PREFERENCE. [#6.3]5. sqWin32SSL.c:269-275,215,349-353 — sqExtractPeerName copiesserverName verbatim into peerName instead of extracting the certsubject; epp.pwszServerName=NULL disables SChannel's hostnamecheck, so image-side peerName==serverName is meaningless on everyconnection. [#3.19]6. sqMacSSL.c:154-201,262-272,363-383 —kSSLSessionOptionBreakOnServerAuth disables auto verification;manual SecTrustEvaluate runs with no SSL policy carrying thehostname, so hostname is never checked. [#3.20]7. sqWin32SSL.c:216-218 — SP_PROT_TLS1_0/1_1/1_2 enabled for bothclient and server roles. [#6.1]8. sqMacSSL.c:154-164 — SSLSetProtocolVersionMin(ctx,kTLSProtocol1) sets minimum to TLS 1.0. [#6.2]
VM internals
------------
9. memoryUnix.c:66-89,109-111 — JIT pages mmap'dPROT_READ|PROT_WRITE|PROT_EXEC permanently;sqMakeMemoryExecutableFromTo / NotExecutable hooks are commentedout, so W^X is defeated on Linux/FreeBSD. [#5.24]10. debug.c:45 — error(char*) forwards the argument as the formatstring into the vfprintf-style logger; exported API contract leaksa %n/%s primitive to any future caller-controlled string. [#4.10]11. ffi/typesPrimitives.c:170-172 — getHandler() returns the firstslot of any oop with no class tag check; libffi consumesattacker-shaped ffi_type fields on every struct cif build, giving acontrolled-dispatch primitive to anyone who can register an FFIstruct. [#2.5]
Build / supply chain (every build / every CI run)
-------------------------------------------------
12. Jenkinsfile:84,249 — fetch-and-execute installer via `wget … |bash` over plain HTTP. [#7]13. scripts/runTests.sh:31 — `wget -O - https://get.pharo.org/64/80| bash`, executed from PR workflows. [#7]14. scripts/installCygwin.ps1:7-9 — Cygwin installer + mirrorretrieved over plain HTTP. [#7]15. cmake/importLibFFI.cmake / importLibGit2.cmake /importSDL2.cmake — dependencies pinned to mutable git tags, nocommit-SHA. [#7]16. macros.cmake:69-103 + every cmake/import*.cmake usingfiles.pharo.org — DownloadProject calls omit URL_HASH for libgit2,libssh2, openssl, zlib, SDL2, cairo, pixman, libpng, freetype,fontconfig, harfbuzz, gcc-runtime. [#7]17. cmake/importFreetype2.cmake:47-49 — direct savannah.gnu.orgdownload with no URL_HASH. [#7]18. docker/ubuntu-arm64/Dockerfile,docker/debian10-armv7/Dockerfile — base images unpinned (no @sha256digest). [#7]19. Jenkinsfile:97-403 — every upload uses `scp -oStrictHostKeyChecking=no` against files.pharo.org. [#7]20. Jenkinsfile:138 + cmake/sign.cmake:8-11 — SIGN_CERT_PASSWORDpassed through environment under a broad withCredentials() block. [#7]21. .github/workflows/continuous-integration-workflow.yaml:2 — `on:[push, pull_request]` with no `permissions:` block; fork PRs canedit runTests.sh and execute with the workflow GITHUB_TOKEN scope. [#7]22. .github/workflows/...:11,14,68 — EOL runners (ubuntu-18.04,windows-2016) and EOL actions (checkout@v1, upload-artifact@v1). [#7]
23. cmake/packaging.cmake:92 — CPACK_PACKAGE_CHECKSUM "SHA1". [#7]
24. scripts/installCygwin.ps1:35-48 — `cygwin -q` suppressessignature warnings during install. [#7]25. CMakeLists.txt:206,266-296 + cmake/Linux.cmake:1 — no-D_FORTIFY_SOURCE=2, no -fstack-protector-strong, no -fPIE/-pie, no-Wformat-security, no -Wl,-z,relro / -z,now / -z,noexecstack onLinux release builds. [#7]26. CMakeLists.txt:206,266-296 — -Wno-int-conversion and-Wno-pointer-sign actively silenced; both classes of warning catchreal bugs. [#7]27. cmake/Linux.cmake:1 — Linux rpath set to "." (relative to CWD)instead of "$ORIGIN". [#7]28. CMakeLists.txt:206 — Windows/Cygwin builds lack /GS, /guard:cf,/DYNAMICBASE, /NXCOMPAT. [#7]

--
Nicolas Anquetil
Evref team -- Inria Lille

[Pharo-dev] Re: claude.ai code review of pharo-v

Reply via email to