[Pharo-dev] Re: claude.ai code review of pharo-v

Aaron Wohl via Pharo-dev Thu, 14 May 2026 00:40:31 -0700

Human sign-off would be a great system.  Unfortunately, the current setup 
requires humans to generate fixes for Pharo.   The discussion is to allow 
AI-generated proposed fixes.  I don't understand the copyright issue, so I will 
let someone who does explain why AI fixes are a copyright problem.


The usual anti-AI copyright issues are 1) you can't copyright the work of an AI 
and 2) AIs will always steal other peoples work and lie about where it came 
from. 

----- Original message -----
From: Sven Van Caekenberghe <[email protected]>
To: Pharo Development List <[email protected]>
Cc: stephane ducasse <[email protected]>, [email protected], 
Aaron Wohl <[email protected]>
Subject: Re: [Pharo-dev] claude.ai code review of pharo-v
Date: Thursday, May 14, 2026 2:52 AM

AI Coding Assistants — The Linux Kernel documentation 
<https://docs.kernel.org/process/coding-assistants.html>
docs.kernel.org <https://docs.kernel.org/process/coding-assistants.html>
 <https://docs.kernel.org/process/coding-assistants.html>

It is great that you did this and important. But each change has to be 
validated by a capable VM developer, who takes responsibility.

> On 14 May 2026, at 04:42, Aaron Wohl via Pharo-dev 
> <[email protected]> wrote:
> 
> It has been in the news that Anthropic delayed the release of Mythos to 
> invite groups to fix the AI's bugs.  The news reports suggest that the big 
> open-source projects are accepting AI bug fixes (not just bug reports).  If 
> the linux kernal and chrome are accepting AI code how can they do it and 
> pharo can not?
> https://www.linuxfoundation.org/blog/project-glasswing-gives-maintainers-advanced-ai-to-secure-open-source
> I am not suggesting that AI code should be blindly accepted, some of AI bugs 
> and fixes are fine.  It is faster to have a human review than to generate 
> fixes.
> 
> ----- Original message -----
> From: stephane ducasse <[email protected]>
> To: Aaron Wohl <[email protected]>
> Cc: [email protected], Pharo-dev <[email protected]>
> Subject: Re: [Pharo-dev] claude.ai code review of pharo-v
> Date: Wednesday, May 13, 2026 3:26 PM
> 
> thanks we will check this.
> 
> 
>> On 13 May 2026, at 19:24, Aaron Wohl <[email protected]> wrote:
>> 
>> I had Claude create PRs for the current git main and for the version 12 you 
>> asked for.
>> I didn't submit them because they are AI-generated.  I have no interest in 
>> doing manual work that an AI can do.  On copyright: if not being able to 
>> copyright AI work is the issue, perhaps you could copyright the collection 
>> (of source lines)?
>> 
>> For pharo-12 there are 93 bug fixes posted here.
>> https://www.awohl.com/pharo-bugs-2026-05-13/
>> 
>> ----- Original message -----
>> From: stephane ducasse <[email protected]>
>> To: Aaron Wohl <[email protected]>
>> Cc: [email protected], Pharo Development List 
>> <[email protected]>
>> Subject: Re: [Pharo-dev] claude.ai code review of pharo-v
>> Date: Wednesday, May 13, 2026 8:14 AM
>> 
>> Hi aaron
>> 
>> Today I discussed with Guille (who was and still a bit sick and under 
>> recovery) and did not have the opportunity to discuss with Pablo (who is on 
>> vacation)
>> 
>> We are interested in pullrequests that improve the security and code of the 
>> VM and pluggins. 
>> So your analyses are definitively worth. 
>> 
>> We would love to have PRs and for us this is great if you do them and not an 
>> IA so that we can control the copyright concerns. 
>> 
>> So thank you for your time and idea. 
>> Guille may contact you directly. 
>> 
>> S (this week we have 3 working days) and next week we have also some 
>> non-working days.
>> The month of may is a gruyere.
>> 
>> 
>>> On 12 May 2026, at 09:10, Aaron Wohl via Pharo-dev 
>>> <[email protected]> wrote:
>>> 
>>> I regularly have Claude.AI review my code https://awohl.com, I pointed 
>>> Claude at the Pharo-VM to see what it could see.  A lot of the issues it 
>>> found are things like strcpy without bounds checks with strings no one 
>>> would ever make that big. However, it found some issues that are always 
>>> triggered. 
>>> 
>>> Full list of issues
>>> https://github.com/avwohl/iospharo/blob/main/docs/pharo-vm-code-reivew-2026-05-11.md
>>> 
>>> I did not make an AI-generated patch list.  [email protected] 
>>> mentioned copyright concerns by [email protected] of accepting AI 
>>> code.  I assume this is due to the inability to copyright AI-generated work 
>>> and to AI's tendency to steal others' work and lie about its origins.   
>>> However, a lot of the fixes are so trivial, like off-by-one errors, that I 
>>> don't know how much of an issue it is. If you want me to PR fixes and or 
>>> tests that trigger for the full list, or always list, let me know.
>>> 
>>> Many of the issues would only happen if someone were trying to break things 
>>> (MAXLEN symlinks, damaged image files). If one were a mindset, it could be 
>>> called AI slop.  However, here is a short list of things that always cause 
>>> issues in everyday operation: for example, every plugin loaded damages heap 
>>> memory, or the nightly build of signed code disables SSL checks, so a 
>>> simple DNS hack could get malicious code signed.
>>> 
>>> Source: ~/pharo.md (review of /Users/wohl/esrc/pharo-vm @ pharo-10, 
>>> 2026-05-11)
>>> Filter: only items that fire in normal benign operation, not edge cases
>>> requiring huge/crafted strings or attacker-chosen sizes.
>>> 
>>> ================================================================
>>> A. ALWAYS — MEMORY DAMAGE / UNDEFINED BEHAVIOR
>>> ================================================================
>>> 
>>> 1. sqNamedPrims.c:56-57 — calloc(sizeof(ModuleEntry)+strlen(name)) then 
>>> strcpy writes strlen+1; 1-byte heap overflow on every plugin load. [#3.26]
>>> 2. ffi/callbacks/callbacks.c:14-32 — stack-allocated CallbackInvocation 
>>> registered in runner->callbackStack/global queue; sig_longjmp exit leaves 
>>> dangling stack pointer after every same-thread callback. [#2.8]
>>> 3. ffi/typesPrimitives.c:174-188 — setHandler(receiver, structType) stores 
>>> pointer before failed()/ffi_get_struct_offsets checks; any error path frees 
>>> memory while receiver still holds the dangling handle. [#2.6]
>>> 4. threadSafeQueue.c:113-137 — queue->first and node->element read outside 
>>> the mutex; lock-holder's free(node) leaves the other consumer walking freed 
>>> memory on every concurrent dequeue (hit by every FFI workload). [#3.21]
>>> 5. SocketPluginImpl.c:1109-1123 — sqSocketDestroy frees PSP(s) after 
>>> sqSocketAbortConnection queues a closeHandler against pss; AIO dispatch 
>>> fires on freed memory. [#3.22]
>>> 6. ffi/utils.c:43-50 — readString returns an un-pinned image-memory 
>>> pointer; GC between strlen() and the caller's strcpy invalidates the length 
>>> and the address. [#5.27]
>>> 7. ffi/callbacks/callbacks.c:24-29 — runner->callbackStack chain updated 
>>> without any lock; reentrant callbacks from multiple threads on the same 
>>> Runner corrupt the linked list. [#5.4]
>>> 8. pathUtilities.c:233-237 — strrchr(name,'.') result stored in 
>>> fileExtension, but the NULL guard tests the unrelated `extension` variable; 
>>> strcmp(NULL,...) crashes on any directory entry without a dot (e.g. 
>>> "Makefile").  [#4.2]
>>> 9. pathUtilities.c:163 — first[strlen(first)-1] reads first[-1] (one byte 
>>> before the buffer) whenever first is "", reachable from 
>>> parameters.c:210-212 fallback. [#4.3]
>>> 10. externalPrimitives.c:57,66 — module path assembled in a file-static 
>>> moduleNameBuffer with no lock; concurrent loads scribble each other's path 
>>> and dlopen/LoadLibrary sees a torn string. [#5.3]
>>> 11. debugUnix.c:88-95,122-162 — SIGSEGV/SIGBUS/SIGFPE handler calls 
>>> fopen/vfprintf/backtrace_symbols_fd/ctime_r/semaphore_wait (none 
>>> async-signal-safe) and uses SA_NODEFER so the handler can re-enter itself 
>>> on every crash. [#5.1]
>>> 12. debugUnix.c:123,144,154 — sigaction.sa_mask never initialized via 
>>> sigemptyset for term_handler_action and sigpipe_handler_action; kernel 
>>> reads uninitialized stack to decide what to mask on every install. [#5.2]
>>> 13. debug.c:57-66 — glibc strerror_r returns a pointer that may not write 
>>> to the supplied buffer; caller prints the buffer unconditionally, leaking 
>>> uninitialized stack bytes on every error path. [#5.7]
>>> 14. SocketPluginImpl.c:2494-2496 — sqSocket lastError stored in a 
>>> file-static, clobbered across sockets; every concurrent socket failure 
>>> overwrites another socket's error state. [#5.17]
>>> 15. aioWin.c:457/465 — heap-interior alias into allHandles transiently 
>>> equals a freshly malloc'd region; any early return between the two 
>>> assignments leaves a free-of-interior or double-use hazard. [#1/§9 PARTIAL]
>>> 
>>> 
>>> ================================================================
>>> B. ALWAYS — SECURITY / CORRECTNESS (NOT MEMORY DAMAGE)
>>> ================================================================
>>> 
>>> TLS / SqueakSSL
>>> ---------------
>>> 1. sqUnixSSL.c:89-143 — SSL_CTX_set_verify never called; 
>>> SSL_get_verify_result returns X509_V_OK by default → no certificate 
>>> validation on any TLS connection. [#2.9]
>>> 2. sqUnixSSL.c:102-107 — SSLv23_method with only SSLv2/v3 disabled; TLS 1.0 
>>> and 1.1 still accepted on every handshake. [#2.9 / #6.3]
>>> 3. sqUnixSSL.c:115 — cipher list "!ADH:HIGH:MEDIUM:@STRENGTH" permits 
>>> MEDIUM ciphers. [#2.9]
>>> 4. sqUnixSSL.c:107 — no SSL_OP_NO_COMPRESSION (CRIME), no 
>>> SSL_OP_NO_RENEGOTIATION, no SSL_OP_CIPHER_SERVER_PREFERENCE. [#6.3]
>>> 5. sqWin32SSL.c:269-275,215,349-353 — sqExtractPeerName copies serverName 
>>> verbatim into peerName instead of extracting the cert subject; 
>>> epp.pwszServerName=NULL disables SChannel's hostname check, so image-side 
>>> peerName==serverName is meaningless on every connection. [#3.19]
>>> 6. sqMacSSL.c:154-201,262-272,363-383 — kSSLSessionOptionBreakOnServerAuth 
>>> disables auto verification; manual SecTrustEvaluate runs with no SSL policy 
>>> carrying the hostname, so hostname is never checked. [#3.20]
>>> 7. sqWin32SSL.c:216-218 — SP_PROT_TLS1_0/1_1/1_2 enabled for both client 
>>> and server roles. [#6.1]
>>> 8. sqMacSSL.c:154-164 — SSLSetProtocolVersionMin(ctx, kTLSProtocol1) sets 
>>> minimum to TLS 1.0. [#6.2]
>>> 
>>> VM internals
>>> ------------
>>> 9. memoryUnix.c:66-89,109-111 — JIT pages mmap'd 
>>> PROT_READ|PROT_WRITE|PROT_EXEC permanently; sqMakeMemoryExecutableFromTo / 
>>> NotExecutable hooks are commented out, so W^X is defeated on Linux/FreeBSD. 
>>> [#5.24]
>>> 10. debug.c:45 — error(char*) forwards the argument as the format string 
>>> into the vfprintf-style logger; exported API contract leaks a %n/%s 
>>> primitive to any future caller-controlled string. [#4.10]
>>> 11. ffi/typesPrimitives.c:170-172 — getHandler() returns the first slot of 
>>> any oop with no class tag check; libffi consumes attacker-shaped ffi_type 
>>> fields on every struct cif build, giving a controlled-dispatch primitive to 
>>> anyone who can register an FFI struct. [#2.5]
>>> 
>>> Build / supply chain (every build / every CI run)
>>> -------------------------------------------------
>>> 12. Jenkinsfile:84,249 — fetch-and-execute installer via `wget … | bash` 
>>> over plain HTTP. [#7]
>>> 13. scripts/runTests.sh:31 — `wget -O - https://get.pharo.org/64/80 | 
>>> bash`, executed from PR workflows. [#7]
>>> 14. scripts/installCygwin.ps1:7-9 — Cygwin installer + mirror retrieved 
>>> over plain HTTP. [#7]
>>> 15. cmake/importLibFFI.cmake / importLibGit2.cmake / importSDL2.cmake — 
>>> dependencies pinned to mutable git tags, no commit-SHA. [#7]
>>> 16. macros.cmake:69-103 + every cmake/import*.cmake using files.pharo.org — 
>>> DownloadProject calls omit URL_HASH for libgit2, libssh2, openssl, zlib, 
>>> SDL2, cairo, pixman, libpng, freetype, fontconfig, harfbuzz, gcc-runtime. 
>>> [#7]
>>> 17. cmake/importFreetype2.cmake:47-49 — direct savannah.gnu.org download 
>>> with no URL_HASH. [#7]
>>> 18. docker/ubuntu-arm64/Dockerfile, docker/debian10-armv7/Dockerfile — base 
>>> images unpinned (no @sha256 digest). [#7]
>>> 19. Jenkinsfile:97-403 — every upload uses `scp -o 
>>> StrictHostKeyChecking=no` against files.pharo.org. [#7]
>>> 20. Jenkinsfile:138 + cmake/sign.cmake:8-11 — SIGN_CERT_PASSWORD passed 
>>> through environment under a broad withCredentials() block. [#7]
>>> 21. .github/workflows/continuous-integration-workflow.yaml:2 — `on: [push, 
>>> pull_request]` with no `permissions:` block; fork PRs can edit runTests.sh 
>>> and execute with the workflow GITHUB_TOKEN scope. [#7]
>>> 22. .github/workflows/...:11,14,68 — EOL runners (ubuntu-18.04, 
>>> windows-2016) and EOL actions (checkout@v1, upload-artifact@v1). [#7]
>>> 23. cmake/packaging.cmake:92 — CPACK_PACKAGE_CHECKSUM "SHA1". [#7]
>>> 24. scripts/installCygwin.ps1:35-48 — `cygwin -q` suppresses signature 
>>> warnings during install. [#7]
>>> 25. CMakeLists.txt:206,266-296 + cmake/Linux.cmake:1 — no 
>>> -D_FORTIFY_SOURCE=2, no -fstack-protector-strong, no -fPIE/-pie, no 
>>> -Wformat-security, no -Wl,-z,relro / -z,now / -z,noexecstack on Linux 
>>> release builds. [#7]
>>> 26. CMakeLists.txt:206,266-296 — -Wno-int-conversion and -Wno-pointer-sign 
>>> actively silenced; both classes of warning catch real bugs. [#7]
>>> 27. cmake/Linux.cmake:1 — Linux rpath set to "." (relative to CWD) instead 
>>> of "$ORIGIN". [#7]
>>> 28. CMakeLists.txt:206 — Windows/Cygwin builds lack /GS, /guard:cf, 
>>> /DYNAMICBASE, /NXCOMPAT. [#7]

[Pharo-dev] Re: claude.ai code review of pharo-v

Reply via email to