Human sign-off would be a great system. Unfortunately, the current setup requires humans to generate fixes for Pharo. The discussion is to allow AI-generated proposed fixes. I don't understand the copyright issue, so I will let someone who does explain why AI fixes are a copyright problem.
The usual anti-AI copyright issues are 1) you can't copyright the work of an AI and 2) AIs will always steal other peoples work and lie about where it came from. ----- Original message ----- From: Sven Van Caekenberghe <[email protected]> To: Pharo Development List <[email protected]> Cc: stephane ducasse <[email protected]>, [email protected], Aaron Wohl <[email protected]> Subject: Re: [Pharo-dev] claude.ai code review of pharo-v Date: Thursday, May 14, 2026 2:52 AM AI Coding Assistants — The Linux Kernel documentation <https://docs.kernel.org/process/coding-assistants.html> docs.kernel.org <https://docs.kernel.org/process/coding-assistants.html> <https://docs.kernel.org/process/coding-assistants.html> It is great that you did this and important. But each change has to be validated by a capable VM developer, who takes responsibility. > On 14 May 2026, at 04:42, Aaron Wohl via Pharo-dev > <[email protected]> wrote: > > It has been in the news that Anthropic delayed the release of Mythos to > invite groups to fix the AI's bugs. The news reports suggest that the big > open-source projects are accepting AI bug fixes (not just bug reports). If > the linux kernal and chrome are accepting AI code how can they do it and > pharo can not? > https://www.linuxfoundation.org/blog/project-glasswing-gives-maintainers-advanced-ai-to-secure-open-source > I am not suggesting that AI code should be blindly accepted, some of AI bugs > and fixes are fine. It is faster to have a human review than to generate > fixes. > > ----- Original message ----- > From: stephane ducasse <[email protected]> > To: Aaron Wohl <[email protected]> > Cc: [email protected], Pharo-dev <[email protected]> > Subject: Re: [Pharo-dev] claude.ai code review of pharo-v > Date: Wednesday, May 13, 2026 3:26 PM > > thanks we will check this. > > >> On 13 May 2026, at 19:24, Aaron Wohl <[email protected]> wrote: >> >> I had Claude create PRs for the current git main and for the version 12 you >> asked for. >> I didn't submit them because they are AI-generated. I have no interest in >> doing manual work that an AI can do. On copyright: if not being able to >> copyright AI work is the issue, perhaps you could copyright the collection >> (of source lines)? >> >> For pharo-12 there are 93 bug fixes posted here. >> https://www.awohl.com/pharo-bugs-2026-05-13/ >> >> ----- Original message ----- >> From: stephane ducasse <[email protected]> >> To: Aaron Wohl <[email protected]> >> Cc: [email protected], Pharo Development List >> <[email protected]> >> Subject: Re: [Pharo-dev] claude.ai code review of pharo-v >> Date: Wednesday, May 13, 2026 8:14 AM >> >> Hi aaron >> >> Today I discussed with Guille (who was and still a bit sick and under >> recovery) and did not have the opportunity to discuss with Pablo (who is on >> vacation) >> >> We are interested in pullrequests that improve the security and code of the >> VM and pluggins. >> So your analyses are definitively worth. >> >> We would love to have PRs and for us this is great if you do them and not an >> IA so that we can control the copyright concerns. >> >> So thank you for your time and idea. >> Guille may contact you directly. >> >> S (this week we have 3 working days) and next week we have also some >> non-working days. >> The month of may is a gruyere. >> >> >>> On 12 May 2026, at 09:10, Aaron Wohl via Pharo-dev >>> <[email protected]> wrote: >>> >>> I regularly have Claude.AI review my code https://awohl.com, I pointed >>> Claude at the Pharo-VM to see what it could see. A lot of the issues it >>> found are things like strcpy without bounds checks with strings no one >>> would ever make that big. However, it found some issues that are always >>> triggered. >>> >>> Full list of issues >>> https://github.com/avwohl/iospharo/blob/main/docs/pharo-vm-code-reivew-2026-05-11.md >>> >>> I did not make an AI-generated patch list. [email protected] >>> mentioned copyright concerns by [email protected] of accepting AI >>> code. I assume this is due to the inability to copyright AI-generated work >>> and to AI's tendency to steal others' work and lie about its origins. >>> However, a lot of the fixes are so trivial, like off-by-one errors, that I >>> don't know how much of an issue it is. If you want me to PR fixes and or >>> tests that trigger for the full list, or always list, let me know. >>> >>> Many of the issues would only happen if someone were trying to break things >>> (MAXLEN symlinks, damaged image files). If one were a mindset, it could be >>> called AI slop. However, here is a short list of things that always cause >>> issues in everyday operation: for example, every plugin loaded damages heap >>> memory, or the nightly build of signed code disables SSL checks, so a >>> simple DNS hack could get malicious code signed. >>> >>> Source: ~/pharo.md (review of /Users/wohl/esrc/pharo-vm @ pharo-10, >>> 2026-05-11) >>> Filter: only items that fire in normal benign operation, not edge cases >>> requiring huge/crafted strings or attacker-chosen sizes. >>> >>> ================================================================ >>> A. ALWAYS — MEMORY DAMAGE / UNDEFINED BEHAVIOR >>> ================================================================ >>> >>> 1. sqNamedPrims.c:56-57 — calloc(sizeof(ModuleEntry)+strlen(name)) then >>> strcpy writes strlen+1; 1-byte heap overflow on every plugin load. [#3.26] >>> 2. ffi/callbacks/callbacks.c:14-32 — stack-allocated CallbackInvocation >>> registered in runner->callbackStack/global queue; sig_longjmp exit leaves >>> dangling stack pointer after every same-thread callback. [#2.8] >>> 3. ffi/typesPrimitives.c:174-188 — setHandler(receiver, structType) stores >>> pointer before failed()/ffi_get_struct_offsets checks; any error path frees >>> memory while receiver still holds the dangling handle. [#2.6] >>> 4. threadSafeQueue.c:113-137 — queue->first and node->element read outside >>> the mutex; lock-holder's free(node) leaves the other consumer walking freed >>> memory on every concurrent dequeue (hit by every FFI workload). [#3.21] >>> 5. SocketPluginImpl.c:1109-1123 — sqSocketDestroy frees PSP(s) after >>> sqSocketAbortConnection queues a closeHandler against pss; AIO dispatch >>> fires on freed memory. [#3.22] >>> 6. ffi/utils.c:43-50 — readString returns an un-pinned image-memory >>> pointer; GC between strlen() and the caller's strcpy invalidates the length >>> and the address. [#5.27] >>> 7. ffi/callbacks/callbacks.c:24-29 — runner->callbackStack chain updated >>> without any lock; reentrant callbacks from multiple threads on the same >>> Runner corrupt the linked list. [#5.4] >>> 8. pathUtilities.c:233-237 — strrchr(name,'.') result stored in >>> fileExtension, but the NULL guard tests the unrelated `extension` variable; >>> strcmp(NULL,...) crashes on any directory entry without a dot (e.g. >>> "Makefile"). [#4.2] >>> 9. pathUtilities.c:163 — first[strlen(first)-1] reads first[-1] (one byte >>> before the buffer) whenever first is "", reachable from >>> parameters.c:210-212 fallback. [#4.3] >>> 10. externalPrimitives.c:57,66 — module path assembled in a file-static >>> moduleNameBuffer with no lock; concurrent loads scribble each other's path >>> and dlopen/LoadLibrary sees a torn string. [#5.3] >>> 11. debugUnix.c:88-95,122-162 — SIGSEGV/SIGBUS/SIGFPE handler calls >>> fopen/vfprintf/backtrace_symbols_fd/ctime_r/semaphore_wait (none >>> async-signal-safe) and uses SA_NODEFER so the handler can re-enter itself >>> on every crash. [#5.1] >>> 12. debugUnix.c:123,144,154 — sigaction.sa_mask never initialized via >>> sigemptyset for term_handler_action and sigpipe_handler_action; kernel >>> reads uninitialized stack to decide what to mask on every install. [#5.2] >>> 13. debug.c:57-66 — glibc strerror_r returns a pointer that may not write >>> to the supplied buffer; caller prints the buffer unconditionally, leaking >>> uninitialized stack bytes on every error path. [#5.7] >>> 14. SocketPluginImpl.c:2494-2496 — sqSocket lastError stored in a >>> file-static, clobbered across sockets; every concurrent socket failure >>> overwrites another socket's error state. [#5.17] >>> 15. aioWin.c:457/465 — heap-interior alias into allHandles transiently >>> equals a freshly malloc'd region; any early return between the two >>> assignments leaves a free-of-interior or double-use hazard. [#1/§9 PARTIAL] >>> >>> >>> ================================================================ >>> B. ALWAYS — SECURITY / CORRECTNESS (NOT MEMORY DAMAGE) >>> ================================================================ >>> >>> TLS / SqueakSSL >>> --------------- >>> 1. sqUnixSSL.c:89-143 — SSL_CTX_set_verify never called; >>> SSL_get_verify_result returns X509_V_OK by default → no certificate >>> validation on any TLS connection. [#2.9] >>> 2. sqUnixSSL.c:102-107 — SSLv23_method with only SSLv2/v3 disabled; TLS 1.0 >>> and 1.1 still accepted on every handshake. [#2.9 / #6.3] >>> 3. sqUnixSSL.c:115 — cipher list "!ADH:HIGH:MEDIUM:@STRENGTH" permits >>> MEDIUM ciphers. [#2.9] >>> 4. sqUnixSSL.c:107 — no SSL_OP_NO_COMPRESSION (CRIME), no >>> SSL_OP_NO_RENEGOTIATION, no SSL_OP_CIPHER_SERVER_PREFERENCE. [#6.3] >>> 5. sqWin32SSL.c:269-275,215,349-353 — sqExtractPeerName copies serverName >>> verbatim into peerName instead of extracting the cert subject; >>> epp.pwszServerName=NULL disables SChannel's hostname check, so image-side >>> peerName==serverName is meaningless on every connection. [#3.19] >>> 6. sqMacSSL.c:154-201,262-272,363-383 — kSSLSessionOptionBreakOnServerAuth >>> disables auto verification; manual SecTrustEvaluate runs with no SSL policy >>> carrying the hostname, so hostname is never checked. [#3.20] >>> 7. sqWin32SSL.c:216-218 — SP_PROT_TLS1_0/1_1/1_2 enabled for both client >>> and server roles. [#6.1] >>> 8. sqMacSSL.c:154-164 — SSLSetProtocolVersionMin(ctx, kTLSProtocol1) sets >>> minimum to TLS 1.0. [#6.2] >>> >>> VM internals >>> ------------ >>> 9. memoryUnix.c:66-89,109-111 — JIT pages mmap'd >>> PROT_READ|PROT_WRITE|PROT_EXEC permanently; sqMakeMemoryExecutableFromTo / >>> NotExecutable hooks are commented out, so W^X is defeated on Linux/FreeBSD. >>> [#5.24] >>> 10. debug.c:45 — error(char*) forwards the argument as the format string >>> into the vfprintf-style logger; exported API contract leaks a %n/%s >>> primitive to any future caller-controlled string. [#4.10] >>> 11. ffi/typesPrimitives.c:170-172 — getHandler() returns the first slot of >>> any oop with no class tag check; libffi consumes attacker-shaped ffi_type >>> fields on every struct cif build, giving a controlled-dispatch primitive to >>> anyone who can register an FFI struct. [#2.5] >>> >>> Build / supply chain (every build / every CI run) >>> ------------------------------------------------- >>> 12. Jenkinsfile:84,249 — fetch-and-execute installer via `wget … | bash` >>> over plain HTTP. [#7] >>> 13. scripts/runTests.sh:31 — `wget -O - https://get.pharo.org/64/80 | >>> bash`, executed from PR workflows. [#7] >>> 14. scripts/installCygwin.ps1:7-9 — Cygwin installer + mirror retrieved >>> over plain HTTP. [#7] >>> 15. cmake/importLibFFI.cmake / importLibGit2.cmake / importSDL2.cmake — >>> dependencies pinned to mutable git tags, no commit-SHA. [#7] >>> 16. macros.cmake:69-103 + every cmake/import*.cmake using files.pharo.org — >>> DownloadProject calls omit URL_HASH for libgit2, libssh2, openssl, zlib, >>> SDL2, cairo, pixman, libpng, freetype, fontconfig, harfbuzz, gcc-runtime. >>> [#7] >>> 17. cmake/importFreetype2.cmake:47-49 — direct savannah.gnu.org download >>> with no URL_HASH. [#7] >>> 18. docker/ubuntu-arm64/Dockerfile, docker/debian10-armv7/Dockerfile — base >>> images unpinned (no @sha256 digest). [#7] >>> 19. Jenkinsfile:97-403 — every upload uses `scp -o >>> StrictHostKeyChecking=no` against files.pharo.org. [#7] >>> 20. Jenkinsfile:138 + cmake/sign.cmake:8-11 — SIGN_CERT_PASSWORD passed >>> through environment under a broad withCredentials() block. [#7] >>> 21. .github/workflows/continuous-integration-workflow.yaml:2 — `on: [push, >>> pull_request]` with no `permissions:` block; fork PRs can edit runTests.sh >>> and execute with the workflow GITHUB_TOKEN scope. [#7] >>> 22. .github/workflows/...:11,14,68 — EOL runners (ubuntu-18.04, >>> windows-2016) and EOL actions (checkout@v1, upload-artifact@v1). [#7] >>> 23. cmake/packaging.cmake:92 — CPACK_PACKAGE_CHECKSUM "SHA1". [#7] >>> 24. scripts/installCygwin.ps1:35-48 — `cygwin -q` suppresses signature >>> warnings during install. [#7] >>> 25. CMakeLists.txt:206,266-296 + cmake/Linux.cmake:1 — no >>> -D_FORTIFY_SOURCE=2, no -fstack-protector-strong, no -fPIE/-pie, no >>> -Wformat-security, no -Wl,-z,relro / -z,now / -z,noexecstack on Linux >>> release builds. [#7] >>> 26. CMakeLists.txt:206,266-296 — -Wno-int-conversion and -Wno-pointer-sign >>> actively silenced; both classes of warning catch real bugs. [#7] >>> 27. cmake/Linux.cmake:1 — Linux rpath set to "." (relative to CWD) instead >>> of "$ORIGIN". [#7] >>> 28. CMakeLists.txt:206 — Windows/Cygwin builds lack /GS, /guard:cf, >>> /DYNAMICBASE, /NXCOMPAT. [#7]
