Re: request for help: LLM-based quality assurance

Tim Rühsen Sat, 06 Jun 2026 09:37:20 -0700

From my experiment with a local LLM.

Starting server in one console:

$ llama.cpp/build/bin/llama-server -m ~/.cache/huggingface/hub/models--unsloth--Qwen3.6-35B-A3B-GGUF/snapshots/9280dd353ab587157920d5bd391ada414d84e552/Qwen3.6-35B-A3B-UD-IQ2_M.gguf --n-gpu-layers 99 --temp 0.6 --top-p 0.95 --top-k 20 --min-p 0.00 --cache-type-v q4_0 --cache-type-k q4_0 --presence-penalty 1.0

0.00.131.117 I log_info: verbosity = 3 (adjust with the `-lv N` CLI arg)
0.00.131.120 I device_info:

,,,



Running the `pi` agent with

(i="17dc60e624cd6fc3491f9cb002f760d60e66ce8b"; pi "Review commit $i, use the code-reviewer skill. Think deeply about all implications for the various platforms that gnulib targets. Gnulib is a portability wrapper. Write the review result into '$i.txt'" -p)


$ cat 17dc60e624cd6fc3491f9cb002f760d60e66ce8b.txt
###########
## Review Summary

This commit correctly fixes a platform-specific bug where uClibc-ng was incorrectly treated as glibc for the UTF-8 locale optimization path. The fix adds `!__UCLIBC__` to all conditions gating UTF-8-specific optimizations on `__GLIBC__ >= 2`. This is a critical correctness fix for embedded Linux systems using uClibc-ng.


## Critical

**lib/mbrtoc32.c and lib/mbrtowc.c — UTF-8 optimization path was incorrectly enabled for uClibc-ng**

Before this fix, the conditions `(GNULIB_WCHAR_SINGLE_LOCALE && __GLIBC__ >= 2)` would be true on uClibc-ng, because uClibc-ng defines `__GLIBC__`. However, uClibc-ng's internal `mbstate_t` representation differs from glibc's. The optimization code relies on knowing the exact layout of `mbstate_t` internals (e.g., `__count` field structure with specific bit layouts for input byte counting), which is glibc-specific.


On uClibc-ng, this would cause:
- Incorrect state tracking during multi-byte character processing

- Potential memory corruption or misaligned reads from `mbstate_t.__count` and related fields

- Silent incorrect return values or EILSEQ errors on valid UTF-8 input

The fix correctly excludes uClibc-ng from this optimization path, causing it to fall back to the generic path which works with any `mbstate_t` implementation.


## Suggestions

**lib/mbrtowc.c:114 — NULL-handling condition still uses `__GLIBC__ >= 2` without `!__UCLIBC__`**

The NULL-handling workaround at line 114 (the `if (s == NULL)` block) still uses `(GNULIB_WCHAR_SINGLE_LOCALE && __GLIBC__ >= 2)` without excluding uClibc-ng. While this is a different concern than the UTF-8 optimization (it's about detecting whether the system has a bug where it doesn't return `(size_t) -2` for empty input), it means uClibc-ng will still use the glibc-specific NULL-handling path.

This is not incorrect per se — the NULL-handling logic is more generic than the UTF-8 optimization — but it would be more consistent to also exclude uClibc-ng here, or explicitly document why uClibc-ng is intentionally included for this specific condition. If uClibc-ng does have the same empty-input bug as glibc, then it's fine; if not, this is unnecessary code.


```c
/* Before (line 114) */

# if (MBRTOWC_RETVAL_BUG || MBRTOWC_EMPTY_INPUT_BUG || MBRTOWC_INVALID_UTF8_BUG \

      || (GNULIB_WCHAR_SINGLE_LOCALE && __GLIBC__ >= 2))

/* Suggested: consider whether uClibc-ng needs this path */

# if (MBRTOWC_RETVAL_BUG || MBRTOWC_EMPTY_INPUT_BUG || MBRTOWC_INVALID_UTF8_BUG \

      || (GNULIB_WCHAR_SINGLE_LOCALE && __GLIBC__ >= 2 && !__UCLIBC__))
```

## Nitpicks

**Commit message typo** — "optimze" should be "optimize". Minor, but the commit message has a typo: `mbrtoc32, mbrtowc: do not optimze for uClibc-ng`.

**ChangeLog attribution** — The ChangeLog entry credits Waldemar Brodkorb as the reporter and Paul Eggert as the committer. This is correct GNU style. However, the ChangeLog header uses the format `YYYY-MM-DD Author Name <email>` with two spaces between date and name, while the body uses tab indentation. This is consistent with existing GNU coding conventions, so no issue here.


## What looks good

The fix is surgical and targeted — it only changes the specific conditions that gate UTF-8-specific optimizations on `__GLIBC__ >= 2`, adding `!__UCLIBC__` without touching any other logic. This minimizes the risk of introducing regressions. The pattern `&& !__UCLIBC__` is already well-established throughout gnulib (visible in `lib/wprintf-parse.h`, `lib/printf-parse.h`, `lib/string.in.h`, `lib/wchar.in.h`, etc.), so this commit follows existing conventions.


## Platform implications

- **uClibc-ng (ARM, MIPS, RISC-V embedded)**: This is the primary beneficiary. Embedded systems using uClibc-ng will now correctly fall through to the generic multibyte conversion path instead of using glibc-specific internal state assumptions. - **glibc (Linux desktop/server)**: No change in behavior — the `!__UCLIBC__` condition is already implicitly true on glibc since `__UCLIBC__` is not defined. - **musl**: No change — musl doesn't define `__GLIBC__` at all, so these conditions are false regardless. - **BSD systems (FreeBSD, NetBSD, OpenBSD)**: No change — these don't define `__GLIBC__`.

- **Windows/MinGW**: No change — these don't define `__GLIBC__`.
- **Solaris, HP-UX, AIX**: No change — these don't define `__GLIBC__`.

The fix is narrowly scoped to the UTF-8 optimization path and has no negative impact on any other platform.

###########

Local models are of course not as powerful as commercial high-end models. But maybe good enough, at least better than nothing and low-cost. Caveat: Running the same model on CPU (8 cores / 16 threads) is ~10x slower. Gitlab hosted runners: https://docs.gitlab.com/ci/runners/hosted_runners/linux/ For Gitlab CI I successfully use saas-linux-xlarge-amd64 for slow cross-builds and valgrind testing (for wget2 CI).

The agent took ~2-3 Minutes to finish. On my machine, the token processing speed is ~200t/s for input tokens and ~25t/s for output tokens.

The review density highly depends on the prompting (skills are just prompts + some little meta info). Scripts can be added, the output analyzed by LLM. So any extensions possible.

Btw (if you wonder), the `code-reviewer` skill is a very generic/random review skill, mostly optimized for web/Javascript. Not even worth copying it here. We should craft a very specific one for gnulib, with all the rules we like to be checked.


Regards, Tim

OpenPGP_signature.asc
Description: OpenPGP digital signature

Re: request for help: LLM-based quality assurance

Reply via email to