This is an automated email from the ASF dual-hosted git repository.
potiuk pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow-steward.git
The following commit(s) were added to refs/heads/main by this push:
new c51ad1f6 feat(agent-isolation): claude-term-bg.sh — distinguish
blocked-on-decision from finished-idle (#503)
c51ad1f6 is described below
commit c51ad1f608dae7a32bd84cf26fea1f01667bc844
Author: Jarek Potiuk <[email protected]>
AuthorDate: Fri Jun 12 00:13:46 2026 +0200
feat(agent-isolation): claude-term-bg.sh — distinguish blocked-on-decision
from finished-idle (#503)
The terminal-tint helper previously tinted on every `Stop` event, so a
turn that merely *finished* (Claude idle until you start the next thing)
looked identical to a turn genuinely blocked on your decision. This
reworks the state machine to tint only when Claude actually wants you to
act, across six hooks:
Stop -> stop (heuristic: tint only if the final assistant
message reads as a question/request; a
completion stays calm. Needs python3/python,
else defaults calm)
PreToolUse -> wait (matcher AskUserQuestion — exact signal)
PostToolUse -> reset (matcher * — calm while working; clears the
tint the instant you act on a prompt)
Notification -> notify (tint permission prompts only; idle ping no-op)
UserPromptSubmit/SessionStart -> reset
PostToolUse (not PreToolUse) now carries the general reset so it can clear
a tint the permission prompt itself set; the idle ping is a deliberate
no-op so it can't wipe a pending question's tint after ~60s.
Docs updated in lockstep: the wiring table, install JSON, and install-skill
checklist in docs/setup/secure-agent-setup.md, plus the README table row.
Generated-by: Claude Code (Opus 4.8)
---
docs/setup/secure-agent-setup.md | 75 ++++++++++++-------
tools/agent-isolation/README.md | 2 +-
tools/agent-isolation/claude-term-bg.sh | 124 +++++++++++++++++++++++++++++---
3 files changed, 166 insertions(+), 35 deletions(-)
diff --git a/docs/setup/secure-agent-setup.md b/docs/setup/secure-agent-setup.md
index f5277c2c..745db14c 100644
--- a/docs/setup/secure-agent-setup.md
+++ b/docs/setup/secure-agent-setup.md
@@ -1305,23 +1305,39 @@ elsewhere. The framework ships
[`tools/agent-isolation/claude-term-bg.sh`](../../tools/agent-isolation/claude-term-bg.sh)
to make a **calm baseline the normal state and tint the background
only when Claude genuinely wants you to act** — never while it is
-working. The model is two states, wired across five hooks:
+working, and never when it merely *finished* a turn and is idle
+until you start the next thing. Those last two look identical at the
+`Stop` event, so the model leans on three "Claude is asking you for
+something" signals (two exact, one heuristic), wired across six
+hooks:
| Moment | Hook → action | Background |
|---|---|---|
-| Turn finished — your turn to respond | `Stop` → `wait` | tinted (muted
indigo `#2a1a3a`) |
+| Turn ended on a genuine question/request | `Stop` → `stop` | tinted (muted
indigo `#2a1a3a`) |
+| Turn ended on a completion ("Done.") | `Stop` → `stop` | calm |
+| Structured question posed | `PreToolUse` (matcher `AskUserQuestion`) →
`wait` | tinted |
| Blocked on a permission prompt | `Notification` → `notify` | tinted |
-| Actively working (running a tool) | `PreToolUse` → `reset` | calm |
-| Fresh/idle session, plain idle ping | `SessionStart` / `Notification` →
`reset`/`notify` | calm |
-| You submit a reply | `UserPromptSubmit` → `reset` | calm |
-
-Three details make the model behave:
-
-- **`PreToolUse` → `reset`** fires on every tool call, so the moment
- Claude resumes work — including right after you approve a
- permission prompt that had tinted the screen — it returns to calm.
- Without it, an approved-permission tint would linger until the
- turn ended.
+| Actively working / you just acted | `PostToolUse` (matcher `*`) → `reset` |
calm |
+| Plain 60-second idle ping | `Notification` → `notify` (no-op) | unchanged |
+| Fresh session, or you submit a reply | `SessionStart` / `UserPromptSubmit` →
`reset` | calm |
+
+Four details make the model behave:
+
+- **`Stop` → `stop`** is the only non-exact signal. It reads the
+ last assistant text message from the session transcript (the path
+ arrives on stdin in the `Stop` payload) and tints only when that
+ message ends as a question (`…?`) or with a strong trailing
+ request ("want me to", "would you like", "should I", "OK to",
+ "your call", …); a statement-shaped completion stays calm. Needs
+ `python3`/`python` on `PATH`; if absent, `stop` defaults to calm
+ and only the two exact signals tint.
+- **`PostToolUse` → `reset`** (not `PreToolUse`) clears the "you
+ just acted" tint. `PreToolUse` fires *before* the permission
+ prompt is shown, so it cannot clear a tint the prompt itself
+ sets; `PostToolUse` fires *after* the tool completes — the moment
+ your approval lets work resume — so it is what returns the screen
+ to calm. `PreToolUse` is reserved for the `AskUserQuestion` →
+ `wait` tint.
- **`SessionStart` → `reset`** clears any tint a *previous* session
left behind (OSC background changes persist in the terminal
across processes, so a session closed mid-wait would otherwise
@@ -1330,8 +1346,10 @@ Three details make the model behave:
- **`Notification` → `notify`** is selective: the same hook fires
both for permission prompts *and* the plain 60-second idle ping,
so the script reads the notification payload on stdin and tints
- only when the message is a permission/attention prompt; an idle
- session stays calm.
+ only for a permission/attention prompt. The idle ping is a
+ deliberate **no-op** (not a reset) — otherwise a turn that ended
+ on a genuine question, tinted by `stop`, would silently go calm
+ after a minute.
**Two mechanics make this work** (both are easy to get wrong):
@@ -1364,21 +1382,26 @@ cp
/path/to/airflow-steward/tools/agent-isolation/claude-term-bg.sh \
chmod +x ~/.claude/scripts/claude-term-bg.sh
```
-Wire it into `~/.claude/settings.json` under four hook events. If
+Wire it into `~/.claude/settings.json` under six hook events. If
you already have hooks on any of these events, add the command as
an extra entry rather than replacing the existing array. The
-`CLAUDE_RESET_BG=#000000` prefix makes the calm state a
-deterministic black (recommended — it sidesteps the OSC-111 reset
-gap described above); drop it to fall back to profile-default
-reset.
+`PreToolUse` entry uses the `AskUserQuestion` matcher (not `*`), so
+it tints only on a structured question; `PostToolUse` carries the
+general `reset`. The `CLAUDE_RESET_BG=#000000` prefix makes the
+calm state a deterministic black (recommended — it sidesteps the
+OSC-111 reset gap described above); drop it to fall back to
+profile-default reset.
```jsonc
{
"hooks": {
"Stop": [
- { "hooks": [ { "type": "command", "command":
"~/.claude/scripts/claude-term-bg.sh wait" } ] }
+ { "hooks": [ { "type": "command", "command": "CLAUDE_RESET_BG=#000000
~/.claude/scripts/claude-term-bg.sh stop" } ] }
],
"PreToolUse": [
+ { "matcher": "AskUserQuestion", "hooks": [ { "type": "command",
"command": "~/.claude/scripts/claude-term-bg.sh wait" } ] }
+ ],
+ "PostToolUse": [
{ "matcher": "*", "hooks": [ { "type": "command", "command":
"CLAUDE_RESET_BG=#000000 ~/.claude/scripts/claude-term-bg.sh reset" } ] }
],
"UserPromptSubmit": [
@@ -1817,11 +1840,13 @@ Then walk through:
on me (a pure quality-of-life signal, no security effect).
**Default no.** Only if I say yes: copy
`<airflow-steward>/tools/agent-isolation/claude-term-bg.sh`
- into `~/.claude/scripts/` and `chmod +x` it, then add five
+ into `~/.claude/scripts/` and `chmod +x` it, then add six
hooks to `~/.claude/settings.json`, merging into any existing
- arrays on those events — `Stop` → `claude-term-bg.sh wait`;
- `UserPromptSubmit`, `SessionStart`, and `PreToolUse` (matcher
- `*`) → `claude-term-bg.sh reset`; and `Notification` →
+ arrays on those events — `Stop` → `claude-term-bg.sh stop`
+ (heuristic tint on a question/request, calm on a completion);
+ `PreToolUse` (matcher `AskUserQuestion`) → `claude-term-bg.sh
+ wait`; `UserPromptSubmit`, `SessionStart`, and `PostToolUse`
+ (matcher `*`) → `claude-term-bg.sh reset`; and `Notification` →
`claude-term-bg.sh notify`. Ask whether I want the calm state
to be a deterministic black (prefix the reset/notify commands
with `CLAUDE_RESET_BG=#000000`) or the terminal's profile
diff --git a/tools/agent-isolation/README.md b/tools/agent-isolation/README.md
index 228f915f..8d44b694 100644
--- a/tools/agent-isolation/README.md
+++ b/tools/agent-isolation/README.md
@@ -34,7 +34,7 @@ versions.
| [`sandbox-error-hint.sh`](sandbox-error-hint.sh) | Claude Code `PostToolUse`
hook (Bash matcher). Scans the tool's stdout + stderr for the three known
sandbox-shaped error signatures (SSH agent / Yubikey unreachable, loopback
port-bind blocked, docker / podman socket denied) and prints a `[sandbox-hint]`
line pointing at the matching entry in
[`docs/setup/sandbox-troubleshooting.md`](../../docs/setup/sandbox-troubleshooting.md).
Fail-open: any unexpected JSON shape exits silent. Recomm [...]
| [`sandbox-status-line.sh`](sandbox-status-line.sh) | Claude Code
`statusLine` helper. Renders `<model> [sandbox]` (green) or `<model> [NO
SANDBOX]` (bold red) based on `sandbox.enabled` in the active settings —
project `settings.local.json` first, then project `settings.json`, then
user-scope, mirroring Claude Code's own precedence. Reflects in-session
`/sandbox` toggles (which persist to project `settings.local.json`).
Recommended user-scope. |
| [`sandbox-status-line-rich.sh`](sandbox-status-line-rich.sh) | Opt-in richer
alternative to `sandbox-status-line.sh`. Same sandbox-state detection, plus
folder name (hash-coloured), git branch + dirty + ahead/behind, per-branch PR
title (cached, gated by `gh`), and a yellow `[sandbox-auto]` tag for the
`autoAllowBashIfSandboxed` setting. Wire one *or* the other into
`statusLine.command`. |
-| [`claude-term-bg.sh`](claude-term-bg.sh) | **Opt-in quality-of-life helper
(not a security control).** Keeps a calm baseline background and tints it only
when Claude genuinely wants you to act (never while working), so a window
you've tabbed away from can't sit blocked unnoticed. Two states across five
hooks: `Stop` → `wait` (turn finished, tint); `UserPromptSubmit` +
`SessionStart` + `PreToolUse` → `reset` (calm — `PreToolUse` keeps it calm
while a tool runs and clears an approved-per [...]
+| [`claude-term-bg.sh`](claude-term-bg.sh) | **Opt-in quality-of-life helper
(not a security control).** Keeps a calm baseline background and tints it only
when Claude genuinely wants you to act (never while working, and never when it
merely *finished* a turn), so a window you've tabbed away from can't sit
blocked unnoticed. Distinguishes "blocked on a decision" from "finished and
idle" — which look identical at the `Stop` event — via three signals across six
hooks: `Stop` → `stop` (heur [...]
| [`sandbox-add-project-root.sh`](sandbox-add-project-root.sh) | Adds the
current adopter repo's project root (and, with `--all-worktrees`, every linked
git worktree's working dir) as an explicit absolute path to
`sandbox.filesystem.allowRead` and `allowWrite` in the project-local,
gitignored `<repo>/.claude/settings.local.json` — one entry per worktree, each
in that worktree's own settings file. Defensive against [issue
#197](https://github.com/apache/airflow-steward/issues/197) — `allo [...]
| [`git-global-post-checkout.sh`](git-global-post-checkout.sh) | Universal
`post-checkout` git hook installed at `~/.claude/git-hooks/post-checkout` when
the operator picks **whole-user** scope in `setup-isolated-setup-install`.
Activated by `git config --global core.hooksPath ~/.claude/git-hooks/` so every
`git checkout` / `git clone` / `git worktree add` across the host invokes it.
Two responsibilities (both best-effort + idempotent + `\|\| true`): (1) `setup
verify --auto-fix-symlinks [...]
diff --git a/tools/agent-isolation/claude-term-bg.sh
b/tools/agent-isolation/claude-term-bg.sh
index d8f730f1..0ef1d89d 100755
--- a/tools/agent-isolation/claude-term-bg.sh
+++ b/tools/agent-isolation/claude-term-bg.sh
@@ -20,16 +20,52 @@
# waiting on YOU, and keep it calm the rest of the time.
#
# This is a quality-of-life helper, NOT a security control: it makes the
-# "Claude is blocked on me" state impossible to miss in a window you've
-# tabbed away from. Wire it into five Claude Code hooks (user-scope):
+# "Claude is BLOCKED ON A DECISION I have to make" state impossible to miss in
+# a window you've tabbed away from — while deliberately NOT tinting when Claude
+# merely finished a task and is idle until you start the next thing. Those two
+# look identical at the `Stop` event, so the script distinguishes them with
+# three genuine "Claude is asking you for something" signals (two exact, one
+# heuristic) and stays calm for everything else. Wire it into six hooks
+# (user-scope):
#
-# "Stop" -> claude-term-bg.sh wait (turn finished — your
turn, tint)
+# "Stop" -> claude-term-bg.sh stop (turn ended: TINT only if
the final
+# assistant message is
genuinely a
+# question/request; stay
calm if it reads
+# as a completion —
heuristic, see below)
+# "PreToolUse" -> claude-term-bg.sh wait (matcher: AskUserQuestion
— a structured
+# question was posed;
EXACT signal)
+# "PostToolUse" -> claude-term-bg.sh reset (matcher: * — a tool
finished: calm while
+# working, and clears the
tint the instant
+# you approve a permission
prompt or answer
+# an AskUserQuestion)
+# "Notification" -> claude-term-bg.sh notify (TINT for permission
prompts only — EXACT;
+# the plain 60s idle ping
is a NO-OP so it
+# can't wipe a pending
question's tint)
# "UserPromptSubmit" -> claude-term-bg.sh reset (you replied — back to
calm)
# "SessionStart" -> claude-term-bg.sh reset (fresh session — clear
any stale tint)
-# "PreToolUse" -> claude-term-bg.sh reset (actively working — stay
calm; also
-# clears an
approved-permission tint)
-# "Notification" -> claude-term-bg.sh notify (tint for permission
prompts only;
-# the plain idle ping
stays calm)
+#
+# Two design notes that make the distinction hold:
+#
+# * Why PostToolUse (not PreToolUse) clears the "you just acted" tint:
+# PreToolUse fires BEFORE the permission prompt is shown, so it cannot
clear
+# a tint the prompt itself sets. PostToolUse fires AFTER the tool completes
+# — the moment your approval/selection lets work resume — so it is the hook
+# that returns the screen to calm once you have acted. (This is also why
the
+# general PreToolUse reset is gone: PreToolUse is now reserved for the
+# AskUserQuestion->wait tint, and PostToolUse alone keeps work calm.)
+# * Why the idle ping is a no-op (not a reset): the `Notification` hook also
+# fires the plain "waiting for your input" ping ~60s after a turn ends. If
+# that reset the background, a turn that ended on a genuine question
(tinted
+# by `stop`) would silently go calm after a minute. Leaving it untouched
+# preserves whatever state `stop`/`notify` decided.
+#
+# The `stop` heuristic (the only non-exact signal): it reads the LAST assistant
+# text message from the session transcript (path arrives on stdin in the Stop
+# hook payload) and tints when that message ends as a question ("...?") or with
+# a strong trailing request ("want me to", "would you like", "should I", "OK
+# to", "your call", ...). A statement-shaped completion ("Done.", "All set.")
+# stays calm. Needs python3/python on PATH; if absent, `stop` defaults to calm
+# (only the two EXACT signals tint).
#
# Colours are overridable via the environment (e.g. inline in the hook
command):
# CLAUDE_WAIT_BG background while waiting (default: #2a1a3a, a muted
indigo)
@@ -84,16 +120,86 @@ set_reset() {
fi
}
+# Classify the final assistant message of a transcript as a genuine
+# question/request ("wait") vs a completion ("calm"). Prints one word.
+# Heuristic, anchored on the trailing sentence so a long completion that
+# merely *mentions* a question earlier still reads as calm.
+classify_last_message() {
+ local transcript="$1" py
+ py=$(command -v python3 2>/dev/null || command -v python 2>/dev/null || true)
+ [ -n "$py" ] && [ -f "$transcript" ] || { echo calm; return; }
+ "$py" - "$transcript" <<'PY' 2>/dev/null || echo calm
+import json, re, sys
+last = ""
+try:
+ with open(sys.argv[1], encoding="utf-8", errors="replace") as f:
+ for line in f:
+ line = line.strip()
+ if not line:
+ continue
+ try:
+ obj = json.loads(line)
+ except Exception:
+ continue
+ if not isinstance(obj, dict):
+ continue
+ msg = obj.get("message") if isinstance(obj.get("message"), dict)
else {}
+ role = msg.get("role") or (obj.get("type") if obj.get("type") ==
"assistant" else None)
+ if role != "assistant":
+ continue
+ content = msg.get("content", obj.get("content"))
+ txt = ""
+ if isinstance(content, list):
+ for b in content:
+ if isinstance(b, dict) and b.get("type") == "text":
+ txt += b.get("text", "")
+ elif isinstance(content, str):
+ txt = content
+ if txt.strip():
+ last = txt
+except Exception:
+ print("calm"); sys.exit(0)
+
+t = re.sub(r'[`*_>#\s]+$', '', last.strip()) # drop trailing markdown/space
+if not t:
+ print("calm"); sys.exit(0)
+tail = t.splitlines()[-1].strip()
+blob = t.lower()[-240:]
+ask = tail.endswith("?") or any(p in blob for p in (
+ "want me to", "would you like", "should i ", "shall i ", "do you want",
+ "ok to ", "okay to ", "your call", "which would you", "let me know which",
+ "let me know if you'd like", "want that too", "proceed with",
+))
+print("wait" if ask else "calm")
+PY
+}
+
case "$action" in
wait|set) set_wait ;;
reset|off) set_reset ;;
+ stop)
+ # Turn ended. Tint ONLY if the final assistant message is genuinely a
+ # question/request; a completion-shaped message stays calm. This is what
+ # separates "Claude is asking me something" from "Claude finished — I'll
+ # start the next thing whenever." Payload (with transcript_path) on stdin.
+ payload=$(cat 2>/dev/null)
+ transcript=$(printf '%s' "$payload" \
+ | sed -n
's/.*"transcript_path"[[:space:]]*:[[:space:]]*"\([^"]*\)".*/\1/p' \
+ | head -n1)
+ case "$transcript" in "~/"*) transcript="$HOME/${transcript#\~/}" ;; esac
+ case "$(classify_last_message "$transcript")" in
+ wait) set_wait ;;
+ *) set_reset ;;
+ esac
+ ;;
notify)
# Notification fires for permission prompts AND the plain 60s idle ping.
- # Tint only when it's something that genuinely wants the user to act.
+ # Tint for a genuine permission request; leave the background UNTOUCHED on
+ # the idle ping so it can't wipe a question-tint set by `stop`.
msg=$(cat 2>/dev/null)
case "$msg" in
*permission*|*approve*|*"needs your"*) set_wait ;;
- *) set_reset ;;
+ *) : ;; # idle ping / other — no-op
esac
;;
esac