Re: [PR] add Privacy-LLM gate-check validator [airflow-steward]

via GitHub Sun, 24 May 2026 17:31:43 -0700


justinmclean commented on code in PR #215:
URL: https://github.com/apache/airflow-steward/pull/215#discussion_r3295591107



##########
tools/skill-validator/src/skill_validator/__init__.py:
##########
@@ -570,6 +590,61 @@ def validate_principle_compliance(path: Path, text: str) 
-> Iterable[Violation]:
         )
 
 
+# ---------------------------------------------------------------------------
+# Privacy-LLM gate-check (write-skill/security-checklist.md § Pattern 6)
+# ---------------------------------------------------------------------------
+
+
+def validate_privacy_patterns(path: Path, text: str) -> Iterable[Violation]:
+    """Check Privacy-LLM gate-check convention from 
``write-skill/security-checklist.md``.
+
+    **Pattern 6** *(SKILL.md only)*: skills whose ``mode`` implies processing
+    external / attacker-controlled content **and** that *read full issue 
bodies*
+    from the private ``<tracker>`` repo must invoke the Privacy-LLM gate-check
+    (``privacy-llm-check``) before making any outbound LLM call.
+
+    Three conditions must all be true to trigger the check:
+
+    1. The file is a ``SKILL.md`` entry point.
+    2. The ``mode`` frontmatter field is one of the external-content modes
+       (``Triage``, ``Mentoring``, ``Drafting``).
+    3. The skill both references ``<tracker>`` **and** contains ``gh issue 
view``
+       — the command that fetches full issue bodies (embargoed CVE detail,
+       reporter PII, etc.).  Skills that only write to / query metadata from
+       the tracker (create an issue, list milestones, search titles) are exempt
+       because they never pass private issue body content to the model.
+
+    All violations are **SOFT** — advisory, surfaced as warnings without
+    failing the run unless ``--strict`` is passed.
+    """
+    # Pattern 6 is only relevant for SKILL.md entry points.
+    if path.name != "SKILL.md":
+        return
+
+    fm = parse_frontmatter(text) or {}
+    mode = fm.get("mode", "")
+    if mode not in _EXTERNAL_CONTENT_MODES:
+        return
+
+    # Only flag skills that both reference the tracker AND read full issue 
bodies.
+    if _TRACKER_PLACEHOLDER not in text:
+        return
+    if _TRACKER_READ_PHRASE not in text:
+        return
+
+    if _PRIVACY_LLM_GATE_PHRASE not in text:

Review Comment:
   The validator now only counts privacy-llm-check when it appears in a fenced 
block under a real section whose heading starts with Prerequisites, Preflight, 
or Step 0. It also ignores fenced blocks under anti-example headings like Don't 
do this, Bad example, Wrong, or Anti-example.
   
   I added regression coverage for:
   
   fenced command in ## History does not satisfy the gate
   command after a ## Step 0 section, but under sibling ## History, does not 
satisfy the gate
   ## Appendix: Step 0 from an older version does not satisfy the gate
   ## Step 0 / ### Bad example does not satisfy the gate
   real Step 0 / Prerequisites fenced commands still satisfy the gate
   I also verified against the existing security skills; their gate calls are 
under ## Step 0 — Pre-flight check, and skill-validate passes cleanly.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] add Privacy-LLM gate-check validator [airflow-steward]

Reply via email to