[Metamath] Re: Introducing ProofScaffold: A Modular, AI-Friendly Toolchain for Metamath

Glauco Wed, 11 Feb 2026 05:07:50 -0800

Hi Mingli Yuan,

I've given Gemini your README.md and asked it to prove mp2 with your 
framework. Below is the Python code that Gemini returned.
Is it completely hallucinated, or is it close?


from logic.propositional.hilbert import System
from logic.propositional.hilbert.lemmas import Proof, ProofBuilder

def prove_mp2(sys: System) -> Proof:
    """
    Double Modus Ponens: φ, ψ, (φ → (ψ → χ)) ⊢ χ
    """
    # 1. Initialize the builder with the system and theorem name
    lb = ProofBuilder(sys, "mp2")

    # 2. Define Hypotheses (mp2.1, mp2.2, mp2.3)
    h_phi = lb.hyp("mp2.1", "φ")
    h_psi = lb.hyp("mp2.2", "ψ")
    h_chi_nested = lb.hyp("mp2.3", "φ → (ψ → χ)")

    # 3. Apply the first Modus Ponens (Step 4 in .mmp)
    # This uses h_phi and h_chi_nested to get (ψ → χ)
    step4 = lb.mp("s1", h_phi, h_chi_nested, note="Step 4: MP h_phi, 
h_chi_nested")

    # 4. Apply the second Modus Ponens (Step 5 in .mmp)
    # This uses h_psi and step4 to get χ
    step5 = lb.mp("s2", h_psi, step4, note="Step 5: MP h_psi, s1")

    # 5. Return the build targeting the final result
    return lb.build(step5)

BR
Glauco

Il giorno mercoledì 11 febbraio 2026 alle 08:05:05 UTC+1 [email protected] 
ha scritto:

> Dear Metamath Community,
>
> I am writing to share an open-source project we have been working on 
> called ProofScaffold (https://github.com/epistemic-frontier/proof-scaffold
> ).
>
> Our team has deep respect for set.mm and the rigorous foundation this 
> community has built. However, as we explore the intersection of Metamath 
> and Large Language Models (LLMs), we’ve encountered a persistent challenge: 
> feeding a 48MB monolithic file to an AI often leads to context dilution, 
> hallucinated imports, and noisy proof searches.
>
> To solve this, we built ProofScaffold. It acts as a package manager and 
> linker (written in Python) for Metamath. It allows us to split massive 
> databases into composable, compilable, and independently verifiable 
> packages with explicit dependencies and exports—much like Cargo or NPM, but 
> for formal math.
>
> Crucially, the Trust Computing Base (TCB) does not change. Python acts 
> strictly as an untrusted builder/linker. The final output is a standard, 
> flattened .mm transient monolith that is verified by metamath-exe or 
> metamath-knife.
>
> We believe this modularity is the key to unlocking AI's true potential in 
> formal mathematics:
>
> - Targeted Retrieval: By scoping the context to a specific package (e.g., 
> just fol or zf), we drastically reduce noise for the LLM.
>
> - Controlled Semantic Boundaries: Explicit exports provide the AI with a 
> strict subset of permissible symbols and axioms. This prevents hallucinated 
> reasoning, such as an AI accidentally employing the Axiom of Choice in a 
> strict ZF-only context.
>
> - Verifiable Incremental Loops: An AI can generate a proof, verify it 
> locally within the package, and map any verifier errors directly back to 
> the specific package contract (e.g., missing $f or label conflicts). This 
> makes AI self-correction much more reliable.
>
> - Curriculum Alignment: Modular packages naturally form a curriculum 
> (axioms → definitions → lemmas → theorems), providing high-quality gradient 
> data for training models, rather than overwhelming them with the entire 
> set.mm at once.
>
> We have successfully ported a prelude and are currently porting the logic 
> package (aligned with the |- conventions of set.mm). Our next step is to 
> further subdivide logic into prop, fol, eq, and setvar packages, and to 
> generate machine-readable interface manifests to help AI planners.
>
> A Question for the Community regarding PyPI Naming:
>
> To make this ecosystem easily accessible for AI researchers and engineers, 
> we plan to publish these modularized databases as Python packages on PyPI.
>
> I would like to ask if the community is comfortable with us using the 
> metamath- prefix for these packages (e.g., metamath-logic, metamath-zfc). I 
> want to be entirely respectful of the Metamath trademark/legacy and ensure 
> this doesn't cause confusion with official tools. If the community prefers 
> we use a different namespace (e.g., proof-scaffold-logic), we will gladly 
> do so.
>
> I would love to hear your thoughts, feedback, or critiques on this 
> approach.
>
> Best regards,
>
> Mingli Yuan
>

-- 
You received this message because you are subscribed to the Google Groups 
"Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/metamath/6caee51b-6405-4a23-837e-8d47339b5df7n%40googlegroups.com.

[Metamath] Re: Introducing ProofScaffold: A Modular, AI-Friendly Toolchain for Metamath

Reply via email to