Re: [DISCUSS] ACBP × HL7: Calcite IR → multi-dialect SQL for deterministic policy artifacts (Why ACBP, benefits, MWE, questions)

Julian Hyde Tue, 19 Aug 2025 15:59:21 -0700

Thanks for reaching out, Muteb.

I’m not familiar with HL7 (though a google search reveals that is a standard in 
the healthcare) and ACBP is new (though I’m familiar with business 
rules/constraint languages), so forgive me if I’m a little slow to comprehend 
this.


I would like to figure out whether an integration between ACBP and Calcite 
would be a module within Calcite or an external project that uses various 
Calcite services (such as RelBuilder and the dialects).

Could the requirement be written in the form of a test case, something like 
this?

  String config = “…”; // a string in ABCP’s configuration language
  String query = “…”; // a string in the ABCP’s query language
  Dialect = …; // target dialect
  String sql = abcpToSql(config, query, dialect);
  assertThat(sql, is(“…”));

I have assumed that ABCP has a configuration language (akin to SQL DML, 
defining rules) and a runtime language (akin to SQL queries and DML), and that 
the translator is stateless. Let me know if these assumptions are valid.

If the formulation as test cases is appropriate, I would like to see few test 
cases, ranging from simple to moderately hard to implement.

Julian

PS If you are not subscribed yet, please subscribe ASAP. Until then, you can 
see replies in the email archive, 
https://lists.apache.org/thread/fcpko5d8orq5m3los0n65xhg3p1vn2rq.



> On Aug 19, 2025, at 3:27 PM, Muteb Alanazi <dotk...@outlook.com> wrote:
> 
> Hi Calcite devs,
> 
> I’m exploring Apache Calcite as the SQL generator for ACBP, a policy compiler 
> that turns categorical/boolean models into deterministic, auditable SQL 
> artifacts (views/CTAS) with formal checks (soundness/coverage/dedup).
> 
> Public overview (no private IP):
> - Overview: https://dotkboy-web.github.io/acbp/
> - Theorems: https://dotkboy-web.github.io/acbp/acbp_theorems.html
> - ACBP Equation: https://dotkboy-web.github.io/acbp/ACBP-Equation.html
> - DOI snapshot: https://zenodo.org/records/16891549
> 
> Why ACBP? (relevance to Calcite)
> - Realistic workload: CASE-heavy categorical logic over clean tables, no 
> vendor-specific UDFs → ideal for multi-dialect unparse coverage.
> - Deterministic semantics: decisions must be identical across dialects → 
> strong target for correctness/golden tests of RelToSql/dialects.
> - Governance-first: compiler proves properties (soundness/coverage/dedup) and 
> keeps decisions auditable; portability matters for users running multiple 
> engines.
> - Clean separation: ACBP owns the IR (categories/flags/rules); Calcite prints 
> dialect SQL. Any gaps can become small PRs in sql.dialect.
> - Give back: happy to contribute a tiny HL7 suite of golden tests (input plan 
> → expected strings) and open JIRAs for dialect gaps.
> 
> Why this helps HL7/HIS (benefits & what’s novel)
> - Operational: deterministic routing and noise control (silencing windows, 
> dedup, escalation ladders) encoded as SQL CASE/joins.
> - Governance/audit: soundness/coverage/dedup checks run in-DB; every action 
> has explainable predicates + policy version for replay.
> - Engineering: one IR, many dialects (PostgreSQL/BigQuery/ClickHouse/Spark) 
> via Calcite; DDL templated per dialect.
> - KPIs: fewer duplicate/low-value alerts, faster policy iteration with 
> guardrails, and stable sub-second P95 on “what to do now” boards.
> - Novelty: not “rules in healthcare” per se―the combination of a SQL-native 
> policy compiler + formal coverage/soundness/dedup + decision-space 
> enumeration + multi-dialect generation is distinct from interface/rules 
> engines.
> 
> Goal
> Given an HL7-derived model (categories + flags + deterministic rules), build 
> a Calcite plan once and unparse to multiple SQL dialects to produce “decision 
> space” artifacts, preserving ACBP semantics.
> 
> What “ACBP semantics” means here
> - Categories (finite enums) + derived boolean flags drive a single 
> deterministic CASE that yields an action (throttle/alert/escalate/...).
> - The compiler guarantees governance properties and keeps decisions auditable 
> in-DB.
> - SELECT/CASE is the portable core; DDL (CTAS/REPLACE) is wrapped by small 
> templates per dialect.
> 
> HL7 → categories/flags (representative mapping)
> - ADT: message_type='ADT', trigger_event ∈ {A01,A04,A08,A03}, patient_class ∈ 
> {E,I,O}
>  Flags: is_admission := trigger_event IN ('A01','A04'); is_discharge := 
> trigger_event='A03'; is_emergency := patient_class='E'.
> - ORM: message_type='ORM', order_priority ∈ {STAT,ROUTINE}, order_type 
> (LAB/RX/PROC)
>  Flags: is_stat_order := order_priority='STAT'; is_critical_order := 
> order_type IN ('LAB','PROC').
> - ORU: message_type='ORU', obs_abn_flag (OBX-8: H/L/N), loinc_code
>  Flags: abnormal_result := obs_abn_flag IN ('H','L'); critical_analyte := 
> loinc_code IN ref_critical_loinc.
> 
> MWE (shape representative of ACBP output)
> -- Schema (selected fields; staging joins handled upstream)
> CREATE TABLE hl7_messages (
>  msg_id BIGINT,
>  event_ts TIMESTAMP,
>  message_type VARCHAR,     -- ADT, ORM, ORU
>  trigger_event VARCHAR,    -- A01, O01, R01, ...
>  patient_class VARCHAR,    -- PV1-2: E/I/O
>  order_priority VARCHAR,   -- STAT, ROUTINE
>  loinc_code VARCHAR,
>  obs_abn_flag VARCHAR      -- OBX-8: H/L/N
> );
> 
> -- Deterministic policy (CASE over categories/flags):
> SELECT
>  msg_id,
>  event_ts,
>  CASE
>    WHEN (message_type = 'ORM' AND order_priority = 'STAT'
>          AND obs_abn_flag IN ('H','L')) THEN 3                    -- escalate
>    WHEN (message_type = 'ORU' AND obs_abn_flag IN ('H','L')
>          AND loinc_code IN (SELECT code FROM ref_critical_loinc)) THEN 3
>    WHEN (message_type = 'ORU' AND obs_abn_flag IN ('H','L')) THEN 2 -- alert
>    WHEN (message_type = 'ADT' AND trigger_event IN ('A01','A04')
>          AND patient_class = 'E') THEN 2
>    ELSE 1                                                          -- throttle
>  END AS action_id
> FROM hl7_messages
> WHERE event_ts >= CURRENT_TIMESTAMP - INTERVAL '2' DAY;
> 
> Smallest viable integration surface
> 1) Build relational algebra with RelBuilder (scan/filter/project/CASE; no 
> non-portable ops).
> 2) Convert RelNode → SqlNode via RelToSqlConverter.
> 3) Unparse with SqlDialect + SqlPrettyWriter per target dialect.
> 4) Keep DDL templated per dialect (CTAS/OR REPLACE varies); the SELECT is the 
> portable core.
> 
> Notes on portability
> - I can express the time filter as DATEADD/TIMESTAMP_SUB if a dialect prefers 
> that form.
> - I avoid bitwise logic in the core SELECT; boolean flags are explicit 
> predicates.
> - Packed masks (if needed for analytics/storage) can be materialized as a 
> follow-up step per dialect.
> 
> Questions
> 1) Is RelBuilder → RelToSqlConverter → SqlDialect the recommended path for 
> multi-dialect generation today? Any caveats?
> 2) For interval arithmetic, would you represent a generic 
> DATEADD/TIMESTAMP_SUB in the IR and let SqlDialect map it?
> 3) Guidance for ensuring CASE-heavy categorical logic unparses idiomatically 
> across Postgres/BigQuery/ClickHouse/Spark?
> 4) If we hit a dialect gap, what’s the smallest PR you prefer (tests + 
> changes in org.apache.calcite.sql.dialect.*)?
> 
> I can provide a tiny repo with:
> - input plan construction (RelBuilder),
> - expected SQL strings per dialect (golden tests),
> - and JIRAs for any concrete gaps we uncover.
> 
> Thanks,
> Muteb (DotK)
> Asia/Riyadh
> 
> Note: I’m awaiting subscription confirmation; please keep me CC’d on replies.

Re: [DISCUSS] ACBP × HL7: Calcite IR → multi-dialect SQL for deterministic policy artifacts (Why ACBP, benefits, MWE, questions)

Reply via email to