+1 This is a great direction. It provides reusable, consistent metric definitions will really help bridge the semantic layer and make Spark more accessible for analytical and LLM-driven use cases.
Wenchen Fan <[email protected]> 于2025年11月6日周四 10:47写道: > Hi Anand, > > Thanks for the review! You are right that making metric view the single > source of truth brings better user experience, and also allows better > governance and LLM integration. However, I think how to handle the > integration is out of the scope of Spark, and should be taken care by the > vendors who provide these infrastructure. > > Thanks, > Wenchen > > On Thu, Nov 6, 2025 at 9:57 AM Anand Chinnakannan <[email protected]> > wrote: > >> Hi Team, >> >> I’ve reviewed and updated our Metric View proposal (Q1–Q8) to include two >> key enhancements: >> >> 🔐 Governance Integration: Metric Views are now treated as governed, >> first-class catalog objects with access control, lineage tracking, and >> versioning — ensuring metrics remain secure, auditable, and consistently >> defined. >> >> 🤖 LLM Agent Integration: Added guidance on how LLMs and AI agents can >> discover and query metric views through catalog metadata for consistent, >> governed responses to natural-language queries. >> >> >> These updates align with our goal of making Metric Views the single >> source of truth for analytical and AI-driven use cases. >> >> I’d love your input on these sections — especially around: >> >> 1. Any additional governance scenarios we should consider. >> >> >> 2. LLM integration edge cases or optimization ideas. >> >> >> 3. Suggestions for examples, syntax, or long-term roadmap points. >> >> >> Please feel free to add comments, edits, or examples directly in the >> document, or share your thoughts in reply. >> Your contributions will help us finalize a stronger, more complete >> proposal for review. >> >> Thank you for your time and collaboration — looking forward to your >> insights! >> >> Best regards, >> Anand Chinnakannan >> Staff Data Scientist | Walmart >> Executive MBA Candidate, Quantic School of Business & Technology >> 📧 [email protected] >> >> On Thu, Nov 6, 2025, 10:27 AM Wenchen Fan <[email protected]> wrote: >> >>> Thanks for the proposal! I believe this is a very useful feature, as the >>> other alternatives do not work well: people need to either define many >>> similar views with different grouping columns and aggregate functions, or >>> manually maintain a doc page to describe the semantic of these metrics that >>> people need to follow when writing queries to calculate these metrics. >>> >>> Shall we start the vote next week if there is no objections? >>> >>> On Fri, Oct 31, 2025 at 2:30 PM Linhong Liu >>> <[email protected]> wrote: >>> >>>> Hi all, >>>> >>>> I would like to propose introducing "The metrics & semantic modeling in >>>> Spark". >>>> >>>> This feature enables defining business metrics once and reusing them >>>> across any breakdown, ensuring consistent outcomes and bridging the >>>> semantic gap between business logic and data schemas to help LLMs generate >>>> more precise results. >>>> >>>> Looking forward to your feedback! >>>> >>>> JIRA: SPARK-54119 <https://issues.apache.org/jira/browse/SPARK-54119> >>>> SPIP docs: >>>> https://docs.google.com/document/d/1xVTLijvDTJ90lZ_ujwzf9HvBJgWg0mY6cYM44Fcghl0/edit?tab=t.0#heading=h.4iogryr5qznc >>>> <https://docs.google.com/document/d/1xVTLijvDTJ90lZ_ujwzf9HvBJgWg0mY6cYM44Fcghl0/edit?tab=t.0#heading=h.4iogryr5qznc> >>>> >>>> Thanks, >>>> Linhong >>>> >>>
