+1

This is a great direction. It provides reusable, consistent metric
definitions will really help bridge the semantic layer and make Spark more
accessible for analytical and LLM-driven use cases.

Wenchen Fan <[email protected]> 于2025年11月6日周四 10:47写道:

> Hi Anand,
>
> Thanks for the review! You are right that making metric view the single
> source of truth brings better user experience, and also allows better
> governance and LLM integration. However, I think how to handle the
> integration is out of the scope of Spark, and should be taken care by the
> vendors who provide these infrastructure.
>
> Thanks,
> Wenchen
>
> On Thu, Nov 6, 2025 at 9:57 AM Anand Chinnakannan <[email protected]>
> wrote:
>
>> Hi Team,
>>
>> I’ve reviewed and updated our Metric View proposal (Q1–Q8) to include two
>> key enhancements:
>>
>> 🔐 Governance Integration: Metric Views are now treated as governed,
>> first-class catalog objects with access control, lineage tracking, and
>> versioning — ensuring metrics remain secure, auditable, and consistently
>> defined.
>>
>> 🤖 LLM Agent Integration: Added guidance on how LLMs and AI agents can
>> discover and query metric views through catalog metadata for consistent,
>> governed responses to natural-language queries.
>>
>>
>> These updates align with our goal of making Metric Views the single
>> source of truth for analytical and AI-driven use cases.
>>
>> I’d love your input on these sections — especially around:
>>
>> 1. Any additional governance scenarios we should consider.
>>
>>
>> 2. LLM integration edge cases or optimization ideas.
>>
>>
>> 3. Suggestions for examples, syntax, or long-term roadmap points.
>>
>>
>> Please feel free to add comments, edits, or examples directly in the
>> document, or share your thoughts in reply.
>> Your contributions will help us finalize a stronger, more complete
>> proposal for review.
>>
>> Thank you for your time and collaboration — looking forward to your
>> insights!
>>
>> Best regards,
>> Anand Chinnakannan
>> Staff Data Scientist | Walmart
>> Executive MBA Candidate, Quantic School of Business & Technology
>> 📧 [email protected]
>>
>> On Thu, Nov 6, 2025, 10:27 AM Wenchen Fan <[email protected]> wrote:
>>
>>> Thanks for the proposal! I believe this is a very useful feature, as the
>>> other alternatives do not work well: people need to either define many
>>> similar views with different grouping columns and aggregate functions, or
>>> manually maintain a doc page to describe the semantic of these metrics that
>>> people need to follow when writing queries to calculate these metrics.
>>>
>>> Shall we start the vote next week if there is no objections?
>>>
>>> On Fri, Oct 31, 2025 at 2:30 PM Linhong Liu
>>> <[email protected]> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I would like to propose introducing "The metrics & semantic modeling in
>>>> Spark".
>>>>
>>>> This feature enables defining business metrics once and reusing them
>>>> across any breakdown, ensuring consistent outcomes and bridging the
>>>> semantic gap between business logic and data schemas to help LLMs generate
>>>> more precise results.
>>>>
>>>> Looking forward to your feedback!
>>>>
>>>> JIRA: SPARK-54119 <https://issues.apache.org/jira/browse/SPARK-54119>
>>>> SPIP docs:
>>>> https://docs.google.com/document/d/1xVTLijvDTJ90lZ_ujwzf9HvBJgWg0mY6cYM44Fcghl0/edit?tab=t.0#heading=h.4iogryr5qznc
>>>> <https://docs.google.com/document/d/1xVTLijvDTJ90lZ_ujwzf9HvBJgWg0mY6cYM44Fcghl0/edit?tab=t.0#heading=h.4iogryr5qznc>
>>>>
>>>> Thanks,
>>>> Linhong
>>>>
>>>

Reply via email to