This is an automated email from the ASF dual-hosted git repository.

jshao pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/gravitino-site.git


The following commit(s) were added to refs/heads/main by this push:
     new 4757a7c0a Add 2025 summary (#109)
4757a7c0a is described below

commit 4757a7c0a1b821f9c6d0ede343c15e919bec5013
Author: roryqi <[email protected]>
AuthorDate: Tue Jan 6 15:31:59 2026 +0800

    Add 2025 summary (#109)
    
    * Add 2025 summary
    
    * Fix minors
    
    * fix style
    
    * remove community users
---
 blog/2026-01-05-gravitino-2025-summary.mdx | 93 ++++++++++++++++++++++++++++++
 1 file changed, 93 insertions(+)

diff --git a/blog/2026-01-05-gravitino-2025-summary.mdx 
b/blog/2026-01-05-gravitino-2025-summary.mdx
new file mode 100644
index 000000000..ff41ba262
--- /dev/null
+++ b/blog/2026-01-05-gravitino-2025-summary.mdx
@@ -0,0 +1,93 @@
+---
+title: Apache Gravitino - 2025 Summary
+slug: gravitino-top-level-project
+tags: [apache,gravitino,ASF]
+---
+###
+
+### **Introduction**
+
+2025 was a landmark year for Apache Gravitino. The project not only graduated 
as a Top-Level Project (TLP) but also reached its first major stable release, 
version 1.0.0. Throughout the year, the community focused heavily on 
"Contextual Engineering" and "AI-native" metadata management, introducing 
groundbreaking features like the Model Context Protocol (MCP) server, the Lance 
REST service, and a metadata-driven action system. This article summarizes the 
milestones and achievements of Apa [...]
+
+###
+
+### **Timeline**
+
+Apache Gravitino officially **graduated as an Apache Top-Level Project on June 
3, 2025**, marking a significant maturity milestone.
+
+In 2025, the community released several key versions, including the major 
1.0.0 release and significant feature updates in 0.8.0-incubating, 
0.9.0-incubating, and 1.1.0.
+
+* **2025.01.24: Version 0.8.0-incubating released**
+  * Focused on strengthening AI support with the introduction of the **Model 
Catalog**.
+  * Introduced credential vending for Filesets and new connectors for Flink 
(Iceberg/Paimon).
+* **2025.05.07: Version 0.9.0-incubating released**
+  * Enhanced data governance with a new **Data Lineage interface** 
(OpenLineage compliant).
+  * Added gcli script for better CLI experience and improved security with 
privilege refinements.
+* **2025.09.24: Version 1.0.0 released**
+  * The first stable major release, themed "From Metadata Management to 
Contextual Engineering."
+  * Introduced the **Metadata-driven Action System** (including Statistics, 
Policies, and Jobs).
+  * Launched the **MCP (Model Context Protocol) Server**, enabling AI 
Agents/LLMs to interact directly with metadata.
+  * Implemented unified Role-Based Access Control (RBAC) across catalogs.
+* **2025.11.20: Version 1.0.1 released**
+  * A stability release featuring smarter job templates and improved Python 
client support.
+* **2025.12.19: Version 1.1.0 released**
+  * Added the **Lance REST service** to support vector data for AI workloads.
+  * Introduced a Generic Lakehouse Catalog and support for Hive 3 and 
multi-cluster HDFS filesets.
+  * Hardened security for the Iceberg REST service.
+
+### **Key Features & Improvements**
+
+In 2025, Gravitino evolved from a unified catalog to an active metadata 
control plane. Key technical achievements include:
+
+1. **AI & LLM Integration**: The project positioned itself as an AI-native 
catalog by introducing the **Model Catalog** for managing ML models and the 
**MCP Server** to connect AI agents with data context. The addition of the 
**Lance REST service** in v1.1.0 further solidified support for vector datasets.
+2. **Metadata-Driven Actions**: A new framework allowing users to define 
policies (e.g., TTL, compaction) and execute jobs based on metadata, moving 
beyond passive metadata storage.
+3. **Unified Governance & Security**: Full implementation of **RBAC**, 
credential vending for secure data access (S3/GCS/ADLS), and a unified 
authentication flow for Iceberg REST services.
+4. **Ecosystem Expansion**: Broadened support with new connectors (Generic 
Lakehouse, Hive 3, Flink, Paimon) and enhancements to the **GVFS (Gravitino 
Virtual File System)** for unified file management.
+
+### **Community**
+
+The Apache Gravitino community saw explosive growth in 2025, evolving from an 
incubator project into a Top-Level Project (TLP) backed by a rapidly expanding 
global ecosystem.
+
+* **Top-Level Graduation**: On **June 3, 2025**, the project officially 
graduated to an Apache Top-Level Project, a major milestone marking its 
maturity in community health, vendor-neutral governance, and production 
readiness.
+* **Community Growth (Year-over-Year)**:
+  * **Engagement**: GitHub stars increased by over **130%**, ending the year 
above **2,600**. Forks grew by approximately **150%**, reflecting a surge in 
community-led integrations and local developments.
+  * **Contributor Base**: The active developer community expanded by nearly 
**100%**. Recent major releases, such as version 1.1.0, featured contributions 
from **40+ unique developers** representing a wide variety of global 
organizations.
+  * **Development Velocity**: Development pace accelerated significantly, with 
code commits reaching a lifetime total of over **3,300 commits**.
+  * **Post-Graduation Committer Growth**: July 7, 2025: Chenxi Pan was added 
as Committers. December 15, 2025: Junda Yang and Yangyang Zhong were added as 
Committers.
+* **Global Presence**: The project established itself as the standard for 
federated metadata through featured presentations at **Community Over Code (NA 
& Asia)** and **QCon Shanghai**, gathering critical production feedback from 
global data engineering teams to shape the future roadmap.
+
+### **Industry Trends in Metadata Management (2026)**
+
+1. **Breaking Lakehouse Silos**: As organizations adopt multiple "open" table 
formats, the risk of "format lock-in" has replaced "vendor lock-in." The trend 
is toward **Universal Lakehouse** architectures that provide a single entry 
point for fragmented data silos.
+2. **The Multimodal AI Explosion**: AI workloads are moving beyond tabular 
data to include massive volumes of unstructured assets (images, video, audio). 
Traditional data stacks are being replaced by **AI-Native Multimodal Stacks** 
that can process complex data types with the same governance as SQL tables.
+3. **Emergence of Data Agents**: AI Agents are becoming the primary consumers 
of data. These agents require "Context Engineering"—a way to use metadata as an 
external brain to discover, understand, and act upon data autonomously.
+4. **Escalating AI Security Risks**: The high-speed nature of AI interactions 
makes traditional static security (RBAC) obsolete. The industry is moving 
toward **Identity-Centric Zero Trust** and **Fine-Grained ABAC** to prevent 
data leakage and ensure model safety.
+
+### **Future Work:**
+
+### **1. Universal Lakehouse & Format Interoperability**
+
+To solve the data silo problem, Gravitino is expanding its reach to provide a 
unified management layer for the modern Lakehouse.
+
+* **Multi-Format Support**: We will provide first-class support for **Apache 
Iceberg**, **Delta Lake**, **Hudi**, and **Paimon**. By acting as a "Catalog of 
Catalogs," Gravitino allows users to manage multiple formats through a single 
interface, significantly reducing vendor lock-in and simplifying cross-format 
governance.
+
+#### **2. Multimodal Data Stack for the AI Era**
+
+Gravitino is evolving to empower a new generation of AI-native data stacks.
+
+* **Ecosystem Integration**: We will focus on deep integration with AI-centric 
engines like **Daft**, **Ray**, and **Lance**.
+* **Empowering New Scenarios**: By providing a unified metadata layer for 
these engines, Gravitino allows users to "reuse" existing data governance 
capabilities—like auditing and access control—for modern multimodal scenarios, 
giving the new AI data stack enterprise-grade maturity from day one.
+
+#### **3. Data Agent Orchestration (Metadata as the "Brain")**
+
+Gravitino will serve as the cognitive foundation for autonomous **Data 
Agents**.
+
+* **MCP Server & Action System**: Leveraging the **Model Context Protocol 
(MCP)** and our **Metadata Action System**, we are exploring scenario-based 
capabilities for Data Agents. This allows an AI agent to not only "see" the 
data but also "act" on it—such as performing a schema update or triggering a 
compaction job—using metadata as its reasoning context.
+
+#### **4. Advanced Security: KMS & ABAC**
+
+As security threats become more sophisticated in the AI era, Gravitino is 
implementing more granular and automated security controls.
+
+* **ABAC (Attribute-Based Access Control)**: We will implement an ABAC engine 
to enable fine-grained permissions. This allows access decisions to be made 
based on dynamic tags (e.g., Sensitivity=High) and environmental context rather 
than just static roles.
+* **KMS & Credential Management**: To protect data-at-rest and in-transit, we 
are integrating with **Key Management Services (KMS)** .
+

Reply via email to