[
https://issues.apache.org/jira/browse/AMBARI-26532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18023090#comment-18023090
]
JungJungIn commented on AMBARI-26532:
-------------------------------------
Hi there — I’ve been working on a small PoC project called *MCP-Ambari-API*
(GitHub: https://github.com/call518/MCP-Ambari-API) which implements a
lightweight MCP-style interface on top of Ambari REST. Although it’s still in
early stages, it already covers a subset of functionality aligned with what
this JIRA proposes, so I wanted to share some experiences and suggestions.
*What MCP-Ambari-API currently supports (in my PoC):*
* Mapping Ambari read APIs (hosts, services, configs, metrics) to MCP
“Resources” endpoints
* Some write operations (e.g. update configurations, restart services) mapped
as MCP “Tools”
* Simple chaining/aggregation logic for small workflows
*Overlap / alignment with AMBARI-26532’s vision:*
* The read-only “Observer” role (expose cluster state, metrics, alert history)
is something I’ve partially implemented
* The “Operator” role (perform actions via MCP Tools) is also in scope
* The concept of translating conversational or agentic workflows into
orchestration over Ambari APIs is very much in the same spirit
*Gaps / limitations vs the full proposal in AMBARI-26532:*
* I don’t yet support natural language interpretation / LLM bridging
* No designed “Prompts” abstraction (complex multi-step workflows) in full
generality
* No full conversational UI or autonomous agent loop
* Limited error handling, security, concurrency, transactionality
*Suggestions / lessons learned from building the PoC:*
# Design a clear mapping layer between MCP primitives (Resource / Tool /
Prompt) and Ambari REST endpoints. A lot of complexity lies in reconciling
Ambari’s semantics (configs, versions, service/component dependencies).
# For multi-step workflows (Prompts), it helps to support templated workflows
(with parameters) rather than fully dynamic planning in first cut.
# Security / authentication boundary is critical. In my PoC, I had to
carefully gate write operations and respect Ambari’s RBAC.
# Metrics / alert data often need time-windowed querying and aggregation —
consider what time-series or summarization primitives MCP “Resources” need.
# Robustness: retries, fallback logic, partial rollbacks are important,
especially when chaining tools.
I’d be happy to contribute parts of MCP-Ambari-API (or collaborate) to the
official implementation of this feature. If maintainers are open, I can try a
pull request or prototype extension.
Thanks for opening this issue. It’s an exciting direction for making Ambari
more “agentic”.
> Add Model Context Protocol (MCP) Server for AI-Driven Cluster Management
> ------------------------------------------------------------------------
>
> Key: AMBARI-26532
> URL: https://issues.apache.org/jira/browse/AMBARI-26532
> Project: Ambari
> Issue Type: New Feature
> Reporter: Nikita Pande
> Assignee: Nikita Pande
> Priority: Major
>
> Integrating Ambari with MCP is not merely a technical exercise; it unlocks a
> new paradigm of cluster management, shifting from manual, UI-driven
> operations to conversational, automated, and ultimately autonomous control.
> This transformation enables a range of high-value use cases that can
> dramatically reduce operational overhead and democratize administrative
> expertise.
> * *Natural Language Diagnostics & Troubleshooting:* This is the most
> immediate and compelling use case. Administrators, regardless of their
> expertise level, can interact with the cluster in plain English to diagnose
> issues. Instead of navigating through multiple screens in the Ambari UI or
> crafting complex {{curl}} commands, they can simply ask questions. For
> instance:
> ** _"Why did the HDFS service health check fail on node '<nodeName>?"_
> ** _"Show me all CRITICAL alerts from the last 24 hours related to YARN."_
> ** _"What is the current heap usage of the NameNode, and how does it compare
> to yesterday?"_ To answer these, an AI agent would leverage MCP {{Resources}}
> to fetch health reports, alert histories, and performance metrics from
> Ambari, then use its reasoning capabilities to synthesize a coherent,
> human-readable answer.
> * *Automated and Agentic Remediation:* Moving beyond diagnosis, this
> integration empowers AI agents to take corrective actions. This creates a
> "self-healing" capability for the cluster. An agent can be instructed to
> execute complex remediation workflows that involve a chain of actions and
> checks. For example:
>
> ** _"The NameNode is in standby. Investigate the logs for critical errors.
> If none are found within the last 15 minutes, attempt a restart and confirm
> it becomes active. Notify the support channel in chat interface with the
> result."_ This workflow would require the agent to chain multiple MCP
> {{Tool}} calls: get logs ({{{}Resource{}}}), analyze them (LLM reasoning),
> restart the service ({{{}Tool{}}}), and check its status ({{{}Resource{}}}),
> demonstrating a sophisticated, agentic process.
> * *Conversational Configuration and Security Audits:* Complex configuration
> changes and security hardening are often error-prone. A conversational
> interface simplifies these tasks significantly.
>
> ** _"Increase the YARN NodeManager memory to 32GB on all worker nodes and
> then perform a rolling restart of the YARN service."_
> *
> ** _"Audit the cluster for security compliance. List all services that do
> not have Kerberos enabled and generate the sequence of API calls required to
> configure them."_ These commands would be translated by the agent into a
> series of {{updateServiceConfig}} and {{restartService}} tool calls, executed
> in the correct order.
> * *Declarative Provisioning via Conversation:* This use case represents an
> evolution of Ambari Blueprints, making cluster provisioning more accessible.
> An administrator could describe the desired cluster in high-level terms, and
> the AI agent would handle the low-level details of creating the Blueprint
> JSON.
>
> ** _"Provision a new 5-node test cluster using <stack name and version>. The
> cluster should include HDFS, YARN, and Spark. Designate 'master01' as the
> master node with the NameNode and ResourceManager, and the rest as worker
> nodes with DataNodes and NodeManagers."_ The agent would parse this request,
> generate the corresponding Blueprint, and use an MCP {{Tool}} to submit it to
> the Ambari API, initiating the cluster deployment.
> * *Proposed Solution:* This feature proposes the development and integration
> of a new, standalone {*}Ambari MCP Server{*}. This service will expose
> Ambari's rich management capabilities through the open and rapidly-adopted
> Model Context Protocol (MCP). By doing so, it will allow any MCP-compatible
> AI agent or host application (e.g., VS Code with Copilot, Claude Desktop) to
> securely discover and interact with the Ambari-managed cluster. The server
> will map Ambari's REST API endpoints to MCP's core primitives: state-changing
> operations will be exposed as {{{}Tools{}}}, read-only data queries as
> {{{}Resources{}}}, and complex, multi-step administrative tasks as
> {{{}Prompts{}}}. This will effectively transform Ambari from a passive
> management tool into an active, intelligent platform accessible via natural
> language and agentic workflows.
> *Key Benefits:*
> *Reduced Operational Overhead:* Enable administrators to diagnose issues,
> perform restarts, and modify configurations using simple, conversational
> commands, automating routine tasks.
> *Democratized Expertise:* Allow less experienced operators to perform complex
> administrative operations safely by leveraging pre-defined, reliable MCP
> Prompts that encapsulate expert workflows.
> *Enhanced Automation and Self-Healing:* Provide the foundation for building
> sophisticated, agentic systems that can proactively monitor cluster health,
> diagnose failures, and execute remediation plans autonomously.
> *Ecosystem Interoperability:* Position Ambari as a first-class citizen in the
> burgeoning ecosystem of AI development tools and agentic frameworks by
> adopting the MCP standard, ensuring its future relevance.
> *Roadmap:*
> *
> ** Read-Only Integration (The Observer) - Phase 1: Exposing all relevant
> cluster state, including service statuses, host information, component
> layouts, configurations, alert histories, and performance metrics.
> ** Actionable Tools (The Operator) - Phase 2: Enable direct, conversational
> control over the cluster. Administrators can now use the AI agent as a remote
> control for Ambari, issuing commands to operate the cluster.
> ** Abstracted Workflows (The Autonomous Agent) - Phase 3: Achieve true
> agentic behavior. This phase moves beyond simple command-and-control to a
> state where the AI can be delegated complex, long-running tasks, executing
> sophisticated strategies with minimal human intervention and unlocking the
> full potential of autonomous data platform management.
> Refer [https://modelcontextprotocol.io/]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]