[jira] [Commented] (AMBARI-26532) Add Model Context Protocol (MCP) Server for AI-Driven Cluster Management

2026-02-10 Thread Nikita Pande (Jira)


[ 
https://issues.apache.org/jira/browse/AMBARI-26532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18057598#comment-18057598
 ] 

Nikita Pande commented on AMBARI-26532:
---

I've developed a prototype MCP server for Ambari in GoLang: 
[mcp-ambari|https://github.com/nikita15p/mcp-ambari]
*Key enhancements over potential Typescript alternative* : - 
- Built with official MCP Go SDK for standards compliance and efficiency 
- Separates readonly tools (e.g., status queries) from actionable ones (e.g., 
service ops) for secure AI interactions 
- Supports stdio, TLS, mTLS transports for flexible deployments 
- Includes pre-defined workflows, prompts, resources, and tools tailored to 
Ambari ops 
This aligns directly with AMBARI-26532 goals, offering superior performance for 
large-scale Hadoop clusters. 

> Add Model Context Protocol (MCP) Server for AI-Driven Cluster Management
> 
>
> Key: AMBARI-26532
> URL: https://issues.apache.org/jira/browse/AMBARI-26532
> Project: Ambari
>  Issue Type: New Feature
>Reporter: Nikita Pande
>Assignee: Nikita Pande
>Priority: Major
>
> Integrating Ambari with MCP is not merely a technical exercise; it unlocks a 
> new paradigm of cluster management, shifting from manual, UI-driven 
> operations to conversational, automated, and ultimately autonomous control. 
> This transformation enables a range of high-value use cases that can 
> dramatically reduce operational overhead and democratize administrative 
> expertise.
>  * *Natural Language Diagnostics & Troubleshooting:* This is the most 
> immediate and compelling use case. Administrators, regardless of their 
> expertise level, can interact with the cluster in plain English to diagnose 
> issues. Instead of navigating through multiple screens in the Ambari UI or 
> crafting complex {{curl}} commands, they can simply ask questions. For 
> instance:  
>  **  _"Why did the HDFS service health check fail on node '?"_
>  ** _"Show me all CRITICAL alerts from the last 24 hours related to YARN."_
>  ** _"What is the current heap usage of the NameNode, and how does it compare 
> to yesterday?"_ To answer these, an AI agent would leverage MCP {{Resources}} 
> to fetch health reports, alert histories, and performance metrics from 
> Ambari, then use its reasoning capabilities to synthesize a coherent, 
> human-readable answer.
>  * *Automated and Agentic Remediation:* Moving beyond diagnosis, this 
> integration empowers AI agents to take corrective actions. This creates a 
> "self-healing" capability for the cluster. An agent can be instructed to 
> execute complex remediation workflows that involve a chain of actions and 
> checks. For example:  
>  
>  ** _"The NameNode is in standby. Investigate the logs for critical errors. 
> If none are found within the last 15 minutes, attempt a restart and confirm 
> it becomes active. Notify the support channel in chat interface with the 
> result."_ This workflow would require the agent to chain multiple MCP 
> {{Tool}} calls: get logs ({{{}Resource{}}}), analyze them (LLM reasoning), 
> restart the service ({{{}Tool{}}}), and check its status ({{{}Resource{}}}), 
> demonstrating a sophisticated, agentic process.
>  * *Conversational Configuration and Security Audits:* Complex configuration 
> changes and security hardening are often error-prone. A conversational 
> interface simplifies these tasks significantly.  
>  
>  ** _"Increase the YARN NodeManager memory to 32GB on all worker nodes and 
> then perform a rolling restart of the YARN service."_
>  * 
>  ** _"Audit the cluster for security compliance. List all services that do 
> not have Kerberos enabled and generate the sequence of API calls required to 
> configure them."_ These commands would be translated by the agent into a 
> series of {{updateServiceConfig}} and {{restartService}} tool calls, executed 
> in the correct order.
>  * *Declarative Provisioning via Conversation:* This use case represents an 
> evolution of Ambari Blueprints, making cluster provisioning more accessible. 
> An administrator could describe the desired cluster in high-level terms, and 
> the AI agent would handle the low-level details of creating the Blueprint 
> JSON. 
>  
>  ** _"Provision a new 5-node test cluster using . The 
> cluster should include HDFS, YARN, and Spark. Designate 'master01' as the 
> master node with the NameNode and ResourceManager, and the rest as worker 
> nodes with DataNodes and NodeManagers."_ The agent would parse this request, 
> generate the corresponding Blueprint, and use an MCP {{Tool}} to submit it to 
> the Ambari API, initiating the cluster deployment.
>  * *Proposed Solution:* This feature proposes the development and integration 
> of a new, standalone {*}Ambari MCP Server{

[jira] [Commented] (AMBARI-26532) Add Model Context Protocol (MCP) Server for AI-Driven Cluster Management

2025-11-21 Thread Nikita Pande (Jira)


[ 
https://issues.apache.org/jira/browse/AMBARI-26532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18039797#comment-18039797
 ] 

Nikita Pande commented on AMBARI-26532:
---

Hi [~call518] thanks for showing interest in the feature :)
I was working on a POC 
[https://github.com/nikita15p/ambari-mcp-server.|https://github.com/nikita15p/ambari-mcp-server]
This has been tested with cline as MCP client and lists all the tools and 
resources as part of MCP Servers.

Below is a table explicitly segregating *MCP tools* and the *MCP resources* 
they manage, following your request and the ambari-mcp-server code/doc 
structure:.
||MCP Tool Name||MCP Resource Operated On||
|ambari_clusters_getclusters|Cluster Collection|
|ambari_clusters_getcluster|Single Cluster|
|ambari_clusters_createcluster|Cluster|
|ambari_services_getservices|Cluster Services|
|ambari_services_getservice|Single Service|
|ambari_services_getservicestate|Service State|
|ambari_services_startservice|Service|
|ambari_services_stopservice|Service|
|ambari_services_getserviceswithstaleconfigs|Services (Stale Configs)|
|ambari_services_gethostcomponentswithstaleconfigs|Host Components (Stale 
Configs)|
|ambari_services_restartservice|Service|
|ambari_services_restartcomponents|Component(s)|
|ambari_services_getrollingrestartstatus|Rolling Restart Status|
|ambari_services_enablemaintenancemode|Service/Component|
|ambari_services_disablemaintenancemode|Service/Component|
|ambari_services_runservicecheck|Service|
|ambari_services_isservicechecksupported|Service|
|ambari_services_getservicecheckstatus|Service Check|
|ambari_hosts_gethosts|Host Collection|
|ambari_hosts_gethost|Single Host|
|ambari_alerts_gettargets|Alert Targets|
|ambari_alerts_getalerts|Alerts|
|ambari_alerts_getalertsummary|Alert Summary|
|ambari_alerts_getalertdetails|Alert Definition|
|ambari_alerts_getalertdefinitions|Alert Definitions|
|ambari_alerts_updatealertdefinition|Alert Definition|
|ambari_alerts_getalertgroups|Alert Groups|
|ambari_alerts_createalertgroup|Alert Group|
|ambari_alerts_updatealertgroup|Alert Group|
|ambari_alerts_deletealertgroup|Alert Group|
|ambari_alerts_duplicatealertgroup|Alert Group|
|ambari_alerts_adddefinitiontogroup|Alert Group/Definition|
|ambari_alerts_removedefinitionfromgroup|Alert Group/Definition|
|ambari_alerts_getnotifications|Notification Targets|
|ambari_alerts_createnotification|Notification Target|
|ambari_alerts_updatenotification|Notification Target|
|ambari_alerts_deletenotification|Notification Target|
|ambari_alerts_addnotificationtogroup|Alert Group/Notification Target|
|ambari_alerts_removenotificationfromgroup|Alert Group/Notification Target|
|ambari_alerts_savealertsettings|Cluster Alert Settings|
 * {*}MCP Tools{*}: The function/endpoints callable by the MCP client or AI via 
the server.

 * {*}MCP Resources{*}: The specific Ambari objects (clusters, hosts, services, 
components, alert configs, notification targets, etc.) on which those tools 
operate.

Limitations:

 

Currently, authentication details are managed and stored locally before being 
passed to the system. Our roadmap includes implementing a full suite of 
authentication and authorization mechanisms supported by Ambari, such as LDAP, 
Kerberos, and Active Directory, etc integration, to enhance security and 
flexibility.

 

 

> Add Model Context Protocol (MCP) Server for AI-Driven Cluster Management
> 
>
> Key: AMBARI-26532
> URL: https://issues.apache.org/jira/browse/AMBARI-26532
> Project: Ambari
>  Issue Type: New Feature
>Reporter: Nikita Pande
>Assignee: Nikita Pande
>Priority: Major
>
> Integrating Ambari with MCP is not merely a technical exercise; it unlocks a 
> new paradigm of cluster management, shifting from manual, UI-driven 
> operations to conversational, automated, and ultimately autonomous control. 
> This transformation enables a range of high-value use cases that can 
> dramatically reduce operational overhead and democratize administrative 
> expertise.
>  * *Natural Language Diagnostics & Troubleshooting:* This is the most 
> immediate and compelling use case. Administrators, regardless of their 
> expertise level, can interact with the cluster in plain English to diagnose 
> issues. Instead of navigating through multiple screens in the Ambari UI or 
> crafting complex {{curl}} commands, they can simply ask questions. For 
> instance:  
>  **  _"Why did the HDFS service health check fail on node '?"_
>  ** _"Show me all CRITICAL alerts from the last 24 hours related to YARN."_
>  ** _"What is the current heap usage of the NameNode, and how does it compare 
> to yesterday?"_ To answer these, an AI agent would leverage MCP {{Resources}} 
> to fetch health reports, alert histories, and performance metrics fro

[jira] [Commented] (AMBARI-26532) Add Model Context Protocol (MCP) Server for AI-Driven Cluster Management

2025-09-26 Thread JungJungIn (Jira)


[ 
https://issues.apache.org/jira/browse/AMBARI-26532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18023090#comment-18023090
 ] 

JungJungIn commented on AMBARI-26532:
-

Hi there — I’ve been working on a small PoC project called *MCP-Ambari-API* 
(GitHub: https://github.com/call518/MCP-Ambari-API) which implements a 
lightweight MCP-style interface on top of Ambari REST. Although it’s still in 
early stages, it already covers a subset of functionality aligned with what 
this JIRA proposes, so I wanted to share some experiences and suggestions.

*What MCP-Ambari-API currently supports (in my PoC):*
 * Mapping Ambari read APIs (hosts, services, configs, metrics) to MCP 
“Resources” endpoints

 * Some write operations (e.g. update configurations, restart services) mapped 
as MCP “Tools”

 * Simple chaining/aggregation logic for small workflows

*Overlap / alignment with AMBARI-26532’s vision:*
 * The read-only “Observer” role (expose cluster state, metrics, alert history) 
is something I’ve partially implemented

 * The “Operator” role (perform actions via MCP Tools) is also in scope

 * The concept of translating conversational or agentic workflows into 
orchestration over Ambari APIs is very much in the same spirit

*Gaps / limitations vs the full proposal in AMBARI-26532:*
 * I don’t yet support natural language interpretation / LLM bridging

 * No designed “Prompts” abstraction (complex multi-step workflows) in full 
generality

 * No full conversational UI or autonomous agent loop

 * Limited error handling, security, concurrency, transactionality

*Suggestions / lessons learned from building the PoC:*
 # Design a clear mapping layer between MCP primitives (Resource / Tool / 
Prompt) and Ambari REST endpoints. A lot of complexity lies in reconciling 
Ambari’s semantics (configs, versions, service/component dependencies).

 # For multi-step workflows (Prompts), it helps to support templated workflows 
(with parameters) rather than fully dynamic planning in first cut.

 # Security / authentication boundary is critical. In my PoC, I had to 
carefully gate write operations and respect Ambari’s RBAC.

 # Metrics / alert data often need time-windowed querying and aggregation — 
consider what time-series or summarization primitives MCP “Resources” need.

 # Robustness: retries, fallback logic, partial rollbacks are important, 
especially when chaining tools.

I’d be happy to contribute parts of MCP-Ambari-API (or collaborate) to the 
official implementation of this feature. If maintainers are open, I can try a 
pull request or prototype extension.

Thanks for opening this issue. It’s an exciting direction for making Ambari 
more “agentic”.

> Add Model Context Protocol (MCP) Server for AI-Driven Cluster Management
> 
>
> Key: AMBARI-26532
> URL: https://issues.apache.org/jira/browse/AMBARI-26532
> Project: Ambari
>  Issue Type: New Feature
>Reporter: Nikita Pande
>Assignee: Nikita Pande
>Priority: Major
>
> Integrating Ambari with MCP is not merely a technical exercise; it unlocks a 
> new paradigm of cluster management, shifting from manual, UI-driven 
> operations to conversational, automated, and ultimately autonomous control. 
> This transformation enables a range of high-value use cases that can 
> dramatically reduce operational overhead and democratize administrative 
> expertise.
>  * *Natural Language Diagnostics & Troubleshooting:* This is the most 
> immediate and compelling use case. Administrators, regardless of their 
> expertise level, can interact with the cluster in plain English to diagnose 
> issues. Instead of navigating through multiple screens in the Ambari UI or 
> crafting complex {{curl}} commands, they can simply ask questions. For 
> instance:  
>  **  _"Why did the HDFS service health check fail on node '?"_
>  ** _"Show me all CRITICAL alerts from the last 24 hours related to YARN."_
>  ** _"What is the current heap usage of the NameNode, and how does it compare 
> to yesterday?"_ To answer these, an AI agent would leverage MCP {{Resources}} 
> to fetch health reports, alert histories, and performance metrics from 
> Ambari, then use its reasoning capabilities to synthesize a coherent, 
> human-readable answer.
>  * *Automated and Agentic Remediation:* Moving beyond diagnosis, this 
> integration empowers AI agents to take corrective actions. This creates a 
> "self-healing" capability for the cluster. An agent can be instructed to 
> execute complex remediation workflows that involve a chain of actions and 
> checks. For example:  
>  
>  ** _"The NameNode is in standby. Investigate the logs for critical errors. 
> If none are found within the last 15 minutes, attempt a restart and confirm 
> it becomes active. Notify the support channel in