viirya commented on PR #55648:
URL: https://github.com/apache/spark/pull/55648#issuecomment-4377110887
Thanks @gaogaotiantian and @franciscoabsampaio for the feedback — agreed on
both counts.
I should have started with an SPIP. This is user-facing surface area that
introduces a new module, an optional third-party SDK dependency (`mcp`), and a
permanent set of tool-shape contracts that the project will own going forward.
Those are exactly the criteria the SPIP process is meant to gate.
Plan from here:
1. I'll file the SPIP on dev@ and JIRA, with this PR linked as a reference
implementation. A working draft is already written; I'll share it on the SPIP
thread so people can react to the design before deep code review starts here.
2. The PR itself is functionally ready -- I'll keep it open at its current
status so reviewers who want to look at the code alongside the SPIP discussion
can do so, but I'm not asking for review merge-readiness until the SPIP lands.
3. Once the SPIP is accepted (or revised), I'll address review comments here
(including documentation — there is a module README at
`python/pyspark/sql/mcp/README.md` today, but no user-facing Sphinx page yet,
which is on me to add), and rebase.
@franciscoabsampaio — quick note on `asyncio.to_thread`: it's there because
the MCP Python SDK's server loop is asyncio-native, and the Spark Connect
Python client API is synchronous (`df.collect()`, `df._explain_string(...)`,
etc.). To honour a `--query-timeout-seconds` budget without blocking the
asyncio event loop, the tool handlers offload the blocking Connect calls via
`asyncio.to_thread` and wrap them in `asyncio.wait_for`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]