sandugood commented on code in PR #1513:
URL: 
https://github.com/apache/datafusion-ballista/pull/1513#discussion_r2946184117


##########
python/examples/getting_started.ipynb:
##########
@@ -0,0 +1,353 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<!---\n",
+    "  Licensed to the Apache Software Foundation (ASF) under one\n",
+    "  or more contributor license agreements.  See the NOTICE file\n",
+    "  distributed with this work for additional information\n",
+    "  regarding copyright ownership.  The ASF licenses this file\n",
+    "  to you under the Apache License, Version 2.0 (the\n",
+    "  \"License\"); you may not use this file except in compliance\n",
+    "  with the License.  You may obtain a copy of the License at\n",
+    "\n",
+    "    http://www.apache.org/licenses/LICENSE-2.0\n";,
+    "\n",
+    "  Unless required by applicable law or agreed to in writing,\n",
+    "  software distributed under the License is distributed on an\n",
+    "  \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY\n",
+    "  KIND, either express or implied.  See the License for the\n",
+    "  specific language governing permissions and limitations\n",
+    "  under the License.\n",
+    "-->\n",
+    "\n",
+    "# Getting Started with PyBallista\n",
+    "\n",
+    "This notebook demonstrates how to get started with Ballista using 
Python.\n",
+    "\n",
+    "## Prerequisites\n",
+    "\n",
+    "1. Install PyBallista: `pip install ballista`\n",
+    "2. Have a Ballista cluster running (or use the built-in test cluster)\n",
+    "\n",
+    "## Overview\n",
+    "\n",
+    "Ballista is a distributed query engine built on Apache DataFusion. 
PyBallista provides:\n",
+    "\n",
+    "- **BallistaSessionContext**: Drop-in replacement for DataFusion's 
SessionContext\n",
+    "- **SQL Magic Commands**: Interactive SQL in Jupyter notebooks via `%sql` 
and `%%sql`\n",
+    "- **DataFrame API**: Full DataFrame API for data transformations\n",
+    "- **Rich HTML Display**: DataFrames render as styled HTML tables"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Method 1: Python API\n",
+    "\n",
+    "The most straightforward way to use Ballista is via the Python API."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import ballista\n",
+    "from ballista import BallistaSessionContext, setup_test_cluster\n",
+    "%load_ext autoreload\n",
+    "%autoreload 2\n",
+    "\n",
+    "# Check versions\n",
+    "print(f\"Ballista version: {ballista.__version__}\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# For this demo, we'll use the built-in test cluster\n",
+    "# In production, you would connect to your Ballista scheduler:\n",
+    "# ctx = BallistaSessionContext(\"df://your-scheduler:50050\")\n",
+    "\n",
+    "# host, port = setup_test_cluster()\n",
+    "host, port = \"localhost\", \"39431\"\n",

Review Comment:
   Yes, debug leftover. Fixed



##########
python/python/ballista/extension.py:
##########
@@ -229,9 +231,319 @@ def write_parquet(
         df = self._to_internal_df()
         df.write_parquet(str(path), compression.value, compression_level)
 
+    def explain_visual(self, analyze: bool = False) -> 
"ExecutionPlanVisualization":
+        """
+        Generate a visual representation of the execution plan.
+
+        This method creates an SVG visualization of the query execution plan,
+        which can be displayed directly in Jupyter notebooks.
+
+        Args:
+            analyze: If True, includes runtime statistics from actual 
execution.
+
+        Returns:
+            ExecutionPlanVisualization: An object that renders as SVG in 
Jupyter.
+
+        Example:
+            >>> df = ctx.sql("SELECT * FROM orders WHERE amount > 100")
+            >>> df.explain_visual()  # Displays SVG in notebook
+            >>> viz = df.explain_visual(analyze=True)
+            >>> viz.save("plan.svg")  # Save to file
+        """
+        # Get the execution plan as a string representation
+        # Note: explain() prints but doesn't return a string, so we use 
logical_plan()
+        try:
+            plan = self.logical_plan()
+            plan_str = plan.display_indent()
+        except Exception:
+            # Fallback if logical_plan() fails
+            plan_str = "Unable to retrieve execution plan"
+        return ExecutionPlanVisualization(plan_str, analyze=analyze)
+
+    def collect_with_progress(
+        self,
+        callback: Optional[callable] = None,

Review Comment:
   Fixed. Thank you



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to