This is an automated email from the ASF dual-hosted git repository.

agrove pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/datafusion-comet.git


The following commit(s) were added to refs/heads/main by this push:
     new 837c256f docs: Various documentation improvements (#1005)
837c256f is described below

commit 837c256f0de16ea06b04bdc84503367b8a87be03
Author: Andy Grove <[email protected]>
AuthorDate: Tue Oct 8 15:16:12 2024 -0600

    docs: Various documentation improvements (#1005)
    
    * various documentation improvements
    
    * add direct download urls
---
 README.md                                          |   4 +-
 .../_static/images/CometNativeExecution.drawio.png | Bin 61017 -> 0 bytes
 .../_static/images/CometNativeParquetReader.drawio | 100 +++++++++++++++++++
 .../images/CometNativeParquetReader.drawio.svg     |   4 +
 .../images/CometNativeParquetScan.drawio.png       | Bin 75703 -> 0 bytes
 .../_static/images/CometOverviewDetailed.drawio    |  94 ++++++++++++++++++
 .../images/CometOverviewDetailed.drawio.svg        |   4 +
 docs/source/contributor-guide/plugin_overview.md   |   4 +-
 docs/source/index.rst                              |   2 +
 docs/source/user-guide/installation.md             | 107 +++++++--------------
 docs/source/user-guide/overview.md                 |  34 +++----
 docs/source/user-guide/source.md                   |  69 +++++++++++++
 12 files changed, 329 insertions(+), 93 deletions(-)

diff --git a/README.md b/README.md
index c318b053..1a6281a9 100644
--- a/README.md
+++ b/README.md
@@ -30,10 +30,12 @@ under the License.
 <img src="docs/source/_static/images/DataFusionComet-Logo-Light.png" 
width="512" alt="logo"/>
 
 Apache DataFusion Comet is a high-performance accelerator for Apache Spark, 
built on top of the powerful
-[Apache DataFusion](https://datafusion.apache.org) query engine. Comet is 
designed to significantly enhance the
+[Apache DataFusion] query engine. Comet is designed to significantly enhance 
the
 performance of Apache Spark workloads while leveraging commodity hardware and 
seamlessly integrating with the
 Spark ecosystem without requiring any code changes.
 
+[Apache DataFusion]: https://datafusion.apache.org
+
 # Benefits of Using Comet
 
 ## Run Spark Queries at DataFusion Speeds
diff --git a/docs/source/_static/images/CometNativeExecution.drawio.png 
b/docs/source/_static/images/CometNativeExecution.drawio.png
deleted file mode 100644
index ba122a1f..00000000
Binary files a/docs/source/_static/images/CometNativeExecution.drawio.png and 
/dev/null differ
diff --git a/docs/source/_static/images/CometNativeParquetReader.drawio 
b/docs/source/_static/images/CometNativeParquetReader.drawio
new file mode 100644
index 00000000..0c7304ef
--- /dev/null
+++ b/docs/source/_static/images/CometNativeParquetReader.drawio
@@ -0,0 +1,100 @@
+<mxfile host="app.diagrams.net" agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 
10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.6 Safari/605.1.15" 
version="24.7.16">
+  <diagram name="Page-1" id="IdYZ_KFENTEXElLiOEKC">
+    <mxGraphModel dx="1133" dy="729" grid="1" gridSize="10" guides="1" 
tooltips="1" connect="1" arrows="1" fold="1" page="1" pageScale="1" 
pageWidth="850" pageHeight="1100" math="0" shadow="0">
+      <root>
+        <mxCell id="0" />
+        <mxCell id="1" parent="0" />
+        <mxCell id="wVAZ-YzccNhZugPFJvmi-30" value="Spark Executor" 
style="rounded=1;whiteSpace=wrap;html=1;dashed=1;verticalAlign=top;" vertex="1" 
parent="1">
+          <mxGeometry x="10" y="40" width="510" height="430" as="geometry" />
+        </mxCell>
+        <mxCell id="AH3lBTSLKK5181iXBnnY-2" value="JVM Code" 
style="rounded=1;whiteSpace=wrap;html=1;verticalAlign=top;" parent="1" 
vertex="1">
+          <mxGeometry x="30" y="70" width="210" height="380" as="geometry" />
+        </mxCell>
+        <mxCell id="wVAZ-YzccNhZugPFJvmi-24" 
style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;exitX=0.75;exitY=1;exitDx=0;exitDy=0;entryX=0.75;entryY=0;entryDx=0;entryDy=0;"
 edge="1" parent="1" source="t5OBkkhKOG6cYtw1sPyQ-18" 
target="wVAZ-YzccNhZugPFJvmi-13">
+          <mxGeometry relative="1" as="geometry" />
+        </mxCell>
+        <mxCell id="t5OBkkhKOG6cYtw1sPyQ-18" value="Comet Parquet 
Reader&lt;div&gt;&lt;br&gt;&lt;/div&gt;&lt;div&gt;&lt;br&gt;&lt;/div&gt;&lt;div&gt;IO
 and Decompression&lt;/div&gt;" 
style="rounded=1;whiteSpace=wrap;html=1;verticalAlign=top;" parent="1" 
vertex="1">
+          <mxGeometry x="45" y="110" width="180" height="100" as="geometry" />
+        </mxCell>
+        <mxCell id="wVAZ-YzccNhZugPFJvmi-1" value="Native Code" 
style="rounded=1;whiteSpace=wrap;html=1;verticalAlign=top;" vertex="1" 
parent="1">
+          <mxGeometry x="290" y="70" width="210" height="380" as="geometry" />
+        </mxCell>
+        <mxCell id="wVAZ-YzccNhZugPFJvmi-21" 
style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;exitX=0;exitY=0.75;exitDx=0;exitDy=0;entryX=1;entryY=0.75;entryDx=0;entryDy=0;"
 edge="1" parent="1" source="wVAZ-YzccNhZugPFJvmi-2" 
target="wVAZ-YzccNhZugPFJvmi-13">
+          <mxGeometry relative="1" as="geometry" />
+        </mxCell>
+        <mxCell id="wVAZ-YzccNhZugPFJvmi-2" value="Native Execution Plan" 
style="rounded=1;whiteSpace=wrap;html=1;verticalAlign=top;" vertex="1" 
parent="1">
+          <mxGeometry x="310" y="240" width="170" height="100" as="geometry" />
+        </mxCell>
+        <mxCell id="wVAZ-YzccNhZugPFJvmi-19" 
style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;exitX=0;exitY=0.75;exitDx=0;exitDy=0;entryX=1;entryY=0.75;entryDx=0;entryDy=0;"
 edge="1" parent="1" source="wVAZ-YzccNhZugPFJvmi-4" 
target="t5OBkkhKOG6cYtw1sPyQ-18">
+          <mxGeometry relative="1" as="geometry" />
+        </mxCell>
+        <mxCell id="wVAZ-YzccNhZugPFJvmi-4" value="Parquet Decoding" 
style="rounded=1;whiteSpace=wrap;html=1;verticalAlign=top;" vertex="1" 
parent="1">
+          <mxGeometry x="305" y="110" width="180" height="100" as="geometry" />
+        </mxCell>
+        <mxCell id="wVAZ-YzccNhZugPFJvmi-6" value="" 
style="shape=image;verticalLabelPosition=bottom;labelBackgroundColor=default;verticalAlign=top;aspect=fixed;imageAspect=0;image=data:image/svg+xml,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIGZpbGw9Im5vbmUiIHZpZXdCb3g9IjAgMCA4MDEgMTY4IiBoZWlnaHQ9IjE2OCIgd2lkdGg9IjgwMSI+JiN4YTs8ZyBjbGlwLXBhdGg9InVybCgjY2xpcDBfMV8xODEpIj4mI3hhOzxwYXRoIGZpbGw9InVybCgjcGFpbnQwX2xpbmVhcl8xXzE4MSkiIGQ9Ik03Ni4xMjk3IDE2OEM4OC40NTk3IDE2OCA5OS42MDk3IDE1
 [...]
+          <mxGeometry x="323.48" y="273.6" width="143.03" height="30" 
as="geometry" />
+        </mxCell>
+        <mxCell id="wVAZ-YzccNhZugPFJvmi-7" value="" 
style="shape=image;verticalLabelPosition=bottom;labelBackgroundColor=default;verticalAlign=top;aspect=fixed;imageAspect=0;image=data:image/png,iVBORw0KGgoAAAANSUhEUgAABwgAAAOoCAMAAADyHlBJAAADAFBMVEUAAAABAQECAgIDAwMEBAQFBQUGBgYHBwcICAgJCQkKCgoLCwsMDAwNDQ0ODg4PDw8QEBARERESEhITExMUFBQVFRUWFhYXFxcYGBgZGRkaGhobGxscHBwdHR0eHh4fHx8gICAhISEiIiIjIyMkJCQlJSUmJiYnJycoKCgpKSkqKiorKyssLCwtLS0uLi4vLy8wMDAxMTEyMjIzMzM0NDQ1NTU2NjY3Nzc4ODg5OTk6Ojo7Ozs8
 [...]
+          <mxGeometry x="360" y="303.6" width="70" height="36.4" as="geometry" 
/>
+        </mxCell>
+        <mxCell id="wVAZ-YzccNhZugPFJvmi-10" value="" 
style="shape=flexArrow;endArrow=classic;html=1;rounded=0;exitX=0.5;exitY=1;exitDx=0;exitDy=0;entryX=0.5;entryY=0;entryDx=0;entryDy=0;"
 edge="1" parent="1">
+          <mxGeometry width="50" height="50" relative="1" as="geometry">
+            <mxPoint x="394.5" y="340" as="sourcePoint" />
+            <mxPoint x="394.5" y="370" as="targetPoint" />
+          </mxGeometry>
+        </mxCell>
+        <mxCell id="wVAZ-YzccNhZugPFJvmi-11" value="Shuffle Files" 
style="shape=process;whiteSpace=wrap;html=1;backgroundOutline=1;fillColor=#f5f5f5;fontColor=#333333;strokeColor=#666666;"
 vertex="1" parent="1">
+          <mxGeometry x="310" y="370" width="170" height="50" as="geometry" />
+        </mxCell>
+        <mxCell id="wVAZ-YzccNhZugPFJvmi-20" 
style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;exitX=1;exitY=0.25;exitDx=0;exitDy=0;entryX=0;entryY=0.25;entryDx=0;entryDy=0;"
 edge="1" parent="1" source="wVAZ-YzccNhZugPFJvmi-13" 
target="wVAZ-YzccNhZugPFJvmi-2">
+          <mxGeometry relative="1" as="geometry" />
+        </mxCell>
+        <mxCell id="wVAZ-YzccNhZugPFJvmi-28" value="executePlan()" 
style="edgeLabel;html=1;align=center;verticalAlign=middle;resizable=0;points=[];"
 vertex="1" connectable="0" parent="wVAZ-YzccNhZugPFJvmi-20">
+          <mxGeometry x="-0.1059" y="2" relative="1" as="geometry">
+            <mxPoint y="11" as="offset" />
+          </mxGeometry>
+        </mxCell>
+        <mxCell id="wVAZ-YzccNhZugPFJvmi-23" 
style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;exitX=0.25;exitY=0;exitDx=0;exitDy=0;entryX=0.25;entryY=1;entryDx=0;entryDy=0;"
 edge="1" parent="1" source="wVAZ-YzccNhZugPFJvmi-13" 
target="t5OBkkhKOG6cYtw1sPyQ-18">
+          <mxGeometry relative="1" as="geometry" />
+        </mxCell>
+        <mxCell id="wVAZ-YzccNhZugPFJvmi-25" 
style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;exitX=0.75;exitY=1;exitDx=0;exitDy=0;entryX=0.75;entryY=0;entryDx=0;entryDy=0;"
 edge="1" parent="1" source="wVAZ-YzccNhZugPFJvmi-13" 
target="wVAZ-YzccNhZugPFJvmi-14">
+          <mxGeometry relative="1" as="geometry" />
+        </mxCell>
+        <mxCell id="wVAZ-YzccNhZugPFJvmi-13" value="CometExecIterator" 
style="rounded=1;whiteSpace=wrap;html=1;verticalAlign=middle;" vertex="1" 
parent="1">
+          <mxGeometry x="45" y="240" width="180" height="100" as="geometry" />
+        </mxCell>
+        <mxCell id="wVAZ-YzccNhZugPFJvmi-22" 
style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;exitX=0.25;exitY=0;exitDx=0;exitDy=0;entryX=0.25;entryY=1;entryDx=0;entryDy=0;"
 edge="1" parent="1" source="wVAZ-YzccNhZugPFJvmi-14" 
target="wVAZ-YzccNhZugPFJvmi-13">
+          <mxGeometry relative="1" as="geometry" />
+        </mxCell>
+        <mxCell id="wVAZ-YzccNhZugPFJvmi-26" value="next()" 
style="edgeLabel;html=1;align=center;verticalAlign=middle;resizable=0;points=[];"
 vertex="1" connectable="0" parent="wVAZ-YzccNhZugPFJvmi-22">
+          <mxGeometry x="0.0667" y="1" relative="1" as="geometry">
+            <mxPoint x="21" as="offset" />
+          </mxGeometry>
+        </mxCell>
+        <mxCell id="wVAZ-YzccNhZugPFJvmi-14" value="Spark Execution Logic" 
style="rounded=1;whiteSpace=wrap;html=1;verticalAlign=middle;" vertex="1" 
parent="1">
+          <mxGeometry x="45" y="370" width="180" height="40" as="geometry" />
+        </mxCell>
+        <mxCell id="wVAZ-YzccNhZugPFJvmi-15" value="" 
style="shape=image;verticalLabelPosition=bottom;labelBackgroundColor=default;verticalAlign=top;aspect=fixed;imageAspect=0;image=data:image/png,iVBORw0KGgoAAAANSUhEUgAABwgAAAOoCAMAAADyHlBJAAADAFBMVEUAAAABAQECAgIDAwMEBAQFBQUGBgYHBwcICAgJCQkKCgoLCwsMDAwNDQ0ODg4PDw8QEBARERESEhITExMUFBQVFRUWFhYXFxcYGBgZGRkaGhobGxscHBwdHR0eHh4fHx8gICAhISEiIiIjIyMkJCQlJSUmJiYnJycoKCgpKSkqKiorKyssLCwtLS0uLi4vLy8wMDAxMTEyMjIzMzM0NDQ1NTU2NjY3Nzc4ODg5OTk6Ojo7Ozs
 [...]
+          <mxGeometry x="360" y="173.60000000000002" width="70" height="36.4" 
as="geometry" />
+        </mxCell>
+        <mxCell id="wVAZ-YzccNhZugPFJvmi-16" value="" 
style="shape=flexArrow;endArrow=classic;html=1;rounded=0;exitX=0.5;exitY=1;exitDx=0;exitDy=0;entryX=0.5;entryY=0;entryDx=0;entryDy=0;"
 edge="1" parent="1">
+          <mxGeometry width="50" height="50" relative="1" as="geometry">
+            <mxPoint x="394.5" y="210" as="sourcePoint" />
+            <mxPoint x="394.5" y="240" as="targetPoint" />
+          </mxGeometry>
+        </mxCell>
+        <mxCell id="wVAZ-YzccNhZugPFJvmi-18" 
style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;exitX=1;exitY=0.25;exitDx=0;exitDy=0;entryX=0;entryY=0.25;entryDx=0;entryDy=0;"
 edge="1" parent="1" source="t5OBkkhKOG6cYtw1sPyQ-18" 
target="wVAZ-YzccNhZugPFJvmi-4">
+          <mxGeometry relative="1" as="geometry" />
+        </mxCell>
+        <mxCell id="wVAZ-YzccNhZugPFJvmi-29" value="decode()" 
style="edgeLabel;html=1;align=center;verticalAlign=middle;resizable=0;points=[];"
 vertex="1" connectable="0" parent="wVAZ-YzccNhZugPFJvmi-18">
+          <mxGeometry x="-0.025" y="-3" relative="1" as="geometry">
+            <mxPoint y="12" as="offset" />
+          </mxGeometry>
+        </mxCell>
+        <mxCell id="wVAZ-YzccNhZugPFJvmi-27" value="next()" 
style="edgeLabel;html=1;align=center;verticalAlign=middle;resizable=0;points=[];"
 vertex="1" connectable="0" parent="1">
+          <mxGeometry x="110" y="220" as="geometry" />
+        </mxCell>
+      </root>
+    </mxGraphModel>
+  </diagram>
+</mxfile>
diff --git a/docs/source/_static/images/CometNativeParquetReader.drawio.svg 
b/docs/source/_static/images/CometNativeParquetReader.drawio.svg
new file mode 100644
index 00000000..0c1f93c7
--- /dev/null
+++ b/docs/source/_static/images/CometNativeParquetReader.drawio.svg
@@ -0,0 +1,4 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Do not edit this file with editors other than draw.io -->
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" 
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd";>
+<svg xmlns="http://www.w3.org/2000/svg"; 
xmlns:xlink="http://www.w3.org/1999/xlink"; version="1.1" width="511px" 
height="431px" viewBox="-0.5 -0.5 511 431" content="&lt;mxfile 
host=&quot;app.diagrams.net&quot; agent=&quot;Mozilla/5.0 (Macintosh; Intel Mac 
OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.6 
Safari/605.1.15&quot; version=&quot;24.7.16&quot; scale=&quot;1&quot; 
border=&quot;0&quot;&gt;&#10;  &lt;diagram name=&quot;Page-1&quot; 
id=&quot;IdYZ_KFENTEXElLiOEKC&quo [...]
\ No newline at end of file
diff --git a/docs/source/_static/images/CometNativeParquetScan.drawio.png 
b/docs/source/_static/images/CometNativeParquetScan.drawio.png
deleted file mode 100644
index 712cbae4..00000000
Binary files a/docs/source/_static/images/CometNativeParquetScan.drawio.png and 
/dev/null differ
diff --git a/docs/source/_static/images/CometOverviewDetailed.drawio 
b/docs/source/_static/images/CometOverviewDetailed.drawio
new file mode 100644
index 00000000..ff7f4c59
--- /dev/null
+++ b/docs/source/_static/images/CometOverviewDetailed.drawio
@@ -0,0 +1,94 @@
+<mxfile host="app.diagrams.net" agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 
10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.6 Safari/605.1.15" 
version="24.7.16">
+  <diagram name="Page-1" id="IdYZ_KFENTEXElLiOEKC">
+    <mxGraphModel dx="1193" dy="827" grid="1" gridSize="10" guides="1" 
tooltips="1" connect="1" arrows="1" fold="1" page="1" pageScale="1" 
pageWidth="850" pageHeight="1100" math="0" shadow="0">
+      <root>
+        <mxCell id="0" />
+        <mxCell id="1" parent="0" />
+        <mxCell id="AH3lBTSLKK5181iXBnnY-2" value="Spark Executor" 
style="rounded=1;whiteSpace=wrap;html=1;verticalAlign=top;" parent="1" 
vertex="1">
+          <mxGeometry x="290" width="210" height="430" as="geometry" />
+        </mxCell>
+        <mxCell id="AH3lBTSLKK5181iXBnnY-16" value="Spark Driver" 
style="rounded=1;whiteSpace=wrap;html=1;verticalAlign=top;" parent="1" 
vertex="1">
+          <mxGeometry y="40" width="200" height="350" as="geometry" />
+        </mxCell>
+        <mxCell id="AH3lBTSLKK5181iXBnnY-17" value="" 
style="shape=image;verticalLabelPosition=bottom;labelBackgroundColor=default;verticalAlign=top;aspect=fixed;imageAspect=0;image=data:image/svg+xml,PHN2ZyB4bWxuczp4bGluaz0iaHR0cDovL3d3dy53My5vcmcvMTk5OS94bGluayIgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzIwMDAvc3ZnIiBzdHlsZT0iZmlsbC1ydWxlOmV2ZW5vZGQ7Y2xpcC1ydWxlOmV2ZW5vZGQ7c3Ryb2tlLWxpbmVqb2luOnJvdW5kO3N0cm9rZS1taXRlcmxpbWl0OjI7IiB4bWw6c3BhY2U9InByZXNlcnZlIiB2ZXJzaW9uPSIxLjEiIHZpZXdCb3g9IjAgMCA
 [...]
+          <mxGeometry x="34.519999999999996" y="200" width="125.48" 
height="30" as="geometry" />
+        </mxCell>
+        <mxCell id="t5OBkkhKOG6cYtw1sPyQ-1" value="Spark Logical Plan" 
style="rounded=1;whiteSpace=wrap;html=1;" vertex="1" parent="1">
+          <mxGeometry x="10" y="80" width="180" height="30" as="geometry" />
+        </mxCell>
+        <mxCell id="t5OBkkhKOG6cYtw1sPyQ-2" value="Spark Physical Plan" 
style="rounded=1;whiteSpace=wrap;html=1;" vertex="1" parent="1">
+          <mxGeometry x="10" y="140" width="180" height="30" as="geometry" />
+        </mxCell>
+        <mxCell id="t5OBkkhKOG6cYtw1sPyQ-3" value="Comet Physical Plan" 
style="rounded=1;whiteSpace=wrap;html=1;verticalAlign=top;" vertex="1" 
parent="1">
+          <mxGeometry x="10" y="260" width="180" height="100" as="geometry" />
+        </mxCell>
+        <mxCell id="t5OBkkhKOG6cYtw1sPyQ-4" value="protobuf intermediate 
representation" 
style="shape=process;whiteSpace=wrap;html=1;backgroundOutline=1;fillColor=#f5f5f5;fontColor=#333333;strokeColor=#666666;"
 vertex="1" parent="1">
+          <mxGeometry x="40" y="290" width="120" height="50" as="geometry" />
+        </mxCell>
+        <mxCell id="t5OBkkhKOG6cYtw1sPyQ-12" value="" 
style="shape=flexArrow;endArrow=classic;html=1;rounded=0;exitX=0.5;exitY=1;exitDx=0;exitDy=0;entryX=0.5;entryY=0;entryDx=0;entryDy=0;"
 edge="1" parent="1" source="t5OBkkhKOG6cYtw1sPyQ-1" 
target="t5OBkkhKOG6cYtw1sPyQ-2">
+          <mxGeometry width="50" height="50" relative="1" as="geometry">
+            <mxPoint x="270" y="270" as="sourcePoint" />
+            <mxPoint x="320" y="220" as="targetPoint" />
+          </mxGeometry>
+        </mxCell>
+        <mxCell id="t5OBkkhKOG6cYtw1sPyQ-13" value="" 
style="shape=flexArrow;endArrow=classic;html=1;rounded=0;exitX=0.5;exitY=1;exitDx=0;exitDy=0;entryX=0.5;entryY=0;entryDx=0;entryDy=0;"
 edge="1" parent="1">
+          <mxGeometry width="50" height="50" relative="1" as="geometry">
+            <mxPoint x="96.75999999999999" y="170" as="sourcePoint" />
+            <mxPoint x="96.75999999999999" y="200" as="targetPoint" />
+          </mxGeometry>
+        </mxCell>
+        <mxCell id="t5OBkkhKOG6cYtw1sPyQ-14" value="" 
style="shape=flexArrow;endArrow=classic;html=1;rounded=0;exitX=0.5;exitY=1;exitDx=0;exitDy=0;entryX=0.5;entryY=0;entryDx=0;entryDy=0;"
 edge="1" parent="1">
+          <mxGeometry width="50" height="50" relative="1" as="geometry">
+            <mxPoint x="96.75999999999999" y="230" as="sourcePoint" />
+            <mxPoint x="96.75999999999999" y="260" as="targetPoint" />
+          </mxGeometry>
+        </mxCell>
+        <mxCell id="t5OBkkhKOG6cYtw1sPyQ-15" value="" 
style="shape=flexArrow;endArrow=classic;html=1;rounded=0;endWidth=28;endSize=9.67;width=11;fillColor=#000000;"
 edge="1" parent="1">
+          <mxGeometry width="50" height="50" relative="1" as="geometry">
+            <mxPoint x="200" y="204.5" as="sourcePoint" />
+            <mxPoint x="290" y="204.5" as="targetPoint" />
+          </mxGeometry>
+        </mxCell>
+        <mxCell id="t5OBkkhKOG6cYtw1sPyQ-16" value="Native Execution Plan" 
style="rounded=1;whiteSpace=wrap;html=1;verticalAlign=top;" vertex="1" 
parent="1">
+          <mxGeometry x="310" y="230" width="170" height="100" as="geometry" />
+        </mxCell>
+        <mxCell id="t5OBkkhKOG6cYtw1sPyQ-17" value="" 
style="shape=image;verticalLabelPosition=bottom;labelBackgroundColor=default;verticalAlign=top;aspect=fixed;imageAspect=0;image=data:image/svg+xml,PHN2ZyB4bWxuczp4bGluaz0iaHR0cDovL3d3dy53My5vcmcvMTk5OS94bGluayIgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzIwMDAvc3ZnIiBzdHlsZT0iZmlsbC1ydWxlOmV2ZW5vZGQ7Y2xpcC1ydWxlOmV2ZW5vZGQ7c3Ryb2tlLWxpbmVqb2luOnJvdW5kO3N0cm9rZS1taXRlcmxpbWl0OjI7IiB4bWw6c3BhY2U9InByZXNlcnZlIiB2ZXJzaW9uPSIxLjEiIHZpZXdCb3g9IjAgMCA
 [...]
+          <mxGeometry x="332.26" y="170" width="125.48" height="30" 
as="geometry" />
+        </mxCell>
+        <mxCell id="t5OBkkhKOG6cYtw1sPyQ-18" value="Comet Physical Plan" 
style="rounded=1;whiteSpace=wrap;html=1;verticalAlign=top;" vertex="1" 
parent="1">
+          <mxGeometry x="305" y="40" width="180" height="100" as="geometry" />
+        </mxCell>
+        <mxCell id="t5OBkkhKOG6cYtw1sPyQ-19" value="protobuf intermediate 
representation" 
style="shape=process;whiteSpace=wrap;html=1;backgroundOutline=1;fillColor=#f5f5f5;fontColor=#333333;strokeColor=#666666;"
 vertex="1" parent="1">
+          <mxGeometry x="335" y="70" width="120" height="50" as="geometry" />
+        </mxCell>
+        <mxCell id="t5OBkkhKOG6cYtw1sPyQ-20" value="" 
style="shape=image;verticalLabelPosition=bottom;labelBackgroundColor=default;verticalAlign=top;aspect=fixed;imageAspect=0;image=data:image/svg+xml,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIGZpbGw9Im5vbmUiIHZpZXdCb3g9IjAgMCA4MDEgMTY4IiBoZWlnaHQ9IjE2OCIgd2lkdGg9IjgwMSI+JiN4YTs8ZyBjbGlwLXBhdGg9InVybCgjY2xpcDBfMV8xODEpIj4mI3hhOzxwYXRoIGZpbGw9InVybCgjcGFpbnQwX2xpbmVhcl8xXzE4MSkiIGQ9Ik03Ni4xMjk3IDE2OEM4OC40NTk3IDE2OCA5OS42MDk3IDE
 [...]
+          <mxGeometry x="323.48" y="263.6" width="143.03" height="30" 
as="geometry" />
+        </mxCell>
+        <mxCell id="t5OBkkhKOG6cYtw1sPyQ-21" value="" 
style="shape=image;verticalLabelPosition=bottom;labelBackgroundColor=default;verticalAlign=top;aspect=fixed;imageAspect=0;image=data:image/png,iVBORw0KGgoAAAANSUhEUgAABwgAAAOoCAMAAADyHlBJAAADAFBMVEUAAAABAQECAgIDAwMEBAQFBQUGBgYHBwcICAgJCQkKCgoLCwsMDAwNDQ0ODg4PDw8QEBARERESEhITExMUFBQVFRUWFhYXFxcYGBgZGRkaGhobGxscHBwdHR0eHh4fHx8gICAhISEiIiIjIyMkJCQlJSUmJiYnJycoKCgpKSkqKiorKyssLCwtLS0uLi4vLy8wMDAxMTEyMjIzMzM0NDQ1NTU2NjY3Nzc4ODg5OTk6Ojo7Ozs
 [...]
+          <mxGeometry x="360" y="293.6" width="70" height="36.4" as="geometry" 
/>
+        </mxCell>
+        <mxCell id="t5OBkkhKOG6cYtw1sPyQ-22" value="" 
style="shape=flexArrow;endArrow=classic;html=1;rounded=0;exitX=0.5;exitY=1;exitDx=0;exitDy=0;entryX=0.5;entryY=0;entryDx=0;entryDy=0;"
 edge="1" parent="1">
+          <mxGeometry width="50" height="50" relative="1" as="geometry">
+            <mxPoint x="394.5" y="140" as="sourcePoint" />
+            <mxPoint x="394.5" y="170" as="targetPoint" />
+          </mxGeometry>
+        </mxCell>
+        <mxCell id="t5OBkkhKOG6cYtw1sPyQ-23" value="" 
style="shape=flexArrow;endArrow=classic;html=1;rounded=0;exitX=0.5;exitY=1;exitDx=0;exitDy=0;entryX=0.5;entryY=0;entryDx=0;entryDy=0;"
 edge="1" parent="1" source="t5OBkkhKOG6cYtw1sPyQ-17" 
target="t5OBkkhKOG6cYtw1sPyQ-16">
+          <mxGeometry width="50" height="50" relative="1" as="geometry">
+            <mxPoint x="140" y="210" as="sourcePoint" />
+            <mxPoint x="140" y="240" as="targetPoint" />
+          </mxGeometry>
+        </mxCell>
+        <mxCell id="t5OBkkhKOG6cYtw1sPyQ-24" value="" 
style="shape=flexArrow;endArrow=classic;html=1;rounded=0;exitX=0.5;exitY=1;exitDx=0;exitDy=0;entryX=0.5;entryY=0;entryDx=0;entryDy=0;"
 edge="1" parent="1">
+          <mxGeometry width="50" height="50" relative="1" as="geometry">
+            <mxPoint x="394.5" y="330" as="sourcePoint" />
+            <mxPoint x="394.5" y="360" as="targetPoint" />
+          </mxGeometry>
+        </mxCell>
+        <mxCell id="t5OBkkhKOG6cYtw1sPyQ-25" value="Shuffle Files" 
style="shape=process;whiteSpace=wrap;html=1;backgroundOutline=1;fillColor=#f5f5f5;fontColor=#333333;strokeColor=#666666;"
 vertex="1" parent="1">
+          <mxGeometry x="310" y="360" width="170" height="50" as="geometry" />
+        </mxCell>
+      </root>
+    </mxGraphModel>
+  </diagram>
+</mxfile>
diff --git a/docs/source/_static/images/CometOverviewDetailed.drawio.svg 
b/docs/source/_static/images/CometOverviewDetailed.drawio.svg
new file mode 100644
index 00000000..0f29083b
--- /dev/null
+++ b/docs/source/_static/images/CometOverviewDetailed.drawio.svg
@@ -0,0 +1,4 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Do not edit this file with editors other than draw.io -->
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" 
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd";>
+<svg xmlns="http://www.w3.org/2000/svg"; 
xmlns:xlink="http://www.w3.org/1999/xlink"; version="1.1" width="501px" 
height="431px" viewBox="-0.5 -0.5 501 431" content="&lt;mxfile 
host=&quot;app.diagrams.net&quot; agent=&quot;Mozilla/5.0 (Macintosh; Intel Mac 
OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.6 
Safari/605.1.15&quot; version=&quot;24.7.16&quot; scale=&quot;1&quot; 
border=&quot;0&quot;&gt;&#10;  &lt;diagram name=&quot;Page-1&quot; 
id=&quot;IdYZ_KFENTEXElLiOEKC&quo [...]
\ No newline at end of file
diff --git a/docs/source/contributor-guide/plugin_overview.md 
b/docs/source/contributor-guide/plugin_overview.md
index c7538290..a211ca6b 100644
--- a/docs/source/contributor-guide/plugin_overview.md
+++ b/docs/source/contributor-guide/plugin_overview.md
@@ -79,10 +79,10 @@ The leaf nodes in the physical plan are always `ScanExec` 
and these operators co
 prepared before the plan is executed. When `CometExecIterator` invokes 
`Native.executePlan` it passes the memory
 addresses of these Arrow arrays to the native code.
 
-![Diagram of Comet Native 
Execution](../../_static/images/CometNativeExecution.drawio.png)
+![Diagram of Comet Native 
Execution](../../_static/images/CometOverviewDetailed.drawio.svg)
 
 ## End to End Flow
 
 The following diagram shows the end-to-end flow.
 
-![Diagram of Comet Native Parquet 
Scan](../../_static/images/CometNativeParquetScan.drawio.png)
+![Diagram of Comet Native Parquet 
Scan](../../_static/images/CometNativeParquetReader.drawio.svg)
diff --git a/docs/source/index.rst b/docs/source/index.rst
index 4bf5d9fd..39ad27a5 100644
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@@ -42,6 +42,8 @@ as a native runtime to achieve improvement in terms of query 
efficiency and quer
 
    Comet Overview <user-guide/overview>
    Installing Comet <user-guide/installation>
+   Building From Source <user-guide/source>
+   Kubernetes Guide <user-guide/kubernetes>
    Supported Data Sources <user-guide/datasources>
    Supported Data Types <user-guide/datatypes>
    Supported Operators <user-guide/operators>
diff --git a/docs/source/user-guide/installation.md 
b/docs/source/user-guide/installation.md
index dc4429b8..343b6586 100644
--- a/docs/source/user-guide/installation.md
+++ b/docs/source/user-guide/installation.md
@@ -19,73 +19,54 @@
 
 # Installing DataFusion Comet
 
+## Prerequisites
+
 Make sure the following requirements are met and software installed on your 
machine.
 
-## Supported Platforms
+### Supported Operating Systems
 
 - Linux
 - Apple OSX (Intel and Apple Silicon)
 
-## Requirements
+### Supported Spark Versions
 
-- [Apache Spark supported by 
Comet](overview.md#supported-apache-spark-versions)
-- JDK 8 and up
-- GLIBC 2.17 (Centos 7) and up
+Comet currently supports the following versions of Apache Spark:
 
-## Deploying to Kubernetes
+- 3.3.x (Java 8/11/17, Scala 2.12/2.13)
+- 3.4.x (Java 8/11/17, Scala 2.12/2.13)
+- 3.5.x (Java 8/11/17, Scala 2.12/2.13)
 
-See the [Comet Kubernetes Guide](kubernetes.md) guide.
-
-## Using a Published JAR File
+Experimental support is provided for the following versions of Apache Spark 
and is intended for development/testing
+use only and should not be used in production yet.
 
-Pre-built jar files are available in Maven central at 
https://central.sonatype.com/namespace/org.apache.datafusion
+- 4.0.0-preview1 (Java 17/21, Scala 2.13)
 
-## Using a Published Source Release
-
-Official source releases can be downloaded from 
https://dist.apache.org/repos/dist/release/datafusion/
-
-```console
-# Pick the latest version
-export COMET_VERSION=0.3.0
-# Download the tarball
-curl -O 
"https://dist.apache.org/repos/dist/release/datafusion/datafusion-comet-$COMET_VERSION/apache-datafusion-comet-$COMET_VERSION.tar.gz";
-# Unpack
-tar -xzf apache-datafusion-comet-$COMET_VERSION.tar.gz
-cd apache-datafusion-comet-$COMET_VERSION
-```
+Note that Comet may not fully work with proprietary forks of Apache Spark such 
as the Spark versions offered by
+Cloud Service Providers.
 
-Build
-
-```console
-make release-nogit PROFILES="-Pspark-3.4"
-```
-
-## Building from the GitHub repository
+## Using a Published JAR File
 
-Clone the repository:
+Comet jar files are available in [Maven 
Central](https://central.sonatype.com/namespace/org.apache.datafusion).
 
-```console
-git clone https://github.com/apache/datafusion-comet.git
-```
+Here are the direct links for downloading the Comet jar file.
 
-Build Comet for a specific Spark version:
+- [Comet plugin for Spark 3.3 / Scala 
2.12](https://repo1.maven.org/maven2/org/apache/datafusion/comet-spark-spark3.3_2.12/0.3.0/comet-spark-spark3.3_2.12-0.3.0.jar)
+- [Comet plugin for Spark 3.3 / Scala 
2.13](https://repo1.maven.org/maven2/org/apache/datafusion/comet-spark-spark3.3_2.13/0.3.0/comet-spark-spark3.3_2.13-0.3.0.jar)
+- [Comet plugin for Spark 3.4 / Scala 
2.12](https://repo1.maven.org/maven2/org/apache/datafusion/comet-spark-spark3.4_2.12/0.3.0/comet-spark-spark3.4_2.12-0.3.0.jar)
+- [Comet plugin for Spark 3.4 / Scala 
2.13](https://repo1.maven.org/maven2/org/apache/datafusion/comet-spark-spark3.4_2.13/0.3.0/comet-spark-spark3.4_2.13-0.3.0.jar)
+- [Comet plugin for Spark 3.5 / Scala 
2.12](https://repo1.maven.org/maven2/org/apache/datafusion/comet-spark-spark3.5_2.12/0.3.0/comet-spark-spark3.5_2.12-0.3.0.jar)
+- [Comet plugin for Spark 3.5 / Scala 
2.13](https://repo1.maven.org/maven2/org/apache/datafusion/comet-spark-spark3.5_2.13/0.3.0/comet-spark-spark3.5_2.13-0.3.0.jar)
 
-```console
-cd datafusion-comet
-make release PROFILES="-Pspark-3.4"
-```
+## Building from source
 
-Note that the project builds for Scala 2.12 by default but can be built for 
Scala 2.13 using an additional profile:
+Refer to the [Building from Source] guide for instructions from building Comet 
from source, either from official
+source releases, or from the latest code in the GitHub repository.
 
-```console
-make release PROFILES="-Pspark-3.4 -Pscala-2.13"
-```
+[Building from Source]: source.md
 
-To build Comet from the source distribution on an isolated environment without 
an access to `github.com` it is necessary to disable 
`git-commit-id-maven-plugin`, otherwise you will face errors that there is no 
access to the git during the build process. In that case you may use:
+## Deploying to Kubernetes
 
-```console
-make release-nogit PROFILES="-Pspark-3.4"
-```
+See the [Comet Kubernetes Guide](kubernetes.md) guide.
 
 ## Run Spark Shell with Comet enabled
 
@@ -99,11 +80,10 @@ $SPARK_HOME/bin/spark-shell \
     --conf spark.driver.extraClassPath=$COMET_JAR \
     --conf spark.executor.extraClassPath=$COMET_JAR \
     --conf spark.plugins=org.apache.spark.CometPlugin \
-    --conf spark.comet.enabled=true \
-    --conf spark.comet.exec.enabled=true \
+    --conf 
spark.shuffle.manager=org.apache.spark.sql.comet.execution.shuffle.CometShuffleManager
     --conf spark.comet.explainFallback.enabled=true \
-    --conf spark.driver.memory=1g \
-    --conf spark.executor.memory=1g
+    --conf spark.memory.offHeap.enabled=true \
+    --conf spark.memory.offHeap.size=16g \
 ```
 
 ### Verify Comet enabled for Spark SQL query
@@ -142,20 +122,9 @@ WARN CometSparkSessionExtensions$CometExecRule: Comet 
cannot execute some parts
   - Execute InsertIntoHadoopFsRelationCommand is not supported
 ```
 
-### Enable Comet shuffle
+## Additional Configuration
 
-Comet shuffle feature is disabled by default. To enable it, please add related 
configs:
-
-```
---conf 
spark.shuffle.manager=org.apache.spark.sql.comet.execution.shuffle.CometShuffleManager
---conf spark.comet.exec.shuffle.enabled=true
-```
-
-Above configs enable Comet native shuffle which only supports hash partition 
and single partition.
-Comet native shuffle doesn't support complex types yet.
-
-Comet doesn't have official release yet so currently the only way to test it 
is to build jar and include it in your
-Spark application. Depending on your deployment mode you may also need to set 
the driver & executor class path(s) to
+Depending on your deployment mode you may also need to set the driver & 
executor class path(s) to
 explicitly contain Comet otherwise Spark may use a different class-loader for 
the Comet components than its internal
 components which will then fail at runtime. For example:
 
@@ -165,11 +134,7 @@ components which will then fail at runtime. For example:
 
 Some cluster managers may require additional configuration, see 
<https://spark.apache.org/docs/latest/cluster-overview.html>
 
-To enable columnar shuffle which supports all partitioning and basic complex 
types, one more config is required:
-
-```
---conf spark.comet.exec.shuffle.mode=jvm
-```
-
 ### Memory tuning
-In addition to Apache Spark memory configuration parameters the Comet 
introduces own parameters to configure memory allocation for native execution. 
More [Comet Memory Tuning](./tuning.md)
+
+In addition to Apache Spark memory configuration parameters, Comet introduces 
additional parameters to configure memory
+allocation for native execution. See [Comet Memory Tuning](./tuning.md) for 
details.
diff --git a/docs/source/user-guide/overview.md 
b/docs/source/user-guide/overview.md
index e386aec8..92dfe2bb 100644
--- a/docs/source/user-guide/overview.md
+++ b/docs/source/user-guide/overview.md
@@ -19,8 +19,14 @@
 
 # Comet Overview
 
-Comet runs Spark SQL queries using the native Apache DataFusion runtime, which 
is
-typically faster and more resource efficient than JVM based runtimes.
+Apache DataFusion Comet is a high-performance accelerator for Apache Spark, 
built on top of the powerful
+[Apache DataFusion] query engine. Comet is designed to significantly enhance 
the
+performance of Apache Spark workloads while leveraging commodity hardware and 
seamlessly integrating with the
+Spark ecosystem without requiring any code changes.
+
+[Apache DataFusion]: https://datafusion.apache.org
+
+The following diagram provides an overview of Comet's architecture.
 
 ![Comet Overview](../_static/images/comet-overview.png)
 
@@ -34,26 +40,10 @@ Comet aims to support:
 
 ## Architecture
 
-The following diagram illustrates the architecture of Comet:
+The following diagram shows how Comet integrates with Apache Spark.
 
 ![Comet System Diagram](../_static/images/comet-system-diagram.png)
 
-## Supported Apache Spark versions
-
-Comet currently supports the following versions of Apache Spark:
-
-- 3.3.x
-- 3.4.x
-- 3.5.x
-
-Experimental support is provided for the following versions of Apache Spark 
and is intended for development/testing 
-use only and should not be used in production yet.
-
-- 4.0.0-preview1
-
-Note that Comet may not fully work with proprietary forks of Apache Spark such 
as the Spark versions offered by 
-Cloud Service Providers. 
-
 ## Feature Parity with Apache Spark
 
 The project strives to keep feature parity with Apache Spark, that is,
@@ -65,3 +55,9 @@ features and fallback to Spark engine.
 To achieve this, besides unit tests within Comet itself, we also re-use
 Spark SQL tests and make sure they all pass with Comet extension
 enabled.
+
+## Getting Started
+
+Refer to the [Comet Installation Guide] to get started.
+
+[Comet Installation Guide]: installation.md
diff --git a/docs/source/user-guide/source.md b/docs/source/user-guide/source.md
new file mode 100644
index 00000000..71c9060c
--- /dev/null
+++ b/docs/source/user-guide/source.md
@@ -0,0 +1,69 @@
+<!---
+  Licensed to the Apache Software Foundation (ASF) under one
+  or more contributor license agreements.  See the NOTICE file
+  distributed with this work for additional information
+  regarding copyright ownership.  The ASF licenses this file
+  to you under the Apache License, Version 2.0 (the
+  "License"); you may not use this file except in compliance
+  with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing,
+  software distributed under the License is distributed on an
+  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  KIND, either express or implied.  See the License for the
+  specific language governing permissions and limitations
+  under the License.
+-->
+
+# Building Comet From Source
+
+It is sometimes preferable to build from source for a specific platform.
+
+## Using a Published Source Release
+
+Official source releases can be downloaded from 
https://dist.apache.org/repos/dist/release/datafusion/
+
+```console
+# Pick the latest version
+export COMET_VERSION=0.3.0
+# Download the tarball
+curl -O 
"https://dist.apache.org/repos/dist/release/datafusion/datafusion-comet-$COMET_VERSION/apache-datafusion-comet-$COMET_VERSION.tar.gz";
+# Unpack
+tar -xzf apache-datafusion-comet-$COMET_VERSION.tar.gz
+cd apache-datafusion-comet-$COMET_VERSION
+```
+
+Build
+
+```console
+make release-nogit PROFILES="-Pspark-3.4"
+```
+
+## Building from the GitHub repository
+
+Clone the repository:
+
+```console
+git clone https://github.com/apache/datafusion-comet.git
+```
+
+Build Comet for a specific Spark version:
+
+```console
+cd datafusion-comet
+make release PROFILES="-Pspark-3.4"
+```
+
+Note that the project builds for Scala 2.12 by default but can be built for 
Scala 2.13 using an additional profile:
+
+```console
+make release PROFILES="-Pspark-3.4 -Pscala-2.13"
+```
+
+To build Comet from the source distribution on an isolated environment without 
an access to `github.com` it is necessary to disable 
`git-commit-id-maven-plugin`, otherwise you will face errors that there is no 
access to the git during the build process. In that case you may use:
+
+```console
+make release-nogit PROFILES="-Pspark-3.4"
+```


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


Reply via email to