This is an automated email from the ASF dual-hosted git repository.
sunchao pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow-datafusion-comet.git
The following commit(s) were added to refs/heads/main by this push:
new 1f53e25 Minor: Update README.md with system diagram (#148)
1f53e25 is described below
commit 1f53e25505b6c646acfbc813bcd6af06f1390cb0
Author: Andrew Lamb <[email protected]>
AuthorDate: Fri Mar 1 17:16:19 2024 -0500
Minor: Update README.md with system diagram (#148)
---
README.md | 9 ++++++++-
doc/comet-system-diagram.png | Bin 0 -> 30027 bytes
2 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/README.md b/README.md
index 5fb90be..572a9d2 100644
--- a/README.md
+++ b/README.md
@@ -22,13 +22,20 @@ under the License.
Comet is an Apache Spark plugin that uses [Apache Arrow
DataFusion](https://arrow.apache.org/datafusion/)
as native runtime to achieve improvement in terms of query efficiency and
query runtime.
-On a high level, Comet aims to support:
+Comet runs Spark SQL queries using the native DataFusion runtime, which is
+typically faster and more resource efficient than JVM based runtimes.
+
+<a href="doc/comet-overview.png"><img src="doc/comet-system-diagram.png"
align="center" width="500" ></a>
+
+Comet aims to support:
- a native Parquet implementation, including both reader and writer
- full implementation of Spark operators, including
Filter/Project/Aggregation/Join/Exchange etc.
- full implementation of Spark built-in expressions
- a UDF framework for users to migrate their existing UDF to native
+## Architecture
+
The following diagram illustrates the architecture of Comet:
<a href="doc/comet-overview.png"><img src="doc/comet-overview.png"
align="center" height="600" width="750" ></a>
diff --git a/doc/comet-system-diagram.png b/doc/comet-system-diagram.png
new file mode 100644
index 0000000..e7c9075
Binary files /dev/null and b/doc/comet-system-diagram.png differ