[arrow-datafusion-python] branch main updated: Minor docs updates (#210)

agrove Wed, 22 Feb 2023 06:05:28 -0800

This is an automated email from the ASF dual-hosted git repository.

agrove pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow-datafusion-python.git



The following commit(s) were added to refs/heads/main by this push:
     new 75dea3d  Minor docs updates (#210)
75dea3d is described below

commit 75dea3dbd530421821becc3642e43036d1a3c121
Author: Andy Grove <[email protected]>
AuthorDate: Wed Feb 22 07:05:17 2023 -0700

    Minor docs updates (#210)
    
    * Add cuDF to examples
    
    * lint
---
 README.md               | 29 +++++++++++++++--------------
 examples/README.md      | 19 ++++++++++---------
 examples/sql-on-cudf.py |  4 +---
 3 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/README.md b/README.md
index e78f613..d83b78c 100644
--- a/README.md
+++ b/README.md
@@ -28,26 +28,26 @@ DataFusion's Python bindings can be used as an end-user 
tool as well as providin
 
 ## Features
 
-- Execute queries using SQL or DataFrames against CSV, Parquet, and JSON data 
sources
-- Queries are optimized using DataFusion's query optimizer
-- Execute user-defined Python code from SQL
-- Exchange data with Pandas and other DataFrame libraries that support PyArrow
-- Serialize and deserialize query plans in Substrait format
-- Experimental support for executing SQL queries against Polars, Pandas and 
cuDF
+- Execute queries using SQL or DataFrames against CSV, Parquet, and JSON data 
sources.
+- Queries are optimized using DataFusion's query optimizer.
+- Execute user-defined Python code from SQL.
+- Exchange data with Pandas and other DataFrame libraries that support PyArrow.
+- Serialize and deserialize query plans in Substrait format.
+- Experimental support for transpiling SQL queries to DataFrame calls with 
Polars, Pandas, and cuDF.
 
 ## Comparison with other projects
 
-Here is a comparison with similar projects that may help understand when 
DataFusion might be suitable and unsuitable 
+Here is a comparison with similar projects that may help understand when 
DataFusion might be suitable and unsuitable
 for your needs:
 
-- [DuckDB](http://www.duckdb.org/) is an open source, in-process analytic 
database. Like DataFusion, it supports 
- very fast execution, both from its custom file format and directly from 
Parquet files. Unlike DataFusion, it is 
- written in C/C++ and it is primarily used directly by users as a serverless 
database and query system rather than 
- as a library for building such database systems.
+- [DuckDB](http://www.duckdb.org/) is an open source, in-process analytic 
database. Like DataFusion, it supports
+  very fast execution, both from its custom file format and directly from 
Parquet files. Unlike DataFusion, it is
+  written in C/C++ and it is primarily used directly by users as a serverless 
database and query system rather than
+  as a library for building such database systems.
 
-- [Polars](http://pola.rs/) is one of the fastest DataFrame libraries at the 
time of writing. Like DataFusion, it 
- is also written in Rust and uses the Apache Arrow memory model, but unlike 
DataFusion it does not provide full SQL 
- support, nor as many extension points.
+- [Polars](http://pola.rs/) is one of the fastest DataFrame libraries at the 
time of writing. Like DataFusion, it
+  is also written in Rust and uses the Apache Arrow memory model, but unlike 
DataFusion it does not provide full SQL
+  support, nor as many extension points.
 
 ## Example Usage
 
@@ -110,6 +110,7 @@ See [examples](examples/README.md) for more information.
 
 - [Executing SQL on Polars](./examples/sql-on-polars.py)
 - [Executing SQL on Pandas](./examples/sql-on-pandas.py)
+- [Executing SQL on cuDF](./examples/sql-on-cudf.py)
 
 ## How to install (from pip)
 
diff --git a/examples/README.md b/examples/README.md
index ce98600..2c4775e 100644
--- a/examples/README.md
+++ b/examples/README.md
@@ -29,21 +29,22 @@ Here is a direct link to the file used in the examples:
 
 ### Executing Queries with DataFusion
 
-- [Query a Parquet file using SQL](./examples/sql-parquet.py)
-- [Query a Parquet file using the DataFrame 
API](./examples/dataframe-parquet.py)
-- [Run a SQL query and store the results in a Pandas 
DataFrame](./examples/sql-to-pandas.py)
-- [Query PyArrow Data](./examples/query-pyarrow-data.py)
+- [Query a Parquet file using SQL](./sql-parquet.py)
+- [Query a Parquet file using the DataFrame API](./dataframe-parquet.py)
+- [Run a SQL query and store the results in a Pandas 
DataFrame](./sql-to-pandas.py)
+- [Query PyArrow Data](./query-pyarrow-data.py)
 
 ### Running User-Defined Python Code
 
-- [Register a Python UDF with DataFusion](./examples/python-udf.py)
-- [Register a Python UDAF with DataFusion](./examples/python-udaf.py)
+- [Register a Python UDF with DataFusion](./python-udf.py)
+- [Register a Python UDAF with DataFusion](./python-udaf.py)
 
 ### Substrait Support
 
-- [Serialize query plans using Substrait](./examples/substrait.py)
+- [Serialize query plans using Substrait](./substrait.py)
 
 ### Executing SQL against DataFrame Libraries (Experimental)
 
-- [Executing SQL on Polars](./examples/sql-on-polars.py)
-- [Executing SQL on Pandas](./examples/sql-on-pandas.py)
+- [Executing SQL on Polars](./sql-on-polars.py)
+- [Executing SQL on Pandas](./sql-on-pandas.py)
+- [Executing SQL on cuDF](./sql-on-cudf.py)
diff --git a/examples/sql-on-cudf.py b/examples/sql-on-cudf.py
index 407cb1f..999756f 100644
--- a/examples/sql-on-cudf.py
+++ b/examples/sql-on-cudf.py
@@ -19,8 +19,6 @@ from datafusion.cudf import SessionContext
 
 
 ctx = SessionContext()
-ctx.register_parquet(
-    "taxi", "/home/jeremy/Downloads/yellow_tripdata_2021-01.parquet"
-)
+ctx.register_parquet("taxi", "yellow_tripdata_2021-01.parquet")
 df = ctx.sql("select passenger_count from taxi")
 print(df)

[arrow-datafusion-python] branch main updated: Minor docs updates (#210)

Reply via email to