This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-1.7
in repository https://gitbox.apache.org/repos/asf/orc.git


The following commit(s) were added to refs/heads/branch-1.7 by this push:
     new 3e97a42  ORC-1112: Add `Using with Python` web page (#1039)
3e97a42 is described below

commit 3e97a425979252887c7ff3bbbf697934206357cd
Author: Dongjoon Hyun <[email protected]>
AuthorDate: Tue Feb 1 16:24:43 2022 -0800

    ORC-1112: Add `Using with Python` web page (#1039)
    
    ### What changes were proposed in this pull request?
    
    This PR aims to add `Using with Python` web page to Apache ORC website for 
the community Python users.
    
    ### Why are the changes needed?
    
    To help Python users to use `Apache Arrow` project more with latest `Apache 
ORC 1.7.x C++` release.
    
    ### How was this patch tested?
    
    Build the doc and check generated website. The embedded code can be test 
with `PyArrow 6.0.1 (latest)` and will be improved at `PyArrow 7.0` via 
[ARROW-15338: [Python] Add pyarrow.orc.read_table 
API](https://github.com/apache/arrow/commit/ff4b9bea56aeb2c48f19d6137dd2fbae59d618c7)
    
    <img width="581" alt="Screen Shot 2022-02-01 at 2 28 15 PM" 
src="https://user-images.githubusercontent.com/9700541/152062188-d9d3309a-9367-49dc-b8ea-0f4bac8d9919.png";>
    
    <img width="100%" alt="Screen Shot 2022-02-01 at 2 29 23 PM" 
src="https://user-images.githubusercontent.com/9700541/152062356-934b366f-040b-4fa7-8beb-27ae786e028b.png";>
    
    This closes #1027
    
    (cherry picked from commit 50ba8bb643a51132849ed61a7462ed5d229812e7)
    Signed-off-by: Dongjoon Hyun <[email protected]>
---
 site/_data/docs.yml   |  4 ++++
 site/_docs/pyarrow.md | 37 +++++++++++++++++++++++++++++++++++++
 site/index.html       |  1 +
 3 files changed, 42 insertions(+)

diff --git a/site/_data/docs.yml b/site/_data/docs.yml
index 855abbd..f4354aa 100644
--- a/site/_data/docs.yml
+++ b/site/_data/docs.yml
@@ -11,6 +11,10 @@
   - building
   - releases
 
+- title: Using in Python
+  docs:
+  - pyarrow
+
 - title: Using in Spark
   docs:
   - spark-ddl
diff --git a/site/_docs/pyarrow.md b/site/_docs/pyarrow.md
new file mode 100644
index 0000000..a598779
--- /dev/null
+++ b/site/_docs/pyarrow.md
@@ -0,0 +1,37 @@
+---
+layout: docs
+title: PyArrow
+permalink: /docs/pyarrow.html
+---
+
+## How to install
+
+Apache Arrow project's PyArrow is the recommended package.
+
+https://pypi.org/project/pyarrow/
+
+```
+pip3 install pyarrow
+pip3 install pandas
+```
+
+## How to write and read an ORC file
+
+```
+In [1]: import pandas as pd
+
+In [2]: import pyarrow as pa
+
+In [3]: import pyarrow.orc as orc
+
+In [4]: orc.write_table(pa.table({"col1": [1, 2, 3]}), "test.orc")
+
+In [5]: t = orc.ORCFile("test.orc").read()
+
+In [6]: t.to_pandas()
+Out[6]:
+   col1
+0     1
+1     2
+2     3
+```
diff --git a/site/index.html b/site/index.html
index 7b02453..a07f5a7 100644
--- a/site/index.html
+++ b/site/index.html
@@ -44,6 +44,7 @@ overview: true
     <div class="unit golden-large code">
       <p class="title">Quickstart Documentation</p>
       <ul class="shell">
+        <li><a href="docs/pyarrow.html">Using with Python</a></li>
         <li><a href="docs/spark-ddl.html">Using with Spark</a></li>
         <li><a href="docs/hive-ddl.html">Using with Hive</a></li>
         <li><a href="docs/mapred.html">Using with Hadoop MapRed</a></li>

Reply via email to