Repository: tajo
Updated Branches:
  refs/heads/master 616414a51 -> b0c0a390e


TAJO-1682: Write ORC document.

Closes #764

Signed-off-by: Jihoon Son <[email protected]>


Project: http://git-wip-us.apache.org/repos/asf/tajo/repo
Commit: http://git-wip-us.apache.org/repos/asf/tajo/commit/b0c0a390
Tree: http://git-wip-us.apache.org/repos/asf/tajo/tree/b0c0a390
Diff: http://git-wip-us.apache.org/repos/asf/tajo/diff/b0c0a390

Branch: refs/heads/master
Commit: b0c0a390e3e774c4004156ad0027cf8d3de4c876
Parents: 616414a
Author: Jongyoung Park <[email protected]>
Authored: Thu Sep 17 15:34:19 2015 +0900
Committer: Jihoon Son <[email protected]>
Committed: Thu Sep 17 15:34:19 2015 +0900

----------------------------------------------------------------------
 CHANGES                                         |  3 ++
 .../sphinx/table_management/file_formats.rst    |  1 +
 .../src/main/sphinx/table_management/orc.rst    | 47 ++++++++++++++++++++
 3 files changed, 51 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/tajo/blob/b0c0a390/CHANGES
----------------------------------------------------------------------
diff --git a/CHANGES b/CHANGES
index 70d99e3..c485a27 100644
--- a/CHANGES
+++ b/CHANGES
@@ -547,6 +547,9 @@ Release 0.11.0 - unreleased
   
   TASKS
 
+    TAJO-1682: Write ORC document. (Contributed by Jongyoung Park, 
+    Committed by jihoon)
+
     TAJO-1744: Porting bash shell scripts to Windows command shell scripts.
     (Contributed by YeonSu Han, Committed by jihoon)
 

http://git-wip-us.apache.org/repos/asf/tajo/blob/b0c0a390/tajo-docs/src/main/sphinx/table_management/file_formats.rst
----------------------------------------------------------------------
diff --git a/tajo-docs/src/main/sphinx/table_management/file_formats.rst 
b/tajo-docs/src/main/sphinx/table_management/file_formats.rst
index 0579497..7768920 100644
--- a/tajo-docs/src/main/sphinx/table_management/file_formats.rst
+++ b/tajo-docs/src/main/sphinx/table_management/file_formats.rst
@@ -10,4 +10,5 @@ Currently, Tajo provides four file formats as follows:
     text
     rcfile
     parquet
+    orc
     sequencefile
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/tajo/blob/b0c0a390/tajo-docs/src/main/sphinx/table_management/orc.rst
----------------------------------------------------------------------
diff --git a/tajo-docs/src/main/sphinx/table_management/orc.rst 
b/tajo-docs/src/main/sphinx/table_management/orc.rst
new file mode 100644
index 0000000..2733afc
--- /dev/null
+++ b/tajo-docs/src/main/sphinx/table_management/orc.rst
@@ -0,0 +1,47 @@
+***
+ORC
+***
+
+**ORC(Optimized Row Columnar)** is a columnar storage format from Hive. ORC 
improves performance for reading,
+writing, and processing data.
+For more details, please refer to `ORC Files 
<https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ORC>`_ at Hive 
wiki.
+
+===========================
+How to Create an ORC Table?
+===========================
+
+If you are not familiar with ``CREATE TABLE`` statement, please refer to Data 
Definition Language :doc:`/sql_language/ddl`.
+
+In order to specify a certain file format for your table, you need to use the 
``USING`` clause in your ``CREATE TABLE``
+statement. Below is an example statement for creating a table using orc files.
+
+.. code-block:: sql
+
+  CREATE TABLE table1 (
+    id int,
+    name text,
+    score float,
+    type text
+  ) USING orc;
+
+===================
+Physical Properties
+===================
+
+Some table storage formats provide parameters for enabling or disabling 
features and adjusting physical parameters.
+The ``WITH`` clause in the CREATE TABLE statement allows users to set those 
parameters.
+
+Now, ORC file provides the following physical properties.
+
+* ``orc.max.merge.distance``: When ORC file is read, if stripes are too closer 
and the distance is lower than this value, they are merged and read at once. 
Default is 1MB.
+* ``orc.stripe.size``: It decides size of each stripe. Default is 64MB.
+* ``orc.compression.kind``: It means the compression algorithm used to 
compress and write data. It should be one of ``none``, ``snappy``, ``zlib``. 
Default is ``none``.
+* ``orc.buffer.size``: It decides size of writing buffer. Default is 256KB.
+* ``orc.rowindex.stride``: Define the default ORC index stride in number of 
rows. (Stride is the number of rows an index entry represents.) Default is 
10000.
+
+======================================
+Compatibility Issues with Apache Hive™
+======================================
+
+At the moment, Tajo only supports flat relational tables.
+We are currently working on adding support for nested schemas and non-scalar 
types (`TAJO-710 <https://issues.apache.org/jira/browse/TAJO-710>`_).
\ No newline at end of file

Reply via email to