TAJO-1068: Add SQL Query documentation.

Closes #155


Project: http://git-wip-us.apache.org/repos/asf/tajo/repo
Commit: http://git-wip-us.apache.org/repos/asf/tajo/commit/2d4b0847
Tree: http://git-wip-us.apache.org/repos/asf/tajo/tree/2d4b0847
Diff: http://git-wip-us.apache.org/repos/asf/tajo/diff/2d4b0847

Branch: refs/heads/block_iteration
Commit: 2d4b0847ff13dced41d47fd740b6c699bba0c945
Parents: 0356a52
Author: Hyunsik Choi <[email protected]>
Authored: Wed Oct 1 17:33:44 2014 -0700
Committer: Hyunsik Choi <[email protected]>
Committed: Wed Oct 1 17:33:44 2014 -0700

----------------------------------------------------------------------
 CHANGES                                         |   2 +
 .../src/main/sphinx/sql_language/data_model.rst |   8 +-
 .../src/main/sphinx/sql_language/queries.rst    | 232 ++++++++++++++++++-
 3 files changed, 231 insertions(+), 11 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/tajo/blob/2d4b0847/CHANGES
----------------------------------------------------------------------
diff --git a/CHANGES b/CHANGES
index daabed0..1eea41d 100644
--- a/CHANGES
+++ b/CHANGES
@@ -440,6 +440,8 @@ Release 0.9.0 - unreleased
 
   TASKS
 
+    TAJO-1068: Add SQL Query documentation. (hyunsik)
+
     TAJO-1078: Update contributor list. (hyunsik)
 
     TAJO-1070: BSTIndexScanExec should not seek a negative offset. (jinho)

http://git-wip-us.apache.org/repos/asf/tajo/blob/2d4b0847/tajo-docs/src/main/sphinx/sql_language/data_model.rst
----------------------------------------------------------------------
diff --git a/tajo-docs/src/main/sphinx/sql_language/data_model.rst 
b/tajo-docs/src/main/sphinx/sql_language/data_model.rst
index e01c6d4..a0c5856 100644
--- a/tajo-docs/src/main/sphinx/sql_language/data_model.rst
+++ b/tajo-docs/src/main/sphinx/sql_language/data_model.rst
@@ -7,7 +7,7 @@ Data Types
 ===============
 
 
+-----------+----------------+----------------------------+-------------+---------------------------------------------------+--------------------------------------------------------------------------+
 
-| Supported | SQL Type Name  |  Alias                     | Size (byte) | 
Description                                       | Range                       
                                             |
+| Support   | SQL Type Name  |  Alias                     | Size (byte) | 
Description                                       | Range                       
                                             |
 
+===========+================+============================+=============+===================================================+==========================================================================+
 
 | O         | boolean        |  bool                      |  1          |      
                                             | true/false                       
                                        |
 
+-----------+----------------+----------------------------+-------------+---------------------------------------------------+--------------------------------------------------------------------------+
  
@@ -41,13 +41,13 @@ Data Types
 
+-----------+----------------+----------------------------+-------------+---------------------------------------------------+--------------------------------------------------------------------------+
 
 | O         | blob           |  bytea                     |             | 
variable-length binary string                     |                             
                                             |
 
+-----------+----------------+----------------------------+-------------+---------------------------------------------------+--------------------------------------------------------------------------+
 
-|           | date           |                            |             |      
                                             |                                  
                                        | 
+| O         | date           |                            |             |      
                                             |                                  
                                        | 
 
+-----------+----------------+----------------------------+-------------+---------------------------------------------------+--------------------------------------------------------------------------+
 
-|           | time           |                            |             |      
                                             |                                  
                                        | 
+| O         | time           |                            |             |      
                                             |                                  
                                        | 
 
+-----------+----------------+----------------------------+-------------+---------------------------------------------------+--------------------------------------------------------------------------+
 
 |           | timetz         |  time with time zone       |             |      
                                             |                                  
                                        |
 
+-----------+----------------+----------------------------+-------------+---------------------------------------------------+--------------------------------------------------------------------------+
 
-|           | timestamp      |                            |             |      
                                             |                                  
                                        |
+| O         | timestamp      |                            |             |      
                                             |                                  
                                        |
 
+-----------+----------------+----------------------------+-------------+---------------------------------------------------+--------------------------------------------------------------------------+
 
 |           | timestamptz    |                            |             |      
                                             |                                  
                                        |
 
+-----------+----------------+----------------------------+-------------+---------------------------------------------------+--------------------------------------------------------------------------+
 

http://git-wip-us.apache.org/repos/asf/tajo/blob/2d4b0847/tajo-docs/src/main/sphinx/sql_language/queries.rst
----------------------------------------------------------------------
diff --git a/tajo-docs/src/main/sphinx/sql_language/queries.rst 
b/tajo-docs/src/main/sphinx/sql_language/queries.rst
index 8a212a5..5adfdaa 100644
--- a/tajo-docs/src/main/sphinx/sql_language/queries.rst
+++ b/tajo-docs/src/main/sphinx/sql_language/queries.rst
@@ -11,7 +11,7 @@ Overview
 .. code-block:: sql
 
   SELECT [distinct [all]] * | <expression> [[AS] <alias>] [, ...]
-    [FROM <table name> [[AS] <table alias name>] [, ...]]
+    [FROM <table reference> [[AS] <table alias name>] [, ...]]
     [WHERE <condition>]
     [GROUP BY <expression> [, ...]]
     [HAVING <condition>]
@@ -23,16 +23,234 @@ Overview
 From Clause
 =====================
 
+*Synopsis*
+
+.. code-block:: sql
+
+  [FROM <table reference> [[AS] <table alias name>] [, ...]]
+
+
+The ``FROM`` clause specifies one or more other tables given in a 
comma-separated table reference list.
+A table reference can be a relation name , or a subquery, a table join, or 
complex combinations of them.
+
+-----------------------
+Table and Table Aliases
+-----------------------
+
+A temporary name can be given to tables and complex table references to be used
+for references to the derived table in the rest of the query. This is called a 
table alias.
+
+To create a a table alias, please use ``AS``:
+
+.. code-block:: sql
+
+  FROM table_reference AS alias
+
+or
+
+.. code-block:: sql
+
+  FROM table_reference alias
+
+The ``AS`` keyword can be omitted, and *Alias* can be any identifier.
+
+A typical application of table aliases is to give short names to long table 
references. For example:
+
+.. code-block:: sql
+
+  SELECT * FROM long_table_name_1234 s JOIN another_long_table_name_5678 a ON 
s.id = a.num;
+
+-------------
+Joined Tables
+-------------
+
+Tajo supports all kinds of join types.
+
+Join Types
+~~~~~~~~~~
+
+Cross Join
+^^^^^^^^^^
+
+.. code-block:: sql
+
+  FROM T1 CROSS JOIN T2
+
+Cross join, also called *Cartesian product*, results in every possible 
combination of rows from T1 and T2.
+
+``FROM T1 CROSS JOIN T2`` is equivalent to ``FROM T1, T2``.
+
+Qualified joins
+^^^^^^^^^^^^^^^
+
+Qualified joins implicitly or explicitly have join conditions. 
Inner/Outer/Natural Joins all are qualified joins.
+Except for natural join, ``ON`` or ``USING`` clause in each join is used to 
specify a join condition. 
+A join condition must include at least one boolean expression, and it can also 
include just filter conditions.
+
+**Inner Join**
+
+.. code-block:: sql
+
+  T1 [INNER] JOIN T2 ON boolean_expression
+  T1 [INNER] JOIN T2 USING (join column list)
+
+``INNER`` keyword is the default, and so ``INNER`` can be omitted when you use 
inner join.
+
+**Outer Join**
+
+.. code-block:: sql
+
+  T1 (LEFT|RIGHT|FULL) OUTER JOIN T2 ON boolean_expression
+  T1 (LEFT|RIGHT|FULL) OUTER JOIN T2 USING (join column list)
+
+One of ``LEFT``, ``RIGHT``, or ``FULL`` must be specified for outer joins. 
+Join conditions in outer join will have different behavior according to 
corresponding table references of join conditions.
+To know outer join behavior in more detail, please refer to 
+`Advanced outer join constructs 
<http://www.ibm.com/developerworks/data/library/techarticle/purcell/0201purcell.html>`_.
+
+**Natural Join**
+
+.. code-block:: sql
+
+  T1 NATURAL JOIN T2
+
+``NATURAL`` is a short form of ``USING``. It forms a ``USING`` list consisting 
of all common column names that appear in 
+both join tables. These common columns appear only once in the output table. 
If there are no common columns, 
+``NATURAL`` behaves like ``CROSS JOIN``.
+
+**Subqueries**
+
+Subqueries allow users to specify a derived table. It requires enclosing a SQL 
statement in parentheses and an alias name. 
+For example:
+
+.. code-block:: sql
+
+  FROM (SELECT * FROM table1) AS alias_name
 
 =====================
 Where Clause
 =====================
 
+The syntax of the WHERE Clause is
 
-=====================
-Groupby Clause
-=====================
+*Synopsis*
 
-=====================
-Select list
-=====================
\ No newline at end of file
+.. code-block:: sql
+
+  WHERE search_condition
+
+``search_condition`` can be any boolean expression. 
+In order to know additional predicates, please refer to 
:doc:`/sql_language/predicates`.
+
+==========================
+Groupby and Having Clauses
+==========================
+
+*Synopsis*
+
+.. code-block:: sql
+
+  SELECT select_list
+      FROM ...
+      [WHERE ...]
+      GROUP BY grouping_column_reference [, grouping_column_reference]...
+      [HAVING boolean_expression]
+
+The rows which passes ``WHERE`` filter may be subject to grouping, specified 
by ``GROUP BY`` clause.
+Grouping combines a set of rows having common values into one group, and then 
computes rows in the group with aggregation functions. ``HAVING`` clause can be 
used with only ``GROUP BY`` clause. It eliminates the unqualified result rows 
of grouping.
+
+``grouping_column_reference`` can be a column reference, a complex expression 
including scalar functions and arithmetic operations.
+
+.. code-block:: sql
+
+  SELECT l_orderkey, SUM(l_quantity) AS quantity FROM lineitem GROUP BY 
l_orderkey;
+
+  SELECT substr(l_shipdate,1,4) as year, SUM(l_orderkey) AS total2 FROM 
lineitem GROUP BY substr(l_shipdate,1,4);
+
+If a SQL statement includes ``GROUP BY`` clause, expressions in select list 
must be either grouping_column_reference or aggregation function. For example, 
the following example query is not allowed because ``l_orderkey`` does not 
occur in ``GROUP BY`` clause.
+
+.. code-block:: sql
+
+  SELECT l_orderkey, l_partkey, SUM(l_orderkey) AS total FROM lineitem GROUP 
BY l_partkey;
+
+Aggregation functions can be used with ``DISTINCT`` keywords. It forces an 
individual aggregate function to take only distinct values of the argument 
expression. ``DISTINCT`` keyword is used as follows:
+
+.. code-block:: sql
+
+  SELECT l_partkey, COUNT(distinct l_quantity), SUM(distinct l_extendedprice) 
AS total FROM lineitem GROUP BY l_partkey;
+
+==========================
+Orderby and Limit Clauses
+==========================
+
+*Synopsis*
+
+.. code-block:: sql
+
+  FROM ... ORDER BY <sort_expr> [(ASC|DESC)] [NULL (FIRST|LAST) [,...]
+
+``sort_expr`` can be a column reference, aliased column reference, or a 
complex expression. 
+``ASC`` indicates an ascending order of ``sort_expr`` values. ``DESC`` 
indicates a descending order of ``sort_expr`` values.
+``ASC`` is the default order.
+
+``NULLS FIRST`` and ``NULLS LAST`` options can be used to determine whether 
nulls values appear 
+before or after non-null values in the sort ordering. By default, null values 
are dealt as if larger than any non-null value; 
+that is, ``NULLS FIRST`` is the default for ``DESC`` order, and ``NULLS LAST`` 
otherwise.
+
+==========================
+Window Functions
+==========================
+
+A window function performs a calculation across multiple table rows that 
belong to some window frame.
+
+*Synopsis*
+
+.. code-block:: sql
+
+  SELECT ...., func(param) OVER ([PARTITION BY partition-expr [, ...]] [ORDER 
BY sort-expr [, ...]]), ....,  FROM
+
+The PARTITION BY list within OVER specifies dividing the rows into groups, or 
partitions, that share the same values of 
+the PARTITION BY expression(s). For each row, the window function is computed 
across the rows that fall into 
+the same partition as the current row.
+
+We will briefly explain some examples using window functions.
+
+---------
+Examples
+---------
+
+Multiple window functions can be used in a SQL statement as follows:
+
+.. code-block:: sql
+
+  SELECT l_orderkey, sum(l_discount) OVER (PARTITION BY l_orderkey), 
sum(l_quantity) OVER (PARTITION BY l_orderkey) FROM LINEITEM;
+
+If ``OVER()`` clause is empty as following, it makes all table rows into one 
window frame.
+
+.. code-block:: sql
+
+  SELECT salary, sum(salary) OVER () FROM empsalary;
+
+Also, ``ORDER BY`` clause can be used without ``PARTITION BY`` clause as 
follows:
+
+.. code-block:: sql
+
+  SELECT salary, sum(salary) OVER (ORDER BY salary) FROM empsalary;
+
+Also, all expressions and aggregation functions are allowed in ``ORDER BY`` 
clause as follows:
+
+.. code-block:: sql
+
+  select
+    l_orderkey,
+    count(*) as cnt,
+    row_number() over (partition by l_orderkey order by count(*) desc)
+    row_num
+  from
+    lineitem
+  group by
+    l_orderkey
+
+.. note::
+
+  Currently, Tajo does not support multiple different partition-expressions in 
one SQL statement.
\ No newline at end of file

Reply via email to