Modified: tajo/site/docs/devel/sql_language/ddl.html URL: http://svn.apache.org/viewvc/tajo/site/docs/devel/sql_language/ddl.html?rev=1649478&r1=1649477&r2=1649478&view=diff ============================================================================== --- tajo/site/docs/devel/sql_language/ddl.html (original) +++ tajo/site/docs/devel/sql_language/ddl.html Mon Jan 5 08:43:18 2015 @@ -61,11 +61,11 @@ <ul class="current"> <li class="toctree-l1"><a class="reference internal" href="../introduction.html">Introduction</a></li> <li class="toctree-l1"><a class="reference internal" href="../getting_started.html">Getting Started</a><ul> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/prerequisites.html">Prerequisites</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/downloading_source.html">Dowload and unpack the source code</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/building.html">Build source code</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/local_setup.html">Setting up a local Tajo cluster</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/first_query.html">First query execution</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#prerequisites">Prerequisites</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#dowload-and-unpack-the-source-code">Dowload and unpack the source code</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#build-source-code">Build source code</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#setting-up-a-local-tajo-cluster">Setting up a local Tajo cluster</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#first-query-execution">First query execution</a></li> </ul> </li> <li class="toctree-l1"><a class="reference internal" href="../configuration.html">Configuration</a><ul> @@ -186,7 +186,7 @@ <div class="highlight-sql"><div class="highlight"><pre><span class="k">CREATE</span> <span class="k">DATABASE</span> <span class="p">[</span><span class="n">IF</span> <span class="k">NOT</span> <span class="k">EXISTS</span><span class="p">]</span> <span class="o"><</span><span class="n">database_name</span><span class="o">></span> </pre></div> </div> -<p><tt class="docutils literal"><span class="pre">IF</span> <span class="pre">NOT</span> <span class="pre">EXISTS</span></tt> allows <tt class="docutils literal"><span class="pre">CREATE</span> <span class="pre">DATABASE</span></tt> statement to avoid an error which occurs when the database exists.</p> +<p><code class="docutils literal"><span class="pre">IF</span> <span class="pre">NOT</span> <span class="pre">EXISTS</span></code> allows <code class="docutils literal"><span class="pre">CREATE</span> <span class="pre">DATABASE</span></code> statement to avoid an error which occurs when the database exists.</p> </div> <div class="section" id="drop-database"> <h2>DROP DATABASE<a class="headerlink" href="#drop-database" title="Permalink to this headline">¶</a></h2> @@ -194,7 +194,7 @@ <div class="highlight-sql"><div class="highlight"><pre><span class="k">DROP</span> <span class="k">DATABASE</span> <span class="p">[</span><span class="n">IF</span> <span class="k">EXISTS</span><span class="p">]</span> <span class="o"><</span><span class="n">database_name</span><span class="o">></span> </pre></div> </div> -<p><tt class="docutils literal"><span class="pre">IF</span> <span class="pre">EXISTS</span></tt> allows <tt class="docutils literal"><span class="pre">DROP</span> <span class="pre">DATABASE</span></tt> statement to avoid an error which occurs when the database does not exist.</p> +<p><code class="docutils literal"><span class="pre">IF</span> <span class="pre">EXISTS</span></code> allows <code class="docutils literal"><span class="pre">DROP</span> <span class="pre">DATABASE</span></code> statement to avoid an error which occurs when the database does not exist.</p> </div> <div class="section" id="create-table"> <h2>CREATE TABLE<a class="headerlink" href="#create-table" title="Permalink to this headline">¶</a></h2> @@ -206,7 +206,7 @@ <span class="k">using</span> <span class="o"><</span><span class="n">storage_type</span><span class="o">></span> <span class="p">[</span><span class="k">with</span> <span class="p">(</span><span class="o"><</span><span class="k">key</span><span class="o">></span> <span class="o">=</span> <span class="o"><</span><span class="n">value</span><span class="o">></span><span class="p">,</span> <span class="p">...)]</span> <span class="k">LOCATION</span> <span class="s1">'<path>'</span> </pre></div> </div> -<p><tt class="docutils literal"><span class="pre">IF</span> <span class="pre">NOT</span> <span class="pre">EXISTS</span></tt> allows <tt class="docutils literal"><span class="pre">CREATE</span> <span class="pre">[EXTERNAL]</span> <span class="pre">TABLE</span></tt> statement to avoid an error which occurs when the table does not exist.</p> +<p><code class="docutils literal"><span class="pre">IF</span> <span class="pre">NOT</span> <span class="pre">EXISTS</span></code> allows <code class="docutils literal"><span class="pre">CREATE</span> <span class="pre">[EXTERNAL]</span> <span class="pre">TABLE</span></code> statement to avoid an error which occurs when the table does not exist.</p> <div class="section" id="compression"> <h3>Compression<a class="headerlink" href="#compression" title="Permalink to this headline">¶</a></h3> <p>If you want to add an external table that contains compressed data, you should give ‘compression.code’ parameter to CREATE TABLE statement.</p> @@ -238,7 +238,7 @@ <div class="highlight-sql"><div class="highlight"><pre><span class="k">DROP</span> <span class="k">TABLE</span> <span class="p">[</span><span class="n">IF</span> <span class="k">EXISTS</span><span class="p">]</span> <span class="o"><</span><span class="k">table_name</span><span class="o">></span> <span class="p">[</span><span class="n">PURGE</span><span class="p">]</span> </pre></div> </div> -<p><tt class="docutils literal"><span class="pre">IF</span> <span class="pre">EXISTS</span></tt> allows <tt class="docutils literal"><span class="pre">DROP</span> <span class="pre">DATABASE</span></tt> statement to avoid an error which occurs when the database does not exist. <tt class="docutils literal"><span class="pre">DROP</span> <span class="pre">TABLE</span></tt> statement removes a table from Tajo catalog, but it does not remove the contents. If <tt class="docutils literal"><span class="pre">PURGE</span></tt> option is given, <tt class="docutils literal"><span class="pre">DROP</span> <span class="pre">TABLE</span></tt> statement will eliminate the entry in the catalog as well as the contents.</p> +<p><code class="docutils literal"><span class="pre">IF</span> <span class="pre">EXISTS</span></code> allows <code class="docutils literal"><span class="pre">DROP</span> <span class="pre">DATABASE</span></code> statement to avoid an error which occurs when the database does not exist. <code class="docutils literal"><span class="pre">DROP</span> <span class="pre">TABLE</span></code> statement removes a table from Tajo catalog, but it does not remove the contents. If <code class="docutils literal"><span class="pre">PURGE</span></code> option is given, <code class="docutils literal"><span class="pre">DROP</span> <span class="pre">TABLE</span></code> statement will eliminate the entry in the catalog as well as the contents.</p> </div> </div>
Modified: tajo/site/docs/devel/sql_language/insert.html URL: http://svn.apache.org/viewvc/tajo/site/docs/devel/sql_language/insert.html?rev=1649478&r1=1649477&r2=1649478&view=diff ============================================================================== --- tajo/site/docs/devel/sql_language/insert.html (original) +++ tajo/site/docs/devel/sql_language/insert.html Mon Jan 5 08:43:18 2015 @@ -61,11 +61,11 @@ <ul class="current"> <li class="toctree-l1"><a class="reference internal" href="../introduction.html">Introduction</a></li> <li class="toctree-l1"><a class="reference internal" href="../getting_started.html">Getting Started</a><ul> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/prerequisites.html">Prerequisites</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/downloading_source.html">Dowload and unpack the source code</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/building.html">Build source code</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/local_setup.html">Setting up a local Tajo cluster</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/first_query.html">First query execution</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#prerequisites">Prerequisites</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#dowload-and-unpack-the-source-code">Dowload and unpack the source code</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#build-source-code">Build source code</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#setting-up-a-local-tajo-cluster">Setting up a local Tajo cluster</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#first-query-execution">First query execution</a></li> </ul> </li> <li class="toctree-l1"><a class="reference internal" href="../configuration.html">Configuration</a><ul> @@ -180,7 +180,7 @@ <div class="section" id="insert-overwrite-into"> <h1>INSERT (OVERWRITE) INTO<a class="headerlink" href="#insert-overwrite-into" title="Permalink to this headline">¶</a></h1> -<p>INSERT OVERWRITE statement overwrites a table data of an existing table or a data in a given directory. Tajo’s INSERT OVERWRITE statement follows <tt class="docutils literal"><span class="pre">INSERT</span> <span class="pre">INTO</span> <span class="pre">SELECT</span></tt> statement of SQL. The examples are as follows:</p> +<p>INSERT OVERWRITE statement overwrites a table data of an existing table or a data in a given directory. Tajo’s INSERT OVERWRITE statement follows <code class="docutils literal"><span class="pre">INSERT</span> <span class="pre">INTO</span> <span class="pre">SELECT</span></code> statement of SQL. The examples are as follows:</p> <div class="highlight-sql"><div class="highlight"><pre><span class="k">create</span> <span class="k">table</span> <span class="n">t1</span> <span class="p">(</span><span class="n">col1</span> <span class="nb">int8</span><span class="p">,</span> <span class="n">col2</span> <span class="n">int4</span><span class="p">,</span> <span class="n">col3</span> <span class="n">float8</span><span class="p">);</span> <span class="c1">-- when a target table schema and output schema are equivalent to each other</span> Modified: tajo/site/docs/devel/sql_language/predicates.html URL: http://svn.apache.org/viewvc/tajo/site/docs/devel/sql_language/predicates.html?rev=1649478&r1=1649477&r2=1649478&view=diff ============================================================================== --- tajo/site/docs/devel/sql_language/predicates.html (original) +++ tajo/site/docs/devel/sql_language/predicates.html Mon Jan 5 08:43:18 2015 @@ -61,11 +61,11 @@ <ul class="current"> <li class="toctree-l1"><a class="reference internal" href="../introduction.html">Introduction</a></li> <li class="toctree-l1"><a class="reference internal" href="../getting_started.html">Getting Started</a><ul> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/prerequisites.html">Prerequisites</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/downloading_source.html">Dowload and unpack the source code</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/building.html">Build source code</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/local_setup.html">Setting up a local Tajo cluster</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/first_query.html">First query execution</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#prerequisites">Prerequisites</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#dowload-and-unpack-the-source-code">Dowload and unpack the source code</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#build-source-code">Build source code</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#setting-up-a-local-tajo-cluster">Setting up a local Tajo cluster</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#first-query-execution">First query execution</a></li> </ul> </li> <li class="toctree-l1"><a class="reference internal" href="../configuration.html">Configuration</a><ul> @@ -230,7 +230,7 @@ <span class="n">string</span> <span class="k">NOT</span> <span class="k">SIMILAR</span> <span class="k">TO</span> <span class="n">pattern</span> </pre></div> </div> -<p>It returns true or false depending on whether its pattern matches the given string. Also like LIKE, <tt class="docutils literal"><span class="pre">SIMILAR</span> <span class="pre">TO</span></tt> uses <tt class="docutils literal"><span class="pre">_</span></tt> and <tt class="docutils literal"><span class="pre">%</span></tt> as metacharacters denoting any single character and any string, respectively.</p> +<p>It returns true or false depending on whether its pattern matches the given string. Also like LIKE, <code class="docutils literal"><span class="pre">SIMILAR</span> <span class="pre">TO</span></code> uses <code class="docutils literal"><span class="pre">_</span></code> and <code class="docutils literal"><span class="pre">%</span></code> as metacharacters denoting any single character and any string, respectively.</p> <p>In addition to these metacharacters borrowed from LIKE, ‘SIMILAR TO’ supports more powerful pattern-matching metacharacters borrowed from regular expressions:</p> <table border="1" class="docutils"> <colgroup> @@ -278,7 +278,7 @@ </tr> </tbody> </table> -<p>Note that <cite>.`</cite> is not used as a metacharacter in <tt class="docutils literal"><span class="pre">SIMILAR</span> <span class="pre">TO</span></tt> operator.</p> +<p>Note that <cite>.`</cite> is not used as a metacharacter in <code class="docutils literal"><span class="pre">SIMILAR</span> <span class="pre">TO</span></code> operator.</p> </div> <div class="section" id="regular-expressions"> <h3>Regular expressions<a class="headerlink" href="#regular-expressions" title="Permalink to this headline">¶</a></h3> Modified: tajo/site/docs/devel/sql_language/queries.html URL: http://svn.apache.org/viewvc/tajo/site/docs/devel/sql_language/queries.html?rev=1649478&r1=1649477&r2=1649478&view=diff ============================================================================== --- tajo/site/docs/devel/sql_language/queries.html (original) +++ tajo/site/docs/devel/sql_language/queries.html Mon Jan 5 08:43:18 2015 @@ -61,11 +61,11 @@ <ul class="current"> <li class="toctree-l1"><a class="reference internal" href="../introduction.html">Introduction</a></li> <li class="toctree-l1"><a class="reference internal" href="../getting_started.html">Getting Started</a><ul> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/prerequisites.html">Prerequisites</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/downloading_source.html">Dowload and unpack the source code</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/building.html">Build source code</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/local_setup.html">Setting up a local Tajo cluster</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/first_query.html">First query execution</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#prerequisites">Prerequisites</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#dowload-and-unpack-the-source-code">Dowload and unpack the source code</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#build-source-code">Build source code</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#setting-up-a-local-tajo-cluster">Setting up a local Tajo cluster</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#first-query-execution">First query execution</a></li> </ul> </li> <li class="toctree-l1"><a class="reference internal" href="../configuration.html">Configuration</a><ul> @@ -198,13 +198,13 @@ <div class="highlight-sql"><div class="highlight"><pre><span class="p">[</span><span class="k">FROM</span> <span class="o"><</span><span class="k">table</span> <span class="n">reference</span><span class="o">></span> <span class="p">[[</span><span class="k">AS</span><span class="p">]</span> <span class="o"><</span><span class="k">table</span> <span class="k">alias</span> <span class="n">name</span><span class="o">></span><span class="p">]</span> <span class="p">[,</span> <span class="p">...]]</span> </pre></div> </div> -<p>The <tt class="docutils literal"><span class="pre">FROM</span></tt> clause specifies one or more other tables given in a comma-separated table reference list. +<p>The <code class="docutils literal"><span class="pre">FROM</span></code> clause specifies one or more other tables given in a comma-separated table reference list. A table reference can be a relation name , or a subquery, a table join, or complex combinations of them.</p> <div class="section" id="table-and-table-aliases"> <h3>Table and Table Aliases<a class="headerlink" href="#table-and-table-aliases" title="Permalink to this headline">¶</a></h3> <p>A temporary name can be given to tables and complex table references to be used for references to the derived table in the rest of the query. This is called a table alias.</p> -<p>To create a a table alias, please use <tt class="docutils literal"><span class="pre">AS</span></tt>:</p> +<p>To create a a table alias, please use <code class="docutils literal"><span class="pre">AS</span></code>:</p> <div class="highlight-sql"><div class="highlight"><pre><span class="k">FROM</span> <span class="n">table_reference</span> <span class="k">AS</span> <span class="k">alias</span> </pre></div> </div> @@ -212,7 +212,7 @@ for references to the derived table in t <div class="highlight-sql"><div class="highlight"><pre><span class="k">FROM</span> <span class="n">table_reference</span> <span class="k">alias</span> </pre></div> </div> -<p>The <tt class="docutils literal"><span class="pre">AS</span></tt> keyword can be omitted, and <em>Alias</em> can be any identifier.</p> +<p>The <code class="docutils literal"><span class="pre">AS</span></code> keyword can be omitted, and <em>Alias</em> can be any identifier.</p> <p>A typical application of table aliases is to give short names to long table references. For example:</p> <div class="highlight-sql"><div class="highlight"><pre><span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">long_table_name_1234</span> <span class="n">s</span> <span class="k">JOIN</span> <span class="n">another_long_table_name_5678</span> <span class="n">a</span> <span class="k">ON</span> <span class="n">s</span><span class="p">.</span><span class="n">id</span> <span class="o">=</span> <span class="n">a</span><span class="p">.</span><span class="n">num</span><span class="p">;</span> </pre></div> @@ -229,25 +229,25 @@ for references to the derived table in t </pre></div> </div> <p>Cross join, also called <em>Cartesian product</em>, results in every possible combination of rows from T1 and T2.</p> -<p><tt class="docutils literal"><span class="pre">FROM</span> <span class="pre">T1</span> <span class="pre">CROSS</span> <span class="pre">JOIN</span> <span class="pre">T2</span></tt> is equivalent to <tt class="docutils literal"><span class="pre">FROM</span> <span class="pre">T1,</span> <span class="pre">T2</span></tt>.</p> +<p><code class="docutils literal"><span class="pre">FROM</span> <span class="pre">T1</span> <span class="pre">CROSS</span> <span class="pre">JOIN</span> <span class="pre">T2</span></code> is equivalent to <code class="docutils literal"><span class="pre">FROM</span> <span class="pre">T1,</span> <span class="pre">T2</span></code>.</p> </div> <div class="section" id="qualified-joins"> <h5>Qualified joins<a class="headerlink" href="#qualified-joins" title="Permalink to this headline">¶</a></h5> <p>Qualified joins implicitly or explicitly have join conditions. Inner/Outer/Natural Joins all are qualified joins. -Except for natural join, <tt class="docutils literal"><span class="pre">ON</span></tt> or <tt class="docutils literal"><span class="pre">USING</span></tt> clause in each join is used to specify a join condition. +Except for natural join, <code class="docutils literal"><span class="pre">ON</span></code> or <code class="docutils literal"><span class="pre">USING</span></code> clause in each join is used to specify a join condition. A join condition must include at least one boolean expression, and it can also include just filter conditions.</p> <p><strong>Inner Join</strong></p> <div class="highlight-sql"><div class="highlight"><pre><span class="n">T1</span> <span class="p">[</span><span class="k">INNER</span><span class="p">]</span> <span class="k">JOIN</span> <span class="n">T2</span> <span class="k">ON</span> <span class="n">boolean_expression</span> <span class="n">T1</span> <span class="p">[</span><span class="k">INNER</span><span class="p">]</span> <span class="k">JOIN</span> <span class="n">T2</span> <span class="k">USING</span> <span class="p">(</span><span class="k">join</span> <span class="k">column</span> <span class="n">list</span><span class="p">)</span> </pre></div> </div> -<p><tt class="docutils literal"><span class="pre">INNER</span></tt> keyword is the default, and so <tt class="docutils literal"><span class="pre">INNER</span></tt> can be omitted when you use inner join.</p> +<p><code class="docutils literal"><span class="pre">INNER</span></code> keyword is the default, and so <code class="docutils literal"><span class="pre">INNER</span></code> can be omitted when you use inner join.</p> <p><strong>Outer Join</strong></p> <div class="highlight-sql"><div class="highlight"><pre><span class="n">T1</span> <span class="p">(</span><span class="k">LEFT</span><span class="o">|</span><span class="k">RIGHT</span><span class="o">|</span><span class="k">FULL</span><span class="p">)</span> <span class="k">OUTER</span> <span class="k">JOIN</span> <span class="n">T2</span> <span class="k">ON</span> <span class="n">boolean_expression</span> <span class="n">T1</span> <span class="p">(</span><span class="k">LEFT</span><span class="o">|</span><span class="k">RIGHT</span><span class="o">|</span><span class="k">FULL</span><span class="p">)</span> <span class="k">OUTER</span> <span class="k">JOIN</span> <span class="n">T2</span> <span class="k">USING</span> <span class="p">(</span><span class="k">join</span> <span class="k">column</span> <span class="n">list</span><span class="p">)</span> </pre></div> </div> -<p>One of <tt class="docutils literal"><span class="pre">LEFT</span></tt>, <tt class="docutils literal"><span class="pre">RIGHT</span></tt>, or <tt class="docutils literal"><span class="pre">FULL</span></tt> must be specified for outer joins. +<p>One of <code class="docutils literal"><span class="pre">LEFT</span></code>, <code class="docutils literal"><span class="pre">RIGHT</span></code>, or <code class="docutils literal"><span class="pre">FULL</span></code> must be specified for outer joins. Join conditions in outer join will have different behavior according to corresponding table references of join conditions. To know outer join behavior in more detail, please refer to <a class="reference external" href="http://www.ibm.com/developerworks/data/library/techarticle/purcell/0201purcell.html">Advanced outer join constructs</a>.</p> @@ -255,9 +255,9 @@ To know outer join behavior in more deta <div class="highlight-sql"><div class="highlight"><pre><span class="n">T1</span> <span class="k">NATURAL</span> <span class="k">JOIN</span> <span class="n">T2</span> </pre></div> </div> -<p><tt class="docutils literal"><span class="pre">NATURAL</span></tt> is a short form of <tt class="docutils literal"><span class="pre">USING</span></tt>. It forms a <tt class="docutils literal"><span class="pre">USING</span></tt> list consisting of all common column names that appear in +<p><code class="docutils literal"><span class="pre">NATURAL</span></code> is a short form of <code class="docutils literal"><span class="pre">USING</span></code>. It forms a <code class="docutils literal"><span class="pre">USING</span></code> list consisting of all common column names that appear in both join tables. These common columns appear only once in the output table. If there are no common columns, -<tt class="docutils literal"><span class="pre">NATURAL</span></tt> behaves like <tt class="docutils literal"><span class="pre">CROSS</span> <span class="pre">JOIN</span></tt>.</p> +<code class="docutils literal"><span class="pre">NATURAL</span></code> behaves like <code class="docutils literal"><span class="pre">CROSS</span> <span class="pre">JOIN</span></code>.</p> <p><strong>Subqueries</strong></p> <p>Subqueries allow users to specify a derived table. It requires enclosing a SQL statement in parentheses and an alias name. For example:</p> @@ -275,7 +275,7 @@ For example:</p> <div class="highlight-sql"><div class="highlight"><pre><span class="k">WHERE</span> <span class="n">search_condition</span> </pre></div> </div> -<p><tt class="docutils literal"><span class="pre">search_condition</span></tt> can be any boolean expression. +<p><code class="docutils literal"><span class="pre">search_condition</span></code> can be any boolean expression. In order to know additional predicates, please refer to <a class="reference internal" href="predicates.html"><em>Predicates</em></a>.</p> </div> <div class="section" id="groupby-and-having-clauses"> @@ -288,19 +288,19 @@ In order to know additional predicates, <span class="p">[</span><span class="k">HAVING</span> <span class="n">boolean_expression</span><span class="p">]</span> </pre></div> </div> -<p>The rows which passes <tt class="docutils literal"><span class="pre">WHERE</span></tt> filter may be subject to grouping, specified by <tt class="docutils literal"><span class="pre">GROUP</span> <span class="pre">BY</span></tt> clause. -Grouping combines a set of rows having common values into one group, and then computes rows in the group with aggregation functions. <tt class="docutils literal"><span class="pre">HAVING</span></tt> clause can be used with only <tt class="docutils literal"><span class="pre">GROUP</span> <span class="pre">BY</span></tt> clause. It eliminates the unqualified result rows of grouping.</p> -<p><tt class="docutils literal"><span class="pre">grouping_column_reference</span></tt> can be a column reference, a complex expression including scalar functions and arithmetic operations.</p> +<p>The rows which passes <code class="docutils literal"><span class="pre">WHERE</span></code> filter may be subject to grouping, specified by <code class="docutils literal"><span class="pre">GROUP</span> <span class="pre">BY</span></code> clause. +Grouping combines a set of rows having common values into one group, and then computes rows in the group with aggregation functions. <code class="docutils literal"><span class="pre">HAVING</span></code> clause can be used with only <code class="docutils literal"><span class="pre">GROUP</span> <span class="pre">BY</span></code> clause. It eliminates the unqualified result rows of grouping.</p> +<p><code class="docutils literal"><span class="pre">grouping_column_reference</span></code> can be a column reference, a complex expression including scalar functions and arithmetic operations.</p> <div class="highlight-sql"><div class="highlight"><pre><span class="k">SELECT</span> <span class="n">l_orderkey</span><span class="p">,</span> <span class="k">SUM</span><span class="p">(</span><span class="n">l_quantity</span><span class="p">)</span> <span class="k">AS</span> <span class="n">quantity</span> <span class="k">FROM</span> <span class="n">lineitem</span> <span class="k">GROUP</span> <span class="k">BY</span> <span class="n">l_orderkey</span><span class="p">;</span> <span class="k">SELECT</span> <span class="n">substr</span><span class="p">(</span><span class="n">l_shipdate</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">4</span><span class="p">)</span> <span class="k">as</span> <span class="k">year</span><span class="p">,</span> <span class="k">SUM</span><span class="p">(</span><span class="n">l_orderkey</span><span class="p">)</span> <span class="k">AS</span> <span class="n">total2</span> <span class="k">FROM</span> <span class="n">lineitem</span> <span class="k">GROUP</span> <span class="k">BY</span> <span class="n">substr</span><span class="p">(</span><span class="n">l_shipdate</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">4</span><span class="p">);</span> </pre></div> </div> -<p>If a SQL statement includes <tt class="docutils literal"><span class="pre">GROUP</span> <span class="pre">BY</span></tt> clause, expressions in select list must be either grouping_column_reference or aggregation function. For example, the following example query is not allowed because <tt class="docutils literal"><span class="pre">l_orderkey</span></tt> does not occur in <tt class="docutils literal"><span class="pre">GROUP</span> <span class="pre">BY</span></tt> clause.</p> +<p>If a SQL statement includes <code class="docutils literal"><span class="pre">GROUP</span> <span class="pre">BY</span></code> clause, expressions in select list must be either grouping_column_reference or aggregation function. For example, the following example query is not allowed because <code class="docutils literal"><span class="pre">l_orderkey</span></code> does not occur in <code class="docutils literal"><span class="pre">GROUP</span> <span class="pre">BY</span></code> clause.</p> <div class="highlight-sql"><div class="highlight"><pre><span class="k">SELECT</span> <span class="n">l_orderkey</span><span class="p">,</span> <span class="n">l_partkey</span><span class="p">,</span> <span class="k">SUM</span><span class="p">(</span><span class="n">l_orderkey</span><span class="p">)</span> <span class="k">AS</span> <span class="n">total</span> <span class="k">FROM</span> <span class="n">lineitem</span> <span class="k">GROUP</span> <span class="k">BY</span> <span class="n">l_partkey</span><span class="p">;</span> </pre></div> </div> -<p>Aggregation functions can be used with <tt class="docutils literal"><span class="pre">DISTINCT</span></tt> keywords. It forces an individual aggregate function to take only distinct values of the argument expression. <tt class="docutils literal"><span class="pre">DISTINCT</span></tt> keyword is used as follows:</p> +<p>Aggregation functions can be used with <code class="docutils literal"><span class="pre">DISTINCT</span></code> keywords. It forces an individual aggregate function to take only distinct values of the argument expression. <code class="docutils literal"><span class="pre">DISTINCT</span></code> keyword is used as follows:</p> <div class="highlight-sql"><div class="highlight"><pre><span class="k">SELECT</span> <span class="n">l_partkey</span><span class="p">,</span> <span class="k">COUNT</span><span class="p">(</span><span class="k">distinct</span> <span class="n">l_quantity</span><span class="p">),</span> <span class="k">SUM</span><span class="p">(</span><span class="k">distinct</span> <span class="n">l_extendedprice</span><span class="p">)</span> <span class="k">AS</span> <span class="n">total</span> <span class="k">FROM</span> <span class="n">lineitem</span> <span class="k">GROUP</span> <span class="k">BY</span> <span class="n">l_partkey</span><span class="p">;</span> </pre></div> </div> @@ -311,12 +311,12 @@ Grouping combines a set of rows having c <div class="highlight-sql"><div class="highlight"><pre><span class="k">FROM</span> <span class="p">...</span> <span class="k">ORDER</span> <span class="k">BY</span> <span class="o"><</span><span class="n">sort_expr</span><span class="o">></span> <span class="p">[(</span><span class="k">ASC</span><span class="o">|</span><span class="k">DESC</span><span class="p">)]</span> <span class="p">[</span><span class="k">NULL</span> <span class="p">(</span><span class="k">FIRST</span><span class="o">|</span><span class="k">LAST</span><span class="p">)</span> <span class="p">[,...]</span> </pre></div> </div> -<p><tt class="docutils literal"><span class="pre">sort_expr</span></tt> can be a column reference, aliased column reference, or a complex expression. -<tt class="docutils literal"><span class="pre">ASC</span></tt> indicates an ascending order of <tt class="docutils literal"><span class="pre">sort_expr</span></tt> values. <tt class="docutils literal"><span class="pre">DESC</span></tt> indicates a descending order of <tt class="docutils literal"><span class="pre">sort_expr</span></tt> values. -<tt class="docutils literal"><span class="pre">ASC</span></tt> is the default order.</p> -<p><tt class="docutils literal"><span class="pre">NULLS</span> <span class="pre">FIRST</span></tt> and <tt class="docutils literal"><span class="pre">NULLS</span> <span class="pre">LAST</span></tt> options can be used to determine whether nulls values appear +<p><code class="docutils literal"><span class="pre">sort_expr</span></code> can be a column reference, aliased column reference, or a complex expression. +<code class="docutils literal"><span class="pre">ASC</span></code> indicates an ascending order of <code class="docutils literal"><span class="pre">sort_expr</span></code> values. <code class="docutils literal"><span class="pre">DESC</span></code> indicates a descending order of <code class="docutils literal"><span class="pre">sort_expr</span></code> values. +<code class="docutils literal"><span class="pre">ASC</span></code> is the default order.</p> +<p><code class="docutils literal"><span class="pre">NULLS</span> <span class="pre">FIRST</span></code> and <code class="docutils literal"><span class="pre">NULLS</span> <span class="pre">LAST</span></code> options can be used to determine whether nulls values appear before or after non-null values in the sort ordering. By default, null values are dealt as if larger than any non-null value; -that is, <tt class="docutils literal"><span class="pre">NULLS</span> <span class="pre">FIRST</span></tt> is the default for <tt class="docutils literal"><span class="pre">DESC</span></tt> order, and <tt class="docutils literal"><span class="pre">NULLS</span> <span class="pre">LAST</span></tt> otherwise.</p> +that is, <code class="docutils literal"><span class="pre">NULLS</span> <span class="pre">FIRST</span></code> is the default for <code class="docutils literal"><span class="pre">DESC</span></code> order, and <code class="docutils literal"><span class="pre">NULLS</span> <span class="pre">LAST</span></code> otherwise.</p> </div> <div class="section" id="window-functions"> <h2>Window Functions<a class="headerlink" href="#window-functions" title="Permalink to this headline">¶</a></h2> @@ -335,15 +335,15 @@ the same partition as the current row.</ <div class="highlight-sql"><div class="highlight"><pre><span class="k">SELECT</span> <span class="n">l_orderkey</span><span class="p">,</span> <span class="k">sum</span><span class="p">(</span><span class="n">l_discount</span><span class="p">)</span> <span class="n">OVER</span> <span class="p">(</span><span class="n">PARTITION</span> <span class="k">BY</span> <span class="n">l_orderkey</span><span class="p">),</span> <span class="k">sum</span><span class="p">(</span><span class="n">l_quantity</span><span class="p">)</span> <span class="n">OVER</span> <span class="p">(</span><span class="n">PARTITION</span> <span class="k">BY</span> <span class="n">l_orderkey</span><span class="p">)</span> <span class="k">FROM</span> <span class="n">LINEITEM</span><span class="p">;</span> </pre></div> </div> -<p>If <tt class="docutils literal"><span class="pre">OVER()</span></tt> clause is empty as following, it makes all table rows into one window frame.</p> +<p>If <code class="docutils literal"><span class="pre">OVER()</span></code> clause is empty as following, it makes all table rows into one window frame.</p> <div class="highlight-sql"><div class="highlight"><pre><span class="k">SELECT</span> <span class="n">salary</span><span class="p">,</span> <span class="k">sum</span><span class="p">(</span><span class="n">salary</span><span class="p">)</span> <span class="n">OVER</span> <span class="p">()</span> <span class="k">FROM</span> <span class="n">empsalary</span><span class="p">;</span> </pre></div> </div> -<p>Also, <tt class="docutils literal"><span class="pre">ORDER</span> <span class="pre">BY</span></tt> clause can be used without <tt class="docutils literal"><span class="pre">PARTITION</span> <span class="pre">BY</span></tt> clause as follows:</p> +<p>Also, <code class="docutils literal"><span class="pre">ORDER</span> <span class="pre">BY</span></code> clause can be used without <code class="docutils literal"><span class="pre">PARTITION</span> <span class="pre">BY</span></code> clause as follows:</p> <div class="highlight-sql"><div class="highlight"><pre><span class="k">SELECT</span> <span class="n">salary</span><span class="p">,</span> <span class="k">sum</span><span class="p">(</span><span class="n">salary</span><span class="p">)</span> <span class="n">OVER</span> <span class="p">(</span><span class="k">ORDER</span> <span class="k">BY</span> <span class="n">salary</span><span class="p">)</span> <span class="k">FROM</span> <span class="n">empsalary</span><span class="p">;</span> </pre></div> </div> -<p>Also, all expressions and aggregation functions are allowed in <tt class="docutils literal"><span class="pre">ORDER</span> <span class="pre">BY</span></tt> clause as follows:</p> +<p>Also, all expressions and aggregation functions are allowed in <code class="docutils literal"><span class="pre">ORDER</span> <span class="pre">BY</span></code> clause as follows:</p> <div class="highlight-sql"><div class="highlight"><pre><span class="k">select</span> <span class="n">l_orderkey</span><span class="p">,</span> <span class="k">count</span><span class="p">(</span><span class="o">*</span><span class="p">)</span> <span class="k">as</span> <span class="n">cnt</span><span class="p">,</span> Modified: tajo/site/docs/devel/sql_language/sql_expression.html URL: http://svn.apache.org/viewvc/tajo/site/docs/devel/sql_language/sql_expression.html?rev=1649478&r1=1649477&r2=1649478&view=diff ============================================================================== --- tajo/site/docs/devel/sql_language/sql_expression.html (original) +++ tajo/site/docs/devel/sql_language/sql_expression.html Mon Jan 5 08:43:18 2015 @@ -61,11 +61,11 @@ <ul class="current"> <li class="toctree-l1"><a class="reference internal" href="../introduction.html">Introduction</a></li> <li class="toctree-l1"><a class="reference internal" href="../getting_started.html">Getting Started</a><ul> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/prerequisites.html">Prerequisites</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/downloading_source.html">Dowload and unpack the source code</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/building.html">Build source code</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/local_setup.html">Setting up a local Tajo cluster</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/first_query.html">First query execution</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#prerequisites">Prerequisites</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#dowload-and-unpack-the-source-code">Dowload and unpack the source code</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#build-source-code">Build source code</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#setting-up-a-local-tajo-cluster">Setting up a local Tajo cluster</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#first-query-execution">First query execution</a></li> </ul> </li> <li class="toctree-l1"><a class="reference internal" href="../configuration.html">Configuration</a><ul> Modified: tajo/site/docs/devel/table_management.html URL: http://svn.apache.org/viewvc/tajo/site/docs/devel/table_management.html?rev=1649478&r1=1649477&r2=1649478&view=diff ============================================================================== --- tajo/site/docs/devel/table_management.html (original) +++ tajo/site/docs/devel/table_management.html Mon Jan 5 08:43:18 2015 @@ -60,11 +60,11 @@ <ul class="current"> <li class="toctree-l1"><a class="reference internal" href="introduction.html">Introduction</a></li> <li class="toctree-l1"><a class="reference internal" href="getting_started.html">Getting Started</a><ul> -<li class="toctree-l2"><a class="reference internal" href="getting_started/prerequisites.html">Prerequisites</a></li> -<li class="toctree-l2"><a class="reference internal" href="getting_started/downloading_source.html">Dowload and unpack the source code</a></li> -<li class="toctree-l2"><a class="reference internal" href="getting_started/building.html">Build source code</a></li> -<li class="toctree-l2"><a class="reference internal" href="getting_started/local_setup.html">Setting up a local Tajo cluster</a></li> -<li class="toctree-l2"><a class="reference internal" href="getting_started/first_query.html">First query execution</a></li> +<li class="toctree-l2"><a class="reference internal" href="getting_started.html#prerequisites">Prerequisites</a></li> +<li class="toctree-l2"><a class="reference internal" href="getting_started.html#dowload-and-unpack-the-source-code">Dowload and unpack the source code</a></li> +<li class="toctree-l2"><a class="reference internal" href="getting_started.html#build-source-code">Build source code</a></li> +<li class="toctree-l2"><a class="reference internal" href="getting_started.html#setting-up-a-local-tajo-cluster">Setting up a local Tajo cluster</a></li> +<li class="toctree-l2"><a class="reference internal" href="getting_started.html#first-query-execution">First query execution</a></li> </ul> </li> <li class="toctree-l1"><a class="reference internal" href="configuration.html">Configuration</a><ul> Modified: tajo/site/docs/devel/table_management/compression.html URL: http://svn.apache.org/viewvc/tajo/site/docs/devel/table_management/compression.html?rev=1649478&r1=1649477&r2=1649478&view=diff ============================================================================== --- tajo/site/docs/devel/table_management/compression.html (original) +++ tajo/site/docs/devel/table_management/compression.html Mon Jan 5 08:43:18 2015 @@ -61,11 +61,11 @@ <ul class="current"> <li class="toctree-l1"><a class="reference internal" href="../introduction.html">Introduction</a></li> <li class="toctree-l1"><a class="reference internal" href="../getting_started.html">Getting Started</a><ul> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/prerequisites.html">Prerequisites</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/downloading_source.html">Dowload and unpack the source code</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/building.html">Build source code</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/local_setup.html">Setting up a local Tajo cluster</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/first_query.html">First query execution</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#prerequisites">Prerequisites</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#dowload-and-unpack-the-source-code">Dowload and unpack the source code</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#build-source-code">Build source code</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#setting-up-a-local-tajo-cluster">Setting up a local Tajo cluster</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#first-query-execution">First query execution</a></li> </ul> </li> <li class="toctree-l1"><a class="reference internal" href="../configuration.html">Configuration</a><ul> Modified: tajo/site/docs/devel/table_management/csv.html URL: http://svn.apache.org/viewvc/tajo/site/docs/devel/table_management/csv.html?rev=1649478&r1=1649477&r2=1649478&view=diff ============================================================================== --- tajo/site/docs/devel/table_management/csv.html (original) +++ tajo/site/docs/devel/table_management/csv.html Mon Jan 5 08:43:18 2015 @@ -61,11 +61,11 @@ <ul class="current"> <li class="toctree-l1"><a class="reference internal" href="../introduction.html">Introduction</a></li> <li class="toctree-l1"><a class="reference internal" href="../getting_started.html">Getting Started</a><ul> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/prerequisites.html">Prerequisites</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/downloading_source.html">Dowload and unpack the source code</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/building.html">Build source code</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/local_setup.html">Setting up a local Tajo cluster</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/first_query.html">First query execution</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#prerequisites">Prerequisites</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#dowload-and-unpack-the-source-code">Dowload and unpack the source code</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#build-source-code">Build source code</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#setting-up-a-local-tajo-cluster">Setting up a local Tajo cluster</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#first-query-execution">First query execution</a></li> </ul> </li> <li class="toctree-l1"><a class="reference internal" href="../configuration.html">Configuration</a><ul> @@ -183,14 +183,14 @@ <div class="section" id="csv-textfile"> <h1>CSV (TextFile)<a class="headerlink" href="#csv-textfile" title="Permalink to this headline">¶</a></h1> <p>A character-separated values (CSV) file represents a tabular data set consisting of rows and columns. -Each row is a plan-text line. A line is usually broken by a character line feed <tt class="docutils literal"><span class="pre">\n</span></tt> or carriage-return <tt class="docutils literal"><span class="pre">\r</span></tt>. -The line feed <tt class="docutils literal"><span class="pre">\n</span></tt> is the default delimiter in Tajo. Each record consists of multiple fields, separated by -some other character or string, most commonly a literal vertical bar <tt class="docutils literal"><span class="pre">|</span></tt>, comma <tt class="docutils literal"><span class="pre">,</span></tt> or tab <tt class="docutils literal"><span class="pre">\t</span></tt>. +Each row is a plan-text line. A line is usually broken by a character line feed <code class="docutils literal"><span class="pre">\n</span></code> or carriage-return <code class="docutils literal"><span class="pre">\r</span></code>. +The line feed <code class="docutils literal"><span class="pre">\n</span></code> is the default delimiter in Tajo. Each record consists of multiple fields, separated by +some other character or string, most commonly a literal vertical bar <code class="docutils literal"><span class="pre">|</span></code>, comma <code class="docutils literal"><span class="pre">,</span></code> or tab <code class="docutils literal"><span class="pre">\t</span></code>. The vertical bar is used as the default field delimiter in Tajo.</p> <div class="section" id="how-to-create-a-csv-table"> <h2>How to Create a CSV Table ?<a class="headerlink" href="#how-to-create-a-csv-table" title="Permalink to this headline">¶</a></h2> -<p>If you are not familiar with the <tt class="docutils literal"><span class="pre">CREATE</span> <span class="pre">TABLE</span></tt> statement, please refer to the Data Definition Language <a class="reference internal" href="../sql_language/ddl.html"><em>Data Definition Language</em></a>.</p> -<p>In order to specify a certain file format for your table, you need to use the <tt class="docutils literal"><span class="pre">USING</span></tt> clause in your <tt class="docutils literal"><span class="pre">CREATE</span> <span class="pre">TABLE</span></tt> +<p>If you are not familiar with the <code class="docutils literal"><span class="pre">CREATE</span> <span class="pre">TABLE</span></code> statement, please refer to the Data Definition Language <a class="reference internal" href="../sql_language/ddl.html"><em>Data Definition Language</em></a>.</p> +<p>In order to specify a certain file format for your table, you need to use the <code class="docutils literal"><span class="pre">USING</span></code> clause in your <code class="docutils literal"><span class="pre">CREATE</span> <span class="pre">TABLE</span></code> statement. The below is an example statement for creating a table using CSV files.</p> <div class="highlight-sql"><div class="highlight"><pre><span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">table1</span> <span class="p">(</span> @@ -205,18 +205,18 @@ statement. The below is an example state <div class="section" id="physical-properties"> <h2>Physical Properties<a class="headerlink" href="#physical-properties" title="Permalink to this headline">¶</a></h2> <p>Some table storage formats provide parameters for enabling or disabling features and adjusting physical parameters. -The <tt class="docutils literal"><span class="pre">WITH</span></tt> clause in the CREATE TABLE statement allows users to set those parameters.</p> +The <code class="docutils literal"><span class="pre">WITH</span></code> clause in the CREATE TABLE statement allows users to set those parameters.</p> <p>Now, the CSV storage format provides the following physical properties.</p> <ul class="simple"> -<li><tt class="docutils literal"><span class="pre">text.delimiter</span></tt>: delimiter character. <tt class="docutils literal"><span class="pre">|</span></tt> or <tt class="docutils literal"><span class="pre">\u0001</span></tt> is usually used, and the default field delimiter is <tt class="docutils literal"><span class="pre">|</span></tt>.</li> -<li><tt class="docutils literal"><span class="pre">text.null</span></tt>: NULL character. The default NULL character is an empty string <tt class="docutils literal"><span class="pre">''</span></tt>. Hive’s default NULL character is <tt class="docutils literal"><span class="pre">'\\N'</span></tt>.</li> -<li><tt class="docutils literal"><span class="pre">compression.codec</span></tt>: Compression codec. You can enable compression feature and set specified compression algorithm. The compression algorithm used to compress files. The compression codec name should be the fully qualified class name inherited from <a class="reference external" href="https://hadoop.apache.org/docs/current/api/org/apache/hadoop/io/compress/CompressionCodec.html">org.apache.hadoop.io.compress.CompressionCodec</a>. By default, compression is disabled.</li> -<li><tt class="docutils literal"><span class="pre">csvfile.serde</span></tt> (deprecated): custom (De)serializer class. <tt class="docutils literal"><span class="pre">org.apache.tajo.storage.TextSerializerDeserializer</span></tt> is the default (De)serializer class.</li> -<li><tt class="docutils literal"><span class="pre">timezone</span></tt>: the time zone that the table uses for writting. When table rows are read or written, <tt class="docutils literal"><span class="pre">`timestamp`</span></tt> and <tt class="docutils literal"><span class="pre">`time`</span></tt> column values are adjusted by this timezone if it is set. Time zone can be an abbreviation form like ‘PST’ or ‘DST’. Also, it accepts an offset-based form like ‘UTC+9’ or a location-based form like ‘Asia/Seoul’.</li> -<li><tt class="docutils literal"><span class="pre">text.error-tolerance.max-num</span></tt>: the maximum number of permissible parsing errors. This value should be an integer value. By default, <tt class="docutils literal"><span class="pre">text.error-tolerance.max-num</span></tt> is <tt class="docutils literal"><span class="pre">0</span></tt>. According to the value, parsing errors will be handled in different ways. -* If <tt class="docutils literal"><span class="pre">text.error-tolerance.max-num</span> <span class="pre"><</span> <span class="pre">0</span></tt>, all parsing errors are ignored. -* If <tt class="docutils literal"><span class="pre">text.error-tolerance.max-num</span> <span class="pre">==</span> <span class="pre">0</span></tt>, any parsing error is not allowed. If any error occurs, the query will be failed. (default) -* If <tt class="docutils literal"><span class="pre">text.error-tolerance.max-num</span> <span class="pre">></span> <span class="pre">0</span></tt>, the given number of parsing errors in each task will be pemissible.</li> +<li><code class="docutils literal"><span class="pre">text.delimiter</span></code>: delimiter character. <code class="docutils literal"><span class="pre">|</span></code> or <code class="docutils literal"><span class="pre">\u0001</span></code> is usually used, and the default field delimiter is <code class="docutils literal"><span class="pre">|</span></code>.</li> +<li><code class="docutils literal"><span class="pre">text.null</span></code>: NULL character. The default NULL character is an empty string <code class="docutils literal"><span class="pre">''</span></code>. Hive’s default NULL character is <code class="docutils literal"><span class="pre">'\\N'</span></code>.</li> +<li><code class="docutils literal"><span class="pre">compression.codec</span></code>: Compression codec. You can enable compression feature and set specified compression algorithm. The compression algorithm used to compress files. The compression codec name should be the fully qualified class name inherited from <a class="reference external" href="https://hadoop.apache.org/docs/current/api/org/apache/hadoop/io/compress/CompressionCodec.html">org.apache.hadoop.io.compress.CompressionCodec</a>. By default, compression is disabled.</li> +<li><code class="docutils literal"><span class="pre">csvfile.serde</span></code> (deprecated): custom (De)serializer class. <code class="docutils literal"><span class="pre">org.apache.tajo.storage.TextSerializerDeserializer</span></code> is the default (De)serializer class.</li> +<li><code class="docutils literal"><span class="pre">timezone</span></code>: the time zone that the table uses for writting. When table rows are read or written, <code class="docutils literal"><span class="pre">`timestamp`</span></code> and <code class="docutils literal"><span class="pre">`time`</span></code> column values are adjusted by this timezone if it is set. Time zone can be an abbreviation form like ‘PST’ or ‘DST’. Also, it accepts an offset-based form like ‘UTC+9’ or a location-based form like ‘Asia/Seoul’.</li> +<li><code class="docutils literal"><span class="pre">text.error-tolerance.max-num</span></code>: the maximum number of permissible parsing errors. This value should be an integer value. By default, <code class="docutils literal"><span class="pre">text.error-tolerance.max-num</span></code> is <code class="docutils literal"><span class="pre">0</span></code>. According to the value, parsing errors will be handled in different ways. +* If <code class="docutils literal"><span class="pre">text.error-tolerance.max-num</span> <span class="pre"><</span> <span class="pre">0</span></code>, all parsing errors are ignored. +* If <code class="docutils literal"><span class="pre">text.error-tolerance.max-num</span> <span class="pre">==</span> <span class="pre">0</span></code>, any parsing error is not allowed. If any error occurs, the query will be failed. (default) +* If <code class="docutils literal"><span class="pre">text.error-tolerance.max-num</span> <span class="pre">></span> <span class="pre">0</span></code>, the given number of parsing errors in each task will be pemissible.</li> </ul> <p>The following example is to set a custom field delimiter, NULL character, and compression codec:</p> <div class="highlight-sql"><div class="highlight"><pre><span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">table1</span> <span class="p">(</span> @@ -231,7 +231,7 @@ The <tt class="docutils literal"><span c </div> <div class="admonition warning"> <p class="first admonition-title">Warning</p> -<p class="last">Be careful when using <tt class="docutils literal"><span class="pre">\n</span></tt> as the field delimiter because CSV uses <tt class="docutils literal"><span class="pre">\n</span></tt> as the line delimiter. +<p class="last">Be careful when using <code class="docutils literal"><span class="pre">\n</span></code> as the field delimiter because CSV uses <code class="docutils literal"><span class="pre">\n</span></code> as the line delimiter. At the moment, Tajo does not provide a way to specify the line delimiter.</p> </div> </div> @@ -240,7 +240,7 @@ At the moment, Tajo does not provide a w <p>The CSV storage format not only provides reading and writing interfaces for CSV data but also allows users to process custom plan-text file formats with user-defined (De)serializer classes. For example, with custom (de)serializers, Tajo can process JSON file formats or any specialized plan-text file formats.</p> -<p>In order to specify a custom (De)serializer, set a physical property <tt class="docutils literal"><span class="pre">csvfile.serde</span></tt>. +<p>In order to specify a custom (De)serializer, set a physical property <code class="docutils literal"><span class="pre">csvfile.serde</span></code>. The property value should be a fully qualified class name.</p> <p>For example:</p> <div class="highlight-sql"><div class="highlight"><pre><span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">table1</span> <span class="p">(</span> @@ -254,21 +254,21 @@ The property value should be a fully qua </div> <div class="section" id="null-value-handling-issues"> <h2>Null Value Handling Issues<a class="headerlink" href="#null-value-handling-issues" title="Permalink to this headline">¶</a></h2> -<p>In default, NULL character in CSV files is an empty string <tt class="docutils literal"><span class="pre">''</span></tt>. +<p>In default, NULL character in CSV files is an empty string <code class="docutils literal"><span class="pre">''</span></code>. In other words, an empty field is basically recognized as a NULL value in Tajo. -If a field domain is <tt class="docutils literal"><span class="pre">TEXT</span></tt>, an empty field is recognized as a string value <tt class="docutils literal"><span class="pre">''</span></tt> instead of NULL value. -Besides, You can also use your own NULL character by specifying a physical property <tt class="docutils literal"><span class="pre">text.null</span></tt>.</p> +If a field domain is <code class="docutils literal"><span class="pre">TEXT</span></code>, an empty field is recognized as a string value <code class="docutils literal"><span class="pre">''</span></code> instead of NULL value. +Besides, You can also use your own NULL character by specifying a physical property <code class="docutils literal"><span class="pre">text.null</span></code>.</p> </div> <div class="section" id="compatibility-issues-with-apache-hive"> <h2>Compatibility Issues with Apache Hiveâ¢<a class="headerlink" href="#compatibility-issues-with-apache-hive" title="Permalink to this headline">¶</a></h2> <p>CSV files generated in Tajo can be processed directly by Apache Hive⢠without further processing. In this section, we explain some compatibility issue for users who use both Hive and Tajo.</p> <p>If you set a custom field delimiter, the CSV tables cannot be directly used in Hive. -In order to specify the custom field delimiter in Hive, you need to use <tt class="docutils literal"><span class="pre">ROW</span> <span class="pre">FORMAT</span> <span class="pre">DELIMITED</span> <span class="pre">FIELDS</span> <span class="pre">TERMINATED</span> <span class="pre">BY</span></tt> -clause in a Hive’s <tt class="docutils literal"><span class="pre">CREATE</span> <span class="pre">TABLE</span></tt> statement as follows:</p> +In order to specify the custom field delimiter in Hive, you need to use <code class="docutils literal"><span class="pre">ROW</span> <span class="pre">FORMAT</span> <span class="pre">DELIMITED</span> <span class="pre">FIELDS</span> <span class="pre">TERMINATED</span> <span class="pre">BY</span></code> +clause in a Hive’s <code class="docutils literal"><span class="pre">CREATE</span> <span class="pre">TABLE</span></code> statement as follows:</p> <div class="highlight-sql"><div class="highlight"><pre><span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">table1</span> <span class="p">(</span><span class="n">id</span> <span class="nb">int</span><span class="p">,</span> <span class="n">name</span> <span class="n">string</span><span class="p">,</span> <span class="n">score</span> <span class="nb">float</span><span class="p">,</span> <span class="k">type</span> <span class="n">string</span><span class="p">)</span> <span class="k">ROW</span> <span class="n">FORMAT</span> <span class="n">DELIMITED</span> <span class="n">FIELDS</span> <span class="n">TERMINATED</span> <span class="k">BY</span> <span class="s1">'|'</span> -<span class="n">STORED</span> <span class="k">AS</span> <span class="n">TEXTFILE</span> +<span class="n">STORED</span> <span class="k">AS</span> <span class="nb">TEXT</span> </pre></div> </div> <p>To the best of our knowledge, there is not way to specify a custom NULL character in Hive.</p> Modified: tajo/site/docs/devel/table_management/file_formats.html URL: http://svn.apache.org/viewvc/tajo/site/docs/devel/table_management/file_formats.html?rev=1649478&r1=1649477&r2=1649478&view=diff ============================================================================== --- tajo/site/docs/devel/table_management/file_formats.html (original) +++ tajo/site/docs/devel/table_management/file_formats.html Mon Jan 5 08:43:18 2015 @@ -61,11 +61,11 @@ <ul class="current"> <li class="toctree-l1"><a class="reference internal" href="../introduction.html">Introduction</a></li> <li class="toctree-l1"><a class="reference internal" href="../getting_started.html">Getting Started</a><ul> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/prerequisites.html">Prerequisites</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/downloading_source.html">Dowload and unpack the source code</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/building.html">Build source code</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/local_setup.html">Setting up a local Tajo cluster</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/first_query.html">First query execution</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#prerequisites">Prerequisites</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#dowload-and-unpack-the-source-code">Dowload and unpack the source code</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#build-source-code">Build source code</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#setting-up-a-local-tajo-cluster">Setting up a local Tajo cluster</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#first-query-execution">First query execution</a></li> </ul> </li> <li class="toctree-l1"><a class="reference internal" href="../configuration.html">Configuration</a><ul> Modified: tajo/site/docs/devel/table_management/parquet.html URL: http://svn.apache.org/viewvc/tajo/site/docs/devel/table_management/parquet.html?rev=1649478&r1=1649477&r2=1649478&view=diff ============================================================================== --- tajo/site/docs/devel/table_management/parquet.html (original) +++ tajo/site/docs/devel/table_management/parquet.html Mon Jan 5 08:43:18 2015 @@ -61,11 +61,11 @@ <ul class="current"> <li class="toctree-l1"><a class="reference internal" href="../introduction.html">Introduction</a></li> <li class="toctree-l1"><a class="reference internal" href="../getting_started.html">Getting Started</a><ul> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/prerequisites.html">Prerequisites</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/downloading_source.html">Dowload and unpack the source code</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/building.html">Build source code</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/local_setup.html">Setting up a local Tajo cluster</a></li> -<li class="toctree-l2"><a class="reference internal" href="../getting_started/first_query.html">First query execution</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#prerequisites">Prerequisites</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#dowload-and-unpack-the-source-code">Dowload and unpack the source code</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#build-source-code">Build source code</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#setting-up-a-local-tajo-cluster">Setting up a local Tajo cluster</a></li> +<li class="toctree-l2"><a class="reference internal" href="../getting_started.html#first-query-execution">First query execution</a></li> </ul> </li> <li class="toctree-l1"><a class="reference internal" href="../configuration.html">Configuration</a><ul> @@ -188,8 +188,8 @@ regardless of the choice of data process For more details, please refer to <a class="reference external" href="http://parquet.io/">Parquet File Format</a>.</p> <div class="section" id="how-to-create-a-parquet-table"> <h2>How to Create a Parquet Table?<a class="headerlink" href="#how-to-create-a-parquet-table" title="Permalink to this headline">¶</a></h2> -<p>If you are not familiar with <tt class="docutils literal"><span class="pre">CREATE</span> <span class="pre">TABLE</span></tt> statement, please refer to Data Definition Language <a class="reference internal" href="../sql_language/ddl.html"><em>Data Definition Language</em></a>.</p> -<p>In order to specify a certain file format for your table, you need to use the <tt class="docutils literal"><span class="pre">USING</span></tt> clause in your <tt class="docutils literal"><span class="pre">CREATE</span> <span class="pre">TABLE</span></tt> +<p>If you are not familiar with <code class="docutils literal"><span class="pre">CREATE</span> <span class="pre">TABLE</span></code> statement, please refer to Data Definition Language <a class="reference internal" href="../sql_language/ddl.html"><em>Data Definition Language</em></a>.</p> +<p>In order to specify a certain file format for your table, you need to use the <code class="docutils literal"><span class="pre">USING</span></code> clause in your <code class="docutils literal"><span class="pre">CREATE</span> <span class="pre">TABLE</span></code> statement. Below is an example statement for creating a table using parquet files.</p> <div class="highlight-sql"><div class="highlight"><pre><span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">table1</span> <span class="p">(</span> <span class="n">id</span> <span class="nb">int</span><span class="p">,</span> @@ -203,13 +203,13 @@ statement. Below is an example statement <div class="section" id="physical-properties"> <h2>Physical Properties<a class="headerlink" href="#physical-properties" title="Permalink to this headline">¶</a></h2> <p>Some table storage formats provide parameters for enabling or disabling features and adjusting physical parameters. -The <tt class="docutils literal"><span class="pre">WITH</span></tt> clause in the CREATE TABLE statement allows users to set those parameters.</p> +The <code class="docutils literal"><span class="pre">WITH</span></code> clause in the CREATE TABLE statement allows users to set those parameters.</p> <p>Now, Parquet file provides the following physical properties.</p> <ul class="simple"> -<li><tt class="docutils literal"><span class="pre">parquet.block.size</span></tt>: The block size is the size of a row group being buffered in memory. This limits the memory usage when writing. Larger values will improve the I/O when reading but consume more memory when writing. Default size is 134217728 bytes (= 128 * 1024 * 1024).</li> -<li><tt class="docutils literal"><span class="pre">parquet.page.size</span></tt>: The page size is for compression. When reading, each page can be decompressed independently. A block is composed of pages. The page is the smallest unit that must be read fully to access a single record. If this value is too small, the compression will deteriorate. Default size is 1048576 bytes (= 1 * 1024 * 1024).</li> -<li><tt class="docutils literal"><span class="pre">parquet.compression</span></tt>: The compression algorithm used to compress pages. It should be one of <tt class="docutils literal"><span class="pre">uncompressed</span></tt>, <tt class="docutils literal"><span class="pre">snappy</span></tt>, <tt class="docutils literal"><span class="pre">gzip</span></tt>, <tt class="docutils literal"><span class="pre">lzo</span></tt>. Default is <tt class="docutils literal"><span class="pre">uncompressed</span></tt>.</li> -<li><tt class="docutils literal"><span class="pre">parquet.enable.dictionary</span></tt>: The boolean value is to enable/disable dictionary encoding. It should be one of either <tt class="docutils literal"><span class="pre">true</span></tt> or <tt class="docutils literal"><span class="pre">false</span></tt>. Default is <tt class="docutils literal"><span class="pre">true</span></tt>.</li> +<li><code class="docutils literal"><span class="pre">parquet.block.size</span></code>: The block size is the size of a row group being buffered in memory. This limits the memory usage when writing. Larger values will improve the I/O when reading but consume more memory when writing. Default size is 134217728 bytes (= 128 * 1024 * 1024).</li> +<li><code class="docutils literal"><span class="pre">parquet.page.size</span></code>: The page size is for compression. When reading, each page can be decompressed independently. A block is composed of pages. The page is the smallest unit that must be read fully to access a single record. If this value is too small, the compression will deteriorate. Default size is 1048576 bytes (= 1 * 1024 * 1024).</li> +<li><code class="docutils literal"><span class="pre">parquet.compression</span></code>: The compression algorithm used to compress pages. It should be one of <code class="docutils literal"><span class="pre">uncompressed</span></code>, <code class="docutils literal"><span class="pre">snappy</span></code>, <code class="docutils literal"><span class="pre">gzip</span></code>, <code class="docutils literal"><span class="pre">lzo</span></code>. Default is <code class="docutils literal"><span class="pre">uncompressed</span></code>.</li> +<li><code class="docutils literal"><span class="pre">parquet.enable.dictionary</span></code>: The boolean value is to enable/disable dictionary encoding. It should be one of either <code class="docutils literal"><span class="pre">true</span></code> or <code class="docutils literal"><span class="pre">false</span></code>. Default is <code class="docutils literal"><span class="pre">true</span></code>.</li> </ul> </div> <div class="section" id="compatibility-issues-with-apache-hive">
