Author: jihoonson
Date: Wed Apr 29 02:50:11 2015
New Revision: 1676659
URL: http://svn.apache.org/r1676659
Log:
Add missing files
Added:
tajo/site/docs/devel/_sources/functions/python.txt
tajo/site/docs/devel/functions/python.html
Added: tajo/site/docs/devel/_sources/functions/python.txt
URL:
http://svn.apache.org/viewvc/tajo/site/docs/devel/_sources/functions/python.txt?rev=1676659&view=auto
==============================================================================
--- tajo/site/docs/devel/_sources/functions/python.txt (added)
+++ tajo/site/docs/devel/_sources/functions/python.txt Wed Apr 29 02:50:11 2015
@@ -0,0 +1,159 @@
+******************************
+Python Functions
+******************************
+
+=======================
+User-defined Functions
+=======================
+
+-----------------------
+Function registration
+-----------------------
+
+To register Python UDFs, you must install script files in all cluster nodes.
+After that, you can register your functions by specifying the paths to those
script files in ``tajo-site.xml``. Here is an example of the configuration.
+
+.. code-block:: xml
+
+ <property>
+ <name>tajo.function.python.code-dir</name>
+ <value>/path/to/script1.py,/path/to/script2.py</value>
+ </property>
+
+Please note that you can specify multiple paths with ``','`` as a delimiter.
Each file can contain multiple functions. Here is a typical example of a script
file.
+
+.. code-block:: python
+
+ # /path/to/udf1.py
+
+ @output_type('int4')
+ def return_one():
+ return 1
+
+ @output_type("text")
+ def helloworld():
+ return 'Hello, World'
+
+ # No decorator - blob
+ def concat_py(str):
+ return str+str
+
+ @output_type('int4')
+ def sum_py(a,b):
+ return a+b
+
+If the configuration is set properly, every function in the script files are
registered when the Tajo cluster starts up.
+
+-----------------------
+Decorators and types
+-----------------------
+
+By default, every function has a return type of ``BLOB``.
+You can use Python decorators to define output types for the script functions.
Tajo can figure out return types from the annotations of the Python script.
+
+* ``output_type``: Defines the return data type for a script UDF in a format
that Tajo can understand. The defined type must be one of the types supported
by Tajo. For supported types, please refer to :doc:`/sql_language/data_model`.
+
+-----------------------
+Query example
+-----------------------
+
+Once the Python UDFs are successfully registered, you can use them as other
built-in functions.
+
+.. code-block:: sql
+
+ default> select concat_py(n_name)::text from nation where
sum_py(n_regionkey,1) > 2;
+
+==============================================
+User-defined Aggregation Functions
+==============================================
+
+-----------------------
+Function registration
+-----------------------
+
+To define your Python aggregation functions, you should write Python classes
for each function.
+Followings are typical examples of Python UDAFs.
+
+.. code-block:: python
+
+ # /path/to/udaf1.py
+
+ class AvgPy:
+ sum = 0
+ cnt = 0
+
+ def __init__(self):
+ self.reset()
+
+ def reset(self):
+ self.sum = 0
+ self.cnt = 0
+
+ # eval at the first stage
+ def eval(self, item):
+ self.sum += item
+ self.cnt += 1
+
+ # get intermediate result
+ def get_partial_result(self):
+ return [self.sum, self.cnt]
+
+ # merge intermediate results
+ def merge(self, list):
+ self.sum += list[0]
+ self.cnt += list[1]
+
+ # get final result
+ @output_type('float8')
+ def get_final_result(self):
+ return self.sum / float(self.cnt)
+
+
+ class CountPy:
+ cnt = 0
+
+ def __init__(self):
+ self.reset()
+
+ def reset(self):
+ self.cnt = 0
+
+ # eval at the first stage
+ def eval(self):
+ self.cnt += 1
+
+ # get intermediate result
+ def get_partial_result(self):
+ return self.cnt
+
+ # merge intermediate results
+ def merge(self, cnt):
+ self.cnt += cnt
+
+ # get final result
+ @output_type('int4')
+ def get_final_result(self):
+ return self.cnt
+
+
+These classes must provide ``reset()``, ``eval()``, ``merge()``,
``get_partial_result()``, and ``get_final_result()`` functions.
+
+* ``reset()`` resets the aggregation state.
+* ``eval()`` evaluates input tuples in the first stage.
+* ``merge()`` merges intermediate results of the first stage.
+* ``get_partial_result()`` returns intermediate results of the first stage.
Output type must be same with the input type of ``merge()``.
+* ``get_final_result()`` returns the final aggregation result.
+
+-----------------------
+Query example
+-----------------------
+
+Once the Python UDAFs are successfully registered, you can use them as other
built-in aggregation functions.
+
+.. code-block:: sql
+
+ default> select avgpy(n_nationkey), countpy() from nation;
+
+.. warning::
+
+ Currently, Python UDAFs cannot be used as window functions.
\ No newline at end of file
Added: tajo/site/docs/devel/functions/python.html
URL:
http://svn.apache.org/viewvc/tajo/site/docs/devel/functions/python.html?rev=1676659&view=auto
==============================================================================
--- tajo/site/docs/devel/functions/python.html (added)
+++ tajo/site/docs/devel/functions/python.html Wed Apr 29 02:50:11 2015
@@ -0,0 +1,404 @@
+
+
+<!DOCTYPE html>
+<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
+<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]-->
+<head>
+ <meta charset="utf-8">
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
+
+ <title>Python Functions — Apache Tajo 0.11.0 documentation</title>
+
+
+
+
+
+
+ <link
href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700'
rel='stylesheet' type='text/css'>
+
+
+
+
+
+
+
+
+
+ <link rel="stylesheet" href="../_static/css/theme.css" type="text/css" />
+
+
+
+ <link rel="top" title="Apache Tajo 0.11.0 documentation"
href="../index.html"/>
+ <link rel="up" title="Functions" href="../functions.html"/>
+ <link rel="next" title="Table Management"
href="../table_management.html"/>
+ <link rel="prev" title="JSON Functions" href="json_func.html"/>
+
+
+ <script
src="https://cdnjs.cloudflare.com/ajax/libs/modernizr/2.6.2/modernizr.min.js"></script>
+
+</head>
+
+<body class="wy-body-for-nav" role="document">
+
+ <div class="wy-grid-for-nav">
+
+
+ <nav data-toggle="wy-nav-shift" class="wy-nav-side">
+ <div class="wy-side-nav-search">
+ <a href="../index.html" class="fa fa-home"> Apache Tajo</a>
+ <div role="search">
+ <form id ="rtd-search-form" class="wy-form" action="../search.html"
method="get">
+ <input type="text" name="q" placeholder="Search docs" />
+ <input type="hidden" name="check_keywords" value="yes" />
+ <input type="hidden" name="area" value="default" />
+ </form>
+</div>
+ </div>
+
+ <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation"
aria-label="main navigation">
+
+
+ <ul class="current">
+<li class="toctree-l1"><a class="reference internal"
href="../introduction.html">Introduction</a></li>
+<li class="toctree-l1"><a class="reference internal"
href="../getting_started.html">Getting Started</a><ul>
+<li class="toctree-l2"><a class="reference internal"
href="../getting_started.html#prerequisites">Prerequisites</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../getting_started.html#dowload-and-unpack-the-source-code">Dowload and
unpack the source code</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../getting_started.html#build-source-code">Build source code</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../getting_started.html#setting-up-a-local-tajo-cluster">Setting up a
local Tajo cluster</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../getting_started.html#first-query-execution">First query
execution</a></li>
+</ul>
+</li>
+<li class="toctree-l1"><a class="reference internal"
href="../configuration.html">Configuration</a><ul>
+<li class="toctree-l2"><a class="reference internal"
href="../configuration/preliminary.html">Preliminary</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../configuration/cluster_setup.html">Cluster Setup</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../configuration/tajo_master_configuration.html">Tajo Master
Configuration</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../configuration/worker_configuration.html">Worker Configuration</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../configuration/catalog_configuration.html">Catalog
Configuration</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../configuration/ha_configuration.html">High Availability for
TajoMaster</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../configuration/service_config_defaults.html">Cluster Service
Configuration Defaults</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../configuration/tajo-site-xml.html">The tajo-site.xml File</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../configuration/catalog-site-xml.html">The catalog-site.xml File</a></li>
+</ul>
+</li>
+<li class="toctree-l1"><a class="reference internal" href="../tsql.html">Tajo
Shell (TSQL)</a><ul>
+<li class="toctree-l2"><a class="reference internal"
href="../tsql/meta_command.html">Meta Commands</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../tsql/dfs_command.html">Executing HDFS commands</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../tsql/variables.html">Session Variables</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../tsql/admin_command.html">Administration Commands</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../tsql/intro.html">Introducing to TSQL</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../tsql/single_command.html">Executing a single command</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../tsql/execute_file.html">Executing Queries from Files</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../tsql/background_command.html">Executing as background process</a></li>
+</ul>
+</li>
+<li class="toctree-l1"><a class="reference internal"
href="../sql_language.html">SQL Language</a><ul>
+<li class="toctree-l2"><a class="reference internal"
href="../sql_language/data_model.html">Data Model</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../sql_language/ddl.html">Data Definition Language</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../sql_language/insert.html">INSERT (OVERWRITE) INTO</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../sql_language/queries.html">Queries</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../sql_language/sql_expression.html">SQL Expressions</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../sql_language/predicates.html">Predicates</a></li>
+</ul>
+</li>
+<li class="toctree-l1"><a class="reference internal"
href="../time_zone.html">Time Zone</a><ul>
+<li class="toctree-l2"><a class="reference internal"
href="../time_zone.html#server-cluster-time-zone">Server Cluster Time
Zone</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../time_zone.html#table-time-zone">Table Time Zone</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../time_zone.html#client-time-zone">Client Time Zone</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../time_zone.html#time-zone-id">Time Zone ID</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../time_zone.html#examples-of-time-zone">Examples of Time Zone</a></li>
+</ul>
+</li>
+<li class="toctree-l1 current"><a class="reference internal"
href="../functions.html">Functions</a><ul class="current">
+<li class="toctree-l2"><a class="reference internal"
href="../functions.html#built-in-functions">Built-in Functions</a></li>
+<li class="toctree-l2 current"><a class="reference internal"
href="../functions.html#user-defined-functions">User-defined Functions</a></li>
+</ul>
+</li>
+<li class="toctree-l1"><a class="reference internal"
href="../table_management.html">Table Management</a><ul>
+<li class="toctree-l2"><a class="reference internal"
href="../table_management/table_overview.html">Overview of Tajo Tables</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../table_management/file_formats.html">File Formats</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../table_management/compression.html">Compression</a></li>
+</ul>
+</li>
+<li class="toctree-l1"><a class="reference internal"
href="../table_partitioning.html">Table Partitioning</a><ul>
+<li class="toctree-l2"><a class="reference internal"
href="../partitioning/intro_to_partitioning.html">Introduction to
Partitioning</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../partitioning/column_partitioning.html">Column Partitioning</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../partitioning/range_partitioning.html">Range Partitioning</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../partitioning/hash_partitioning.html">Hash Partitioning</a></li>
+</ul>
+</li>
+<li class="toctree-l1"><a class="reference internal"
href="../index_overview.html">Index (Experimental Feature)</a><ul>
+<li class="toctree-l2"><a class="reference internal"
href="../index/types.html">Index Types</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../index/how_to_use.html">How to use index?</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../index/future_work.html">Future Works</a></li>
+</ul>
+</li>
+<li class="toctree-l1"><a class="reference internal"
href="../backup_and_restore.html">Backup and Restore</a><ul>
+<li class="toctree-l2"><a class="reference internal"
href="../backup_and_restore/catalog.html">Backup and Restore Catalog</a></li>
+</ul>
+</li>
+<li class="toctree-l1"><a class="reference internal"
href="../hive_integration.html">Hive Integration</a></li>
+<li class="toctree-l1"><a class="reference internal"
href="../hbase_integration.html">HBase Integration</a><ul>
+<li class="toctree-l2"><a class="reference internal"
href="../hbase_integration.html#create-table">CREATE TABLE</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../hbase_integration.html#drop-table">DROP TABLE</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../hbase_integration.html#insert-overwrite-into">INSERT (OVERWRITE)
INTO</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../hbase_integration.html#usage">Usage</a></li>
+</ul>
+</li>
+<li class="toctree-l1"><a class="reference internal"
href="../swift_integration.html">OpenStack Swift Integration</a><ul>
+<li class="toctree-l2"><a class="reference internal"
href="../swift_integration.html#swift-configuration">Swift
configuration</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../swift_integration.html#hadoop-configurations">Hadoop
configurations</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../swift_integration.html#tajo-configuration">Tajo configuration</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../swift_integration.html#querying-on-swift">Querying on Swift</a></li>
+</ul>
+</li>
+<li class="toctree-l1"><a class="reference internal"
href="../jdbc_driver.html">Tajo JDBC Driver</a><ul>
+<li class="toctree-l2"><a class="reference internal"
href="../jdbc_driver.html#how-to-get-jdbc-driver">How to get JDBC
driver</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../jdbc_driver.html#setting-the-classpath">Setting the CLASSPATH</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../jdbc_driver.html#an-example-jdbc-client">An Example JDBC
Client</a></li>
+</ul>
+</li>
+<li class="toctree-l1"><a class="reference internal"
href="../tajo_client_api.html">Tajo Client API</a></li>
+<li class="toctree-l1"><a class="reference internal"
href="../faq.html">FAQ</a></li>
+</ul>
+
+
+ </div>
+
+ </nav>
+
+ <section data-toggle="wy-nav-shift" class="wy-nav-content-wrap">
+
+
+ <nav class="wy-nav-top" role="navigation" aria-label="top navigation">
+ <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
+ <a href="../index.html">Apache Tajo</a>
+ </nav>
+
+
+
+ <div class="wy-nav-content">
+ <div class="rst-content">
+ <div role="navigation" aria-label="breadcrumbs navigation">
+ <ul class="wy-breadcrumbs">
+ <li><a href="../index.html">Docs</a> »</li>
+
+ <li><a href="../functions.html">Functions</a> »</li>
+
+ <li>Python Functions</li>
+ <li class="wy-breadcrumbs-aside">
+
+ <a href="../_sources/functions/python.txt" rel="nofollow"> View page
source</a>
+
+ </li>
+ </ul>
+ <hr/>
+</div>
+ <div role="main">
+
+ <div class="section" id="python-functions">
+<h1>Python Functions<a class="headerlink" href="#python-functions"
title="Permalink to this headline">¶</a></h1>
+<div class="section" id="user-defined-functions">
+<h2>User-defined Functions<a class="headerlink" href="#user-defined-functions"
title="Permalink to this headline">¶</a></h2>
+<div class="section" id="function-registration">
+<h3>Function registration<a class="headerlink" href="#function-registration"
title="Permalink to this headline">¶</a></h3>
+<p>To register Python UDFs, you must install script files in all cluster nodes.
+After that, you can register your functions by specifying the paths to those
script files in <tt class="docutils literal"><span
class="pre">tajo-site.xml</span></tt>. Here is an example of the
configuration.</p>
+<div class="highlight-xml"><div class="highlight"><pre><span
class="nt"><property></span>
+ <span class="nt"><name></span>tajo.function.python.code-dir<span
class="nt"></name></span>
+ <span
class="nt"><value></span>/path/to/script1.py,/path/to/script2.py<span
class="nt"></value></span>
+<span class="nt"></property></span>
+</pre></div>
+</div>
+<p>Please note that you can specify multiple paths with <tt class="docutils
literal"><span class="pre">','</span></tt> as a delimiter. Each file can
contain multiple functions. Here is a typical example of a script file.</p>
+<div class="highlight-python"><div class="highlight"><pre><span class="c">#
/path/to/udf1.py</span>
+
+<span class="nd">@output_type</span><span class="p">(</span><span
class="s">'int4'</span><span class="p">)</span>
+<span class="k">def</span> <span class="nf">return_one</span><span
class="p">():</span>
+ <span class="k">return</span> <span class="mi">1</span>
+
+<span class="nd">@output_type</span><span class="p">(</span><span
class="s">"text"</span><span class="p">)</span>
+<span class="k">def</span> <span class="nf">helloworld</span><span
class="p">():</span>
+ <span class="k">return</span> <span class="s">'Hello, World'</span>
+
+<span class="c"># No decorator - blob</span>
+<span class="k">def</span> <span class="nf">concat_py</span><span
class="p">(</span><span class="nb">str</span><span class="p">):</span>
+ <span class="k">return</span> <span class="nb">str</span><span
class="o">+</span><span class="nb">str</span>
+
+<span class="nd">@output_type</span><span class="p">(</span><span
class="s">'int4'</span><span class="p">)</span>
+<span class="k">def</span> <span class="nf">sum_py</span><span
class="p">(</span><span class="n">a</span><span class="p">,</span><span
class="n">b</span><span class="p">):</span>
+ <span class="k">return</span> <span class="n">a</span><span
class="o">+</span><span class="n">b</span>
+</pre></div>
+</div>
+<p>If the configuration is set properly, every function in the script files
are registered when the Tajo cluster starts up.</p>
+</div>
+<div class="section" id="decorators-and-types">
+<h3>Decorators and types<a class="headerlink" href="#decorators-and-types"
title="Permalink to this headline">¶</a></h3>
+<p>By default, every function has a return type of <tt class="docutils
literal"><span class="pre">BLOB</span></tt>.
+You can use Python decorators to define output types for the script functions.
Tajo can figure out return types from the annotations of the Python script.</p>
+<ul class="simple">
+<li><tt class="docutils literal"><span class="pre">output_type</span></tt>:
Defines the return data type for a script UDF in a format that Tajo can
understand. The defined type must be one of the types supported by Tajo. For
supported types, please refer to <a class="reference internal"
href="../sql_language/data_model.html"><em>Data Model</em></a>.</li>
+</ul>
+</div>
+<div class="section" id="query-example">
+<h3>Query example<a class="headerlink" href="#query-example" title="Permalink
to this headline">¶</a></h3>
+<p>Once the Python UDFs are successfully registered, you can use them as other
built-in functions.</p>
+<div class="highlight-sql"><div class="highlight"><pre><span
class="k">default</span><span class="o">></span> <span
class="k">select</span> <span class="n">concat_py</span><span
class="p">(</span><span class="n">n_name</span><span class="p">)::</span><span
class="nb">text</span> <span class="k">from</span> <span
class="n">nation</span> <span class="k">where</span> <span
class="n">sum_py</span><span class="p">(</span><span
class="n">n_regionkey</span><span class="p">,</span><span
class="mi">1</span><span class="p">)</span> <span class="o">></span> <span
class="mi">2</span><span class="p">;</span>
+</pre></div>
+</div>
+</div>
+</div>
+<div class="section" id="user-defined-aggregation-functions">
+<h2>User-defined Aggregation Functions<a class="headerlink"
href="#user-defined-aggregation-functions" title="Permalink to this
headline">¶</a></h2>
+<div class="section" id="id1">
+<h3>Function registration<a class="headerlink" href="#id1" title="Permalink to
this headline">¶</a></h3>
+<p>To define your Python aggregation functions, you should write Python
classes for each function.
+Followings are typical examples of Python UDAFs.</p>
+<div class="highlight-python"><div class="highlight"><pre><span class="c">#
/path/to/udaf1.py</span>
+
+<span class="k">class</span> <span class="nc">AvgPy</span><span
class="p">:</span>
+ <span class="nb">sum</span> <span class="o">=</span> <span
class="mi">0</span>
+ <span class="n">cnt</span> <span class="o">=</span> <span class="mi">0</span>
+
+ <span class="k">def</span> <span class="nf">__init__</span><span
class="p">(</span><span class="bp">self</span><span class="p">):</span>
+ <span class="bp">self</span><span class="o">.</span><span
class="n">reset</span><span class="p">()</span>
+
+ <span class="k">def</span> <span class="nf">reset</span><span
class="p">(</span><span class="bp">self</span><span class="p">):</span>
+ <span class="bp">self</span><span class="o">.</span><span
class="n">sum</span> <span class="o">=</span> <span class="mi">0</span>
+ <span class="bp">self</span><span class="o">.</span><span
class="n">cnt</span> <span class="o">=</span> <span class="mi">0</span>
+
+ <span class="c"># eval at the first stage</span>
+ <span class="k">def</span> <span class="nf">eval</span><span
class="p">(</span><span class="bp">self</span><span class="p">,</span> <span
class="n">item</span><span class="p">):</span>
+ <span class="bp">self</span><span class="o">.</span><span
class="n">sum</span> <span class="o">+=</span> <span class="n">item</span>
+ <span class="bp">self</span><span class="o">.</span><span
class="n">cnt</span> <span class="o">+=</span> <span class="mi">1</span>
+
+ <span class="c"># get intermediate result</span>
+ <span class="k">def</span> <span class="nf">get_partial_result</span><span
class="p">(</span><span class="bp">self</span><span class="p">):</span>
+ <span class="k">return</span> <span class="p">[</span><span
class="bp">self</span><span class="o">.</span><span class="n">sum</span><span
class="p">,</span> <span class="bp">self</span><span class="o">.</span><span
class="n">cnt</span><span class="p">]</span>
+
+ <span class="c"># merge intermediate results</span>
+ <span class="k">def</span> <span class="nf">merge</span><span
class="p">(</span><span class="bp">self</span><span class="p">,</span> <span
class="nb">list</span><span class="p">):</span>
+ <span class="bp">self</span><span class="o">.</span><span
class="n">sum</span> <span class="o">+=</span> <span
class="nb">list</span><span class="p">[</span><span class="mi">0</span><span
class="p">]</span>
+ <span class="bp">self</span><span class="o">.</span><span
class="n">cnt</span> <span class="o">+=</span> <span
class="nb">list</span><span class="p">[</span><span class="mi">1</span><span
class="p">]</span>
+
+ <span class="c"># get final result</span>
+ <span class="nd">@output_type</span><span class="p">(</span><span
class="s">'float8'</span><span class="p">)</span>
+ <span class="k">def</span> <span class="nf">get_final_result</span><span
class="p">(</span><span class="bp">self</span><span class="p">):</span>
+ <span class="k">return</span> <span class="bp">self</span><span
class="o">.</span><span class="n">sum</span> <span class="o">/</span> <span
class="nb">float</span><span class="p">(</span><span
class="bp">self</span><span class="o">.</span><span class="n">cnt</span><span
class="p">)</span>
+
+
+<span class="k">class</span> <span class="nc">CountPy</span><span
class="p">:</span>
+ <span class="n">cnt</span> <span class="o">=</span> <span class="mi">0</span>
+
+ <span class="k">def</span> <span class="nf">__init__</span><span
class="p">(</span><span class="bp">self</span><span class="p">):</span>
+ <span class="bp">self</span><span class="o">.</span><span
class="n">reset</span><span class="p">()</span>
+
+ <span class="k">def</span> <span class="nf">reset</span><span
class="p">(</span><span class="bp">self</span><span class="p">):</span>
+ <span class="bp">self</span><span class="o">.</span><span
class="n">cnt</span> <span class="o">=</span> <span class="mi">0</span>
+
+ <span class="c"># eval at the first stage</span>
+ <span class="k">def</span> <span class="nf">eval</span><span
class="p">(</span><span class="bp">self</span><span class="p">):</span>
+ <span class="bp">self</span><span class="o">.</span><span
class="n">cnt</span> <span class="o">+=</span> <span class="mi">1</span>
+
+ <span class="c"># get intermediate result</span>
+ <span class="k">def</span> <span class="nf">get_partial_result</span><span
class="p">(</span><span class="bp">self</span><span class="p">):</span>
+ <span class="k">return</span> <span class="bp">self</span><span
class="o">.</span><span class="n">cnt</span>
+
+ <span class="c"># merge intermediate results</span>
+ <span class="k">def</span> <span class="nf">merge</span><span
class="p">(</span><span class="bp">self</span><span class="p">,</span> <span
class="n">cnt</span><span class="p">):</span>
+ <span class="bp">self</span><span class="o">.</span><span
class="n">cnt</span> <span class="o">+=</span> <span class="n">cnt</span>
+
+ <span class="c"># get final result</span>
+ <span class="nd">@output_type</span><span class="p">(</span><span
class="s">'int4'</span><span class="p">)</span>
+ <span class="k">def</span> <span class="nf">get_final_result</span><span
class="p">(</span><span class="bp">self</span><span class="p">):</span>
+ <span class="k">return</span> <span class="bp">self</span><span
class="o">.</span><span class="n">cnt</span>
+</pre></div>
+</div>
+<p>These classes must provide <tt class="docutils literal"><span
class="pre">reset()</span></tt>, <tt class="docutils literal"><span
class="pre">eval()</span></tt>, <tt class="docutils literal"><span
class="pre">merge()</span></tt>, <tt class="docutils literal"><span
class="pre">get_partial_result()</span></tt>, and <tt class="docutils
literal"><span class="pre">get_final_result()</span></tt> functions.</p>
+<ul class="simple">
+<li><tt class="docutils literal"><span class="pre">reset()</span></tt> resets
the aggregation state.</li>
+<li><tt class="docutils literal"><span class="pre">eval()</span></tt>
evaluates input tuples in the first stage.</li>
+<li><tt class="docutils literal"><span class="pre">merge()</span></tt> merges
intermediate results of the first stage.</li>
+<li><tt class="docutils literal"><span
class="pre">get_partial_result()</span></tt> returns intermediate results of
the first stage. Output type must be same with the input type of <tt
class="docutils literal"><span class="pre">merge()</span></tt>.</li>
+<li><tt class="docutils literal"><span
class="pre">get_final_result()</span></tt> returns the final aggregation
result.</li>
+</ul>
+</div>
+<div class="section" id="id2">
+<h3>Query example<a class="headerlink" href="#id2" title="Permalink to this
headline">¶</a></h3>
+<p>Once the Python UDAFs are successfully registered, you can use them as
other built-in aggregation functions.</p>
+<div class="highlight-sql"><div class="highlight"><pre><span
class="k">default</span><span class="o">></span> <span
class="k">select</span> <span class="n">avgpy</span><span
class="p">(</span><span class="n">n_nationkey</span><span class="p">),</span>
<span class="n">countpy</span><span class="p">()</span> <span
class="k">from</span> <span class="n">nation</span><span class="p">;</span>
+</pre></div>
+</div>
+<div class="admonition warning">
+<p class="first admonition-title">Warning</p>
+<p class="last">Currently, Python UDAFs cannot be used as window functions.</p>
+</div>
+</div>
+</div>
+</div>
+
+
+ </div>
+ <footer>
+
+ <div class="rst-footer-buttons" role="navigation" aria-label="footer
navigation">
+
+ <a href="../table_management.html" class="btn btn-neutral float-right"
title="Table Management"/>Next <span class="fa
fa-arrow-circle-right"></span></a>
+
+
+ <a href="json_func.html" class="btn btn-neutral" title="JSON
Functions"><span class="fa fa-arrow-circle-left"></span> Previous</a>
+
+ </div>
+
+
+ <hr/>
+
+ <div role="contentinfo">
+ <p>
+ © Copyright 2014, Apache Tajo Team.
+ </p>
+ </div>
+
+ <a href="https://github.com/snide/sphinx_rtd_theme">Sphinx theme</a>
provided by <a href="https://readthedocs.org">Read the Docs</a>
+</footer>
+ </div>
+ </div>
+
+ </section>
+
+ </div>
+
+
+
+
+
+ <script type="text/javascript">
+ var DOCUMENTATION_OPTIONS = {
+ URL_ROOT:'../',
+ VERSION:'0.11.0',
+ COLLAPSE_INDEX:false,
+ FILE_SUFFIX:'.html',
+ HAS_SOURCE: true
+ };
+ </script>
+ <script type="text/javascript" src="../_static/jquery.js"></script>
+ <script type="text/javascript" src="../_static/underscore.js"></script>
+ <script type="text/javascript" src="../_static/doctools.js"></script>
+
+
+
+
+
+ <script type="text/javascript" src="../_static/js/theme.js"></script>
+
+
+
+
+ <script type="text/javascript">
+ jQuery(function () {
+ SphinxRtdTheme.StickyNav.enable();
+ });
+ </script>
+
+
+</body>
+</html>
\ No newline at end of file