Added: tajo/site/docs/0.11.1/hbase_integration.html URL: http://svn.apache.org/viewvc/tajo/site/docs/0.11.1/hbase_integration.html?rev=1728394&view=auto ============================================================================== --- tajo/site/docs/0.11.1/hbase_integration.html (added) +++ tajo/site/docs/0.11.1/hbase_integration.html Thu Feb 4 00:29:05 2016 @@ -0,0 +1,420 @@ + + +<!DOCTYPE html> +<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]--> +<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]--> +<head> + <meta charset="utf-8"> + <meta name="viewport" content="width=device-width, initial-scale=1.0"> + + <title>HBase Integration — Apache Tajo 0.11.0 documentation</title> + + + + + + + <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'> + + + + + + + + + + <link rel="stylesheet" href="_static/css/theme.css" type="text/css" /> + + + + <link rel="top" title="Apache Tajo 0.11.0 documentation" href="index.html"/> + <link rel="next" title="OpenStack Swift Integration" href="swift_integration.html"/> + <link rel="prev" title="Hive Integration" href="hive_integration.html"/> + + + <script src="https://cdnjs.cloudflare.com/ajax/libs/modernizr/2.6.2/modernizr.min.js"></script> + +</head> + +<body class="wy-body-for-nav" role="document"> + + <div class="wy-grid-for-nav"> + + + <nav data-toggle="wy-nav-shift" class="wy-nav-side"> + <div class="wy-side-nav-search"> + <a href="index.html" class="fa fa-home"> Apache Tajo</a> + <div role="search"> + <form id ="rtd-search-form" class="wy-form" action="search.html" method="get"> + <input type="text" name="q" placeholder="Search docs" /> + <input type="hidden" name="check_keywords" value="yes" /> + <input type="hidden" name="area" value="default" /> + </form> +</div> + </div> + + <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation"> + + + <ul class="current"> +<li class="toctree-l1"><a class="reference internal" href="introduction.html">Introduction</a></li> +<li class="toctree-l1"><a class="reference internal" href="getting_started.html">Getting Started</a><ul> +<li class="toctree-l2"><a class="reference internal" href="getting_started.html#prerequisites">Prerequisites</a></li> +<li class="toctree-l2"><a class="reference internal" href="getting_started.html#dowload-and-unpack-the-source-code">Dowload and unpack the source code</a></li> +<li class="toctree-l2"><a class="reference internal" href="getting_started.html#build-source-code">Build source code</a></li> +<li class="toctree-l2"><a class="reference internal" href="getting_started.html#setting-up-a-local-tajo-cluster">Setting up a local Tajo cluster</a></li> +<li class="toctree-l2"><a class="reference internal" href="getting_started.html#first-query-execution">First query execution</a></li> +</ul> +</li> +<li class="toctree-l1"><a class="reference internal" href="configuration.html">Configuration</a><ul> +<li class="toctree-l2"><a class="reference internal" href="configuration/preliminary.html">Preliminary</a></li> +<li class="toctree-l2"><a class="reference internal" href="configuration/cluster_setup.html">Cluster Setup</a></li> +<li class="toctree-l2"><a class="reference internal" href="configuration/tajo_master_configuration.html">Tajo Master Configuration</a></li> +<li class="toctree-l2"><a class="reference internal" href="configuration/worker_configuration.html">Worker Configuration</a></li> +<li class="toctree-l2"><a class="reference internal" href="configuration/catalog_configuration.html">Catalog Configuration</a></li> +<li class="toctree-l2"><a class="reference internal" href="configuration/ha_configuration.html">High Availability for TajoMaster</a></li> +<li class="toctree-l2"><a class="reference internal" href="configuration/service_config_defaults.html">Cluster Service Configuration Defaults</a></li> +<li class="toctree-l2"><a class="reference internal" href="configuration/tajo-site-xml.html">The tajo-site.xml File</a></li> +<li class="toctree-l2"><a class="reference internal" href="configuration/catalog-site-xml.html">The catalog-site.xml File</a></li> +<li class="toctree-l2"><a class="reference internal" href="configuration/storage-site-json.html">The storage-site.json File</a></li> +</ul> +</li> +<li class="toctree-l1"><a class="reference internal" href="tsql.html">Tajo Shell (TSQL)</a><ul> +<li class="toctree-l2"><a class="reference internal" href="tsql/meta_command.html">Meta Commands</a></li> +<li class="toctree-l2"><a class="reference internal" href="tsql/dfs_command.html">Executing HDFS commands</a></li> +<li class="toctree-l2"><a class="reference internal" href="tsql/variables.html">Session Variables</a></li> +<li class="toctree-l2"><a class="reference internal" href="tsql/admin_command.html">Administration Commands</a></li> +<li class="toctree-l2"><a class="reference internal" href="tsql/intro.html">Introducing to TSQL</a></li> +<li class="toctree-l2"><a class="reference internal" href="tsql/single_command.html">Executing a single command</a></li> +<li class="toctree-l2"><a class="reference internal" href="tsql/execute_file.html">Executing Queries from Files</a></li> +<li class="toctree-l2"><a class="reference internal" href="tsql/background_command.html">Executing as background process</a></li> +</ul> +</li> +<li class="toctree-l1"><a class="reference internal" href="sql_language.html">SQL Language</a><ul> +<li class="toctree-l2"><a class="reference internal" href="sql_language/data_model.html">Data Model</a></li> +<li class="toctree-l2"><a class="reference internal" href="sql_language/ddl.html">Data Definition Language</a></li> +<li class="toctree-l2"><a class="reference internal" href="sql_language/insert.html">INSERT (OVERWRITE) INTO</a></li> +<li class="toctree-l2"><a class="reference internal" href="sql_language/alter_table.html">ALTER TABLE</a></li> +<li class="toctree-l2"><a class="reference internal" href="sql_language/queries.html">Queries</a></li> +<li class="toctree-l2"><a class="reference internal" href="sql_language/joins.html">Joins</a></li> +<li class="toctree-l2"><a class="reference internal" href="sql_language/sql_expression.html">SQL Expressions</a></li> +<li class="toctree-l2"><a class="reference internal" href="sql_language/predicates.html">Predicates</a></li> +<li class="toctree-l2"><a class="reference internal" href="sql_language/explain.html">EXPLAIN</a></li> +</ul> +</li> +<li class="toctree-l1"><a class="reference internal" href="time_zone.html">Time Zone</a><ul> +<li class="toctree-l2"><a class="reference internal" href="time_zone.html#server-cluster-time-zone">Server Cluster Time Zone</a></li> +<li class="toctree-l2"><a class="reference internal" href="time_zone.html#table-time-zone">Table Time Zone</a></li> +<li class="toctree-l2"><a class="reference internal" href="time_zone.html#client-time-zone">Client Time Zone</a></li> +<li class="toctree-l2"><a class="reference internal" href="time_zone.html#time-zone-id">Time Zone ID</a></li> +<li class="toctree-l2"><a class="reference internal" href="time_zone.html#examples-of-time-zone">Examples of Time Zone</a></li> +</ul> +</li> +<li class="toctree-l1"><a class="reference internal" href="functions.html">Functions</a><ul> +<li class="toctree-l2"><a class="reference internal" href="functions.html#built-in-scalar-functions">Built-in Scalar Functions</a></li> +<li class="toctree-l2"><a class="reference internal" href="functions.html#built-in-aggregation-functions">Built-in Aggregation Functions</a></li> +<li class="toctree-l2"><a class="reference internal" href="functions.html#built-in-window-functions">Built-in Window Functions</a></li> +<li class="toctree-l2"><a class="reference internal" href="functions.html#user-defined-functions">User-defined Functions</a></li> +</ul> +</li> +<li class="toctree-l1"><a class="reference internal" href="table_management.html">Table Management</a><ul> +<li class="toctree-l2"><a class="reference internal" href="table_management/table_overview.html">Overview of Tajo Tables</a></li> +<li class="toctree-l2"><a class="reference internal" href="table_management/tablespaces.html">Tablespaces</a></li> +<li class="toctree-l2"><a class="reference internal" href="table_management/data_formats.html">Data Formats</a></li> +<li class="toctree-l2"><a class="reference internal" href="table_management/compression.html">Compression</a></li> +</ul> +</li> +<li class="toctree-l1"><a class="reference internal" href="table_partitioning.html">Table Partitioning</a><ul> +<li class="toctree-l2"><a class="reference internal" href="partitioning/intro_to_partitioning.html">Introduction to Partitioning</a></li> +<li class="toctree-l2"><a class="reference internal" href="partitioning/column_partitioning.html">Column Partitioning</a></li> +<li class="toctree-l2"><a class="reference internal" href="partitioning/range_partitioning.html">Range Partitioning</a></li> +<li class="toctree-l2"><a class="reference internal" href="partitioning/hash_partitioning.html">Hash Partitioning</a></li> +</ul> +</li> +<li class="toctree-l1"><a class="reference internal" href="storage_plugins.html">Storage Plugin</a><ul> +<li class="toctree-l2"><a class="reference internal" href="storage_plugins/overview.html">Storage Plugin Overview</a></li> +<li class="toctree-l2"><a class="reference internal" href="storage_plugins/postgresql.html">PostgreSQL Storage Handler</a></li> +</ul> +</li> +<li class="toctree-l1"><a class="reference internal" href="index_overview.html">Index (Experimental Feature)</a><ul> +<li class="toctree-l2"><a class="reference internal" href="index/types.html">Index Types</a></li> +<li class="toctree-l2"><a class="reference internal" href="index/how_to_use.html">How to use index?</a></li> +<li class="toctree-l2"><a class="reference internal" href="index/future_work.html">Future Works</a></li> +</ul> +</li> +<li class="toctree-l1"><a class="reference internal" href="backup_and_restore.html">Backup and Restore</a><ul> +<li class="toctree-l2"><a class="reference internal" href="backup_and_restore/catalog.html">Backup and Restore Catalog</a></li> +</ul> +</li> +<li class="toctree-l1"><a class="reference internal" href="hive_integration.html">Hive Integration</a></li> +<li class="toctree-l1 current"><a class="current reference internal" href="">HBase Integration</a><ul> +<li class="toctree-l2"><a class="reference internal" href="#create-table">CREATE TABLE</a></li> +<li class="toctree-l2"><a class="reference internal" href="#drop-table">DROP TABLE</a></li> +<li class="toctree-l2"><a class="reference internal" href="#insert-overwrite-into">INSERT (OVERWRITE) INTO</a></li> +<li class="toctree-l2"><a class="reference internal" href="#usage">Usage</a></li> +</ul> +</li> +<li class="toctree-l1"><a class="reference internal" href="swift_integration.html">OpenStack Swift Integration</a><ul> +<li class="toctree-l2"><a class="reference internal" href="swift_integration.html#swift-configuration">Swift configuration</a></li> +<li class="toctree-l2"><a class="reference internal" href="swift_integration.html#hadoop-configurations">Hadoop configurations</a></li> +<li class="toctree-l2"><a class="reference internal" href="swift_integration.html#tajo-configuration">Tajo configuration</a></li> +<li class="toctree-l2"><a class="reference internal" href="swift_integration.html#querying-on-swift">Querying on Swift</a></li> +</ul> +</li> +<li class="toctree-l1"><a class="reference internal" href="jdbc_driver.html">Tajo JDBC Driver</a><ul> +<li class="toctree-l2"><a class="reference internal" href="jdbc_driver.html#how-to-get-jdbc-driver">How to get JDBC driver</a></li> +<li class="toctree-l2"><a class="reference internal" href="jdbc_driver.html#setting-the-classpath">Setting the CLASSPATH</a></li> +<li class="toctree-l2"><a class="reference internal" href="jdbc_driver.html#connecting-to-the-tajo-cluster-instance">Connecting to the Tajo cluster instance</a></li> +<li class="toctree-l2"><a class="reference internal" href="jdbc_driver.html#connection-parameters">Connection Parameters</a></li> +<li class="toctree-l2"><a class="reference internal" href="jdbc_driver.html#an-example-jdbc-client">An Example JDBC Client</a></li> +</ul> +</li> +<li class="toctree-l1"><a class="reference internal" href="tajo_client_api.html">Tajo Client API</a></li> +<li class="toctree-l1"><a class="reference internal" href="faq.html">FAQ</a></li> +</ul> + + + </div> + + </nav> + + <section data-toggle="wy-nav-shift" class="wy-nav-content-wrap"> + + + <nav class="wy-nav-top" role="navigation" aria-label="top navigation"> + <i data-toggle="wy-nav-top" class="fa fa-bars"></i> + <a href="index.html">Apache Tajo</a> + </nav> + + + + <div class="wy-nav-content"> + <div class="rst-content"> + <div role="navigation" aria-label="breadcrumbs navigation"> + <ul class="wy-breadcrumbs"> + <li><a href="index.html">Docs</a> »</li> + + <li>HBase Integration</li> + <li class="wy-breadcrumbs-aside"> + + <a href="_sources/hbase_integration.txt" rel="nofollow"> View page source</a> + + </li> + </ul> + <hr/> +</div> + <div role="main"> + + <div class="section" id="hbase-integration"> +<h1>HBase Integration<a class="headerlink" href="#hbase-integration" title="Permalink to this headline">¶</a></h1> +<p>Apache Tajo⢠storage supports integration with Apache HBaseâ¢. +This integration allows Tajo to access all tables used in Apache HBase.</p> +<p>In order to use this feature, you need to build add some configs into <code class="docutils literal"><span class="pre">conf/tajo-env.sh</span></code> and then add some properties into a table create statement.</p> +<p>This section describes how to setup HBase integration.</p> +<p>First, you need to set your HBase home directory to the environment variable <code class="docutils literal"><span class="pre">HBASE_HOME</span></code> in <code class="docutils literal"><span class="pre">conf/tajo-env.sh</span></code> as follows:</p> +<div class="highlight-python"><div class="highlight"><pre>export HBASE_HOME=/path/to/your/hbase/directory +</pre></div> +</div> +<p>If you set the directory, Tajo will add HBase library file to classpath.</p> +<p>Next, you must configure tablespace about HBase. Please see <a class="reference internal" href="table_management/tablespaces.html"><em>Tablespaces</em></a> if you want to know more information about it.</p> +<div class="section" id="create-table"> +<h2>CREATE TABLE<a class="headerlink" href="#create-table" title="Permalink to this headline">¶</a></h2> +<p><em>Synopsis</em></p> +<div class="highlight-sql"><div class="highlight"><pre><span class="k">CREATE</span> <span class="p">[</span><span class="k">EXTERNAL</span><span class="p">]</span> <span class="k">TABLE</span> <span class="p">[</span><span class="n">IF</span> <span class="k">NOT</span> <span class="k">EXISTS</span><span class="p">]</span> <span class="o"><</span><span class="k">table_name</span><span class="o">></span> <span class="p">[(</span><span class="o"><</span><span class="k">column_name</span><span class="o">></span> <span class="o"><</span><span class="n">data_type</span><span class="o">></span><span class="p">,</span> <span class="p">...</span> <span class="p">)]</span> +<span class="k">USING</span> <span class="n">hbase</span> +<span class="k">WITH</span> <span class="p">(</span><span class="s1">'table'</span><span class="o">=</span><span class="s1">'<hbase_table_name>'</span> +<span class="p">,</span> <span class="s1">'columns'</span><span class="o">=</span><span class="s1">':key,<column_family_name>:<qualifier_name>, ...'</span> +<span class="p">,</span> <span class="s1">'hbase.zookeeper.quorum'</span><span class="o">=</span><span class="s1">'<zookeeper_address>'</span> +<span class="p">,</span> <span class="s1">'hbase.zookeeper.property.clientPort'</span><span class="o">=</span><span class="s1">'<zookeeper_client_port>'</span><span class="p">)</span> +<span class="p">[</span><span class="k">LOCATION</span> <span class="s1">'hbase:zk://<hostname>:<port>/'</span><span class="p">]</span> <span class="p">;</span> +</pre></div> +</div> +<p><code class="docutils literal"><span class="pre">IF</span> <span class="pre">NOT</span> <span class="pre">EXISTS</span></code> allows <code class="docutils literal"><span class="pre">CREATE</span> <span class="pre">[EXTERNAL]</span> <span class="pre">TABLE</span></code> statement to avoid an error which occurs when the table does not exist.</p> +<p>If you want to create <code class="docutils literal"><span class="pre">EXTERNAL</span> <span class="pre">TABLE</span></code>, You must write <code class="docutils literal"><span class="pre">LOCATION</span></code> statement.</p> +<p>Options</p> +<ul class="simple"> +<li><code class="docutils literal"><span class="pre">table</span></code> : Set hbase origin table name. If you want to create an external table, the table must exists on HBase. The other way, if you want to create a managed table, the table must doesn’t exist on HBase.</li> +<li><code class="docutils literal"><span class="pre">columns</span></code> : :key means HBase row key. The number of columns entry need to equals to the number of Tajo table column</li> +<li><code class="docutils literal"><span class="pre">hbase.zookeeper.quorum</span></code> : Set zookeeper quorum address. You can use different zookeeper cluster on the same Tajo database. If you don’t set the zookeeper address, Tajo will refer the property of hbase-site.xml file.</li> +<li><code class="docutils literal"><span class="pre">hbase.zookeeper.property.clientPort</span></code> : Set zookeeper client port. If you don’t set the port, Tajo will refer the property of hbase-site.xml file.</li> +</ul> +</div> +<div class="section" id="drop-table"> +<h2>DROP TABLE<a class="headerlink" href="#drop-table" title="Permalink to this headline">¶</a></h2> +<p><em>Synopsis</em></p> +<div class="highlight-sql"><div class="highlight"><pre><span class="k">DROP</span> <span class="k">TABLE</span> <span class="p">[</span><span class="n">IF</span> <span class="k">EXISTS</span><span class="p">]</span> <span class="o"><</span><span class="k">table_name</span><span class="o">></span> <span class="p">[</span><span class="n">PURGE</span><span class="p">]</span> +</pre></div> +</div> +<p><code class="docutils literal"><span class="pre">IF</span> <span class="pre">EXISTS</span></code> allows <code class="docutils literal"><span class="pre">DROP</span> <span class="pre">TABLE</span></code> statement to avoid an error which occurs when the table does not exist. <code class="docutils literal"><span class="pre">DROP</span> <span class="pre">TABLE</span></code> statement removes a table from Tajo catalog, but it does not remove the contents on HBase cluster. If <code class="docutils literal"><span class="pre">PURGE</span></code> option is given, <code class="docutils literal"><span class="pre">DROP</span> <span class="pre">TABLE</span></code> statement will eliminate the entry in the catalog as well as the contents on HBase cluster.</p> +</div> +<div class="section" id="insert-overwrite-into"> +<h2>INSERT (OVERWRITE) INTO<a class="headerlink" href="#insert-overwrite-into" title="Permalink to this headline">¶</a></h2> +<p>INSERT OVERWRITE statement overwrites a table data of an existing table. Tajo’s INSERT OVERWRITE statement follows <code class="docutils literal"><span class="pre">INSERT</span> <span class="pre">INTO</span> <span class="pre">SELECT</span></code> statement of SQL. The examples are as follows:</p> +<div class="highlight-sql"><div class="highlight"><pre><span class="c1">-- when a target table schema and output schema are equivalent to each other</span> +<span class="k">INSERT</span> <span class="n">OVERWRITE</span> <span class="k">INTO</span> <span class="n">t1</span> <span class="k">SELECT</span> <span class="n">l_orderkey</span><span class="p">,</span> <span class="n">l_partkey</span><span class="p">,</span> <span class="n">l_quantity</span> <span class="k">FROM</span> <span class="n">lineitem</span><span class="p">;</span> +<span class="c1">-- or</span> +<span class="k">INSERT</span> <span class="n">OVERWRITE</span> <span class="k">INTO</span> <span class="n">t1</span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">lineitem</span><span class="p">;</span> + +<span class="c1">-- when the output schema are smaller than the target table schema</span> +<span class="k">INSERT</span> <span class="n">OVERWRITE</span> <span class="k">INTO</span> <span class="n">t1</span> <span class="k">SELECT</span> <span class="n">l_orderkey</span> <span class="k">FROM</span> <span class="n">lineitem</span><span class="p">;</span> + +<span class="c1">-- when you want to specify certain target columns</span> +<span class="k">INSERT</span> <span class="n">OVERWRITE</span> <span class="k">INTO</span> <span class="n">t1</span> <span class="p">(</span><span class="n">col1</span><span class="p">,</span> <span class="n">col3</span><span class="p">)</span> <span class="k">SELECT</span> <span class="n">l_orderkey</span><span class="p">,</span> <span class="n">l_quantity</span> <span class="k">FROM</span> <span class="n">lineitem</span><span class="p">;</span> +</pre></div> +</div> +<div class="admonition note"> +<p class="first admonition-title">Note</p> +<p class="last">If you don’t set row key option, You are never able to use your table data. Because Tajo need to have some key columns for sorting before creating result data.</p> +</div> +</div> +<div class="section" id="usage"> +<h2>Usage<a class="headerlink" href="#usage" title="Permalink to this headline">¶</a></h2> +<p>In order to create a new HBase table which is to be managed by Tajo, use the USING clause on CREATE TABLE:</p> +<div class="highlight-sql"><div class="highlight"><pre><span class="k">CREATE</span> <span class="k">EXTERNAL</span> <span class="k">TABLE</span> <span class="n">blog</span> <span class="p">(</span><span class="n">rowkey</span> <span class="nb">text</span><span class="p">,</span> <span class="n">author</span> <span class="nb">text</span><span class="p">,</span> <span class="n">register_date</span> <span class="nb">text</span><span class="p">,</span> <span class="n">title</span> <span class="nb">text</span><span class="p">)</span> +<span class="k">USING</span> <span class="n">hbase</span> <span class="k">WITH</span> <span class="p">(</span> + <span class="s1">'table'</span><span class="o">=</span><span class="s1">'blog'</span> + <span class="p">,</span> <span class="s1">'columns'</span><span class="o">=</span><span class="s1">':key,info:author,info:date,content:title'</span><span class="p">)</span> +<span class="k">LOCATION</span> <span class="s1">'hbase:zk://<hostname>:<port>/'</span><span class="p">;</span> +</pre></div> +</div> +<p>After executing the command above, you should be able to see the new table in the HBase shell:</p> +<div class="highlight-sql"><div class="highlight"><pre>$ hbase shell +create 'blog', {NAME=>'info'}, {NAME=>'content'} +put 'blog', 'hyunsik-02', 'content:title', 'Getting started with Tajo on your desktop' +put 'blog', 'hyunsik-02', 'info:author', 'Hyunsik Choi' +put 'blog', 'hyunsik-02', 'info:date', '2014-12-03' +put 'blog', 'blrunner-01', 'content:title', 'Apache Tajo: A Big Data Warehouse System on Hadoop' +put 'blog', 'blrunner-01', 'info:author', 'Jaehwa Jung' +put 'blog', 'blrunner-01', 'info:date', '2014-10-31' +put 'blog', 'jhkim-01', 'content:title', 'APACHE TAJO⢠v0.9 HAS ARRIVED!' +put 'blog', 'jhkim-01', 'info:author', 'Jinho Kim' +put 'blog', 'jhkim-01', 'info:date', '2014-10-22' +</pre></div> +</div> +<p>And then create the table and query the table meta data with <code class="docutils literal"><span class="pre">\d</span></code> option:</p> +<div class="highlight-sql"><div class="highlight"><pre>default> \d blog; + +table name: default.blog +table path: +store type: HBASE +number of rows: unknown +volume: 0 B +Options: + 'columns'=':key,info:author,info:date,content:title' + 'table'='blog' + +schema: +rowkey TEXT +author TEXT +register_date TEXT +title TEXT +</pre></div> +</div> +<p>And then query the table as follows:</p> +<div class="highlight-sql"><div class="highlight"><pre>default> SELECT * FROM blog; +rowkey, author, register_date, title +------------------------------- +blrunner-01, Jaehwa Jung, 2014-10-31, Apache Tajo: A Big Data Warehouse System on Hadoop +hyunsik-02, Hyunsik Choi, 2014-12-03, Getting started with Tajo on your desktop +jhkim-01, Jinho Kim, 2014-10-22, APACHE TAJO⢠v0.9 HAS ARRIVED! + +default> SELECT * FROM blog WHERE rowkey = 'blrunner-01'; +Progress: 100%, response time: 2.043 sec +rowkey, author, register_date, title +------------------------------- +blrunner-01, Jaehwa Jung, 2014-10-31, Apache Tajo: A Big Data Warehouse System on Hadoop +</pre></div> +</div> +<p>Here’s how to insert data the HBase table:</p> +<div class="highlight-sql"><div class="highlight"><pre><span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">blog_backup</span><span class="p">(</span><span class="n">rowkey</span> <span class="nb">text</span><span class="p">,</span> <span class="n">author</span> <span class="nb">text</span><span class="p">,</span> <span class="n">register_date</span> <span class="nb">text</span><span class="p">,</span> <span class="n">title</span> <span class="nb">text</span><span class="p">)</span> +<span class="k">USING</span> <span class="n">hbase</span> <span class="k">WITH</span> <span class="p">(</span> + <span class="s1">'table'</span><span class="o">=</span><span class="s1">'blog_backup'</span> + <span class="p">,</span> <span class="s1">'columns'</span><span class="o">=</span><span class="s1">':key,info:author,info:date,content:title'</span><span class="p">);</span> +<span class="k">INSERT</span> <span class="n">OVERWRITE</span> <span class="k">INTO</span> <span class="n">blog_backup</span> <span class="k">SELECT</span> <span class="o">*</span> <span class="k">FROM</span> <span class="n">blog</span><span class="p">;</span> +</pre></div> +</div> +<p>Use HBase shell to verify that the data actually got loaded:</p> +<div class="highlight-sql"><div class="highlight"><pre>hbase(main):004:0> scan 'blog_backup' + ROW COLUMN+CELL + blrunner-01 column=content:title, timestamp=1421227531054, value=Apache Tajo: A Big Data Warehouse System on Hadoop + blrunner-01 column=info:author, timestamp=1421227531054, value=Jaehwa Jung + blrunner-01 column=info:date, timestamp=1421227531054, value=2014-10-31 + hyunsik-02 column=content:title, timestamp=1421227531054, value=Getting started with Tajo on your desktop + hyunsik-02 column=info:author, timestamp=1421227531054, value=Hyunsik Choi + hyunsik-02 column=info:date, timestamp=1421227531054, value=2014-12-03 + jhkim-01 column=content:title, timestamp=1421227531054, value=APACHE TAJO\xE2\x84\xA2 v0.9 HAS ARRIVED! + jhkim-01 column=info:author, timestamp=1421227531054, value=Jinho Kim + jhkim-01 column=info:date, timestamp=1421227531054, value=2014-10-22 +3 row(s) in 0.0470 seconds +</pre></div> +</div> +</div> +</div> + + + </div> + <footer> + + <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation"> + + <a href="swift_integration.html" class="btn btn-neutral float-right" title="OpenStack Swift Integration"/>Next <span class="fa fa-arrow-circle-right"></span></a> + + + <a href="hive_integration.html" class="btn btn-neutral" title="Hive Integration"><span class="fa fa-arrow-circle-left"></span> Previous</a> + + </div> + + + <hr/> + + <div role="contentinfo"> + <p> + © Copyright 2015, Apache Tajo Team. + </p> + </div> + + <a href="https://github.com/snide/sphinx_rtd_theme">Sphinx theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a> +</footer> + </div> + </div> + + </section> + + </div> + + + + + + <script type="text/javascript"> + var DOCUMENTATION_OPTIONS = { + URL_ROOT:'./', + VERSION:'0.11.0', + COLLAPSE_INDEX:false, + FILE_SUFFIX:'.html', + HAS_SOURCE: true + }; + </script> + <script type="text/javascript" src="_static/jquery.js"></script> + <script type="text/javascript" src="_static/underscore.js"></script> + <script type="text/javascript" src="_static/doctools.js"></script> + + + + + + <script type="text/javascript" src="_static/js/theme.js"></script> + + + + + <script type="text/javascript"> + jQuery(function () { + SphinxRtdTheme.StickyNav.enable(); + }); + </script> + + +</body> +</html> \ No newline at end of file
Added: tajo/site/docs/0.11.1/hive_integration.html URL: http://svn.apache.org/viewvc/tajo/site/docs/0.11.1/hive_integration.html?rev=1728394&view=auto ============================================================================== --- tajo/site/docs/0.11.1/hive_integration.html (added) +++ tajo/site/docs/0.11.1/hive_integration.html Thu Feb 4 00:29:05 2016 @@ -0,0 +1,317 @@ + + +<!DOCTYPE html> +<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]--> +<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]--> +<head> + <meta charset="utf-8"> + <meta name="viewport" content="width=device-width, initial-scale=1.0"> + + <title>Hive Integration — Apache Tajo 0.11.0 documentation</title> + + + + + + + <link href='https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700' rel='stylesheet' type='text/css'> + + + + + + + + + + <link rel="stylesheet" href="_static/css/theme.css" type="text/css" /> + + + + <link rel="top" title="Apache Tajo 0.11.0 documentation" href="index.html"/> + <link rel="next" title="HBase Integration" href="hbase_integration.html"/> + <link rel="prev" title="Backup and Restore Catalog" href="backup_and_restore/catalog.html"/> + + + <script src="https://cdnjs.cloudflare.com/ajax/libs/modernizr/2.6.2/modernizr.min.js"></script> + +</head> + +<body class="wy-body-for-nav" role="document"> + + <div class="wy-grid-for-nav"> + + + <nav data-toggle="wy-nav-shift" class="wy-nav-side"> + <div class="wy-side-nav-search"> + <a href="index.html" class="fa fa-home"> Apache Tajo</a> + <div role="search"> + <form id ="rtd-search-form" class="wy-form" action="search.html" method="get"> + <input type="text" name="q" placeholder="Search docs" /> + <input type="hidden" name="check_keywords" value="yes" /> + <input type="hidden" name="area" value="default" /> + </form> +</div> + </div> + + <div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation"> + + + <ul class="current"> +<li class="toctree-l1"><a class="reference internal" href="introduction.html">Introduction</a></li> +<li class="toctree-l1"><a class="reference internal" href="getting_started.html">Getting Started</a><ul> +<li class="toctree-l2"><a class="reference internal" href="getting_started.html#prerequisites">Prerequisites</a></li> +<li class="toctree-l2"><a class="reference internal" href="getting_started.html#dowload-and-unpack-the-source-code">Dowload and unpack the source code</a></li> +<li class="toctree-l2"><a class="reference internal" href="getting_started.html#build-source-code">Build source code</a></li> +<li class="toctree-l2"><a class="reference internal" href="getting_started.html#setting-up-a-local-tajo-cluster">Setting up a local Tajo cluster</a></li> +<li class="toctree-l2"><a class="reference internal" href="getting_started.html#first-query-execution">First query execution</a></li> +</ul> +</li> +<li class="toctree-l1"><a class="reference internal" href="configuration.html">Configuration</a><ul> +<li class="toctree-l2"><a class="reference internal" href="configuration/preliminary.html">Preliminary</a></li> +<li class="toctree-l2"><a class="reference internal" href="configuration/cluster_setup.html">Cluster Setup</a></li> +<li class="toctree-l2"><a class="reference internal" href="configuration/tajo_master_configuration.html">Tajo Master Configuration</a></li> +<li class="toctree-l2"><a class="reference internal" href="configuration/worker_configuration.html">Worker Configuration</a></li> +<li class="toctree-l2"><a class="reference internal" href="configuration/catalog_configuration.html">Catalog Configuration</a></li> +<li class="toctree-l2"><a class="reference internal" href="configuration/ha_configuration.html">High Availability for TajoMaster</a></li> +<li class="toctree-l2"><a class="reference internal" href="configuration/service_config_defaults.html">Cluster Service Configuration Defaults</a></li> +<li class="toctree-l2"><a class="reference internal" href="configuration/tajo-site-xml.html">The tajo-site.xml File</a></li> +<li class="toctree-l2"><a class="reference internal" href="configuration/catalog-site-xml.html">The catalog-site.xml File</a></li> +<li class="toctree-l2"><a class="reference internal" href="configuration/storage-site-json.html">The storage-site.json File</a></li> +</ul> +</li> +<li class="toctree-l1"><a class="reference internal" href="tsql.html">Tajo Shell (TSQL)</a><ul> +<li class="toctree-l2"><a class="reference internal" href="tsql/meta_command.html">Meta Commands</a></li> +<li class="toctree-l2"><a class="reference internal" href="tsql/dfs_command.html">Executing HDFS commands</a></li> +<li class="toctree-l2"><a class="reference internal" href="tsql/variables.html">Session Variables</a></li> +<li class="toctree-l2"><a class="reference internal" href="tsql/admin_command.html">Administration Commands</a></li> +<li class="toctree-l2"><a class="reference internal" href="tsql/intro.html">Introducing to TSQL</a></li> +<li class="toctree-l2"><a class="reference internal" href="tsql/single_command.html">Executing a single command</a></li> +<li class="toctree-l2"><a class="reference internal" href="tsql/execute_file.html">Executing Queries from Files</a></li> +<li class="toctree-l2"><a class="reference internal" href="tsql/background_command.html">Executing as background process</a></li> +</ul> +</li> +<li class="toctree-l1"><a class="reference internal" href="sql_language.html">SQL Language</a><ul> +<li class="toctree-l2"><a class="reference internal" href="sql_language/data_model.html">Data Model</a></li> +<li class="toctree-l2"><a class="reference internal" href="sql_language/ddl.html">Data Definition Language</a></li> +<li class="toctree-l2"><a class="reference internal" href="sql_language/insert.html">INSERT (OVERWRITE) INTO</a></li> +<li class="toctree-l2"><a class="reference internal" href="sql_language/alter_table.html">ALTER TABLE</a></li> +<li class="toctree-l2"><a class="reference internal" href="sql_language/queries.html">Queries</a></li> +<li class="toctree-l2"><a class="reference internal" href="sql_language/joins.html">Joins</a></li> +<li class="toctree-l2"><a class="reference internal" href="sql_language/sql_expression.html">SQL Expressions</a></li> +<li class="toctree-l2"><a class="reference internal" href="sql_language/predicates.html">Predicates</a></li> +<li class="toctree-l2"><a class="reference internal" href="sql_language/explain.html">EXPLAIN</a></li> +</ul> +</li> +<li class="toctree-l1"><a class="reference internal" href="time_zone.html">Time Zone</a><ul> +<li class="toctree-l2"><a class="reference internal" href="time_zone.html#server-cluster-time-zone">Server Cluster Time Zone</a></li> +<li class="toctree-l2"><a class="reference internal" href="time_zone.html#table-time-zone">Table Time Zone</a></li> +<li class="toctree-l2"><a class="reference internal" href="time_zone.html#client-time-zone">Client Time Zone</a></li> +<li class="toctree-l2"><a class="reference internal" href="time_zone.html#time-zone-id">Time Zone ID</a></li> +<li class="toctree-l2"><a class="reference internal" href="time_zone.html#examples-of-time-zone">Examples of Time Zone</a></li> +</ul> +</li> +<li class="toctree-l1"><a class="reference internal" href="functions.html">Functions</a><ul> +<li class="toctree-l2"><a class="reference internal" href="functions.html#built-in-scalar-functions">Built-in Scalar Functions</a></li> +<li class="toctree-l2"><a class="reference internal" href="functions.html#built-in-aggregation-functions">Built-in Aggregation Functions</a></li> +<li class="toctree-l2"><a class="reference internal" href="functions.html#built-in-window-functions">Built-in Window Functions</a></li> +<li class="toctree-l2"><a class="reference internal" href="functions.html#user-defined-functions">User-defined Functions</a></li> +</ul> +</li> +<li class="toctree-l1"><a class="reference internal" href="table_management.html">Table Management</a><ul> +<li class="toctree-l2"><a class="reference internal" href="table_management/table_overview.html">Overview of Tajo Tables</a></li> +<li class="toctree-l2"><a class="reference internal" href="table_management/tablespaces.html">Tablespaces</a></li> +<li class="toctree-l2"><a class="reference internal" href="table_management/data_formats.html">Data Formats</a></li> +<li class="toctree-l2"><a class="reference internal" href="table_management/compression.html">Compression</a></li> +</ul> +</li> +<li class="toctree-l1"><a class="reference internal" href="table_partitioning.html">Table Partitioning</a><ul> +<li class="toctree-l2"><a class="reference internal" href="partitioning/intro_to_partitioning.html">Introduction to Partitioning</a></li> +<li class="toctree-l2"><a class="reference internal" href="partitioning/column_partitioning.html">Column Partitioning</a></li> +<li class="toctree-l2"><a class="reference internal" href="partitioning/range_partitioning.html">Range Partitioning</a></li> +<li class="toctree-l2"><a class="reference internal" href="partitioning/hash_partitioning.html">Hash Partitioning</a></li> +</ul> +</li> +<li class="toctree-l1"><a class="reference internal" href="storage_plugins.html">Storage Plugin</a><ul> +<li class="toctree-l2"><a class="reference internal" href="storage_plugins/overview.html">Storage Plugin Overview</a></li> +<li class="toctree-l2"><a class="reference internal" href="storage_plugins/postgresql.html">PostgreSQL Storage Handler</a></li> +</ul> +</li> +<li class="toctree-l1"><a class="reference internal" href="index_overview.html">Index (Experimental Feature)</a><ul> +<li class="toctree-l2"><a class="reference internal" href="index/types.html">Index Types</a></li> +<li class="toctree-l2"><a class="reference internal" href="index/how_to_use.html">How to use index?</a></li> +<li class="toctree-l2"><a class="reference internal" href="index/future_work.html">Future Works</a></li> +</ul> +</li> +<li class="toctree-l1"><a class="reference internal" href="backup_and_restore.html">Backup and Restore</a><ul> +<li class="toctree-l2"><a class="reference internal" href="backup_and_restore/catalog.html">Backup and Restore Catalog</a></li> +</ul> +</li> +<li class="toctree-l1 current"><a class="current reference internal" href="">Hive Integration</a></li> +<li class="toctree-l1"><a class="reference internal" href="hbase_integration.html">HBase Integration</a><ul> +<li class="toctree-l2"><a class="reference internal" href="hbase_integration.html#create-table">CREATE TABLE</a></li> +<li class="toctree-l2"><a class="reference internal" href="hbase_integration.html#drop-table">DROP TABLE</a></li> +<li class="toctree-l2"><a class="reference internal" href="hbase_integration.html#insert-overwrite-into">INSERT (OVERWRITE) INTO</a></li> +<li class="toctree-l2"><a class="reference internal" href="hbase_integration.html#usage">Usage</a></li> +</ul> +</li> +<li class="toctree-l1"><a class="reference internal" href="swift_integration.html">OpenStack Swift Integration</a><ul> +<li class="toctree-l2"><a class="reference internal" href="swift_integration.html#swift-configuration">Swift configuration</a></li> +<li class="toctree-l2"><a class="reference internal" href="swift_integration.html#hadoop-configurations">Hadoop configurations</a></li> +<li class="toctree-l2"><a class="reference internal" href="swift_integration.html#tajo-configuration">Tajo configuration</a></li> +<li class="toctree-l2"><a class="reference internal" href="swift_integration.html#querying-on-swift">Querying on Swift</a></li> +</ul> +</li> +<li class="toctree-l1"><a class="reference internal" href="jdbc_driver.html">Tajo JDBC Driver</a><ul> +<li class="toctree-l2"><a class="reference internal" href="jdbc_driver.html#how-to-get-jdbc-driver">How to get JDBC driver</a></li> +<li class="toctree-l2"><a class="reference internal" href="jdbc_driver.html#setting-the-classpath">Setting the CLASSPATH</a></li> +<li class="toctree-l2"><a class="reference internal" href="jdbc_driver.html#connecting-to-the-tajo-cluster-instance">Connecting to the Tajo cluster instance</a></li> +<li class="toctree-l2"><a class="reference internal" href="jdbc_driver.html#connection-parameters">Connection Parameters</a></li> +<li class="toctree-l2"><a class="reference internal" href="jdbc_driver.html#an-example-jdbc-client">An Example JDBC Client</a></li> +</ul> +</li> +<li class="toctree-l1"><a class="reference internal" href="tajo_client_api.html">Tajo Client API</a></li> +<li class="toctree-l1"><a class="reference internal" href="faq.html">FAQ</a></li> +</ul> + + + </div> + + </nav> + + <section data-toggle="wy-nav-shift" class="wy-nav-content-wrap"> + + + <nav class="wy-nav-top" role="navigation" aria-label="top navigation"> + <i data-toggle="wy-nav-top" class="fa fa-bars"></i> + <a href="index.html">Apache Tajo</a> + </nav> + + + + <div class="wy-nav-content"> + <div class="rst-content"> + <div role="navigation" aria-label="breadcrumbs navigation"> + <ul class="wy-breadcrumbs"> + <li><a href="index.html">Docs</a> »</li> + + <li>Hive Integration</li> + <li class="wy-breadcrumbs-aside"> + + <a href="_sources/hive_integration.txt" rel="nofollow"> View page source</a> + + </li> + </ul> + <hr/> +</div> + <div role="main"> + + <div class="section" id="hive-integration"> +<h1>Hive Integration<a class="headerlink" href="#hive-integration" title="Permalink to this headline">¶</a></h1> +<p>Apache Tajo⢠catalog supports HiveCatalogStore to integrate with Apache Hiveâ¢. +This integration allows Tajo to access all tables used in Apache Hive. +Depending on your purpose, you can execute either SQL queries or HiveQL queries on the +same tables managed in Apache Hive.</p> +<p>In order to use this feature, you need to build Tajo with a specified maven profile +and then add some configs into <code class="docutils literal"><span class="pre">conf/tajo-env.sh</span></code> and <code class="docutils literal"><span class="pre">conf/catalog-site.xml</span></code>. +This section describes how to setup HiveMetaStore integration. +This instruction would take no more than five minutes.</p> +<p>You need to set your Hive home directory to the environment variable <strong>HIVE_HOME</strong> in <code class="docutils literal"><span class="pre">conf/tajo-env.sh</span></code> as follows:</p> +<div class="highlight-sh"><div class="highlight"><pre><span class="nb">export</span> <span class="nv">HIVE_HOME</span><span class="o">=</span>/path/to/your/hive/directory +</pre></div> +</div> +<p>If you need to use jdbc to connect HiveMetaStore, you have to prepare MySQL jdbc driver. +Next, you should set the path of MySQL JDBC driver jar file to the environment variable <strong>HIVE_JDBC_DRIVER_DIR</strong> in <code class="docutils literal"><span class="pre">conf/tajo-env.sh</span></code> as follows:</p> +<div class="highlight-sh"><div class="highlight"><pre><span class="nb">export</span> <span class="nv">HIVE_JDBC_DRIVER_DIR</span><span class="o">=</span>/path/to/your/mysql_jdbc_driver/mysql-connector-java-x.x.x-bin.jar +</pre></div> +</div> +<p>Finally, you should specify HiveCatalogStore as Tajo catalog driver class in <code class="docutils literal"><span class="pre">conf/catalog-site.xml</span></code> as follows:</p> +<div class="highlight-xml"><div class="highlight"><pre><span class="nt"><property></span> + <span class="nt"><name></span>tajo.catalog.store.class<span class="nt"></name></span> + <span class="nt"><value></span>org.apache.tajo.catalog.store.HiveCatalogStore<span class="nt"></value></span> +<span class="nt"></property></span> +</pre></div> +</div> +<div class="admonition note"> +<p class="first admonition-title">Note</p> +<p>Hive stores a list of partitions for each table in its metastore. When new partitions are +added directly to HDFS, HiveMetastore can’t recognize these partitions until the user executes +<code class="docutils literal"><span class="pre">ALTER</span> <span class="pre">TABLE</span> <span class="pre">table_name</span> <span class="pre">ADD</span> <span class="pre">PARTITION</span></code> commands on each of the newly added partitions or +<code class="docutils literal"><span class="pre">MSCK</span> <span class="pre">REPAIR</span> <span class="pre">TABLE</span> <span class="pre">table_name</span></code> command.</p> +<p>But current Tajo doesn’t provide <code class="docutils literal"><span class="pre">ADD</span> <span class="pre">PARTITION</span></code> command and Hive doesn’t provide an api for +responding to <code class="docutils literal"><span class="pre">MSK</span> <span class="pre">REPAIR</span> <span class="pre">TABLE</span></code> command. Thus, if you insert data to Hive partitioned +table and you want to scan the updated partitions through Tajo, you must run following command on Hive +(see <a class="reference external" href="https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-RecoverPartitions(MSCKREPAIRTABLE)">Hive doc</a> +for more details of the command):</p> +<div class="last highlight-sql"><div class="highlight"><pre><span class="n">MSCK</span> <span class="n">REPAIR</span> <span class="k">TABLE</span> <span class="p">[</span><span class="k">table_name</span><span class="p">];</span> +</pre></div> +</div> +</div> +</div> + + + </div> + <footer> + + <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation"> + + <a href="hbase_integration.html" class="btn btn-neutral float-right" title="HBase Integration"/>Next <span class="fa fa-arrow-circle-right"></span></a> + + + <a href="backup_and_restore/catalog.html" class="btn btn-neutral" title="Backup and Restore Catalog"><span class="fa fa-arrow-circle-left"></span> Previous</a> + + </div> + + + <hr/> + + <div role="contentinfo"> + <p> + © Copyright 2015, Apache Tajo Team. + </p> + </div> + + <a href="https://github.com/snide/sphinx_rtd_theme">Sphinx theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a> +</footer> + </div> + </div> + + </section> + + </div> + + + + + + <script type="text/javascript"> + var DOCUMENTATION_OPTIONS = { + URL_ROOT:'./', + VERSION:'0.11.0', + COLLAPSE_INDEX:false, + FILE_SUFFIX:'.html', + HAS_SOURCE: true + }; + </script> + <script type="text/javascript" src="_static/jquery.js"></script> + <script type="text/javascript" src="_static/underscore.js"></script> + <script type="text/javascript" src="_static/doctools.js"></script> + + + + + + <script type="text/javascript" src="_static/js/theme.js"></script> + + + + + <script type="text/javascript"> + jQuery(function () { + SphinxRtdTheme.StickyNav.enable(); + }); + </script> + + +</body> +</html> \ No newline at end of file
