http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/75c46918/docs/build/html/topics/impala_langref_unsupported.html ---------------------------------------------------------------------- diff --git a/docs/build/html/topics/impala_langref_unsupported.html b/docs/build/html/topics/impala_langref_unsupported.html new file mode 100644 index 0000000..66a4b19 --- /dev/null +++ b/docs/build/html/topics/impala_langref_unsupported.html @@ -0,0 +1,329 @@ +<!DOCTYPE html + SYSTEM "about:legacy-compat"> +<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) Copyright 2017"><meta name="DC.rights.owner" content="(C) Copyright 2017"><meta name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" content="../topics/impala_langref.html"><meta name="prodname" content="Impala"><meta name="prodname" content="Impala"><meta name="version" content="Impala 2.8.x"><meta name="version" content="Impala 2.8.x"><meta name="DC.Format" content="XHTML"><meta name="DC.Identifier" content="langref_hiveql_delta"><link rel="stylesheet" type="text/css" href="../commonltr.css"><title>SQL Differences Between Impala and Hive</title></head><body id="langref_hiveql_delta"><main role="main"><article role="article" aria-labelledby="ariaid-title1"> + + <h1 class="title topictitle1" id="ariaid-title1">SQL Differences Between Impala and Hive</h1> + + + <div class="body conbody"> + + <p class="p"> + + + Impala's SQL syntax follows the SQL-92 standard, and includes many industry extensions in areas such as + built-in functions. See <a class="xref" href="impala_porting.html#porting">Porting SQL from Other Database Systems to Impala</a> for a general discussion of adapting SQL + code from a variety of database systems to Impala. + </p> + + <p class="p"> + Because Impala and Hive share the same metastore database and their tables are often used interchangeably, + the following section covers differences between Impala and Hive in detail. + </p> + + <p class="p toc inpage"></p> + </div> + + <nav role="navigation" class="related-links"><div class="familylinks"><div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_langref.html">Impala SQL Language Reference</a></div></div></nav><article class="topic concept nested1" aria-labelledby="ariaid-title2" id="langref_hiveql_delta__langref_hiveql_unsupported"> + + <h2 class="title topictitle2" id="ariaid-title2">HiveQL Features not Available in Impala</h2> + + <div class="body conbody"> + + <p class="p"> + The current release of Impala does not support the following SQL features that you might be familiar with + from HiveQL: + </p> + + + + <ul class="ul"> + + + <li class="li"> + Extensibility mechanisms such as <code class="ph codeph">TRANSFORM</code>, custom file formats, or custom SerDes. + </li> + + <li class="li"> + The <code class="ph codeph">DATE</code> data type. + </li> + + <li class="li"> + XML and JSON functions. + </li> + + <li class="li"> + Certain aggregate functions from HiveQL: <code class="ph codeph">covar_pop</code>, <code class="ph codeph">covar_samp</code>, + <code class="ph codeph">corr</code>, <code class="ph codeph">percentile</code>, <code class="ph codeph">percentile_approx</code>, + <code class="ph codeph">histogram_numeric</code>, <code class="ph codeph">collect_set</code>; Impala supports the set of aggregate + functions listed in <a class="xref" href="impala_aggregate_functions.html#aggregate_functions">Impala Aggregate Functions</a> and analytic + functions listed in <a class="xref" href="impala_analytic_functions.html#analytic_functions">Impala Analytic Functions</a>. + </li> + + <li class="li"> + Sampling. + </li> + + <li class="li"> + Lateral views. In <span class="keyword">Impala 2.3</span> and higher, Impala supports queries on complex types + (<code class="ph codeph">STRUCT</code>, <code class="ph codeph">ARRAY</code>, or <code class="ph codeph">MAP</code>), using join notation + rather than the <code class="ph codeph">EXPLODE()</code> keyword. + See <a class="xref" href="impala_complex_types.html#complex_types">Complex Types (Impala 2.3 or higher only)</a> for details about Impala support for complex types. + </li> + + <li class="li"> + Multiple <code class="ph codeph">DISTINCT</code> clauses per query, although Impala includes some workarounds for this + limitation. + <div class="note note note_note"><span class="note__title notetitle">Note:</span> + <p class="p"> + By default, Impala only allows a single <code class="ph codeph">COUNT(DISTINCT <var class="keyword varname">columns</var>)</code> + expression in each query. + </p> + <p class="p"> + If you do not need precise accuracy, you can produce an estimate of the distinct values for a column by + specifying <code class="ph codeph">NDV(<var class="keyword varname">column</var>)</code>; a query can contain multiple instances of + <code class="ph codeph">NDV(<var class="keyword varname">column</var>)</code>. To make Impala automatically rewrite + <code class="ph codeph">COUNT(DISTINCT)</code> expressions to <code class="ph codeph">NDV()</code>, enable the + <code class="ph codeph">APPX_COUNT_DISTINCT</code> query option. + </p> + <p class="p"> + To produce the same result as multiple <code class="ph codeph">COUNT(DISTINCT)</code> expressions, you can use the + following technique for queries involving a single table: + </p> +<pre class="pre codeblock"><code>select v1.c1 result1, v2.c1 result2 from + (select count(distinct col1) as c1 from t1) v1 + cross join + (select count(distinct col2) as c1 from t1) v2; +</code></pre> + <p class="p"> + Because <code class="ph codeph">CROSS JOIN</code> is an expensive operation, prefer to use the <code class="ph codeph">NDV()</code> + technique wherever practical. + </p> + </div> + </li> + </ul> + + <div class="p"> + User-defined functions (UDFs) are supported starting in Impala 1.2. See <a class="xref" href="impala_udf.html#udfs">Impala User-Defined Functions (UDFs)</a> + for full details on Impala UDFs. + <ul class="ul"> + <li class="li"> + <p class="p"> + Impala supports high-performance UDFs written in C++, as well as reusing some Java-based Hive UDFs. + </p> + </li> + + <li class="li"> + <p class="p"> + Impala supports scalar UDFs and user-defined aggregate functions (UDAFs). Impala does not currently + support user-defined table generating functions (UDTFs). + </p> + </li> + + <li class="li"> + <p class="p"> + Only Impala-supported column types are supported in Java-based UDFs. + </p> + </li> + + <li class="li"> + <p class="p"> + The Hive <code class="ph codeph">current_user()</code> function cannot be + called from a Java UDF through Impala. + </p> + </li> + </ul> + </div> + + <p class="p"> + Impala does not currently support these HiveQL statements: + </p> + + <ul class="ul"> + <li class="li"> + <code class="ph codeph">ANALYZE TABLE</code> (the Impala equivalent is <code class="ph codeph">COMPUTE STATS</code>) + </li> + + <li class="li"> + <code class="ph codeph">DESCRIBE COLUMN</code> + </li> + + <li class="li"> + <code class="ph codeph">DESCRIBE DATABASE</code> + </li> + + <li class="li"> + <code class="ph codeph">EXPORT TABLE</code> + </li> + + <li class="li"> + <code class="ph codeph">IMPORT TABLE</code> + </li> + + <li class="li"> + <code class="ph codeph">SHOW TABLE EXTENDED</code> + </li> + + <li class="li"> + <code class="ph codeph">SHOW INDEXES</code> + </li> + + <li class="li"> + <code class="ph codeph">SHOW COLUMNS</code> + </li> + + <li class="li"> + <code class="ph codeph">INSERT OVERWRITE DIRECTORY</code>; use <code class="ph codeph">INSERT OVERWRITE <var class="keyword varname">table_name</var></code> + or <code class="ph codeph">CREATE TABLE AS SELECT</code> to materialize query results into the HDFS directory associated + with an Impala table. + </li> + </ul> + </div> + </article> + + <article class="topic concept nested1" aria-labelledby="ariaid-title3" id="langref_hiveql_delta__langref_hiveql_semantics"> + + <h2 class="title topictitle2" id="ariaid-title3">Semantic Differences Between Impala and HiveQL Features</h2> + + <div class="body conbody"> + + <p class="p"> + This section covers instances where Impala and Hive have similar functionality, sometimes including the + same syntax, but there are differences in the runtime semantics of those features. + </p> + + <p class="p"> + <strong class="ph b">Security:</strong> + </p> + + <p class="p"> + Impala utilizes the <a class="xref" href="http://sentry.incubator.apache.org/" target="_blank">Apache + Sentry </a> authorization framework, which provides fine-grained role-based access control + to protect data against unauthorized access or tampering. + </p> + + <p class="p"> + The Hive component now includes Sentry-enabled <code class="ph codeph">GRANT</code>, + <code class="ph codeph">REVOKE</code>, and <code class="ph codeph">CREATE/DROP ROLE</code> statements. Earlier Hive releases had a + privilege system with <code class="ph codeph">GRANT</code> and <code class="ph codeph">REVOKE</code> statements that were primarily + intended to prevent accidental deletion of data, rather than a security mechanism to protect against + malicious users. + </p> + + <p class="p"> + Impala can make use of privileges set up through Hive <code class="ph codeph">GRANT</code> and <code class="ph codeph">REVOKE</code> statements. + Impala has its own <code class="ph codeph">GRANT</code> and <code class="ph codeph">REVOKE</code> statements in Impala 2.0 and higher. + See <a class="xref" href="impala_authorization.html#authorization">Enabling Sentry Authorization for Impala</a> for the details of authorization in Impala, including + how to switch from the original policy file-based privilege model to the Sentry service using privileges + stored in the metastore database. + </p> + + <p class="p"> + <strong class="ph b">SQL statements and clauses:</strong> + </p> + + <p class="p"> + The semantics of Impala SQL statements varies from HiveQL in some cases where they use similar SQL + statement and clause names: + </p> + + <ul class="ul"> + <li class="li"> + Impala uses different syntax and names for query hints, <code class="ph codeph">[SHUFFLE]</code> and + <code class="ph codeph">[NOSHUFFLE]</code> rather than <code class="ph codeph">MapJoin</code> or <code class="ph codeph">StreamJoin</code>. See + <a class="xref" href="impala_joins.html#joins">Joins in Impala SELECT Statements</a> for the Impala details. + </li> + + <li class="li"> + Impala does not expose MapReduce specific features of <code class="ph codeph">SORT BY</code>, <code class="ph codeph">DISTRIBUTE + BY</code>, or <code class="ph codeph">CLUSTER BY</code>. + </li> + + <li class="li"> + Impala does not require queries to include a <code class="ph codeph">FROM</code> clause. + </li> + </ul> + + <p class="p"> + <strong class="ph b">Data types:</strong> + </p> + + <ul class="ul"> + <li class="li"> + Impala supports a limited set of implicit casts. This can help avoid undesired results from unexpected + casting behavior. + <ul class="ul"> + <li class="li"> + Impala does not implicitly cast between string and numeric or Boolean types. Always use + <code class="ph codeph">CAST()</code> for these conversions. + </li> + + <li class="li"> + Impala does perform implicit casts among the numeric types, when going from a smaller or less precise + type to a larger or more precise one. For example, Impala will implicitly convert a + <code class="ph codeph">SMALLINT</code> to a <code class="ph codeph">BIGINT</code> or <code class="ph codeph">FLOAT</code>, but to convert from + <code class="ph codeph">DOUBLE</code> to <code class="ph codeph">FLOAT</code> or <code class="ph codeph">INT</code> to <code class="ph codeph">TINYINT</code> + requires a call to <code class="ph codeph">CAST()</code> in the query. + </li> + + <li class="li"> + Impala does perform implicit casts from string to timestamp. Impala has a restricted set of literal + formats for the <code class="ph codeph">TIMESTAMP</code> data type and the <code class="ph codeph">from_unixtime()</code> format + string; see <a class="xref" href="impala_timestamp.html#timestamp">TIMESTAMP Data Type</a> for details. + </li> + </ul> + <p class="p"> + See <a class="xref" href="impala_datatypes.html#datatypes">Data Types</a> for full details on implicit and explicit casting for + all types, and <a class="xref" href="impala_conversion_functions.html#conversion_functions">Impala Type Conversion Functions</a> for details about + the <code class="ph codeph">CAST()</code> function. + </p> + </li> + + <li class="li"> + Impala does not store or interpret timestamps using the local timezone, to avoid undesired results from + unexpected time zone issues. Timestamps are stored and interpreted relative to UTC. This difference can + produce different results for some calls to similarly named date/time functions between Impala and Hive. + See <a class="xref" href="impala_datetime_functions.html#datetime_functions">Impala Date and Time Functions</a> for details about the Impala + functions. See <a class="xref" href="impala_timestamp.html#timestamp">TIMESTAMP Data Type</a> for a discussion of how Impala handles + time zones, and configuration options you can use to make Impala match the Hive behavior more closely + when dealing with Parquet-encoded <code class="ph codeph">TIMESTAMP</code> data or when converting between + the local time zone and UTC. + </li> + + <li class="li"> + The Impala <code class="ph codeph">TIMESTAMP</code> type can represent dates ranging from 1400-01-01 to 9999-12-31. + This is different from the Hive date range, which is 0000-01-01 to 9999-12-31. + </li> + + <li class="li"> + <p class="p"> + Impala does not return column overflows as <code class="ph codeph">NULL</code>, so that customers can distinguish + between <code class="ph codeph">NULL</code> data and overflow conditions similar to how they do so with traditional + database systems. Impala returns the largest or smallest value in the range for the type. For example, + valid values for a <code class="ph codeph">tinyint</code> range from -128 to 127. In Impala, a <code class="ph codeph">tinyint</code> + with a value of -200 returns -128 rather than <code class="ph codeph">NULL</code>. A <code class="ph codeph">tinyint</code> with a + value of 200 returns 127. + </p> + </li> + + </ul> + + <p class="p"> + <strong class="ph b">Miscellaneous features:</strong> + </p> + + <ul class="ul"> + <li class="li"> + Impala does not provide virtual columns. + </li> + + <li class="li"> + Impala does not expose locking. + </li> + + <li class="li"> + Impala does not expose some configuration properties. + </li> + </ul> + </div> + </article> +</article></main></body></html> \ No newline at end of file
http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/75c46918/docs/build/html/topics/impala_ldap.html ---------------------------------------------------------------------- diff --git a/docs/build/html/topics/impala_ldap.html b/docs/build/html/topics/impala_ldap.html new file mode 100644 index 0000000..e4aaf52 --- /dev/null +++ b/docs/build/html/topics/impala_ldap.html @@ -0,0 +1,294 @@ +<!DOCTYPE html + SYSTEM "about:legacy-compat"> +<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) Copyright 2017"><meta name="DC.rights.owner" content="(C) Copyright 2017"><meta name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" content="../topics/impala_authentication.html"><meta name="prodname" content="Impala"><meta name="prodname" content="Impala"><meta name="version" content="Impala 2.8.x"><meta name="version" content="Impala 2.8.x"><meta name="DC.Format" content="XHTML"><meta name="DC.Identifier" content="ldap"><link rel="stylesheet" type="text/css" href="../commonltr.css"><title>Enabling LDAP Authentication for Impala</title></head><body id="ldap"><main role="main"><article role="article" aria-labelledby="ariaid-title1"> + + <h1 class="title topictitle1" id="ariaid-title1">Enabling LDAP Authentication for Impala</h1> + + + <div class="body conbody"> + + + + <p class="p"> Authentication is the process of allowing only specified named users to + access the server (in this case, the Impala server). This feature is + crucial for any production deployment, to prevent misuse, tampering, or + excessive load on the server. Impala uses LDAP for authentication, + verifying the credentials of each user who connects through + <span class="keyword cmdname">impala-shell</span>, Hue, a Business Intelligence tool, JDBC + or ODBC application, and so on. </p> + + <div class="note note note_note"><span class="note__title notetitle">Note:</span> + Regardless of the authentication mechanism used, Impala always creates HDFS directories and data files + owned by the same user (typically <code class="ph codeph">impala</code>). To implement user-level access to different + databases, tables, columns, partitions, and so on, use the Sentry authorization feature, as explained in + <a class="xref" href="../shared/../topics/impala_authorization.html#authorization">Enabling Sentry Authorization for Impala</a>. + </div> + + <p class="p"> + An alternative form of authentication you can use is Kerberos, described in + <a class="xref" href="impala_kerberos.html#kerberos">Enabling Kerberos Authentication for Impala</a>. + </p> + + <p class="p toc inpage"></p> + + </div> + + <nav role="navigation" class="related-links"><div class="familylinks"><div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_authentication.html">Impala Authentication</a></div></div></nav><article class="topic concept nested1" aria-labelledby="ariaid-title2" id="ldap__ldap_prereqs"> + + <h2 class="title topictitle2" id="ariaid-title2">Requirements for Using Impala with LDAP</h2> + + + <div class="body conbody"> + + <p class="p"> + Authentication against LDAP servers is available in Impala 1.2.2 and higher. Impala 1.4.0 adds support for + secure LDAP authentication through SSL and TLS. + </p> + + <p class="p"> + The Impala LDAP support lets you use Impala with systems such as Active Directory that use LDAP behind the + scenes. + </p> + </div> + </article> + + <article class="topic concept nested1" aria-labelledby="ariaid-title3" id="ldap__ldap_client_server"> + + <h2 class="title topictitle2" id="ariaid-title3">Client-Server Considerations for LDAP</h2> + + <div class="body conbody"> + + <p class="p"> + Only client->Impala connections can be authenticated by LDAP. + </p> + + <p class="p"> You must use the Kerberos authentication mechanism for connections + between internal Impala components, such as between the + <span class="keyword cmdname">impalad</span>, <span class="keyword cmdname">statestored</span>, and + <span class="keyword cmdname">catalogd</span> daemons. See <a class="xref" href="impala_kerberos.html#kerberos">Enabling Kerberos Authentication for Impala</a> on how to set up Kerberos for + Impala. </p> + </div> + </article> + + <article class="topic concept nested1" aria-labelledby="ariaid-title4" id="ldap__ldap_config"> + + <h2 class="title topictitle2" id="ariaid-title4">Server-Side LDAP Setup</h2> + + <div class="body conbody"> + + <p class="p"> + These requirements apply on the server side when configuring and starting Impala: + </p> + + <p class="p"> + To enable LDAP authentication, set the following startup options for <span class="keyword cmdname">impalad</span>: + </p> + + <ul class="ul"> + <li class="li"> + <code class="ph codeph">--enable_ldap_auth</code> enables LDAP-based authentication between the client and Impala. + </li> + + <li class="li"> + <code class="ph codeph">--ldap_uri</code> sets the URI of the LDAP server to use. Typically, the URI is prefixed with + <code class="ph codeph">ldap://</code>. In Impala 1.4.0 and higher, you can specify secure SSL-based LDAP transport by + using the prefix <code class="ph codeph">ldaps://</code>. The URI can optionally specify the port, for example: + <code class="ph codeph">ldap://ldap_server.example.com:389</code> or + <code class="ph codeph">ldaps://ldap_server.example.com:636</code>. (389 and 636 are the default ports for non-SSL and + SSL LDAP connections, respectively.) + </li> + + + + <li class="li"> + For <code class="ph codeph">ldaps://</code> connections secured by SSL, + <code class="ph codeph">--ldap_ca_certificate="<var class="keyword varname">/path/to/certificate/pem</var>"</code> specifies the + location of the certificate in standard <code class="ph codeph">.PEM</code> format. Store this certificate on the local + filesystem, in a location that only the <code class="ph codeph">impala</code> user and other trusted users can read. + </li> + + + </ul> + </div> + </article> + + <article class="topic concept nested1" aria-labelledby="ariaid-title5" id="ldap__ldap_bind_strings"> + + <h2 class="title topictitle2" id="ariaid-title5">Support for Custom Bind Strings</h2> + + <div class="body conbody"> + + <p class="p"> + When Impala connects to LDAP it issues a bind call to the LDAP server to authenticate as the connected + user. Impala clients, including the Impala shell, provide the short name of the user to Impala. This is + necessary so that Impala can use Sentry for role-based access, which uses short names. + </p> + + <p class="p"> + However, LDAP servers often require more complex, structured usernames for authentication. Impala supports + three ways of transforming the short name (for example, <code class="ph codeph">'henry'</code>) to a more complicated + string. If necessary, specify one of the following configuration options + when starting the <span class="keyword cmdname">impalad</span> daemon on each DataNode: + </p> + + <ul class="ul"> + <li class="li"> + <code class="ph codeph">--ldap_domain</code>: Replaces the username with a string + <code class="ph codeph"><var class="keyword varname">username</var>@<var class="keyword varname">ldap_domain</var></code>. + </li> + + <li class="li"> + <code class="ph codeph">--ldap_baseDN</code>: Replaces the username with a <span class="q">"distinguished name"</span> (DN) of the form: + <code class="ph codeph">uid=<var class="keyword varname">userid</var>,ldap_baseDN</code>. (This is equivalent to a Hive option). + </li> + + <li class="li"> + <code class="ph codeph">--ldap_bind_pattern</code>: This is the most general option, and replaces the username with the + string <var class="keyword varname">ldap_bind_pattern</var> where all instances of the string <code class="ph codeph">#UID</code> are + replaced with <var class="keyword varname">userid</var>. For example, an <code class="ph codeph">ldap_bind_pattern</code> of + <code class="ph codeph">"user=#UID,OU=foo,CN=bar"</code> with a username of <code class="ph codeph">henry</code> will construct a + bind name of <code class="ph codeph">"user=henry,OU=foo,CN=bar"</code>. + </li> + </ul> + + <p class="p"> + These options are mutually exclusive; Impala does not start if more than one of these options is specified. + </p> + </div> + </article> + + <article class="topic concept nested1" aria-labelledby="ariaid-title6" id="ldap__ldap_security"> + + <h2 class="title topictitle2" id="ariaid-title6">Secure LDAP Connections</h2> + + <div class="body conbody"> + + <p class="p"> + To avoid sending credentials over the wire in cleartext, you must configure a secure connection between + both the client and Impala, and between Impala and the LDAP server. The secure connection could use SSL or + TLS. + </p> + + <p class="p"> + <strong class="ph b">Secure LDAP connections through SSL:</strong> + </p> + + <p class="p"> + For SSL-enabled LDAP connections, specify a prefix of <code class="ph codeph">ldaps://</code> instead of + <code class="ph codeph">ldap://</code>. Also, the default port for SSL-enabled LDAP connections is 636 instead of 389. + </p> + + <p class="p"> + <strong class="ph b">Secure LDAP connections through TLS:</strong> + </p> + + <p class="p"> + <a class="xref" href="http://en.wikipedia.org/wiki/Transport_Layer_Security" target="_blank">TLS</a>, + the successor to the SSL protocol, is supported by most modern LDAP servers. Unlike SSL connections, TLS + connections can be made on the same server port as non-TLS connections. To secure all connections using + TLS, specify the following flags as startup options to the <span class="keyword cmdname">impalad</span> daemon: + </p> + + <ul class="ul"> + <li class="li"> + <code class="ph codeph">--ldap_tls</code> tells Impala to start a TLS connection to the LDAP server, and to fail + authentication if it cannot be done. + </li> + + <li class="li"> + <code class="ph codeph">--ldap_ca_certificate="<var class="keyword varname">/path/to/certificate/pem</var>"</code> specifies the + location of the certificate in standard <code class="ph codeph">.PEM</code> format. Store this certificate on the local + filesystem, in a location that only the <code class="ph codeph">impala</code> user and other trusted users can read. + </li> + </ul> + </div> + </article> + + <article class="topic concept nested1" aria-labelledby="ariaid-title7" id="ldap__ldap_impala_shell"> + + <h2 class="title topictitle2" id="ariaid-title7">LDAP Authentication for impala-shell Interpreter</h2> + + <div class="body conbody"> + + <p class="p"> + To connect to Impala using LDAP authentication, you specify command-line options to the + <span class="keyword cmdname">impala-shell</span> command interpreter and enter the password when prompted: + </p> + + <ul class="ul"> + <li class="li"> + <code class="ph codeph">-l</code> enables LDAP authentication. + </li> + + <li class="li"> + <code class="ph codeph">-u</code> sets the user. Per Active Directory, the user is the short username, not the full + LDAP distinguished name. If your LDAP settings include a search base, use the + <code class="ph codeph">--ldap_bind_pattern</code> on the <span class="keyword cmdname">impalad</span> daemon to translate the short user + name from <span class="keyword cmdname">impala-shell</span> automatically to the fully qualified name. + + </li> + + <li class="li"> + <span class="keyword cmdname">impala-shell</span> automatically prompts for the password. + </li> + </ul> + + <p class="p"> + For the full list of available <span class="keyword cmdname">impala-shell</span> options, see + <a class="xref" href="impala_shell_options.html#shell_options">impala-shell Configuration Options</a>. + </p> + + <p class="p"> + <strong class="ph b">LDAP authentication for JDBC applications:</strong> See <a class="xref" href="impala_jdbc.html#impala_jdbc">Configuring Impala to Work with JDBC</a> for the + format to use with the JDBC connection string for servers using LDAP authentication. + </p> + </div> + </article> + <article class="topic concept nested1" aria-labelledby="ariaid-title8" id="ldap__ldap_impala_hue"> + <h2 class="title topictitle2" id="ariaid-title8">Enabling LDAP for Impala in Hue</h2> + + <div class="body conbody"> + <section class="section" id="ldap_impala_hue__ldap_impala_hue_cmdline"><h3 class="title sectiontitle">Enabling LDAP for Impala in Hue Using the Command Line</h3> + + <div class="p">LDAP authentication for the Impala app in Hue can be enabled by + setting the following properties under the <code class="ph codeph">[impala]</code> + section in <code class="ph codeph">hue.ini</code>. <table class="table" id="ldap_impala_hue__ldap_impala_hue_configs"><caption></caption><colgroup><col style="width:33.33333333333333%"><col style="width:66.66666666666666%"></colgroup><tbody class="tbody"> + <tr class="row"> + <td class="entry nocellnorowborder"><code class="ph codeph">auth_username</code></td> + <td class="entry nocellnorowborder">LDAP username of Hue user to be authenticated.</td> + </tr> + <tr class="row"> + <td class="entry nocellnorowborder"><code class="ph codeph">auth_password</code></td> + <td class="entry nocellnorowborder"> + <p class="p">LDAP password of Hue user to be authenticated.</p> + </td> + </tr> + </tbody></table>These login details are only used by Impala to authenticate to + LDAP. The Impala service trusts Hue to have already validated the user + being impersonated, rather than simply passing on the credentials.</div> + </section> + </div> + </article> + + <article class="topic concept nested1" aria-labelledby="ariaid-title9" id="ldap__ldap_delegation"> + <h2 class="title topictitle2" id="ariaid-title9">Enabling Impala Delegation for LDAP Users</h2> + <div class="body conbody"> + <p class="p"> + See <a class="xref" href="impala_delegation.html#delegation">Configuring Impala Delegation for Hue and BI Tools</a> for details about the delegation feature + that lets certain users submit queries using the credentials of other users. + </p> + </div> + </article> + + <article class="topic concept nested1" aria-labelledby="ariaid-title10" id="ldap__ldap_restrictions"> + + <h2 class="title topictitle2" id="ariaid-title10">LDAP Restrictions for Impala</h2> + + <div class="body conbody"> + + <p class="p"> + The LDAP support is preliminary. It currently has only been tested against Active Directory. + </p> + </div> + </article> +</article></main></body></html> \ No newline at end of file http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/75c46918/docs/build/html/topics/impala_limit.html ---------------------------------------------------------------------- diff --git a/docs/build/html/topics/impala_limit.html b/docs/build/html/topics/impala_limit.html new file mode 100644 index 0000000..a4e94d0 --- /dev/null +++ b/docs/build/html/topics/impala_limit.html @@ -0,0 +1,168 @@ +<!DOCTYPE html + SYSTEM "about:legacy-compat"> +<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) Copyright 2017"><meta name="DC.rights.owner" content="(C) Copyright 2017"><meta name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" content="../topics/impala_select.html"><meta name="prodname" content="Impala"><meta name="prodname" content="Impala"><meta name="version" content="Impala 2.8.x"><meta name="version" content="Impala 2.8.x"><meta name="DC.Format" content="XHTML"><meta name="DC.Identifier" content="limit"><link rel="stylesheet" type="text/css" href="../commonltr.css"><title>LIMIT Clause</title></head><body id="limit"><main role="main"><article role="article" aria-labelledby="ariaid-title1"> + + <h1 class="title topictitle1" id="ariaid-title1">LIMIT Clause</h1> + + + <div class="body conbody"> + + <p class="p"> + The <code class="ph codeph">LIMIT</code> clause in a <code class="ph codeph">SELECT</code> query sets a maximum number of rows for the + result set. Pre-selecting the maximum size of the result set helps Impala to optimize memory usage while + processing a distributed query. + </p> + + <p class="p"> + <strong class="ph b">Syntax:</strong> + </p> + +<pre class="pre codeblock"><code>LIMIT <var class="keyword varname">constant_integer_expression</var></code></pre> + + <p class="p"> + The argument to the <code class="ph codeph">LIMIT</code> clause must evaluate to a constant value. It can be a numeric + literal, or another kind of numeric expression involving operators, casts, and function return values. You + cannot refer to a column or use a subquery. + </p> + + <p class="p"> + <strong class="ph b">Usage notes:</strong> + </p> + + <p class="p"> + This clause is useful in contexts such as: + </p> + + <ul class="ul"> + <li class="li"> + To return exactly N items from a top-N query, such as the 10 highest-rated items in a shopping category or + the 50 hostnames that refer the most traffic to a web site. + </li> + + <li class="li"> + To demonstrate some sample values from a table or a particular query. (To display some arbitrary items, use + a query with no <code class="ph codeph">ORDER BY</code> clause. An <code class="ph codeph">ORDER BY</code> clause causes additional + memory and/or disk usage during the query.) + </li> + + <li class="li"> + To keep queries from returning huge result sets by accident if a table is larger than expected, or a + <code class="ph codeph">WHERE</code> clause matches more rows than expected. + </li> + </ul> + + <p class="p"> + Originally, the value for the <code class="ph codeph">LIMIT</code> clause had to be a numeric literal. In Impala 1.2.1 and + higher, it can be a numeric expression. + </p> + + <p class="p"> + Prior to Impala 1.4.0, Impala required any query including an + <code class="ph codeph"><a class="xref" href="../shared/../topics/impala_order_by.html#order_by">ORDER BY</a></code> clause to also use a + <code class="ph codeph"><a class="xref" href="../shared/../topics/impala_limit.html#limit">LIMIT</a></code> clause. In Impala 1.4.0 and + higher, the <code class="ph codeph">LIMIT</code> clause is optional for <code class="ph codeph">ORDER BY</code> queries. In cases where + sorting a huge result set requires enough memory to exceed the Impala memory limit for a particular node, + Impala automatically uses a temporary disk work area to perform the sort operation. + </p> + + <p class="p"> + See <a class="xref" href="impala_order_by.html#order_by">ORDER BY Clause</a> for details. + </p> + + <p class="p"> + In Impala 1.2.1 and higher, you can combine a <code class="ph codeph">LIMIT</code> clause with an <code class="ph codeph">OFFSET</code> + clause to produce a small result set that is different from a top-N query, for example, to return items 11 + through 20. This technique can be used to simulate <span class="q">"paged"</span> results. Because Impala queries typically + involve substantial amounts of I/O, use this technique only for compatibility in cases where you cannot + rewrite the application logic. For best performance and scalability, wherever practical, query as many + items as you expect to need, cache them on the application side, and display small groups of results to + users using application logic. + </p> + + <p class="p"> + <strong class="ph b">Restrictions:</strong> + </p> + + <p class="p"> + Correlated subqueries used in <code class="ph codeph">EXISTS</code> and <code class="ph codeph">IN</code> operators cannot include a + <code class="ph codeph">LIMIT</code> clause. + </p> + + <p class="p"> + <strong class="ph b">Examples:</strong> + </p> + + <p class="p"> + The following example shows how the <code class="ph codeph">LIMIT</code> clause caps the size of the result set, with the + limit being applied after any other clauses such as <code class="ph codeph">WHERE</code>. + </p> + +<pre class="pre codeblock"><code>[localhost:21000] > create database limits; +[localhost:21000] > use limits; +[localhost:21000] > create table numbers (x int); +[localhost:21000] > insert into numbers values (1), (3), (4), (5), (2); +Inserted 5 rows in 1.34s +[localhost:21000] > select x from numbers limit 100; ++---+ +| x | ++---+ +| 1 | +| 3 | +| 4 | +| 5 | +| 2 | ++---+ +Returned 5 row(s) in 0.26s +[localhost:21000] > select x from numbers limit 3; ++---+ +| x | ++---+ +| 1 | +| 3 | +| 4 | ++---+ +Returned 3 row(s) in 0.27s +[localhost:21000] > select x from numbers where x > 2 limit 2; ++---+ +| x | ++---+ +| 3 | +| 4 | ++---+ +Returned 2 row(s) in 0.27s</code></pre> + + <p class="p"> + For top-N and bottom-N queries, you use the <code class="ph codeph">ORDER BY</code> and <code class="ph codeph">LIMIT</code> clauses + together: + </p> + +<pre class="pre codeblock"><code>[localhost:21000] > select x as "Top 3" from numbers order by x desc limit 3; ++-------+ +| top 3 | ++-------+ +| 5 | +| 4 | +| 3 | ++-------+ +[localhost:21000] > select x as "Bottom 3" from numbers order by x limit 3; ++----------+ +| bottom 3 | ++----------+ +| 1 | +| 2 | +| 3 | ++----------+ +</code></pre> + + <p class="p"> + You can use constant values besides integer literals as the <code class="ph codeph">LIMIT</code> argument: + </p> + +<pre class="pre codeblock"><code>-- Other expressions that yield constant integer values work too. +SELECT x FROM t1 LIMIT 1e6; -- Limit is one million. +SELECT x FROM t1 LIMIT length('hello world'); -- Limit is 11. +SELECT x FROM t1 LIMIT 2+2; -- Limit is 4. +SELECT x FROM t1 LIMIT cast(truncate(9.9) AS INT); -- Limit is 9. +</code></pre> + </div> +<nav role="navigation" class="related-links"><div class="familylinks"><div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_select.html">SELECT Statement</a></div></div></nav></article></main></body></html> \ No newline at end of file http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/75c46918/docs/build/html/topics/impala_lineage.html ---------------------------------------------------------------------- diff --git a/docs/build/html/topics/impala_lineage.html b/docs/build/html/topics/impala_lineage.html new file mode 100644 index 0000000..c3581e5 --- /dev/null +++ b/docs/build/html/topics/impala_lineage.html @@ -0,0 +1,91 @@ +<!DOCTYPE html + SYSTEM "about:legacy-compat"> +<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) Copyright 2017"><meta name="DC.rights.owner" content="(C) Copyright 2017"><meta name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" content="../topics/impala_security.html"><meta name="prodname" content="Impala"><meta name="prodname" content="Impala"><meta name="version" content="Impala 2.8.x"><meta name="version" content="Impala 2.8.x"><meta name="DC.Format" content="XHTML"><meta name="DC.Identifier" content="lineage"><link rel="stylesheet" type="text/css" href="../commonltr.css"><title>Viewing Lineage Information for Impala Data</title></head><body id="lineage"><main role="main"><article role="article" aria-labelledby="ariaid-title1"> + + <h1 class="title topictitle1" id="ariaid-title1">Viewing Lineage Information for Impala Data</h1> + + + + <div class="body conbody"> + + <p class="p"> + + + <dfn class="term">Lineage</dfn> is a feature that helps you track where data originated, and how + data propagates through the system through SQL statements such as + <code class="ph codeph">SELECT</code>, <code class="ph codeph">INSERT</code>, and <code class="ph codeph">CREATE + TABLE AS SELECT</code>. + </p> + <p class="p"> + This type of tracking is important in high-security configurations, especially in + highly regulated industries such as healthcare, pharmaceuticals, financial services and + intelligence. For such kinds of sensitive data, it is important to know all + the places in the system that contain that data or other data derived from it; to verify who has accessed + that data; and to be able to doublecheck that the data used to make a decision was processed correctly and + not tampered with. + </p> + + <section class="section" id="lineage__column_lineage"><h2 class="title sectiontitle">Column Lineage</h2> + + + + <p class="p"> + <dfn class="term">Column lineage</dfn> tracks information in fine detail, at the level of + particular columns rather than entire tables. + </p> + + <p class="p"> + For example, if you have a table with information derived from web logs, you might copy that data into + other tables as part of the ETL process. The ETL operations might involve transformations through + expressions and function calls, and rearranging the columns into more or fewer tables + (<dfn class="term">normalizing</dfn> or <dfn class="term">denormalizing</dfn> the data). Then for reporting, you might issue + queries against multiple tables and views. In this example, column lineage helps you determine that data + that entered the system as <code class="ph codeph">RAW_LOGS.FIELD1</code> was then turned into + <code class="ph codeph">WEBSITE_REPORTS.IP_ADDRESS</code> through an <code class="ph codeph">INSERT ... SELECT</code> statement. Or, + conversely, you could start with a reporting query against a view, and trace the origin of the data in a + field such as <code class="ph codeph">TOP_10_VISITORS.USER_ID</code> back to the underlying table and even further back + to the point where the data was first loaded into Impala. + </p> + + <p class="p"> + When you have tables where you need to track or control access to sensitive information at the column + level, see <a class="xref" href="impala_authorization.html#authorization">Enabling Sentry Authorization for Impala</a> for how to implement column-level + security. You set up authorization using the Sentry framework, create views that refer to specific sets of + columns, and then assign authorization privileges to those views rather than the underlying tables. + </p> + + </section> + + <section class="section" id="lineage__lineage_data"><h2 class="title sectiontitle">Lineage Data for Impala</h2> + + + + <p class="p"> + The lineage feature is enabled by default. When lineage logging is enabled, the serialized column lineage + graph is computed for each query and stored in a specialized log file in JSON format. + </p> + + <p class="p"> + Impala records queries in the lineage log if they complete successfully, or fail due to authorization + errors. For write operations such as <code class="ph codeph">INSERT</code> and <code class="ph codeph">CREATE TABLE AS SELECT</code>, + the statement is recorded in the lineage log only if it successfully completes. Therefore, the lineage + feature tracks data that was accessed by successful queries, or that was attempted to be accessed by + unsuccessful queries that were blocked due to authorization failure. These kinds of queries represent data + that really was accessed, or where the attempted access could represent malicious activity. + </p> + + <p class="p"> + Impala does not record in the lineage log queries that fail due to syntax errors or that fail or are + cancelled before they reach the stage of requesting rows from the result set. + </p> + + <p class="p"> + To enable or disable this feature, set or remove the <code class="ph codeph">-lineage_event_log_dir</code> + configuration option for the <span class="keyword cmdname">impalad</span> daemon. + </p> + + </section> + + </div> + +<nav role="navigation" class="related-links"><div class="familylinks"><div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_security.html">Impala Security</a></div></div></nav></article></main></body></html> \ No newline at end of file http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/75c46918/docs/build/html/topics/impala_literals.html ---------------------------------------------------------------------- diff --git a/docs/build/html/topics/impala_literals.html b/docs/build/html/topics/impala_literals.html new file mode 100644 index 0000000..cd16389 --- /dev/null +++ b/docs/build/html/topics/impala_literals.html @@ -0,0 +1,427 @@ +<!DOCTYPE html + SYSTEM "about:legacy-compat"> +<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) Copyright 2017"><meta name="DC.rights.owner" content="(C) Copyright 2017"><meta name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" content="../topics/impala_langref.html"><meta name="prodname" content="Impala"><meta name="prodname" content="Impala"><meta name="version" content="Impala 2.8.x"><meta name="version" content="Impala 2.8.x"><meta name="DC.Format" content="XHTML"><meta name="DC.Identifier" content="literals"><link rel="stylesheet" type="text/css" href="../commonltr.css"><title>Literals</title></head><body id="literals"><main role="main"><article role="article" aria-labelledby="ariaid-title1"> + + <h1 class="title topictitle1" id="ariaid-title1">Literals</h1> + + + <div class="body conbody"> + + <p class="p"> + + Each of the Impala data types has corresponding notation for literal values of that type. You specify literal + values in SQL statements, such as in the <code class="ph codeph">SELECT</code> list or <code class="ph codeph">WHERE</code> clause of a + query, or as an argument to a function call. See <a class="xref" href="impala_datatypes.html#datatypes">Data Types</a> for a complete + list of types, ranges, and conversion rules. + </p> + + <p class="p toc inpage"></p> + </div> + + <nav role="navigation" class="related-links"><div class="familylinks"><div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_langref.html">Impala SQL Language Reference</a></div></div></nav><article class="topic concept nested1" aria-labelledby="ariaid-title2" id="literals__numeric_literals"> + + <h2 class="title topictitle2" id="ariaid-title2">Numeric Literals</h2> + + <div class="body conbody"> + + <p class="p"> + + To write literals for the integer types (<code class="ph codeph">TINYINT</code>, <code class="ph codeph">SMALLINT</code>, + <code class="ph codeph">INT</code>, and <code class="ph codeph">BIGINT</code>), use a sequence of digits with optional leading zeros. + </p> + + <p class="p"> + To write literals for the floating-point types (<code class="ph codeph">DECIMAL</code>, + <code class="ph codeph">FLOAT</code>, and <code class="ph codeph">DOUBLE</code>), use a sequence of digits with an optional decimal + point (<code class="ph codeph">.</code> character). To preserve accuracy during arithmetic expressions, Impala interprets + floating-point literals as the <code class="ph codeph">DECIMAL</code> type with the smallest appropriate precision and + scale, until required by the context to convert the result to <code class="ph codeph">FLOAT</code> or + <code class="ph codeph">DOUBLE</code>. + </p> + + <p class="p"> + Integer values are promoted to floating-point when necessary, based on the context. + </p> + + <p class="p"> + You can also use exponential notation by including an <code class="ph codeph">e</code> character. For example, + <code class="ph codeph">1e6</code> is 1 times 10 to the power of 6 (1 million). A number in exponential notation is + always interpreted as floating-point. + </p> + + <p class="p"> + When Impala encounters a numeric literal, it considers the type to be the <span class="q">"smallest"</span> that can + accurately represent the value. The type is promoted to larger or more accurate types if necessary, based + on subsequent parts of an expression. + </p> + <p class="p"> + For example, you can see by the types Impala defines for the following table columns + how it interprets the corresponding numeric literals: + </p> +<pre class="pre codeblock"><code>[localhost:21000] > create table ten as select 10 as x; ++-------------------+ +| summary | ++-------------------+ +| Inserted 1 row(s) | ++-------------------+ +[localhost:21000] > desc ten; ++------+---------+---------+ +| name | type | comment | ++------+---------+---------+ +| x | tinyint | | ++------+---------+---------+ + +[localhost:21000] > create table four_k as select 4096 as x; ++-------------------+ +| summary | ++-------------------+ +| Inserted 1 row(s) | ++-------------------+ +[localhost:21000] > desc four_k; ++------+----------+---------+ +| name | type | comment | ++------+----------+---------+ +| x | smallint | | ++------+----------+---------+ + +[localhost:21000] > create table one_point_five as select 1.5 as x; ++-------------------+ +| summary | ++-------------------+ +| Inserted 1 row(s) | ++-------------------+ +[localhost:21000] > desc one_point_five; ++------+--------------+---------+ +| name | type | comment | ++------+--------------+---------+ +| x | decimal(2,1) | | ++------+--------------+---------+ + +[localhost:21000] > create table one_point_three_three_three as select 1.333 as x; ++-------------------+ +| summary | ++-------------------+ +| Inserted 1 row(s) | ++-------------------+ +[localhost:21000] > desc one_point_three_three_three; ++------+--------------+---------+ +| name | type | comment | ++------+--------------+---------+ +| x | decimal(4,3) | | ++------+--------------+---------+ +</code></pre> + </div> + </article> + + <article class="topic concept nested1" aria-labelledby="ariaid-title3" id="literals__string_literals"> + + <h2 class="title topictitle2" id="ariaid-title3">String Literals</h2> + + <div class="body conbody"> + + <p class="p"> + + String literals are quoted using either single or double quotation marks. You can use either kind of quotes + for string literals, even both kinds for different literals within the same statement. + </p> + + <p class="p"> + Quoted literals are considered to be of type <code class="ph codeph">STRING</code>. To use quoted literals in contexts + requiring a <code class="ph codeph">CHAR</code> or <code class="ph codeph">VARCHAR</code> value, <code class="ph codeph">CAST()</code> the literal to + a <code class="ph codeph">CHAR</code> or <code class="ph codeph">VARCHAR</code> of the appropriate length. + </p> + + <p class="p"> + <strong class="ph b">Escaping special characters:</strong> + </p> + + <p class="p"> + To encode special characters within a string literal, precede them with the backslash (<code class="ph codeph">\</code>) + escape character: + </p> + + <ul class="ul"> + <li class="li"> + <code class="ph codeph">\t</code> represents a tab. + </li> + + <li class="li"> + <code class="ph codeph">\n</code> represents a newline or linefeed. This might cause extra line breaks in + <span class="keyword cmdname">impala-shell</span> output. + </li> + + <li class="li"> + <code class="ph codeph">\r</code> represents a carriage return. This might cause unusual formatting (making it appear + that some content is overwritten) in <span class="keyword cmdname">impala-shell</span> output. + </li> + + <li class="li"> + <code class="ph codeph">\b</code> represents a backspace. This might cause unusual formatting (making it appear that + some content is overwritten) in <span class="keyword cmdname">impala-shell</span> output. + </li> + + <li class="li"> + <code class="ph codeph">\0</code> represents an ASCII <code class="ph codeph">nul</code> character (not the same as a SQL + <code class="ph codeph">NULL</code>). This might not be visible in <span class="keyword cmdname">impala-shell</span> output. + </li> + + <li class="li"> + <code class="ph codeph">\Z</code> represents a DOS end-of-file character. This might not be visible in + <span class="keyword cmdname">impala-shell</span> output. + </li> + + <li class="li"> + <code class="ph codeph">\%</code> and <code class="ph codeph">\_</code> can be used to escape wildcard characters within the string + passed to the <code class="ph codeph">LIKE</code> operator. + </li> + + <li class="li"> + <code class="ph codeph">\</code> followed by 3 octal digits represents the ASCII code of a single character; for + example, <code class="ph codeph">\101</code> is ASCII 65, the character <code class="ph codeph">A</code>. + </li> + + <li class="li"> + Use two consecutive backslashes (<code class="ph codeph">\\</code>) to prevent the backslash from being interpreted as + an escape character. + </li> + + <li class="li"> + Use the backslash to escape single or double quotation mark characters within a string literal, if the + literal is enclosed by the same type of quotation mark. + </li> + + <li class="li"> + If the character following the <code class="ph codeph">\</code> does not represent the start of a recognized escape + sequence, the character is passed through unchanged. + </li> + </ul> + + <p class="p"> + <strong class="ph b">Quotes within quotes:</strong> + </p> + + <p class="p"> + To include a single quotation character within a string value, enclose the literal with either single or + double quotation marks, and optionally escape the single quote as a <code class="ph codeph">\'</code> sequence. Earlier + releases required escaping a single quote inside double quotes. Continue using escape sequences in this + case if you also need to run your SQL code on older versions of Impala. + </p> + + <p class="p"> + To include a double quotation character within a string value, enclose the literal with single quotation + marks, no escaping is necessary in this case. Or, enclose the literal with double quotation marks and + escape the double quote as a <code class="ph codeph">\"</code> sequence. + </p> + +<pre class="pre codeblock"><code>[localhost:21000] > select "What\'s happening?" as single_within_double, + > 'I\'m not sure.' as single_within_single, + > "Homer wrote \"The Iliad\"." as double_within_double, + > 'Homer also wrote "The Odyssey".' as double_within_single; ++----------------------+----------------------+--------------------------+---------------------------------+ +| single_within_double | single_within_single | double_within_double | double_within_single | ++----------------------+----------------------+--------------------------+---------------------------------+ +| What's happening? | I'm not sure. | Homer wrote "The Iliad". | Homer also wrote "The Odyssey". | ++----------------------+----------------------+--------------------------+---------------------------------+ +</code></pre> + + <p class="p"> + <strong class="ph b">Field terminator character in CREATE TABLE:</strong> + </p> + + <div class="note note note_note"><span class="note__title notetitle">Note:</span> + The <code class="ph codeph">CREATE TABLE</code> clauses <code class="ph codeph">FIELDS TERMINATED BY</code>, <code class="ph codeph">ESCAPED + BY</code>, and <code class="ph codeph">LINES TERMINATED BY</code> have special rules for the string literal used for + their argument, because they all require a single character. You can use a regular character surrounded by + single or double quotation marks, an octal sequence such as <code class="ph codeph">'\054'</code> (representing a comma), + or an integer in the range '-127'..'128' (with quotation marks but no backslash), which is interpreted as a + single-byte ASCII character. Negative values are subtracted from 256; for example, <code class="ph codeph">FIELDS + TERMINATED BY '-2'</code> sets the field delimiter to ASCII code 254, the <span class="q">"Icelandic Thorn"</span> + character used as a delimiter by some data formats. + </div> + + <p class="p"> + <strong class="ph b">impala-shell considerations:</strong> + </p> + + <p class="p"> + When dealing with output that includes non-ASCII or non-printable characters such as linefeeds and + backspaces, use the <span class="keyword cmdname">impala-shell</span> options to save to a file, turn off pretty printing, or + both rather than relying on how the output appears visually. See + <a class="xref" href="impala_shell_options.html#shell_options">impala-shell Configuration Options</a> for a list of <span class="keyword cmdname">impala-shell</span> + options. + </p> + </div> + </article> + + <article class="topic concept nested1" aria-labelledby="ariaid-title4" id="literals__boolean_literals"> + + <h2 class="title topictitle2" id="ariaid-title4">Boolean Literals</h2> + + <div class="body conbody"> + + <p class="p"> + For <code class="ph codeph">BOOLEAN</code> values, the literals are <code class="ph codeph">TRUE</code> and <code class="ph codeph">FALSE</code>, + with no quotation marks and case-insensitive. + </p> + + <p class="p"> + <strong class="ph b">Examples:</strong> + </p> + +<pre class="pre codeblock"><code>select true; +select * from t1 where assertion = false; +select case bool_col when true then 'yes' when false 'no' else 'null' end from t1;</code></pre> + </div> + </article> + + <article class="topic concept nested1" aria-labelledby="ariaid-title5" id="literals__timestamp_literals"> + + <h2 class="title topictitle2" id="ariaid-title5">Timestamp Literals</h2> + + <div class="body conbody"> + + <p class="p"> + Impala automatically converts <code class="ph codeph">STRING</code> literals of the correct format into + <code class="ph codeph">TIMESTAMP</code> values. Timestamp values are accepted in the format + <code class="ph codeph">"yyyy-MM-dd HH:mm:ss.SSSSSS"</code>, and can consist of just the date, or just the time, with or + without the fractional second portion. For example, you can specify <code class="ph codeph">TIMESTAMP</code> values such as + <code class="ph codeph">'1966-07-30'</code>, <code class="ph codeph">'08:30:00'</code>, or <code class="ph codeph">'1985-09-25 17:45:30.005'</code>. + <span class="ph">Casting an integer or floating-point value <code class="ph codeph">N</code> to + <code class="ph codeph">TIMESTAMP</code> produces a value that is <code class="ph codeph">N</code> seconds past the start of the epoch + date (January 1, 1970). By default, the result value represents a date and time in the UTC time zone. + If the setting <code class="ph codeph">-use_local_tz_for_unix_timestamp_conversions=true</code> is in effect, + the resulting <code class="ph codeph">TIMESTAMP</code> represents a date and time in the local time zone.</span> + </p> + + <p class="p"> + You can also use <code class="ph codeph">INTERVAL</code> expressions to add or subtract from timestamp literal values, + such as <code class="ph codeph">'1966-07-30' + INTERVAL 5 YEARS + INTERVAL 3 DAYS</code>. See + <a class="xref" href="impala_timestamp.html#timestamp">TIMESTAMP Data Type</a> for details. + </p> + + <p class="p"> + Depending on your data pipeline, you might receive date and time data as text, in notation that does not + exactly match the format for Impala <code class="ph codeph">TIMESTAMP</code> literals. + See <a class="xref" href="impala_datetime_functions.html#datetime_functions">Impala Date and Time Functions</a> for functions that can convert + between a variety of string literals (including different field order, separators, and timezone notation) + and equivalent <code class="ph codeph">TIMESTAMP</code> or numeric values. + </p> + </div> + </article> + + <article class="topic concept nested1" aria-labelledby="ariaid-title6" id="literals__null"> + + <h2 class="title topictitle2" id="ariaid-title6">NULL</h2> + + <div class="body conbody"> + + <p class="p"> + + The notion of <code class="ph codeph">NULL</code> values is familiar from all kinds of database systems, but each SQL + dialect can have its own behavior and restrictions on <code class="ph codeph">NULL</code> values. For Big Data + processing, the precise semantics of <code class="ph codeph">NULL</code> values are significant: any misunderstanding + could lead to inaccurate results or misformatted data, that could be time-consuming to correct for large + data sets. + </p> + + <ul class="ul"> + <li class="li"> + <code class="ph codeph">NULL</code> is a different value than an empty string. The empty string is represented by a + string literal with nothing inside, <code class="ph codeph">""</code> or <code class="ph codeph">''</code>. + </li> + + <li class="li"> + In a delimited text file, the <code class="ph codeph">NULL</code> value is represented by the special token + <code class="ph codeph">\N</code>. + </li> + + <li class="li"> + When Impala inserts data into a partitioned table, and the value of one of the partitioning columns is + <code class="ph codeph">NULL</code> or the empty string, the data is placed in a special partition that holds only + these two kinds of values. When these values are returned in a query, the result is <code class="ph codeph">NULL</code> + whether the value was originally <code class="ph codeph">NULL</code> or an empty string. This behavior is compatible + with the way Hive treats <code class="ph codeph">NULL</code> values in partitioned tables. Hive does not allow empty + strings as partition keys, and it returns a string value such as + <code class="ph codeph">__HIVE_DEFAULT_PARTITION__</code> instead of <code class="ph codeph">NULL</code> when such values are + returned from a query. For example: +<pre class="pre codeblock"><code>create table t1 (i int) partitioned by (x int, y string); +-- Select an INT column from another table, with all rows going into a special HDFS subdirectory +-- named __HIVE_DEFAULT_PARTITION__. Depending on whether one or both of the partitioning keys +-- are null, this special directory name occurs at different levels of the physical data directory +-- for the table. +insert into t1 partition(x=NULL, y=NULL) select c1 from some_other_table; +insert into t1 partition(x, y=NULL) select c1, c2 from some_other_table; +insert into t1 partition(x=NULL, y) select c1, c3 from some_other_table;</code></pre> + </li> + + <li class="li"> + There is no <code class="ph codeph">NOT NULL</code> clause when defining a column to prevent <code class="ph codeph">NULL</code> + values in that column. + </li> + + <li class="li"> + There is no <code class="ph codeph">DEFAULT</code> clause to specify a non-<code class="ph codeph">NULL</code> default value. + </li> + + <li class="li"> + If an <code class="ph codeph">INSERT</code> operation mentions some columns but not others, the unmentioned columns + contain <code class="ph codeph">NULL</code> for all inserted rows. + </li> + + <li class="li"> + <p class="p"> + In Impala 1.2.1 and higher, all <code class="ph codeph">NULL</code> values come at the end of the result set for + <code class="ph codeph">ORDER BY ... ASC</code> queries, and at the beginning of the result set for <code class="ph codeph">ORDER BY ... + DESC</code> queries. In effect, <code class="ph codeph">NULL</code> is considered greater than all other values for + sorting purposes. The original Impala behavior always put <code class="ph codeph">NULL</code> values at the end, even for + <code class="ph codeph">ORDER BY ... DESC</code> queries. The new behavior in Impala 1.2.1 makes Impala more compatible + with other popular database systems. In Impala 1.2.1 and higher, you can override or specify the sorting + behavior for <code class="ph codeph">NULL</code> by adding the clause <code class="ph codeph">NULLS FIRST</code> or <code class="ph codeph">NULLS + LAST</code> at the end of the <code class="ph codeph">ORDER BY</code> clause. + </p> + <div class="note note note_note"><span class="note__title notetitle">Note:</span> + + Because the <code class="ph codeph">NULLS FIRST</code> and <code class="ph codeph">NULLS LAST</code> keywords are not currently + available in Hive queries, any views you create using those keywords will not be available through + Hive. + </div> + </li> + + <li class="li"> + In all other contexts besides sorting with <code class="ph codeph">ORDER BY</code>, comparing a <code class="ph codeph">NULL</code> + to anything else returns <code class="ph codeph">NULL</code>, making the comparison meaningless. For example, + <code class="ph codeph">10 > NULL</code> produces <code class="ph codeph">NULL</code>, <code class="ph codeph">10 < NULL</code> also produces + <code class="ph codeph">NULL</code>, <code class="ph codeph">5 BETWEEN 1 AND NULL</code> produces <code class="ph codeph">NULL</code>, and so on. + </li> + </ul> + + <p class="p"> + Several built-in functions serve as shorthand for evaluating expressions and returning + <code class="ph codeph">NULL</code>, 0, or some other substitution value depending on the expression result: + <code class="ph codeph">ifnull()</code>, <code class="ph codeph">isnull()</code>, <code class="ph codeph">nvl()</code>, <code class="ph codeph">nullif()</code>, + <code class="ph codeph">nullifzero()</code>, and <code class="ph codeph">zeroifnull()</code>. See + <a class="xref" href="impala_conditional_functions.html#conditional_functions">Impala Conditional Functions</a> for details. + </p> + + <p class="p"> + <strong class="ph b">Kudu considerations:</strong> + </p> + <p class="p"> + Columns in Kudu tables have an attribute that specifies whether or not they can contain + <code class="ph codeph">NULL</code> values. A column with a <code class="ph codeph">NULL</code> attribute can contain + nulls. A column with a <code class="ph codeph">NOT NULL</code> attribute cannot contain any nulls, and + an <code class="ph codeph">INSERT</code>, <code class="ph codeph">UPDATE</code>, or <code class="ph codeph">UPSERT</code> statement + will skip any row that attempts to store a null in a column designated as <code class="ph codeph">NOT NULL</code>. + Kudu tables default to the <code class="ph codeph">NULL</code> setting for each column, except columns that + are part of the primary key. + </p> + <p class="p"> + In addition to columns with the <code class="ph codeph">NOT NULL</code> attribute, Kudu tables also have + restrictions on <code class="ph codeph">NULL</code> values in columns that are part of the primary key for + a table. No column that is part of the primary key in a Kudu table can contain any + <code class="ph codeph">NULL</code> values. + </p> + + </div> + </article> +</article></main></body></html> \ No newline at end of file http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/75c46918/docs/build/html/topics/impala_live_progress.html ---------------------------------------------------------------------- diff --git a/docs/build/html/topics/impala_live_progress.html b/docs/build/html/topics/impala_live_progress.html new file mode 100644 index 0000000..40d6631 --- /dev/null +++ b/docs/build/html/topics/impala_live_progress.html @@ -0,0 +1,131 @@ +<!DOCTYPE html + SYSTEM "about:legacy-compat"> +<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) Copyright 2017"><meta name="DC.rights.owner" content="(C) Copyright 2017"><meta name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" content="../topics/impala_query_options.html"><meta name="prodname" content="Impala"><meta name="prodname" content="Impala"><meta name="version" content="Impala 2.8.x"><meta name="version" content="Impala 2.8.x"><meta name="DC.Format" content="XHTML"><meta name="DC.Identifier" content="live_progress"><link rel="stylesheet" type="text/css" href="../commonltr.css"><title>LIVE_PROGRESS Query Option (Impala 2.3 or higher only)</title></head><body id="live_progress"><main role="main"><article role="article" aria-labelledby="ariaid-title1"> + + <h1 class="title topictitle1" id="ariaid-title1">LIVE_PROGRESS Query Option (<span class="keyword">Impala 2.3</span> or higher only)</h1> + + + + <div class="body conbody"> + + <p class="p"> + + For queries submitted through the <span class="keyword cmdname">impala-shell</span> command, + displays an interactive progress bar showing roughly what percentage of + processing has been completed. When the query finishes, the progress bar is erased + from the <span class="keyword cmdname">impala-shell</span> console output. + </p> + + <p class="p"> + </p> + + <p class="p"> + <strong class="ph b">Type:</strong> Boolean; recognized values are 1 and 0, or <code class="ph codeph">true</code> and <code class="ph codeph">false</code>; + any other value interpreted as <code class="ph codeph">false</code> + </p> + <p class="p"> + <strong class="ph b">Default:</strong> <code class="ph codeph">false</code> (shown as 0 in output of <code class="ph codeph">SET</code> statement) + </p> + + <p class="p"> + <strong class="ph b">Command-line equivalent:</strong> + </p> + <p class="p"> + You can enable this query option within <span class="keyword cmdname">impala-shell</span> + by starting the shell with the <code class="ph codeph">--live_progress</code> + command-line option. + You can still turn this setting off and on again within the shell through the + <code class="ph codeph">SET</code> command. + </p> + + <p class="p"> + <strong class="ph b">Usage notes:</strong> + </p> + <p class="p"> + The output from this query option is printed to standard error. The output is only displayed in interactive mode, + that is, not when the <code class="ph codeph">-q</code> or <code class="ph codeph">-f</code> options are used. + </p> + <p class="p"> + For a more detailed way of tracking the progress of an interactive query through + all phases of processing, see <a class="xref" href="impala_live_summary.html#live_summary">LIVE_SUMMARY Query Option (Impala 2.3 or higher only)</a>. + </p> + + <p class="p"> + <strong class="ph b">Restrictions:</strong> + </p> + <p class="p"> + Because the percentage complete figure is calculated using the number of + issued and completed <span class="q">"scan ranges"</span>, which occur while reading the table + data, the progress bar might reach 100% before the query is entirely finished. + For example, the query might do work to perform aggregations after all the + table data has been read. If many of your queries fall into this category, + consider using the <code class="ph codeph">LIVE_SUMMARY</code> option instead for + more granular progress reporting. + </p> + <p class="p"> + The <code class="ph codeph">LIVE_PROGRESS</code> and <code class="ph codeph">LIVE_SUMMARY</code> query options + currently do not produce any output during <code class="ph codeph">COMPUTE STATS</code> operations. + </p> + <div class="p"> + Because the <code class="ph codeph">LIVE_PROGRESS</code> and <code class="ph codeph">LIVE_SUMMARY</code> query options + are available only within the <span class="keyword cmdname">impala-shell</span> interpreter: + <ul class="ul"> + <li class="li"> + <p class="p"> + You cannot change these query options through the SQL <code class="ph codeph">SET</code> + statement using the JDBC or ODBC interfaces. The <code class="ph codeph">SET</code> + command in <span class="keyword cmdname">impala-shell</span> recognizes these names as + shell-only options. + </p> + </li> + <li class="li"> + <p class="p"> + Be careful when using <span class="keyword cmdname">impala-shell</span> on a pre-<span class="keyword">Impala 2.3</span> + system to connect to a system running <span class="keyword">Impala 2.3</span> or higher. + The older <span class="keyword cmdname">impala-shell</span> does not recognize these + query option names. Upgrade <span class="keyword cmdname">impala-shell</span> on the + systems where you intend to use these query options. + </p> + </li> + <li class="li"> + <p class="p"> + Likewise, the <span class="keyword cmdname">impala-shell</span> command relies on + some information only available in <span class="keyword">Impala 2.3</span> and higher + to prepare live progress reports and query summaries. The + <code class="ph codeph">LIVE_PROGRESS</code> and <code class="ph codeph">LIVE_SUMMARY</code> + query options have no effect when <span class="keyword cmdname">impala-shell</span> connects + to a cluster running an older version of Impala. + </p> + </li> + </ul> + </div> + + <p class="p"> + <strong class="ph b">Added in:</strong> <span class="keyword">Impala 2.3.0</span> + </p> + + <p class="p"> + <strong class="ph b">Examples:</strong> + </p> +<pre class="pre codeblock"><code>[localhost:21000] > set live_progress=true; +LIVE_PROGRESS set to true +[localhost:21000] > select count(*) from customer; ++----------+ +| count(*) | ++----------+ +| 150000 | ++----------+ +[localhost:21000] > select count(*) from customer t1 cross join customer t2; +[################################### ] 50% +[######################################################################] 100% + + +</code></pre> + + <p class="p"> + To see how the <code class="ph codeph">LIVE_PROGRESS</code> and <code class="ph codeph">LIVE_SUMMARY</code> query options + work in real time, see <a class="xref" href="https://asciinema.org/a/1rv7qippo0fe7h5k1b6k4nexk" target="_blank">this animated demo</a>. + </p> + + </div> +<nav role="navigation" class="related-links"><div class="familylinks"><div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_query_options.html">Query Options for the SET Statement</a></div></div></nav></article></main></body></html> \ No newline at end of file
