http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/661921b2/docs/topics/impala_kudu.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_kudu.xml b/docs/topics/impala_kudu.xml index ef90e63..cf08671 100644 --- a/docs/topics/impala_kudu.xml +++ b/docs/topics/impala_kudu.xml @@ -20,7 +20,7 @@ under the License. <!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd"> <concept id="impala_kudu" rev="kudu"> - <title>Using Impala to Query Kudu Tables</title> + <title id="kudu">Using Impala to Query Kudu Tables</title> <prolog> <metadata> @@ -36,19 +36,30 @@ under the License. <p> <indexterm audience="hidden">Kudu</indexterm> - You can use Impala to query Kudu tables. This capability allows convenient access to a storage system that is - tuned for different kinds of workloads than the default with Impala. The default Impala tables use data files - stored on HDFS, which are ideal for bulk loads and queries using full-table scans. In contrast, Kudu can do - efficient queries for data organized either in data warehouse style (with full table scans) or for OLTP-style - workloads (with key-based lookups for single rows or small ranges of values). + You can use Impala to query tables stored by Apache Kudu. This capability + allows convenient access to a storage system that is tuned for different kinds of + workloads than the default with Impala. </p> <p> - Certain Impala SQL statements, such as <codeph>UPDATE</codeph> and <codeph>DELETE</codeph>, only work with - Kudu tables. These operations were impractical from a performance perspective to perform at large scale on - HDFS data, or on HBase tables. + By default, Impala tables are stored on HDFS using data files with various file formats. + HDFS files are ideal for bulk loads (append operations) and queries using full-table scans, + but do not support in-place updates or deletes. Kudu is an alternative storage engine used + by Impala which can do both in-place updates (for mixed read/write workloads) and fast scans + (for data-warehouse/analytic operations). Using Kudu tables with Impala can simplify the + ETL pipeline by avoiding extra steps to segregate and reorganize newly arrived data. </p> + <p> + Certain Impala SQL statements and clauses, such as <codeph>DELETE</codeph>, + <codeph>UPDATE</codeph>, <codeph>UPSERT</codeph>, and <codeph>PRIMARY KEY</codeph> work + only with Kudu tables. Other statements and clauses, such as <codeph>LOAD DATA</codeph>, + <codeph>TRUNCATE TABLE</codeph>, and <codeph>INSERT OVERWRITE</codeph>, are not applicable + to Kudu tables. + </p> + + <p outputclass="toc inpage"/> + </conbody> <concept id="kudu_benefits"> @@ -58,50 +69,1097 @@ under the License. <conbody> <p> - The combination of Kudu and Impala works best for tables where scan performance is important, but data - arrives continuously, in small batches, or needs to be updated without being completely replaced. In these - scenarios (such as for streaming data), it might be impractical to use Parquet tables because Parquet works - best with multi-megabyte data files, requiring substantial overhead to replace or reorganize data files to - accomodate frequent additions or changes to data. Impala can query Kudu tables with scan performance close - to that of Parquet, and Impala can also perform update or delete operations without replacing the entire - table contents. You can also use the Kudu API to do ingestion or transformation operations outside of - Impala, and Impala can query the current data at any time. + The combination of Kudu and Impala works best for tables where scan performance is + important, but data arrives continuously, in small batches, or needs to be updated + without being completely replaced. HDFS-backed tables can require substantial overhead + to replace or reorganize data files as new data arrives. Impala can perform efficient + lookups and scans within Kudu tables, and Impala can also perform update or + delete operations efficiently. You can also use the Kudu Java, C++, and Python APIs to + do ingestion or transformation operations outside of Impala, and Impala can query the + current data at any time. </p> </conbody> </concept> - <concept id="kudu_primary_key"> + <concept id="kudu_config"> - <title>Primary Key Columns for Kudu Tables</title> + <title>Configuring Impala for Use with Kudu</title> <conbody> <p> - Kudu tables introduce the notion of primary keys to Impala for the first time. The primary key is made up - of one or more columns, whose values are combined and used as a lookup key during queries. These columns - cannot contain any <codeph>NULL</codeph> values or any duplicate values, and can never be updated. For a - partitioned Kudu table, all the partition key columns must come from the set of primary key columns. + The <codeph>-kudu_master_hosts</codeph> configuration property must be set correctly + for the <cmdname>impalad</cmdname> daemon, for <codeph>CREATE TABLE ... STORED AS + KUDU</codeph> statements to connect to the appropriate Kudu server. Typically, the + required value for this setting is <codeph><varname>kudu_host</varname>:7051</codeph>. + In a high-availability Kudu deployment, specify the names of multiple Kudu hosts separated by commas. </p> <p> - Impala itself still does not have the notion of unique or non-<codeph>NULL</codeph> constraints. These - restrictions on the primary key columns are enforced on the Kudu side. + If the <codeph>-kudu_master_hosts</codeph> configuration property is not set, you can + still associate the appropriate value for each table by specifying a + <codeph>TBLPROPERTIES('kudu.master_addresses')</codeph> clause in the <codeph>CREATE TABLE</codeph> statement or + changing the <codeph>TBLPROPERTIES('kudu.master_addresses')</codeph> value with an <codeph>ALTER TABLE</codeph> + statement. </p> + </conbody> + + <concept id="kudu_topology"> + + <title>Cluster Topology for Kudu Tables</title> + + <conbody> + + <p> + With HDFS-backed tables, you are typically concerned with the number of DataNodes in + the cluster, how many and how large HDFS data files are read during a query, and + therefore the amount of work performed by each DataNode and the network communication + to combine intermediate results and produce the final result set. + </p> + + <p> + With Kudu tables, the topology considerations are different, because: + </p> + + <ul> + <li> + <p> + The underlying storage is managed and organized by Kudu, not represented as HDFS + data files. + </p> + </li> + + <li> + <p> + Kudu handles some of the underlying mechanics of partitioning the data. You can specify + the partitioning scheme with combinations of hash and range partitioning, so that you can + decide how much effort to expend to manage the partitions as new data arrives. For example, + you can construct partitions that apply to date ranges rather than a separate partition for each + day or each hour. + </p> + </li> + + <li> + <p> + Data is physically divided based on units of storage called <term>tablets</term>. Tablets are + stored by <term>tablet servers</term>. Each tablet server can store multiple tablets, + and each tablet is replicated across multiple tablet servers, managed automatically by Kudu. + Where practical, colocate the tablet servers on the same hosts as the DataNodes, although that is not required. + </p> + </li> + </ul> + + <p> + One consideration for the cluster topology is that the number of replicas for a Kudu table + must be odd. + </p> + + </conbody> + + </concept> + + </concept> + + <concept id="kudu_ddl"> + + <title>Impala DDL Enhancements for Kudu Tables (CREATE TABLE and ALTER TABLE)</title> + + <prolog> + <metadata> + <data name="Category" value="DDL"/> + </metadata> + </prolog> + + <conbody> + <p> - The primary key columns must be the first ones specified in the <codeph>CREATE TABLE</codeph> statement. - You specify which column or columns make up the primary key in the table properties, rather than through - attributes in the column list. + You can use the Impala <codeph>CREATE TABLE</codeph> and <codeph>ALTER TABLE</codeph> + statements to create and fine-tune the characteristics of Kudu tables. Because Kudu + tables have features and properties that do not apply to other kinds of Impala tables, + familiarize yourself with Kudu-related concepts and syntax first. + For the general syntax of the <codeph>CREATE TABLE</codeph> + statement for Kudu tables, see <xref keyref="create_table"/>. + </p> + + <p outputclass="toc inpage"/> + + </conbody> + + <concept id="kudu_primary_key"> + + <title>Primary Key Columns for Kudu Tables</title> + + <conbody> + + <p> + Kudu tables introduce the notion of primary keys to Impala for the first time. The + primary key is made up of one or more columns, whose values are combined and used as a + lookup key during queries. The tuple represented by these columns must be unique and cannot contain any + <codeph>NULL</codeph> values, and can never be updated once inserted. For a + Kudu table, all the partition key columns must come from the set of + primary key columns. + </p> + + <p> + The primary key has both physical and logical aspects: + </p> + + <ul> + <li> + <p> + On the physical side, it is used to map the data values to particular tablets for fast retrieval. + Because the tuples formed by the primary key values are unique, the primary key columns are typically + highly selective. + </p> + </li> + <li> + <p> + On the logical side, the uniqueness constraint allows you to avoid duplicate data in a table. + For example, if an <codeph>INSERT</codeph> operation fails partway through, only some of the + new rows might be present in the table. You can re-run the same <codeph>INSERT</codeph>, and + only the missing rows will be added. Or if data in the table is stale, you can run an + <codeph>UPSERT</codeph> statement that brings the data up to date, without the possibility + of creating duplicate copies of existing rows. + </p> + </li> + </ul> + + <note> + <p> + Impala only allows <codeph>PRIMARY KEY</codeph> clauses and <codeph>NOT NULL</codeph> + constraints on columns for Kudu tables. These constraints are enforced on the Kudu side. + </p> + </note> + + </conbody> + + </concept> + + <concept id="kudu_column_attributes" rev="IMPALA-3726"> + + <title>Kudu-Specific Column Attributes for CREATE TABLE</title> + + <conbody> + + <p> + For the general syntax of the <codeph>CREATE TABLE</codeph> + statement for Kudu tables, see <xref keyref="create_table"/>. + The following sections provide more detail for some of the + Kudu-specific keywords you can use in column definitions. + </p> + + <p> + The column list in a <codeph>CREATE TABLE</codeph> statement can include the following + attributes, which only apply to Kudu tables: + </p> + +<codeblock> + PRIMARY KEY +| [NOT] NULL +| ENCODING <varname>codec</varname> +| COMPRESSION <varname>algorithm</varname> +| DEFAULT <varname>constant_expression</varname> +| BLOCK_SIZE <varname>number</varname> +</codeblock> + + <p outputclass="toc inpage"> + See the following sections for details about each column attribute. + </p> + + </conbody> + + <concept id="kudu_primary_key_attribute"> + + <title>PRIMARY KEY Attribute</title> + + <conbody> + + <p> + The primary key for a Kudu table is a column, or set of columns, that uniquely + identifies every row. The primary key value also is used as the natural sort order + for the values from the table. The primary key value for each row is based on the + combination of values for the columns. + </p> + + <p conref="../shared/impala_common.xml#common/pk_implies_not_null"/> + + <p> + The primary key columns must be the first ones specified in the <codeph>CREATE + TABLE</codeph> statement. For a single-column primary key, you can include a + <codeph>PRIMARY KEY</codeph> attribute inline with the column definition. For a + multi-column primary key, you include a <codeph>PRIMARY KEY (<varname>c1</varname>, + <varname>c2</varname>, ...)</codeph> clause as a separate entry at the end of the + column list. + </p> + + <p> + You can specify the <codeph>PRIMARY KEY</codeph> attribute either inline in a single + column definition, or as a separate clause at the end of the column list: + </p> + +<codeblock> +CREATE TABLE pk_inline +( + col1 BIGINT PRIMARY KEY, + col2 STRING, + col3 BOOLEAN +) PARTITION BY HASH(col1) PARTITIONS 2 STORED AS KUDU; + +CREATE TABLE pk_at_end +( + col1 BIGINT, + col2 STRING, + col3 BOOLEAN, + PRIMARY KEY (col1) +) PARTITION BY HASH(col1) PARTITIONS 2 STORED AS KUDU; +</codeblock> + + <p> + When the primary key is a single column, these two forms are equivalent. If the + primary key consists of more than one column, you must specify the primary key using + a separate entry in the column list: + </p> + +<codeblock> +CREATE TABLE pk_multiple_columns +( + col1 BIGINT, + col2 STRING, + col3 BOOLEAN, + <b>PRIMARY KEY (col1, col2)</b> +) PARTITION BY HASH(col2) PARTITIONS 2 STORED AS KUDU; +</codeblock> + + <p> + The <codeph>SHOW CREATE TABLE</codeph> statement always represents the + <codeph>PRIMARY KEY</codeph> specification as a separate item in the column list: + </p> + +<codeblock> +CREATE TABLE inline_pk_rewritten (id BIGINT <b>PRIMARY KEY</b>, s STRING) + PARTITION BY HASH(id) PARTITIONS 2 STORED AS KUDU; + +SHOW CREATE TABLE inline_pk_rewritten; ++------------------------------------------------------------------------------+ +| result | ++------------------------------------------------------------------------------+ +| CREATE TABLE user.inline_pk_rewritten ( | +| id BIGINT NOT NULL ENCODING AUTO_ENCODING COMPRESSION DEFAULT_COMPRESSION, | +| s STRING NULL ENCODING AUTO_ENCODING COMPRESSION DEFAULT_COMPRESSION, | +| <b>PRIMARY KEY (id)</b> | +| ) | +| PARTITION BY HASH (id) PARTITIONS 2 | +| STORED AS KUDU | +| TBLPROPERTIES ('kudu.master_addresses'='host.example.com') | ++------------------------------------------------------------------------------+ +</codeblock> + + <p> + The notion of primary key only applies to Kudu tables. Every Kudu table requires a + primary key. The primary key consists of one or more columns. You must specify any + primary key columns first in the column list. + </p> + + <p> + The contents of the primary key columns cannot be changed by an + <codeph>UPDATE</codeph> or <codeph>UPSERT</codeph> statement. Including too many + columns in the primary key (more than 5 or 6) can also reduce the performance of + write operations. Therefore, pick the most selective and most frequently + tested non-null columns for the primary key specification. + If a column must always have a value, but that value + might change later, leave it out of the primary key and use a <codeph>NOT + NULL</codeph> clause for that column instead. If an existing row has an + incorrect or outdated key column value, delete the old row and insert an entirely + new row with the correct primary key. + </p> + + </conbody> + + </concept> + + <concept id="kudu_not_null_attribute"> + + <title>NULL | NOT NULL Attribute</title> + + <conbody> + + <p> + For Kudu tables, you can specify which columns can contain nulls or not. This + constraint offers an extra level of consistency enforcement for Kudu tables. If an + application requires a field to always be specified, include a <codeph>NOT + NULL</codeph> clause in the corresponding column definition, and Kudu prevents rows + from being inserted with a <codeph>NULL</codeph> in that column. + </p> + + <p> + For example, a table containing geographic information might require the latitude + and longitude coordinates to always be specified. Other attributes might be allowed + to be <codeph>NULL</codeph>. For example, a location might not have a designated + place name, its altitude might be unimportant, and its population might be initially + unknown, to be filled in later. + </p> + + <p conref="../shared/impala_common.xml#common/pk_implies_not_null"/> + + <p> + For non-Kudu tables, Impala allows any column to contain <codeph>NULL</codeph> + values, because it is not practical to enforce a <q>not null</q> constraint on HDFS + data files that could be prepared using external tools and ETL processes. + </p> + +<codeblock> +CREATE TABLE required_columns +( + id BIGINT PRIMARY KEY, + latitude DOUBLE NOT NULL, + longitude DOUBLE NOT NULL, + place_name STRING, + altitude DOUBLE, + population BIGINT +) PARTITION BY HASH(id) PARTITIONS 2 STORED AS KUDU; +</codeblock> + + <p> + During performance optimization, Kudu can use the knowledge that nulls are not + allowed to skip certain checks on each input row, speeding up queries and join + operations. Therefore, specify <codeph>NOT NULL</codeph> constraints when + appropriate. + </p> + + <p> + The <codeph>NULL</codeph> clause is the default condition for all columns that are not + part of the primary key. You can omit it, or specify it to clarify that you have made a + conscious design decision to allow nulls in a column. + </p> + + <p> + Because primary key columns cannot contain any <codeph>NULL</codeph> values, the + <codeph>NOT NULL</codeph> clause is not required for the primary key columns, + but you might still specify it to make your code self-describing. + </p> + + </conbody> + + </concept> + + <concept id="kudu_default_attribute"> + + <title>DEFAULT Attribute</title> + + <conbody> + + <p> + You can specify a default value for columns in Kudu tables. The default value can be + any constant expression, for example, a combination of literal values, arithmetic + and string operations. It cannot contain references to columns or non-deterministic + function calls. + </p> + + <p> + The following example shows different kinds of expressions for the + <codeph>DEFAULT</codeph> clause. The requirement to use a constant value means that + you can fill in a placeholder value such as <codeph>NULL</codeph>, empty string, + 0, -1, <codeph>'N/A'</codeph> and so on, but you cannot reference functions or + column names. Therefore, you cannot use <codeph>DEFAULT</codeph> to do things such as + automatically making an uppercase copy of a string value, storing Boolean values based + on tests of other columns, or add or subtract one from another column representing a sequence number. + </p> + +<codeblock> +CREATE TABLE default_vals +( + id BIGINT PRIMARY KEY, + name STRING NOT NULL DEFAULT 'unknown', + address STRING DEFAULT upper('no fixed address'), + age INT DEFAULT -1, + earthling BOOLEAN DEFAULT TRUE, + planet_of_origin STRING DEFAULT 'Earth', + optional_col STRING DEFAULT NULL +) PARTITION BY HASH(id) PARTITIONS 2 STORED AS KUDU; +</codeblock> + + <note> + <p> + When designing an entirely new schema, prefer to use <codeph>NULL</codeph> as the + placeholder for any unknown or missing values, because that is the universal convention + among database systems. Null values can be stored efficiently, and easily checked with the + <codeph>IS NULL</codeph> or <codeph>IS NOT NULL</codeph> operators. The <codeph>DEFAULT</codeph> + attribute is appropriate when ingesting data that already has an established convention for + representing unknown or missing values, or where the vast majority of rows have some common + non-null value. + </p> + </note> + + </conbody> + + </concept> + + <concept id="kudu_encoding_attribute"> + + <title>ENCODING Attribute</title> + + <conbody> + + <p> + Each column in a Kudu table can optionally use an encoding, a low-overhead form of + compression that reduces the size on disk, then requires additional CPU cycles to + reconstruct the original values during queries. Typically, highly compressible data + benefits from the reduced I/O to read the data back from disk. By default, each + column uses the <q>plain</q> encoding where the data is stored unchanged. + </p> + + <p> + The encoding keywords that Impala recognizes are: + + <ul> + <li> + <p> + <codeph>AUTO_ENCODING</codeph>: use the default encoding based on the column + type; currently always the same as <codeph>PLAIN_ENCODING</codeph>, but subject to + change in the future. + </p> + </li> + <li> + <p> + <codeph>PLAIN_ENCODING</codeph>: leave the value in its original binary format. + </p> + </li> + <!-- GROUP_VARINT is internal use only, not documenting that although it shows up + in parser error messages. --> + <li> + <p> + <codeph>RLE</codeph>: compress repeated values (when sorted in primary key + order) by including a count. + </p> + </li> + <li> + <p> + <codeph>DICT_ENCODING</codeph>: when the number of different string values is + low, replace the original string with a numeric ID. + </p> + </li> + <li> + <p> + <codeph>BIT_SHUFFLE</codeph>: rearrange the bits of the values to efficiently + compress sequences of values that are identical or vary only slightly based + on primary key order. The resulting encoded data is also compressed with LZ4. + </p> + </li> + <li> + <p> + <codeph>PREFIX_ENCODING</codeph>: compress common prefixes in string values; mainly for use internally within Kudu. + </p> + </li> + </ul> + </p> + +<!-- +UNKNOWN, AUTO_ENCODING, PLAIN_ENCODING, PREFIX_ENCODING, GROUP_VARINT, RLE, DICT_ENCODING, BIT_SHUFFLE + +No joy trying keywords UNKNOWN, or GROUP_VARINT with TINYINT and BIGINT. +--> + + <p> + The following example shows the Impala keywords representing the encoding types. + (The Impala keywords match the symbolic names used within Kudu.) + For usage guidelines on the different kinds of encoding, see + <xref href="https://kudu.apache.org/docs/schema_design.html" scope="external" format="html">the Kudu documentation</xref>. + The <codeph>DESCRIBE</codeph> output shows how the encoding is reported after + the table is created, and that omitting the encoding (in this case, for the + <codeph>ID</codeph> column) is the same as specifying <codeph>DEFAULT_ENCODING</codeph>. + </p> + +<codeblock> +CREATE TABLE various_encodings +( + id BIGINT PRIMARY KEY, + c1 BIGINT ENCODING PLAIN_ENCODING, + c2 BIGINT ENCODING AUTO_ENCODING, + c3 TINYINT ENCODING BIT_SHUFFLE, + c4 DOUBLE ENCODING BIT_SHUFFLE, + c5 BOOLEAN ENCODING RLE, + c6 STRING ENCODING DICT_ENCODING, + c7 STRING ENCODING PREFIX_ENCODING +) PARTITION BY HASH(id) PARTITIONS 2 STORED AS KUDU; + +-- Some columns are omitted from the output for readability. +describe various_encodings; ++------+---------+-------------+----------+-----------------+ +| name | type | primary_key | nullable | encoding | ++------+---------+-------------+----------+-----------------+ +| id | bigint | true | false | AUTO_ENCODING | +| c1 | bigint | false | true | PLAIN_ENCODING | +| c2 | bigint | false | true | AUTO_ENCODING | +| c3 | tinyint | false | true | BIT_SHUFFLE | +| c4 | double | false | true | BIT_SHUFFLE | +| c5 | boolean | false | true | RLE | +| c6 | string | false | true | DICT_ENCODING | +| c7 | string | false | true | PREFIX_ENCODING | ++------+---------+-------------+----------+-----------------+ +</codeblock> + + </conbody> + + </concept> + + <concept id="kudu_compression_attribute"> + + <title>COMPRESSION Attribute</title> + + <conbody> + + <p> + You can specify a compression algorithm to use for each column in a Kudu table. This + attribute imposes more CPU overhead when retrieving the values than the + <codeph>ENCODING</codeph> attribute does. Therefore, use it primarily for columns with + long strings that do not benefit much from the less-expensive <codeph>ENCODING</codeph> + attribute. + </p> + + <p> + The choices for <codeph>COMPRESSION</codeph> are <codeph>LZ4</codeph>, + <codeph>SNAPPY</codeph>, and <codeph>ZLIB</codeph>. + </p> + + <note> + <p> + Columns that use the <codeph>BITSHUFFLE</codeph> encoding are already compressed + using <codeph>LZ4</codeph>, and so typically do not need any additional + <codeph>COMPRESSION</codeph> attribute. + </p> + </note> + + <p> + The following example shows design considerations for several + <codeph>STRING</codeph> columns with different distribution characteristics, leading + to choices for both the <codeph>ENCODING</codeph> and <codeph>COMPRESSION</codeph> + attributes. The <codeph>country</codeph> values come from a specific set of strings, + therefore this column is a good candidate for dictionary encoding. The + <codeph>post_id</codeph> column contains an ascending sequence of integers, where + several leading bits are likely to be all zeroes, therefore this column is a good + candidate for bitshuffle encoding. The <codeph>body</codeph> + column and the corresponding columns for translated versions tend to be long unique + strings that are not practical to use with any of the encoding schemes, therefore + they employ the <codeph>COMPRESSION</codeph> attribute instead. The ideal compression + codec in each case would require some experimentation to determine how much space + savings it provided and how much CPU overhead it added, based on real-world data. + </p> + +<codeblock> +CREATE TABLE blog_posts +( + user_id STRING ENCODING DICT_ENCODING, + post_id BIGINT ENCODING BIT_SHUFFLE, + subject STRING ENCODING PLAIN_ENCODING, + body STRING COMPRESSION LZ4, + spanish_translation STRING COMPRESSION SNAPPY, + esperanto_translation STRING COMPRESSION ZLIB, + PRIMARY KEY (user_id, post_id) +) PARTITION BY HASH(user_id, post_id) PARTITIONS 2 STORED AS KUDU; +</codeblock> + + </conbody> + + </concept> + + <concept id="kudu_block_size_attribute"> + + <title>BLOCK_SIZE Attribute</title> + + <conbody> + + <p> + Although Kudu does not use HDFS files internally, and thus is not affected by + the HDFS block size, it does have an underlying unit of I/O called the + <term>block size</term>. The <codeph>BLOCK_SIZE</codeph> attribute lets you set the + block size for any column. + </p> + + <p> + The block size attribute is a relatively advanced feature. Refer to + <xref href="https://kudu.apache.org/docs/index.html" scope="external" format="html">the Kudu documentation</xref> + for usage details. + </p> + +<!-- Commenting out this example for the time being. +<codeblock> +CREATE TABLE performance_for_benchmark_xyz +( + id BIGINT PRIMARY KEY, + col1 BIGINT BLOCK_SIZE 4096, + col2 STRING BLOCK_SIZE 16384, + col3 SMALLINT BLOCK_SIZE 2048 +) PARTITION BY HASH(id) PARTITIONS 2 STORED AS KUDU; +</codeblock> +--> + + </conbody> + + </concept> + + </concept> + + <concept id="kudu_partitioning"> + + <title>Partitioning for Kudu Tables</title> + + <conbody> + + <p> + Kudu tables use special mechanisms to distribute data among the underlying + tablet servers. Although we refer to such tables as partitioned tables, they are + distinguished from traditional Impala partitioned tables by use of different clauses + on the <codeph>CREATE TABLE</codeph> statement. Kudu tables use + <codeph>PARTITION BY</codeph>, <codeph>HASH</codeph>, <codeph>RANGE</codeph>, and + range specification clauses rather than the <codeph>PARTITIONED BY</codeph> clause + for HDFS-backed tables, which specifies only a column name and creates a new partition for each + different value. + </p> + + <p> + For background information and architectural details about the Kudu partitioning + mechanism, see + <xref href="https://kudu.apache.org/kudu.pdf" scope="external" format="html">the Kudu white paper, section 3.2</xref>. + </p> + +<!-- Hiding but leaving in place for the moment, in case the white paper discussion isn't enough. + <p> + With Kudu tables, all of the columns involved in these clauses must be primary key + columns. These clauses let you specify different ways to divide the data for each + column, or even for different value ranges within a column. This flexibility lets you + avoid problems with uneven distribution of data, where the partitioning scheme for + HDFS tables might result in some partitions being much larger than others. By setting + up an effective partitioning scheme for a Kudu table, you can ensure that the work for + a query can be parallelized evenly across the hosts in a cluster. + </p> +--> + + <note> + <p> + The Impala DDL syntax for Kudu tables is different than in early Kudu versions, + which used an experimental fork of the Impala code. For example, the + <codeph>DISTRIBUTE BY</codeph> clause is now <codeph>PARTITION BY</codeph>, the + <codeph>INTO <varname>n</varname> BUCKETS</codeph> clause is now + <codeph>PARTITIONS <varname>n</varname></codeph> and the range partitioning syntax + is reworked to replace the <codeph>SPLIT ROWS</codeph> clause with more expressive + syntax involving comparison operators. + </p> + </note> + + <p outputclass="toc inpage"/> + + </conbody> + + <concept id="kudu_hash_partitioning"> + <title>Hash Partitioning</title> + <conbody> + + <p> + Hash partitioning is the simplest type of partitioning for Kudu tables. + For hash-partitioned Kudu tables, inserted rows are divided up between a fixed number + of <q>buckets</q> by applying a hash function to the values of the columns specified + in the <codeph>HASH</codeph> clause. + Hashing ensures that rows with similar values are evenly distributed, instead of + clumping together all in the same bucket. Spreading new rows across the buckets this + way lets insertion operations work in parallel across multiple tablet servers. + Separating the hashed values can impose additional overhead on queries, where + queries with range-based predicates might have to read multiple tablets to retrieve + all the relevant values. + </p> + +<codeblock> +-- 1M rows with 50 hash partitions = approximately 20,000 rows per partition. +-- The values in each partition are not sequential, but rather based on a hash function. +-- Rows 1, 99999, and 123456 might be in the same partition. +CREATE TABLE million_rows (id string primary key, s string) + PARTITION BY HASH(id) PARTITIONS 50 + STORED AS KUDU; + +-- Because the ID values are unique, we expect the rows to be roughly +-- evenly distributed between the buckets in the destination table. +INSERT INTO million_rows SELECT * FROM billion_rows ORDER BY id LIMIT 1e6; +</codeblock> + + <note> + <p> + The largest number of buckets that you can create with a <codeph>PARTITIONS</codeph> + clause varies depending on the number of tablet servers in the cluster, while the smallest is 2. + For simplicity, some of the simple <codeph>CREATE TABLE</codeph> statements throughout this section + use <codeph>PARTITIONS 2</codeph> to illustrate the minimum requirements for a Kudu table. + For large tables, prefer to use roughly 10 partitions per server in the cluster. + </p> + </note> + + </conbody> + </concept> + + <concept id="kudu_range_partitioning"> + <title>Range Partitioning</title> + <conbody> + + <p> + Range partitioning lets you specify partitioning precisely, based on single values or ranges + of values within one or more columns. You add one or more <codeph>RANGE</codeph> clauses to the + <codeph>CREATE TABLE</codeph> statement, following the <codeph>PARTITION BY</codeph> + clause. + </p> + + <p> + Range-partitioned Kudu tables use one or more range clauses, which include a + combination of constant expressions, <codeph>VALUE</codeph> or <codeph>VALUES</codeph> + keywords, and comparison operators. (This syntax replaces the <codeph>SPLIT + ROWS</codeph> clause used with early Kudu versions.) + For the full syntax, see <xref keyref="create_table"/>. + </p> + +<codeblock><![CDATA[ +-- 50 buckets, all for IDs beginning with a lowercase letter. +-- Having only a single range enforces the allowed range of values +-- but does not add any extra parallelism. +create table million_rows_one_range (id string primary key, s string) + partition by hash(id) partitions 50, + range (partition 'a' <= values < '{') + stored as kudu; + +-- 50 buckets for IDs beginning with a lowercase letter +-- plus 50 buckets for IDs beginning with an uppercase letter. +-- Total number of buckets = number in the PARTITIONS clause x number of ranges. +-- We are still enforcing constraints on the primary key values +-- allowed in the table, and the 2 ranges provide better parallelism +-- as rows are inserted or the table is scanned. +create table million_rows_two_ranges (id string primary key, s string) + partition by hash(id) partitions 50, + range (partition 'a' <= values < '{', partition 'A' <= values < '[') + stored as kudu; + +-- Same as previous table, with an extra range covering the single key value '00000'. +create table million_rows_three_ranges (id string primary key, s string) + partition by hash(id) partitions 50, + range (partition 'a' <= values < '{', partition 'A' <= values < '[', partition value = '00000') + stored as kudu; + +-- The range partitioning can be displayed with a SHOW command in impala-shell. +show range partitions million_rows_three_ranges; ++---------------------+ +| RANGE (id) | ++---------------------+ +| VALUE = "00000" | +| "A" <= VALUES < "[" | +| "a" <= VALUES < "{" | ++---------------------+ +]]> +</codeblock> + + <note> + <p> + When defining ranges, be careful to avoid <q>fencepost errors</q> where values at the + extreme ends might be included or omitted by accident. For example, in the tables defined + in the preceding code listings, the range <codeph><![CDATA["a" <= VALUES < "{"]]></codeph> ensures that + any values starting with <codeph>z</codeph>, such as <codeph>za</codeph> or <codeph>zzz</codeph> + or <codeph>zzz-ZZZ</codeph>, are all included, by using a less-than operator for the smallest + value after all the values starting with <codeph>z</codeph>. + </p> + </note> + + <p> + For range-partitioned Kudu tables, an appropriate range must exist before a data value can be created in the table. + Any <codeph>INSERT</codeph>, <codeph>UPDATE</codeph>, or <codeph>UPSERT</codeph> statements fail if they try to + create column values that fall outside the specified ranges. The error checking for ranges is performed on the + Kudu side; Impala passes the specified range information to Kudu, and passes back any error or warning if the + ranges are not valid. (A nonsensical range specification causes an error for a DDL statement, but only a warning + for a DML statement.) + </p> + + <p> + Ranges can be non-contiguous: + </p> + +<codeblock><![CDATA[ +partition by range (year) (partition 1885 <= values <= 1889, partition 1893 <= values <= 1897) + +partition by range (letter_grade) (partition value = 'A', partition value = 'B', + partition value = 'C', partition value = 'D', partition value = 'F') +]]> +</codeblock> + + <p> + The <codeph>ALTER TABLE</codeph> statement with the <codeph>ADD PARTITION</codeph> or + <codeph>DROP PARTITION</codeph> clauses can be used to add or remove ranges from an + existing Kudu table. + </p> + +<codeblock><![CDATA[ +ALTER TABLE foo ADD PARTITION 30 <= VALUES < 50; +ALTER TABLE foo DROP PARTITION 1 <= VALUES < 5; +]]> +</codeblock> + + <p> + When a range is added, the new range must not overlap with any of the previous ranges; + that is, it can only fill in gaps within the previous ranges. + </p> + +<codeblock><![CDATA[ +alter table test_scores add range partition value = 'E'; + +alter table year_ranges add range partition 1890 <= values < 1893; +]]> +</codeblock> + + <p> + When a range is removed, all the associated rows in the table are deleted. (This + is true whether the table is internal or external.) + </p> + +<codeblock><![CDATA[ +alter table test_scores drop range partition value = 'E'; + +alter table year_ranges drop range partition 1890 <= values < 1893; +]]> +</codeblock> + + <p> + Kudu tables can also use a combination of hash and range partitioning. + </p> + +<codeblock><![CDATA[ +partition by hash (school) partitions 10, + range (letter_grade) (partition value = 'A', partition value = 'B', + partition value = 'C', partition value = 'D', partition value = 'F') +]]> +</codeblock> + + </conbody> + </concept> + + <concept id="kudu_partitioning_misc"> + <title>Working with Partitioning in Kudu Tables</title> + <conbody> + + <p> + To see the current partitioning scheme for a Kudu table, you can use the <codeph>SHOW + CREATE TABLE</codeph> statement or the <codeph>SHOW PARTITIONS</codeph> statement. The + <codeph>CREATE TABLE</codeph> syntax displayed by this statement includes all the + hash, range, or both clauses that reflect the original table structure plus any + subsequent <codeph>ALTER TABLE</codeph> statements that changed the table structure. + </p> + + <p> + To see the underlying buckets and partitions for a Kudu table, use the + <codeph>SHOW TABLE STATS</codeph> or <codeph>SHOW PARTITIONS</codeph> statement. + </p> + + </conbody> + </concept> + + </concept> + + <concept id="kudu_timestamps"> + + <title>Handling Date, Time, or Timestamp Data with Kudu</title> + + <conbody> + + <p> + Because currently a Kudu table cannot have a column with the Impala + <codeph>TIMESTAMP</codeph> type, expect to store date/time information as the number + of seconds, milliseconds, or microseconds since the Unix epoch date of January 1, + 1970. Specify the column as <codeph>BIGINT</codeph> in the Impala <codeph>CREATE + TABLE</codeph> statement, corresponding to an 8-byte integer (an + <codeph>int64</codeph>) in the underlying Kudu table). Then use Impala date/time + conversion functions as necessary to produce a numeric, <codeph>TIMESTAMP</codeph>, + or <codeph>STRING</codeph> value depending on the context. + </p> + + <p> + For example, the <codeph>unix_timestamp()</codeph> function returns an integer result + representing the number of seconds past the epoch. The <codeph>now()</codeph> function + produces a <codeph>TIMESTAMP</codeph> representing the current date and time, which can + be passed as an argument to <codeph>unix_timestamp()</codeph>. And string literals + representing dates and date/times can be cast to <codeph>TIMESTAMP</codeph>, and from there + converted to numeric values. The following examples show how you might store a date/time + column as <codeph>BIGINT</codeph> in a Kudu table, but still use string literals and + <codeph>TIMESTAMP</codeph> values for convenience. + </p> + +<codeblock><![CDATA[ +-- now() returns a TIMESTAMP and shows the format for string literals you can cast to TIMESTAMP. +select now(); ++-------------------------------+ +| now() | ++-------------------------------+ +| 2017-01-25 23:50:10.132385000 | ++-------------------------------+ + +-- unix_timestamp() accepts either a TIMESTAMP or an equivalent string literal. +select unix_timestamp(now()); ++------------------+ +| unix_timestamp() | ++------------------+ +| 1485386670 | ++------------------+ + +select unix_timestamp('2017-01-01'); ++------------------------------+ +| unix_timestamp('2017-01-01') | ++------------------------------+ +| 1483228800 | ++------------------------------+ + +-- Make a table representing a date/time value as BIGINT. +-- Construct 1 range partition and 20 associated hash partitions for each year. +-- Use date/time conversion functions to express the ranges as human-readable dates. +create table time_series(id bigint, when_exactly bigint, event string, primary key (id, when_exactly)) + partition by hash (id) partitions 20, + range (when_exactly) + ( + partition unix_timestamp('2015-01-01') <= values < unix_timestamp('2016-01-01'), + partition unix_timestamp('2016-01-01') <= values < unix_timestamp('2017-01-01'), + partition unix_timestamp('2017-01-01') <= values < unix_timestamp('2018-01-01') + ) + stored as kudu; + +-- On insert, we can transform a human-readable date/time into a numeric value. +insert into time_series values (12345, unix_timestamp('2017-01-25 23:24:56'), 'Working on doc examples'); + +-- On retrieval, we can examine the numeric date/time value or turn it back into a string for readability. +select id, when_exactly, from_unixtime(when_exactly) as 'human-readable date/time', event + from time_series order by when_exactly limit 100; ++-------+--------------+--------------------------+-------------------------+ +| id | when_exactly | human-readable date/time | event | ++-------+--------------+--------------------------+-------------------------+ +| 12345 | 1485386696 | 2017-01-25 23:24:56 | Working on doc examples | ++-------+--------------+--------------------------+-------------------------+ +]]> +</codeblock> + + <note> + <p> + If you do high-precision arithmetic involving numeric date/time values, + when dividing millisecond values by 1000, or microsecond values by 1 million, always + cast the integer numerator to a <codeph>DECIMAL</codeph> with sufficient precision + and scale to avoid any rounding or loss of precision. + </p> + </note> + +<codeblock><![CDATA[ +-- 1 million and 1 microseconds = 1.000001 seconds. +select microseconds, + cast (microseconds as decimal(20,7)) / 1e6 as fractional_seconds + from table_with_microsecond_column; ++--------------+----------------------+ +| microseconds | fractional_seconds | ++--------------+----------------------+ +| 1000001 | 1.000001000000000000 | ++--------------+----------------------+ +]]> +</codeblock> + + </conbody> + + </concept> + + <concept id="kudu_metadata"> + + <title>How Impala Handles Kudu Metadata</title> + + <conbody> + + <p conref="../shared/impala_common.xml#common/kudu_metadata_intro"/> + <p conref="../shared/impala_common.xml#common/kudu_metadata_details"/> + + <p> + Because Kudu manages the metadata for its own tables separately from the metastore + database, there is a table name stored in the metastore database for Impala to use, + and a table name on the Kudu side, and these names can be modified independently + through <codeph>ALTER TABLE</codeph> statements. + </p> + + <p> + To avoid potential name conflicts, the prefix <codeph>impala::</codeph> + and the Impala database name are encoded into the underlying Kudu + table name: + </p> + +<codeblock><![CDATA[ +create database some_database; +use some_database; + +create table table_name_demo (x int primary key, y int) + partition by hash (x) partitions 2 stored as kudu; + +describe formatted table_name_demo; +... +kudu.table_name | impala::some_database.table_name_demo +]]> +</codeblock> + + <p> + See <xref keyref="kudu_tables"/> for examples of how to change the name of + the Impala table in the metastore database, the name of the underlying Kudu + table, or both. + </p> + + </conbody> + + </concept> + + </concept> + + <concept id="kudu_etl"> + + <title>Loading Data into Kudu Tables</title> + + <conbody> + + <p> + Kudu tables are well-suited to use cases where data arrives continuously, in small or + moderate volumes. To bring data into Kudu tables, use the Impala <codeph>INSERT</codeph> + and <codeph>UPSERT</codeph> statements. The <codeph>LOAD DATA</codeph> statement does + not apply to Kudu tables. </p> <p> - Kudu can do extra optimizations for queries that refer to the primary key columns in the - <codeph>WHERE</codeph> clause. It is not crucial though to include the primary key columns in the - <codeph>WHERE</codeph> clause of every query. The benefit is mainly for partitioned tables, - which divide the data among various tablet servers based on the distribution of - data values in some or all of the primary key columns. + Because Kudu manages its own storage layer that is optimized for smaller block sizes than + HDFS, and performs its own housekeeping to keep data evenly distributed, it is not + subject to the <q>many small files</q> issue and does not need explicit reorganization + and compaction as the data grows over time. The partitions within a Kudu table can be + specified to cover a variety of possible data distributions, instead of hardcoding a new + partition for each new day, hour, and so on, which can lead to inefficient, + hard-to-scale, and hard-to-manage partition schemes with HDFS tables. + </p> + + <p> + Your strategy for performing ETL or bulk updates on Kudu tables should take into account + the limitations on consistency for DML operations. + </p> + + <p> + Make <codeph>INSERT</codeph>, <codeph>UPDATE</codeph>, and <codeph>UPSERT</codeph> + operations <term>idempotent</term>: that is, able to be applied multiple times and still + produce an identical result. + </p> + + <p> + If a bulk operation is in danger of exceeding capacity limits due to timeouts or high + memory usage, split it into a series of smaller operations. + </p> + + <p> + Avoid running concurrent ETL operations where the end results depend on precise + ordering. In particular, do not rely on an <codeph>INSERT ... SELECT</codeph> statement + that selects from the same table into which it is inserting, unless you include extra + conditions in the <codeph>WHERE</codeph> clause to avoid reading the newly inserted rows + within the same statement. + </p> + + <p> + Because relationships between tables cannot be enforced by Impala and Kudu, and cannot + be committed or rolled back together, do not expect transactional semantics for + multi-table operations. </p> </conbody> @@ -110,52 +1168,149 @@ under the License. <concept id="kudu_dml"> - <title>Impala DML Support for Kudu Tables</title> + <title>Impala DML Support for Kudu Tables (INSERT, UPDATE, DELETE, UPSERT)</title> + + <prolog> + <metadata> + <data name="Category" value="DML"/> + </metadata> + </prolog> + + <conbody> + + <p> + Impala supports certain DML statements for Kudu tables only. The <codeph>UPDATE</codeph> + and <codeph>DELETE</codeph> statements let you modify data within Kudu tables without + rewriting substantial amounts of table data. The <codeph>UPSERT</codeph> statement acts + as a combination of <codeph>INSERT</codeph> and <codeph>UPDATE</codeph>, inserting rows + where the primary key does not already exist, and updating the non-primary key columns + where the primary key does already exist in the table. + </p> + + <p> + The <codeph>INSERT</codeph> statement for Kudu tables honors the unique and <codeph>NOT + NULL</codeph> requirements for the primary key columns. + </p> + + <p> + Because Impala and Kudu do not support transactions, the effects of any + <codeph>INSERT</codeph>, <codeph>UPDATE</codeph>, or <codeph>DELETE</codeph> statement + are immediately visible. For example, you cannot do a sequence of + <codeph>UPDATE</codeph> statements and only make the changes visible after all the + statements are finished. Also, if a DML statement fails partway through, any rows that + were already inserted, deleted, or changed remain in the table; there is no rollback + mechanism to undo the changes. + </p> + + <p> + In particular, an <codeph>INSERT ... SELECT</codeph> statement that refers to the table + being inserted into might insert more rows than expected, because the + <codeph>SELECT</codeph> part of the statement sees some of the new rows being inserted + and processes them again. + </p> + + <note> + <p> + The <codeph>LOAD DATA</codeph> statement, which involves manipulation of HDFS data files, + does not apply to Kudu tables. + </p> + </note> + + </conbody> + + </concept> + + <concept id="kudu_consistency"> + + <title>Consistency Considerations for Kudu Tables</title> <conbody> <p> - Impala supports certain DML statements for Kudu tables only. The <codeph>UPDATE</codeph> and - <codeph>DELETE</codeph> statements let you modify data within Kudu tables without rewriting substantial - amounts of table data. + Kudu tables have consistency characteristics such as uniqueness, controlled by the + primary key columns, and non-nullable columns. The emphasis for consistency is on + preventing duplicate or incomplete data from being stored in a table. </p> <p> - The <codeph>INSERT</codeph> statement for Kudu tables honors the unique and non-<codeph>NULL</codeph> - requirements for the primary key columns. + Currently, Kudu does not enforce strong consistency for order of operations, total + success or total failure of a multi-row statement, or data that is read while a write + operation is in progress. Changes are applied atomically to each row, but not applied + as a single unit to all rows affected by a multi-row DML statement. That is, Kudu does + not currently have atomic multi-row statements or isolation between statements. </p> <p> - Because Impala and Kudu do not support transactions, the effects of any <codeph>INSERT</codeph>, - <codeph>UPDATE</codeph>, or <codeph>DELETE</codeph> statement are immediately visible. For example, you - cannot do a sequence of <codeph>UPDATE</codeph> statements and only make the change visible after all the - statements are finished. Also, if a DML statement fails partway through, any rows that were already - inserted, deleted, or changed remain in the table; there is no rollback mechanism to undo the changes. + If some rows are rejected during a DML operation because of a mismatch with duplicate + primary key values, <codeph>NOT NULL</codeph> constraints, and so on, the statement + succeeds with a warning. Impala still inserts, deletes, or updates the other rows that + are not affected by the constraint violation. + </p> + + <p> + Consequently, the number of rows affected by a DML operation on a Kudu table might be + different than you expect. + </p> + + <p> + Because there is no strong consistency guarantee for information being inserted into, + deleted from, or updated across multiple tables simultaneously, consider denormalizing + the data where practical. That is, if you run separate <codeph>INSERT</codeph> + statements to insert related rows into two different tables, one <codeph>INSERT</codeph> + might fail while the other succeeds, leaving the data in an inconsistent state. Even if + both inserts succeed, a join query might happen during the interval between the + completion of the first and second statements, and the query would encounter incomplete + inconsistent data. Denormalizing the data into a single wide table can reduce the + possibility of inconsistency due to multi-table operations. + </p> + + <p> + Information about the number of rows affected by a DML operation is reported in + <cmdname>impala-shell</cmdname> output, and in the <codeph>PROFILE</codeph> output, but + is not currently reported to HiveServer2 clients such as JDBC or ODBC applications. </p> </conbody> </concept> - <concept id="kudu_partitioning"> + <concept id="kudu_security"> - <title>Partitioning for Kudu Tables</title> + <title>Security Considerations for Kudu Tables</title> <conbody> <p> - Kudu tables use special mechanisms to evenly distribute data among the underlying tablet servers. Although - we refer to such tables as partitioned tables, they are distinguished from traditional Impala partitioned - tables by use of different clauses on the <codeph>CREATE TABLE</codeph> statement. Partitioned Kudu tables - use <codeph>PARTITION BY</codeph>, <codeph>HASH</codeph>, and <codeph>RANGE</codeph> clauses rather than - the traditional <codeph>PARTITIONED BY</codeph> clause. All of the columns involved in these clauses must - be primary key columns. These clauses let you specify different ways to divide the data for each column, - or even for different value ranges within a column. This flexibility lets you avoid problems with uneven - distribution of data, where the partitioning scheme for HDFS tables might result in some partitions being - much larger than others. By setting up an effective partitioning scheme for a Kudu table, you can ensure - that the work for a query can be parallelized evenly across the hosts in a cluster. + Security for Kudu tables involves: </p> + <ul> + <li> + <p> + Sentry authorization. + </p> + <p conref="../shared/impala_common.xml#common/kudu_sentry_limitations"/> + </li> + + <li> + <p> + Lineage tracking. + </p> + </li> + + <li> + <p> + Auditing. + </p> + </li> + + <li> + <p> + Redaction of sensitive information from log files. + </p> + </li> + </ul> + </conbody> </concept> @@ -167,18 +1322,66 @@ under the License. <conbody> <p> - For queries involving Kudu tables, Impala can delegate much of the work of filtering the result set to - Kudu, avoiding some of the I/O involved in full table scans of tables containing HDFS data files. This type - of optimization is especially effective for partitioned Kudu tables, where the Impala query - <codeph>WHERE</codeph> clause refers to one or more primary key columns that are also used as partition key - columns. For example, if a partitioned Kudu table uses a <codeph>HASH</codeph> clause for - <codeph>col1</codeph> and a <codeph>RANGE</codeph> clause for <codeph>col2</codeph>, a query using a clause - such as <codeph>WHERE col1 IN (1,2,3) AND col2 > 100</codeph> can determine exactly which tablet servers - contain relevant data, and therefore parallelize the query very efficiently. + For queries involving Kudu tables, Impala can delegate much of the work of filtering the + result set to Kudu, avoiding some of the I/O involved in full table scans of tables + containing HDFS data files. This type of optimization is especially effective for + partitioned Kudu tables, where the Impala query <codeph>WHERE</codeph> clause refers to + one or more primary key columns that are also used as partition key columns. For + example, if a partitioned Kudu table uses a <codeph>HASH</codeph> clause for + <codeph>col1</codeph> and a <codeph>RANGE</codeph> clause for <codeph>col2</codeph>, a + query using a clause such as <codeph>WHERE col1 IN (1,2,3) AND col2 > 100</codeph> + can determine exactly which tablet servers contain relevant data, and therefore + parallelize the query very efficiently. </p> + <p> + See <xref keyref="explain"/> for examples of evaluating the effectiveness of + the predicate pushdown for a specific query against a Kudu table. + </p> + + <!-- Hide until subtopics are ready to display. --> + <p outputclass="toc inpage" audience="hidden"/> + </conbody> + <concept id="kudu_vs_parquet" audience="hidden"> + <!-- To do: if there is enough real-world experience in future to have a + substantive discussion of this subject, revisit this topic and + consider unhiding it. --> + + <title>How Kudu Works with Column-Oriented Operations</title> + + <conbody> + + <p> + For immutable data, Impala is often used with Parquet tables due to the efficiency of + the column-oriented Parquet layout. This section describes how Kudu stores and + retrieves columnar data, to help you understand performance and storage considerations + of Kudu tables as compared with Parquet tables. + </p> + + </conbody> + + </concept> + + <concept id="kudu_memory" audience="hidden"> + <!-- To do: if there is enough real-world experience in future to have a + substantive discussion of this subject, revisit this topic and + consider unhiding it. --> + + <title>Memory Usage for Operations on Kudu Tables</title> + + <conbody> + + <p> + The Apache Kudu architecture, topology, and data storage techniques result in + different patterns of memory usage for Impala statements than with HDFS-backed tables. + </p> + + </conbody> + + </concept> + </concept> </concept>
http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/661921b2/docs/topics/impala_literals.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_literals.xml b/docs/topics/impala_literals.xml index d0d475c..22404d9 100644 --- a/docs/topics/impala_literals.xml +++ b/docs/topics/impala_literals.xml @@ -397,6 +397,24 @@ insert into t1 partition(x=NULL, y) select c1, c3 from some_other_table;</codeb <codeph>nullifzero()</codeph>, and <codeph>zeroifnull()</codeph>. See <xref href="impala_conditional_functions.xml#conditional_functions"/> for details. </p> + + <p conref="../shared/impala_common.xml#common/kudu_blurb"/> + <p rev="kudu"> + Columns in Kudu tables have an attribute that specifies whether or not they can contain + <codeph>NULL</codeph> values. A column with a <codeph>NULL</codeph> attribute can contain + nulls. A column with a <codeph>NOT NULL</codeph> attribute cannot contain any nulls, and + an <codeph>INSERT</codeph>, <codeph>UPDATE</codeph>, or <codeph>UPSERT</codeph> statement + will skip any row that attempts to store a null in a column designated as <codeph>NOT NULL</codeph>. + Kudu tables default to the <codeph>NULL</codeph> setting for each column, except columns that + are part of the primary key. + </p> + <p rev="kudu"> + In addition to columns with the <codeph>NOT NULL</codeph> attribute, Kudu tables also have + restrictions on <codeph>NULL</codeph> values in columns that are part of the primary key for + a table. No column that is part of the primary key in a Kudu table can contain any + <codeph>NULL</codeph> values. + </p> + </conbody> </concept> </concept> http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/661921b2/docs/topics/impala_map.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_map.xml b/docs/topics/impala_map.xml index 3d03129..6fb697b 100644 --- a/docs/topics/impala_map.xml +++ b/docs/topics/impala_map.xml @@ -85,6 +85,9 @@ type ::= <varname>primitive_type</varname> | <varname>complex_type</varname> <li/> </ul> + <p conref="../shared/impala_common.xml#common/kudu_blurb"/> + <p conref="../shared/impala_common.xml#common/kudu_unsupported_data_type"/> + <p conref="../shared/impala_common.xml#common/example_blurb"/> <note conref="../shared/impala_common.xml#common/complex_type_schema_pointer"/> http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/661921b2/docs/topics/impala_partitioning.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_partitioning.xml b/docs/topics/impala_partitioning.xml index 4d723ad..1f70d2b 100644 --- a/docs/topics/impala_partitioning.xml +++ b/docs/topics/impala_partitioning.xml @@ -575,7 +575,7 @@ SELECT COUNT(*) FROM sales_table WHERE year IN (2005, 2010, 2015); </concept> - <concept rev="kudu" id="partition_kudu" audience="hidden"> + <concept rev="kudu 2.8.0" id="partition_kudu"> <title>Using Partitioning with Kudu Tables</title> @@ -593,6 +593,12 @@ SELECT COUNT(*) FROM sales_table WHERE year IN (2005, 2010, 2015); columns. </p> + <p> + See <xref href="impala_kudu.xml#kudu_partitioning"/> for + details and examples of the partitioning techniques + for Kudu tables. + </p> + </conbody> </concept> http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/661921b2/docs/topics/impala_refresh.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_refresh.xml b/docs/topics/impala_refresh.xml index 8244aa4..7897ecd 100644 --- a/docs/topics/impala_refresh.xml +++ b/docs/topics/impala_refresh.xml @@ -333,6 +333,11 @@ ERROR: AnalysisException: Items in partition spec must exactly match the partiti <p conref="../shared/impala_common.xml#common/s3_metadata"/> <p conref="../shared/impala_common.xml#common/cancel_blurb_no"/> + + <p rev="kudu" conref="../shared/impala_common.xml#common/kudu_blurb"/> + <p conref="../shared/impala_common.xml#common/kudu_metadata_intro"/> + <p conref="../shared/impala_common.xml#common/kudu_metadata_details"/> + <p conref="../shared/impala_common.xml#common/related_info"/> <p> <xref href="impala_hadoop.xml#intro_metastore"/>, http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/661921b2/docs/topics/impala_reserved_words.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_reserved_words.xml b/docs/topics/impala_reserved_words.xml index 423fd43..440120c 100644 --- a/docs/topics/impala_reserved_words.xml +++ b/docs/topics/impala_reserved_words.xml @@ -82,7 +82,9 @@ avro between bigint <ph rev="1.4.0">binary</ph> +<ph rev="kudu">blocksize</ph> boolean +<!-- <ph rev="kudu">buckets</ph> --> by <ph rev="1.4.0">cached</ph> <ph rev="2.3.0">cascade</ph> @@ -95,6 +97,7 @@ change column columns comment +<ph rev="kudu">compression</ph> compute create cross @@ -105,15 +108,18 @@ databases date datetime decimal -<ph rev="2.6.0">delete</ph> +<ph rev="kudu">default</ph> +<ph rev="kudu">delete</ph> delimited desc describe distinct +<!-- <ph rev="kudu">distribute</ph> --> div double drop else +<ph rev="kudu">encoding</ph> end escaped exists @@ -136,10 +142,10 @@ function functions <ph rev="2.1.0">grant</ph> group -<ph rev="2.6.0">hash</ph> +<ph rev="kudu">hash</ph> having if -<ph rev="2.6.0">ignore</ph> +<!-- <ph rev="kudu">ignore</ph> --> <ph rev="2.5.0">ilike</ph> in <ph rev="2.1.0">incremental</ph> @@ -210,6 +216,7 @@ serdeproperties set show smallint +<!-- <ph rev="kudu">split</ph> --> stats stored straight_join @@ -229,8 +236,9 @@ true <ph rev="2.0.0">unbounded</ph> <ph rev="1.4.0">uncached</ph> union -<ph rev="2.6.0">update</ph> +<ph rev="kudu">update</ph> <ph rev="1.2.1">update_fn</ph> +<ph rev="kudu">upsert</ph> use using values http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/661921b2/docs/topics/impala_revoke.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_revoke.xml b/docs/topics/impala_revoke.xml index 97a912d..b7a0908 100644 --- a/docs/topics/impala_revoke.xml +++ b/docs/topics/impala_revoke.xml @@ -108,6 +108,9 @@ object_type ::= TABLE | DATABASE | SERVER | URI <p conref="../shared/impala_common.xml#common/permissions_blurb_no"/> + <p rev="2.8.0" conref="../shared/impala_common.xml#common/kudu_blurb"/> + <p conref="../shared/impala_common.xml#common/kudu_sentry_limitations"/> + <p conref="../shared/impala_common.xml#common/related_info"/> <p> http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/661921b2/docs/topics/impala_show.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_show.xml b/docs/topics/impala_show.xml index 84e9c0b..4e9e7fc 100644 --- a/docs/topics/impala_show.xml +++ b/docs/topics/impala_show.xml @@ -28,6 +28,7 @@ under the License. <data name="Category" value="Developers"/> <data name="Category" value="Data Analysts"/> <data name="Category" value="Reports"/> + <data name="Category" value="Kudu"/> </metadata> </prolog> @@ -49,7 +50,8 @@ SHOW TABLES [IN <varname>database_name</varname>] [[LIKE] '<varname>pattern</var <ph rev="1.2.1">SHOW TABLE STATS [<varname>database_name</varname>.]<varname>table_name</varname></ph> <ph rev="1.2.1">SHOW COLUMN STATS [<varname>database_name</varname>.]<varname>table_name</varname></ph> <ph rev="1.4.0">SHOW PARTITIONS [<varname>database_name</varname>.]<varname>table_name</varname></ph> -SHOW FILES IN [<varname>database_name</varname>.]<varname>table_name</varname> <ph rev="IMPALA-1654">[PARTITION (<varname>key_col_expression</varname> [, <varname>key_col_expression</varname>]</ph>] +<ph rev="1.4.0">SHOW <ph rev="kudu">[RANGE]</ph> PARTITIONS [<varname>database_name</varname>.]<varname>table_name</varname></ph> +SHOW FILES IN [<varname>database_name</varname>.]<varname>table_name</varname> <ph rev="IMPALA-1654">[PARTITION (<varname>key_col_expression</varname> [, <varname>key_col_expression</varname>]</ph>] <ph rev="2.0.0">SHOW ROLES SHOW CURRENT ROLES @@ -129,7 +131,8 @@ show files in sample_table partition (month like 'J%'); <note> This statement applies to tables and partitions stored on HDFS, or in the Amazon Simple Storage System (S3). It does not apply to views. - It does not apply to tables mapped onto HBase, because HBase does not use the same file-based storage layout. + It does not apply to tables mapped onto HBase <ph rev="kudu">or Kudu</ph>, + because those data management systems do not use the same file-based storage layout. </note> <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/> @@ -742,6 +745,61 @@ show tables like '*dim*|t*'; <p conref="../shared/impala_common.xml#common/permissions_blurb_no"/> + <p rev="kudu"> + For Kudu tables: + </p> + + <ul rev="kudu"> + <li> + <p> + The column specifications include attributes such as <codeph>NULL</codeph>, + <codeph>NOT NULL</codeph>, <codeph>ENCODING</codeph>, and <codeph>COMPRESSION</codeph>. + If you do not specify those attributes in the original <codeph>CREATE TABLE</codeph> statement, + the <codeph>SHOW CREATE TABLE</codeph> output displays the defaults that were used. + </p> + </li> + <li> + <p> + The specifications of any <codeph>RANGE</codeph> clauses are not displayed in full. + To see the definition of the range clauses for a Kudu table, use the <codeph>SHOW RANGE PARTITIONS</codeph> statement. + </p> + </li> + <li> + <p> + The <codeph>TBLPROPERTIES</codeph> output reflects the Kudu master address + and the internal Kudu name associated with the Impala table. + </p> + </li> + </ul> + +<codeblock rev="kudu"> +show CREATE TABLE numeric_grades_default_letter; ++------------------------------------------------------------------------------------------------+ +| result | ++------------------------------------------------------------------------------------------------+ +| CREATE TABLE user.numeric_grades_default_letter ( | +| score TINYINT NOT NULL ENCODING AUTO_ENCODING COMPRESSION DEFAULT_COMPRESSION, | +| letter_grade STRING NULL ENCODING AUTO_ENCODING COMPRESSION DEFAULT_COMPRESSION DEFAULT '-', | +| student STRING NULL ENCODING AUTO_ENCODING COMPRESSION DEFAULT_COMPRESSION, | +| PRIMARY KEY (score) | +| ) | +| PARTITION BY <b>RANGE (score) (...)</b> | +| STORED AS KUDU | +| TBLPROPERTIES ('kudu.master_addresses'='vd0342.example.com:7051', | +| 'kudu.table_name'='impala::USER.numeric_grades_default_letter') | ++------------------------------------------------------------------------------------------------+ + +show range partitions numeric_grades_default_letter; ++--------------------+ +| RANGE (score) | ++--------------------+ +| 0 <= VALUES < 50 | +| 50 <= VALUES < 65 | +| 65 <= VALUES < 80 | +| 80 <= VALUES < 100 | ++--------------------+ +</codeblock> + <p conref="../shared/impala_common.xml#common/example_blurb"/> <p> @@ -855,6 +913,39 @@ show create table show_create_table_demo; <p conref="../shared/impala_common.xml#common/show_security"/> + <p conref="../shared/impala_common.xml#common/kudu_blurb"/> + + <p rev="kudu IMPALA-2830"> + Because Kudu tables do not have characteristics derived from HDFS, such + as number of files, file format, and HDFS cache status, the output of + <codeph>SHOW TABLE STATS</codeph> reflects different characteristics + that apply to Kudu tables. If the Kudu table is created with the + clause <codeph>PARTITIONS 20</codeph>, then the result set of + <codeph>SHOW TABLE STATS</codeph> consists of 20 rows, each representing + one of the numbered partitions. For example: + </p> + +<codeblock rev="kudu IMPALA-2830"> +show table stats kudu_table; ++--------+-----------+----------+-----------------------+------------+ +| # Rows | Start Key | Stop Key | Leader Replica | # Replicas | ++--------+-----------+----------+-----------------------+------------+ +| -1 | | 00000001 | host.example.com:7050 | 3 | +| -1 | 00000001 | 00000002 | host.example.com:7050 | 3 | +| -1 | 00000002 | 00000003 | host.example.com:7050 | 3 | +| -1 | 00000003 | 00000004 | host.example.com:7050 | 3 | +| -1 | 00000004 | 00000005 | host.example.com:7050 | 3 | +... +</codeblock> + + <p rev="IMPALA-2830"> + Impala does not compute the number of rows for each partition for + Kudu tables. Therefore, you do not need to re-run <codeph>COMPUTE STATS</codeph> + when you see -1 in the <codeph># Rows</codeph> column of the output from + <codeph>SHOW TABLE STATS</codeph>. That column always shows -1 for + all Kudu tables. + </p> + <p conref="../shared/impala_common.xml#common/example_blurb"/> <p> @@ -959,6 +1050,14 @@ show table stats store_sales; <p conref="../shared/impala_common.xml#common/show_security"/> + <p rev="kudu IMPALA-2830"> + The output for <codeph>SHOW COLUMN STATS</codeph> includes + the relevant information for Kudu tables. + The information for column statistics that originates in the + underlying Kudu storage layer is also represented in the + metastore database that Impala uses. + </p> + <p conref="../shared/impala_common.xml#common/example_blurb"/> <p> @@ -1145,8 +1244,31 @@ show column stats store_sales; <p conref="../shared/impala_common.xml#common/show_security"/> + <p conref="../shared/impala_common.xml#common/kudu_blurb"/> + + <p rev="kudu IMPALA-4403"> + The optional <codeph>RANGE</codeph> clause only applies to Kudu tables. It displays only the partitions + defined by the <codeph>RANGE</codeph> clause of <codeph>CREATE TABLE</codeph> or <codeph>ALTER TABLE</codeph>. + </p> + + <p rev="kudu IMPALA-4403"> + Although you can specify <codeph><</codeph> or + <codeph><=</codeph> comparison operators when defining + range partitions for Kudu tables, Kudu rewrites them if necessary + to represent each range as + <codeph><varname>low_bound</varname> <= VALUES < <varname>high_bound</varname></codeph>. + This rewriting might involve incrementing one of the boundary values + or appending a <codeph>\0</codeph> for string values, so that the + partition covers the same range as originally specified. + </p> + <p conref="../shared/impala_common.xml#common/example_blurb"/> + <p> + The following example shows the output for a Parquet, text, or other + HDFS-backed table partitioned on the <codeph>YEAR</codeph> column: + </p> + <codeblock rev="1.4.0">[localhost:21000] > show partitions census; +-------+-------+--------+------+---------+ | year | #Rows | #Files | Size | Format | @@ -1162,6 +1284,53 @@ show column stats store_sales; +-------+-------+--------+------+---------+ </codeblock> + <p rev="kudu IMPALA-4403"> + The following example shows the output for a Kudu table + using the hash partitioning mechanism. The number of + rows in the result set corresponds to the values used + in the <codeph>PARTITIONS <varname>N</varname></codeph> + clause of <codeph>CREATE TABLE</codeph>. + </p> + +<codeblock rev="kudu IMPALA-4403"><![CDATA[ +show partitions million_rows_hash; + ++--------+-----------+----------+-----------------------+-- +| # Rows | Start Key | Stop Key | Leader Replica | # Replicas ++--------+-----------+----------+-----------------------+-- +| -1 | | 00000001 | n236.example.com:7050 | 3 +| -1 | 00000001 | 00000002 | n236.example.com:7050 | 3 +| -1 | 00000002 | 00000003 | n336.example.com:7050 | 3 +| -1 | 00000003 | 00000004 | n238.example.com:7050 | 3 +| -1 | 00000004 | 00000005 | n338.example.com:7050 | 3 +.... +| -1 | 0000002E | 0000002F | n240.example.com:7050 | 3 +| -1 | 0000002F | 00000030 | n336.example.com:7050 | 3 +| -1 | 00000030 | 00000031 | n240.example.com:7050 | 3 +| -1 | 00000031 | | n334.example.com:7050 | 3 ++--------+-----------+----------+-----------------------+-- +Fetched 50 row(s) in 0.05s +]]> +</codeblock> + + <p rev="kudu IMPALA-4403"> + The following example shows the output for a Kudu table + using the range partitioning mechanism: + </p> + +<codeblock rev="kudu IMPALA-4403"><![CDATA[ +show range partitions million_rows_range; ++-----------------------+ +| RANGE (id) | ++-----------------------+ +| VALUES < "A" | +| "A" <= VALUES < "[" | +| "a" <= VALUES < "{" | +| "{" <= VALUES < "~\0" | ++-----------------------+ +]]> +</codeblock> + <p conref="../shared/impala_common.xml#common/permissions_blurb"/> <p rev="CDH-19187"> The user ID that the <cmdname>impalad</cmdname> daemon runs under, http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/661921b2/docs/topics/impala_struct.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_struct.xml b/docs/topics/impala_struct.xml index d103d95..b4289ab 100644 --- a/docs/topics/impala_struct.xml +++ b/docs/topics/impala_struct.xml @@ -112,6 +112,9 @@ type ::= <varname>primitive_type</varname> | <varname>complex_type</varname> <li/> </ul> + <p conref="../shared/impala_common.xml#common/kudu_blurb"/> + <p conref="../shared/impala_common.xml#common/kudu_unsupported_data_type"/> + <p conref="../shared/impala_common.xml#common/example_blurb"/> <note conref="../shared/impala_common.xml#common/complex_type_schema_pointer"/>
