http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_decimal.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_decimal.xml b/docs/topics/impala_decimal.xml index c0c98d9..b566860 100644 --- a/docs/topics/impala_decimal.xml +++ b/docs/topics/impala_decimal.xml @@ -3,7 +3,7 @@ <concept rev="1.4.0" id="decimal"> <title>DECIMAL Data Type (CDH 5.1 or higher only)</title> - <titlealts><navtitle>DECIMAL (CDH 5.1 or higher only)</navtitle></titlealts> + <titlealts audience="PDF"><navtitle>DECIMAL</navtitle></titlealts> <prolog> <metadata> <data name="Category" value="Impala"/> @@ -106,18 +106,19 @@ if all the digits of the input values were 9s and the absolute values were added together. </p> <!-- Seems like buggy output from this first query, so hiding the example for the time being. --> -<codeblock audience="Cloudera">[localhost:21000] > select 50000.5 + 12.444, precision(50000.5 + 12.444), scale(50000.5 + 12.444); +<codeblock audience="Cloudera"><![CDATA[[localhost:21000] > select 50000.5 + 12.444, precision(50000.5 + 12.444), scale(50000.5 + 12.444); +------------------+-----------------------------+-------------------------+ | 50000.5 + 12.444 | precision(50000.5 + 12.444) | scale(50000.5 + 12.444) | +------------------+-----------------------------+-------------------------+ | 50012.944 | 9 | 3 | +------------------+-----------------------------+-------------------------+ -[localhost:21000] > select 99999.9 + 99.999, precision(99999.9 + 99.999), scale(99999.9 + 99.999); +[localhost:21000] > select 99999.9 + 99.999, precision(99999.9 + 99.999), scale(99999.9 + 99.999); +------------------+-----------------------------+-------------------------+ | 99999.9 + 99.999 | precision(99999.9 + 99.999) | scale(99999.9 + 99.999) | +------------------+-----------------------------+-------------------------+ | 100099.899 | 9 | 3 | +------------------+-----------------------------+-------------------------+ +]]> </codeblock> </li> @@ -163,33 +164,8 @@ <ul> <li> Using the <codeph>DECIMAL</codeph> type is only supported under CDH 5.1.0 and higher. -<!-- - Although Impala-created tables containing <codeph>DECIMAL</codeph> columns are - readable in CDH 5.1, <codeph>DECIMAL</codeph> data is not interoperable with - other Hadoop components in CDH 4, and some Impala operations such as - <codeph>COMPUTE STATS</codeph> are not possible on such tables in CDH 4. - If you create a Parquet table with a <codeph>DECIMAL</codeph> - column under CDH 4, Impala issues a warning because the data files might not be readable from other CDH 4 components. ---> </li> -<!-- - <li> - The <codeph>DECIMAL</codeph> data type is a relatively new addition to the - Parquet file format. To read Impala-created Parquet files containing - <codeph>DECIMAL</codeph> columns from another Hadoop component such as - MapReduce, Pig, or Hive, use CDH 5.1 or higher, or the equivalent levels of the relevant components and Parquet - JARs from CDH 5.1. - If you create a Parquet table with a <codeph>DECIMAL</codeph> - column under CDH 4, Impala issues a warning because the data files might not be readable from other CDH 4 components. - </li> - - <li> - In particular, Impala-created tables with <codeph>DECIMAL</codeph> columns are - not readable by Hive under CDH 4. - </li> ---> - <li> Use the <codeph>DECIMAL</codeph> data type in Impala for applications where you used the <codeph>NUMBER</codeph> data type in Oracle. The Impala <codeph>DECIMAL</codeph> type does not support the @@ -218,25 +194,26 @@ to the same type in the context of <codeph>UNION</codeph> queries and <codeph>INSERT</codeph> statements: </p> -<codeblock>[localhost:21000] > select cast(1 as int) as x union select cast(1.5 as decimal(9,4)) as x; +<codeblock><![CDATA[[localhost:21000] > select cast(1 as int) as x union select cast(1.5 as decimal(9,4)) as x; +----------------+ | x | +----------------+ | 1.5000 | | 1.0000 | +----------------+ -[localhost:21000] > create table int_vs_decimal as select cast(1 as int) as x union select cast(1.5 as decimal(9,4)) as x; +[localhost:21000] > create table int_vs_decimal as select cast(1 as int) as x union select cast(1.5 as decimal(9,4)) as x; +-------------------+ | summary | +-------------------+ | Inserted 2 row(s) | +-------------------+ -[localhost:21000] > desc int_vs_decimal; +[localhost:21000] > desc int_vs_decimal; +------+---------------+---------+ | name | type | comment | +------+---------------+---------+ | x | decimal(14,4) | | +------+---------------+---------+ +]]> </codeblock> <p> @@ -253,19 +230,20 @@ result of <codeph>NULL</codeph> and displays a runtime warning. </p> -<codeblock>[localhost:21000] > select cast(1.239 as decimal(3,2)); +<codeblock><![CDATA[[localhost:21000] > select cast(1.239 as decimal(3,2)); +-----------------------------+ | cast(1.239 as decimal(3,2)) | +-----------------------------+ | 1.23 | +-----------------------------+ -[localhost:21000] > select cast(1234 as decimal(3)); +[localhost:21000] > select cast(1234 as decimal(3)); +----------------------------+ | cast(1234 as decimal(3,0)) | +----------------------------+ | NULL | +----------------------------+ WARNINGS: Expression overflowed, returning NULL +]]> </codeblock> <p> @@ -284,15 +262,15 @@ WARNINGS: Expression overflowed, returning NULL bytes, only use precision of 10 or higher when actually needed. </p> -<codeblock>[localhost:21000] > create table decimals_9_0 (x decimal); -[localhost:21000] > insert into decimals_9_0 values (1), (2), (4), (8), (16), (1024), (32768), (65536), (1000000); +<codeblock><![CDATA[[localhost:21000] > create table decimals_9_0 (x decimal); +[localhost:21000] > insert into decimals_9_0 values (1), (2), (4), (8), (16), (1024), (32768), (65536), (1000000); ERROR: AnalysisException: Possible loss of precision for target table 'decimal_testing.decimals_9_0'. Expression '1' (type: INT) would need to be cast to DECIMAL(9,0) for column 'x' -[localhost:21000] > insert into decimals_9_0 values (cast(1 as decimal)), (cast(2 as decimal)), (cast(4 as decimal)), (cast(8 as decimal)), (cast(16 as decimal)), (cast(1024 as decimal)), (cast(32768 as decimal)), (cast(65536 as decimal)), (cast(1000000 as decimal)); +[localhost:21000] > insert into decimals_9_0 values (cast(1 as decimal)), (cast(2 as decimal)), (cast(4 as decimal)), (cast(8 as decimal)), (cast(16 as decimal)), (cast(1024 as decimal)), (cast(32768 as decimal)), (cast(65536 as decimal)), (cast(1000000 as decimal)); -[localhost:21000] > create table decimals_10_0 (x decimal(10,0)); -[localhost:21000] > insert into decimals_10_0 values (1), (2), (4), (8), (16), (1024), (32768), (65536), (1000000); -[localhost:21000] > +[localhost:21000] > create table decimals_10_0 (x decimal(10,0)); +[localhost:21000] > insert into decimals_10_0 values (1), (2), (4), (8), (16), (1024), (32768), (65536), (1000000); +]]> </codeblock> <p> @@ -317,12 +295,10 @@ Expression '1' (type: INT) would need to be cast to DECIMAL(9,0) for column 'x' <ul> <li> - <p> - The result of an aggregate function such as <codeph>MAX()</codeph>, <codeph>SUM()</codeph>, or - <codeph>AVG()</codeph> on <codeph>DECIMAL</codeph> values is promoted to a scale of 38, with the same - precision as the underlying column. Thus, the result can represent the largest possible value at that - particular precision. - </p> + <p> The result of the <codeph>SUM()</codeph> aggregate function on + <codeph>DECIMAL</codeph> values is promoted to a precision of 38, + with the same precision as the underlying column. Thus, the result can + represent the largest possible value at that particular precision. </p> </li> <li> @@ -343,60 +319,61 @@ Expression '1' (type: INT) would need to be cast to DECIMAL(9,0) for column 'x' point if needed. Any trailing zeros after the decimal point in the <codeph>STRING</codeph> value must fit within the number of digits specified by the precision. </p> -<codeblock>[localhost:21000] > select cast('100' as decimal); -- Small integer value fits within 9 digits of scale. +<codeblock><![CDATA[[localhost:21000] > select cast('100' as decimal); -- Small integer value fits within 9 digits of scale. +-----------------------------+ | cast('100' as decimal(9,0)) | +-----------------------------+ | 100 | +-----------------------------+ -[localhost:21000] > select cast('100' as decimal(3,0)); -- Small integer value fits within 3 digits of scale. +[localhost:21000] > select cast('100' as decimal(3,0)); -- Small integer value fits within 3 digits of scale. +-----------------------------+ | cast('100' as decimal(3,0)) | +-----------------------------+ | 100 | +-----------------------------+ -[localhost:21000] > select cast('100' as decimal(2,0)); -- 2 digits of scale is not enough! +[localhost:21000] > select cast('100' as decimal(2,0)); -- 2 digits of scale is not enough! +-----------------------------+ | cast('100' as decimal(2,0)) | +-----------------------------+ | NULL | +-----------------------------+ -[localhost:21000] > select cast('100' as decimal(3,1)); -- (3,1) = 2 digits left of the decimal point, 1 to the right. Not enough. +[localhost:21000] > select cast('100' as decimal(3,1)); -- (3,1) = 2 digits left of the decimal point, 1 to the right. Not enough. +-----------------------------+ | cast('100' as decimal(3,1)) | +-----------------------------+ | NULL | +-----------------------------+ -[localhost:21000] > select cast('100' as decimal(4,1)); -- 4 digits total, 1 to the right of the decimal point. +[localhost:21000] > select cast('100' as decimal(4,1)); -- 4 digits total, 1 to the right of the decimal point. +-----------------------------+ | cast('100' as decimal(4,1)) | +-----------------------------+ | 100.0 | +-----------------------------+ -[localhost:21000] > select cast('98.6' as decimal(3,1)); -- (3,1) can hold a 3 digit number with 1 fractional digit. +[localhost:21000] > select cast('98.6' as decimal(3,1)); -- (3,1) can hold a 3 digit number with 1 fractional digit. +------------------------------+ | cast('98.6' as decimal(3,1)) | +------------------------------+ | 98.6 | +------------------------------+ -[localhost:21000] > select cast('98.6' as decimal(15,1)); -- Larger scale allows bigger numbers but still only 1 fractional digit. +[localhost:21000] > select cast('98.6' as decimal(15,1)); -- Larger scale allows bigger numbers but still only 1 fractional digit. +-------------------------------+ | cast('98.6' as decimal(15,1)) | +-------------------------------+ | 98.6 | +-------------------------------+ -[localhost:21000] > select cast('98.6' as decimal(15,5)); -- Larger precision allows more fractional digits, outputs trailing zeros. +[localhost:21000] > select cast('98.6' as decimal(15,5)); -- Larger precision allows more fractional digits, outputs trailing zeros. +-------------------------------+ | cast('98.6' as decimal(15,5)) | +-------------------------------+ | 98.60000 | +-------------------------------+ -[localhost:21000] > select cast('98.60000' as decimal(15,1)); -- Trailing zeros in the string must fit within 'scale' digits (1 in this case). +[localhost:21000] > select cast('98.60000' as decimal(15,1)); -- Trailing zeros in the string must fit within 'scale' digits (1 in this case). +-----------------------------------+ | cast('98.60000' as decimal(15,1)) | +-----------------------------------+ | NULL | +-----------------------------------+ +]]> </codeblock> </li> @@ -495,42 +472,43 @@ Expression '1' (type: INT) would need to be cast to DECIMAL(9,0) for column 'x' </p> <!-- According to Nong, it's a bug that so many integer digits can be converted to a DECIMAL value with small (s,p) spec. So expect to re-do this example. --> -<codeblock>[localhost:21000] > select cast(1 as decimal(1,0)); +<codeblock><![CDATA[[localhost:21000] > select cast(1 as decimal(1,0)); +-------------------------+ | cast(1 as decimal(1,0)) | +-------------------------+ | 1 | +-------------------------+ -[localhost:21000] > select cast(9 as decimal(1,0)); +[localhost:21000] > select cast(9 as decimal(1,0)); +-------------------------+ | cast(9 as decimal(1,0)) | +-------------------------+ | 9 | +-------------------------+ -[localhost:21000] > select cast(10 as decimal(1,0)); +[localhost:21000] > select cast(10 as decimal(1,0)); +--------------------------+ | cast(10 as decimal(1,0)) | +--------------------------+ | 10 | +--------------------------+ -[localhost:21000] > select cast(10 as decimal(1,1)); +[localhost:21000] > select cast(10 as decimal(1,1)); +--------------------------+ | cast(10 as decimal(1,1)) | +--------------------------+ | 10.0 | +--------------------------+ -[localhost:21000] > select cast(100 as decimal(1,1)); +[localhost:21000] > select cast(100 as decimal(1,1)); +---------------------------+ | cast(100 as decimal(1,1)) | +---------------------------+ | 100.0 | +---------------------------+ -[localhost:21000] > select cast(1000 as decimal(1,1)); +[localhost:21000] > select cast(1000 as decimal(1,1)); +----------------------------+ | cast(1000 as decimal(1,1)) | +----------------------------+ | 1000.0 | +----------------------------+ +]]> </codeblock> </li> @@ -539,10 +517,10 @@ Expression '1' (type: INT) would need to be cast to DECIMAL(9,0) for column 'x' When a <codeph>DECIMAL</codeph> value is converted to any of the integer types, any fractional part is truncated (that is, rounded towards zero): </p> -<codeblock>[localhost:21000] > create table num_dec_days (x decimal(4,1)); -[localhost:21000] > insert into num_dec_days values (1), (2), (cast(4.5 as decimal(4,1))); -[localhost:21000] > insert into num_dec_days values (cast(0.1 as decimal(4,1))), (cast(.9 as decimal(4,1))), (cast(9.1 as decimal(4,1))), (cast(9.9 as decimal(4,1))); -[localhost:21000] > select cast(x as int) from num_dec_days; +<codeblock><![CDATA[[localhost:21000] > create table num_dec_days (x decimal(4,1)); +[localhost:21000] > insert into num_dec_days values (1), (2), (cast(4.5 as decimal(4,1))); +[localhost:21000] > insert into num_dec_days values (cast(0.1 as decimal(4,1))), (cast(.9 as decimal(4,1))), (cast(9.1 as decimal(4,1))), (cast(9.9 as decimal(4,1))); +[localhost:21000] > select cast(x as int) from num_dec_days; +----------------+ | cast(x as int) | +----------------+ @@ -554,6 +532,7 @@ Expression '1' (type: INT) would need to be cast to DECIMAL(9,0) for column 'x' | 9 | | 9 | +----------------+ +]]> </codeblock> </li> @@ -564,17 +543,17 @@ Expression '1' (type: INT) would need to be cast to DECIMAL(9,0) for column 'x' representation using a two-step process, by converting it to an integer value and then using that result in a call to a date and time function such as <codeph>from_unixtime()</codeph>. </p> -<codeblock>[localhost:21000] > select from_unixtime(cast(cast(1000.0 as decimal) as bigint)); +<codeblock><![CDATA[[localhost:21000] > select from_unixtime(cast(cast(1000.0 as decimal) as bigint)); +-------------------------------------------------------------+ | from_unixtime(cast(cast(1000.0 as decimal(9,0)) as bigint)) | +-------------------------------------------------------------+ | 1970-01-01 00:16:40 | +-------------------------------------------------------------+ -[localhost:21000] > select now() + interval cast(x as int) days from num_dec_days; -- x is a DECIMAL column. +[localhost:21000] > select now() + interval cast(x as int) days from num_dec_days; -- x is a DECIMAL column. -[localhost:21000] > create table num_dec_days (x decimal(4,1)); -[localhost:21000] > insert into num_dec_days values (1), (2), (cast(4.5 as decimal(4,1))); -[localhost:21000] > select now() + interval cast(x as int) days from num_dec_days; -- The 4.5 value is truncated to 4 and becomes '4 days'. +[localhost:21000] > create table num_dec_days (x decimal(4,1)); +[localhost:21000] > insert into num_dec_days values (1), (2), (cast(4.5 as decimal(4,1))); +[localhost:21000] > select now() + interval cast(x as int) days from num_dec_days; -- The 4.5 value is truncated to 4 and becomes '4 days'. +--------------------------------------+ | now() + interval cast(x as int) days | +--------------------------------------+ @@ -582,6 +561,7 @@ Expression '1' (type: INT) would need to be cast to DECIMAL(9,0) for column 'x' | 2014-05-14 23:11:55.163284000 | | 2014-05-16 23:11:55.163284000 | +--------------------------------------+ +]]> </codeblock> </li> @@ -748,9 +728,9 @@ Expression '1' (type: INT) would need to be cast to DECIMAL(9,0) for column 'x' scale, they are returned correctly by a query. Any values that do not fit within the new precision and scale are returned as <codeph>NULL</codeph>, and Impala reports the conversion error. Leading zeros do not count against the precision value, but trailing zeros after the decimal point do. -<codeblock>[localhost:21000] > create table text_decimals (x string); -[localhost:21000] > insert into text_decimals values ("1"), ("2"), ("99.99"), ("1.234"), ("000001"), ("1.000000000"); -[localhost:21000] > select * from text_decimals; +<codeblock><![CDATA[[localhost:21000] > create table text_decimals (x string); +[localhost:21000] > insert into text_decimals values ("1"), ("2"), ("99.99"), ("1.234"), ("000001"), ("1.000000000"); +[localhost:21000] > select * from text_decimals; +-------------+ | x | +-------------+ @@ -761,8 +741,8 @@ Expression '1' (type: INT) would need to be cast to DECIMAL(9,0) for column 'x' | 000001 | | 1.000000000 | +-------------+ -[localhost:21000] > alter table text_decimals replace columns (x decimal(4,2)); -[localhost:21000] > select * from text_decimals; +[localhost:21000] > alter table text_decimals replace columns (x decimal(4,2)); +[localhost:21000] > select * from text_decimals; +-------+ | x | +-------+ @@ -782,6 +762,7 @@ Error converting column: 0 TO DECIMAL(4, 2) (Data is: 1.000000000) file: hdfs://127.0.0.1:8020/user/hive/warehouse/decimal_testing.db/text_decimals/cd40dc68e20 c565a-cc4bd86c724c96ba_311873428_data.0 record: 1.000000000 +]]> </codeblock> </li> @@ -807,7 +788,7 @@ SELECT CAST(1000.5 AS DECIMAL); <p conref="../shared/impala_common.xml#common/decimal_no_stats"/> -<!-- <p conref="/Content/impala_common_xi44078.xml#common/partitioning_good"/> --> +<!-- <p conref="../shared/impala_common.xml#common/partitioning_good"/> --> <p conref="../shared/impala_common.xml#common/hbase_ok"/> @@ -815,11 +796,11 @@ SELECT CAST(1000.5 AS DECIMAL); <p conref="../shared/impala_common.xml#common/text_bulky"/> -<!-- <p conref="/Content/impala_common_xi44078.xml#common/compatibility_blurb"/> --> +<!-- <p conref="../shared/impala_common.xml#common/compatibility_blurb"/> --> -<!-- <p conref="/Content/impala_common_xi44078.xml#common/internals_blurb"/> --> +<!-- <p conref="../shared/impala_common.xml#common/internals_blurb"/> --> -<!-- <p conref="/Content/impala_common_xi44078.xml#common/added_in_20"/> --> +<!-- <p conref="../shared/impala_common.xml#common/added_in_20"/> --> <p conref="../shared/impala_common.xml#common/column_stats_constant"/>
http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_default_order_by_limit.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_default_order_by_limit.xml b/docs/topics/impala_default_order_by_limit.xml index def0335..94f6899 100644 --- a/docs/topics/impala_default_order_by_limit.xml +++ b/docs/topics/impala_default_order_by_limit.xml @@ -3,10 +3,13 @@ <concept rev="obwl" id="default_order_by_limit"> <title>DEFAULT_ORDER_BY_LIMIT Query Option</title> + <titlealts audience="PDF"><navtitle>DEFAULT_ORDER_BY_LIMIT</navtitle></titlealts> <prolog> <metadata> <data name="Category" value="Impala"/> <data name="Category" value="Impala Query Options"/> + <data name="Category" value="Developers"/> + <data name="Category" value="Data Analysts"/> </metadata> </prolog> http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_delete.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_delete.xml b/docs/topics/impala_delete.xml index fcac5e4..997bd49 100644 --- a/docs/topics/impala_delete.xml +++ b/docs/topics/impala_delete.xml @@ -2,8 +2,8 @@ <!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd"> <concept id="delete"> - <title>DELETE Statement (CDH 5.5 and higher only)</title> - <titlealts><navtitle>DELETE</navtitle></titlealts> + <title>DELETE Statement (CDH 5.10 or higher only)</title> + <titlealts audience="PDF"><navtitle>DELETE</navtitle></titlealts> <prolog> <metadata> <data name="Category" value="Impala"/> @@ -12,6 +12,7 @@ <data name="Category" value="ETL"/> <data name="Category" value="Ingest"/> <data name="Category" value="DML"/> + <data name="Category" value="Developers"/> <data name="Category" value="Data Analysts"/> </metadata> </prolog> @@ -31,7 +32,7 @@ <codeblock> </codeblock> - <p rev="kudu" audience="impala_next"> + <p rev="kudu"> Normally, a <codeph>DELETE</codeph> operation for a Kudu table fails if some partition key columns are not found, due to their being deleted or changed by a concurrent <codeph>UPDATE</codeph> or <codeph>DELETE</codeph> operation. http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_describe.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_describe.xml b/docs/topics/impala_describe.xml index ffdb505..adff870 100644 --- a/docs/topics/impala_describe.xml +++ b/docs/topics/impala_describe.xml @@ -3,12 +3,14 @@ <concept id="describe"> <title id="desc">DESCRIBE Statement</title> - <titlealts><navtitle>DESCRIBE</navtitle></titlealts> + <titlealts audience="PDF"><navtitle>DESCRIBE</navtitle></titlealts> <prolog> <metadata> <data name="Category" value="Impala"/> <data name="Category" value="Impala Data Types"/> <data name="Category" value="SQL"/> + <data name="Category" value="Developers"/> + <data name="Category" value="Data Analysts"/> <data name="Category" value="Tables"/> <data name="Category" value="Reports"/> <data name="Category" value="Schemas"/> @@ -20,10 +22,21 @@ <p> <indexterm audience="Cloudera">DESCRIBE statement</indexterm> The <codeph>DESCRIBE</codeph> statement displays metadata about a table, such as the column names and their - data types. Its syntax is: + data types. + <ph rev="2.3.0">In CDH 5.5 / Impala 2.3 and higher, you can specify the name of a complex type column, which takes + the form of a dotted path. The path might include multiple components in the case of a nested type definition.</ph> + <ph rev="2.5.0">In CDH 5.7 / Impala 2.5 and higher, the <codeph>DESCRIBE DATABASE</codeph> form can display + information about a database.</ph> </p> -<codeblock rev="2.3.0">DESCRIBE [FORMATTED] [<varname>db_name</varname>.]<varname>table_name</varname>[.<varname>complex_col_name</varname> ...]</codeblock> + <p conref="../shared/impala_common.xml#common/syntax_blurb"/> + +<codeblock rev="2.5.0">DESCRIBE [DATABASE] [FORMATTED|EXTENDED] <varname>object_name</varname> + +object_name ::= + [<varname>db_name</varname>.]<varname>table_name</varname>[.<varname>complex_col_name</varname> ...] + | <varname>db_name</varname> +</codeblock> <p> You can use the abbreviation <codeph>DESC</codeph> for the <codeph>DESCRIBE</codeph> statement. @@ -42,9 +55,128 @@ session that loads data and are not stored persistently with the table metadata. </note> +<p rev="2.5.0 IMPALA-2196"> + <b>Describing databases:</b> +</p> + +<p rev="2.5.0"> + By default, the <codeph>DESCRIBE</codeph> output for a database includes the location + and the comment, which can be set by the <codeph>LOCATION</codeph> and <codeph>COMMENT</codeph> + clauses on the <codeph>CREATE DATABASE</codeph> statement. +</p> + +<p rev="2.5.0"> + The additional information displayed by the <codeph>FORMATTED</codeph> or <codeph>EXTENDED</codeph> + keyword includes the HDFS user ID that is considered the owner of the database, and any + optional database properties. The properties could be specified by the <codeph>WITH DBPROPERTIES</codeph> + clause if the database is created using a Hive <codeph>CREATE DATABASE</codeph> statement. + Impala currently does not set or do any special processing based on those properties. +</p> + +<p rev="2.5.0"> +The following examples show the variations in syntax and output for +describing databases. This feature is available in CDH 5.7 / Impala 2.5 +and higher. +</p> + +<codeblock rev="2.5.0"> +describe database default; ++---------+----------------------+-----------------------+ +| name | location | comment | ++---------+----------------------+-----------------------+ +| default | /user/hive/warehouse | Default Hive database | ++---------+----------------------+-----------------------+ + +describe database formatted default; ++---------+----------------------+-----------------------+ +| name | location | comment | ++---------+----------------------+-----------------------+ +| default | /user/hive/warehouse | Default Hive database | +| Owner: | | | +| | public | ROLE | ++---------+----------------------+-----------------------+ + +describe database extended default; ++---------+----------------------+-----------------------+ +| name | location | comment | ++---------+----------------------+-----------------------+ +| default | /user/hive/warehouse | Default Hive database | +| Owner: | | | +| | public | ROLE | ++---------+----------------------+-----------------------+ +</codeblock> + +<p> + <b>Describing tables:</b> +</p> + +<p> + If the <codeph>DATABASE</codeph> keyword is omitted, the default + for the <codeph>DESCRIBE</codeph> statement is to refer to a table. +</p> + +<codeblock> +-- By default, the table is assumed to be in the current database. +describe my_table; ++------+--------+---------+ +| name | type | comment | ++------+--------+---------+ +| x | int | | +| s | string | | ++------+--------+---------+ + +-- Use a fully qualified table name to specify a table in any database. +describe my_database.my_table; ++------+--------+---------+ +| name | type | comment | ++------+--------+---------+ +| x | int | | +| s | string | | ++------+--------+---------+ + +-- The formatted or extended output includes additional useful information. +-- The LOCATION field is especially useful to know for DDL statements and HDFS commands +-- during ETL jobs. (The LOCATION includes a full hdfs:// URL, omitted here for readability.) +describe formatted my_table; ++------------------------------+----------------------------------------------+----------------------+ +| name | type | comment | ++------------------------------+----------------------------------------------+----------------------+ +| # col_name | data_type | comment | +| | NULL | NULL | +| x | int | NULL | +| s | string | NULL | +| | NULL | NULL | +| # Detailed Table Information | NULL | NULL | +| Database: | my_database | NULL | +| Owner: | jrussell | NULL | +| CreateTime: | Fri Mar 18 15:58:00 PDT 2016 | NULL | +| LastAccessTime: | UNKNOWN | NULL | +| Protect Mode: | None | NULL | +| Retention: | 0 | NULL | +| Location: | /user/hive/warehouse/my_database.db/my_table | NULL | +| Table Type: | MANAGED_TABLE | NULL | +| Table Parameters: | NULL | NULL | +| | transient_lastDdlTime | 1458341880 | +| | NULL | NULL | +| # Storage Information | NULL | NULL | +| SerDe Library: | org. ... .LazySimpleSerDe | NULL | +| InputFormat: | org.apache.hadoop.mapred.TextInputFormat | NULL | +| OutputFormat: | org. ... .HiveIgnoreKeyTextOutputFormat | NULL | +| Compressed: | No | NULL | +| Num Buckets: | 0 | NULL | +| Bucket Columns: | [] | NULL | +| Sort Columns: | [] | NULL | ++------------------------------+----------------------------------------------+----------------------+ +</codeblock> + <p conref="../shared/impala_common.xml#common/complex_types_blurb"/> <p rev="2.3.0"> + Because the column definitions for complex types can become long, particularly when such types are nested, + the <codeph>DESCRIBE</codeph> statement uses special formatting for complex type columns to make the output readable. + </p> + + <p rev="2.3.0"> For the <codeph>ARRAY</codeph>, <codeph>STRUCT</codeph>, and <codeph>MAP</codeph> types available in CDH 5.5 / Impala 2.3 and higher, the <codeph>DESCRIBE</codeph> output is formatted to avoid excessively long lines for multiple fields within a <codeph>STRUCT</codeph>, or a nested sequence of @@ -115,7 +247,7 @@ describe t1; </p> </li> </ul> - + <codeblock rev="2.3.0"><![CDATA[ -- #1: The overall layout of the entire table. describe region; @@ -341,7 +473,7 @@ describe customer.c_orders.o_lineitems.item; <p> When you are dealing with data files stored in HDFS, sometimes it is important to know details such as the - path of the data files for an Impala table, and the host name for the namenode. You can get this information + path of the data files for an Impala table, and the hostname for the namenode. You can get this information from the <codeph>DESCRIBE FORMATTED</codeph> output. You specify HDFS URIs or path specifications with statements such as <codeph>LOAD DATA</codeph> and the <codeph>LOCATION</codeph> clause of <codeph>CREATE TABLE</codeph> or <codeph>ALTER TABLE</codeph>. You might also use HDFS URIs or paths with Linux commands @@ -362,13 +494,6 @@ run <codeph>COMPUTE STATS</codeph>. See <xref href="impala_show.xml#show"/> for details. </p> -<p conref="../shared/impala_common.xml#common/complex_types_blurb"/> - -<p rev="2.3.0"> - Because the column definitions for complex types can become long, particularly when such types are nested, - the <codeph>DESCRIBE</codeph> statement uses special formatting for complex type columns to make the output readable. -</p> - <note conref="../shared/impala_common.xml#common/compute_stats_next"/> <p conref="../shared/impala_common.xml#common/example_blurb"/> @@ -378,34 +503,34 @@ run <codeph>COMPUTE STATS</codeph>. FORMATTED</codeph> for different kinds of schema objects: </p> -<ul> - <li> - <codeph>DESCRIBE</codeph> for a table or a view returns the name, type, and comment for each of the - columns. For a view, if the column value is computed by an expression, the column name is automatically - generated as <codeph>_c0</codeph>, <codeph>_c1</codeph>, and so on depending on the ordinal number of the - column. - </li> - - <li> - A table created with no special format or storage clauses is designated as a <codeph>MANAGED_TABLE</codeph> - (an <q>internal table</q> in Impala terminology). Its data files are stored in an HDFS directory under the - default Hive data directory. By default, it uses Text data format. - </li> - - <li> - A view is designated as <codeph>VIRTUAL_VIEW</codeph> in <codeph>DESCRIBE FORMATTED</codeph> output. Some - of its properties are <codeph>NULL</codeph> or blank because they are inherited from the base table. The - text of the query that defines the view is part of the <codeph>DESCRIBE FORMATTED</codeph> output. - </li> - - <li> - A table with additional clauses in the <codeph>CREATE TABLE</codeph> statement has differences in - <codeph>DESCRIBE FORMATTED</codeph> output. The output for <codeph>T2</codeph> includes the - <codeph>EXTERNAL_TABLE</codeph> keyword because of the <codeph>CREATE EXTERNAL TABLE</codeph> syntax, and - different <codeph>InputFormat</codeph> and <codeph>OutputFormat</codeph> fields to reflect the Parquet file - format. - </li> - </ul> + <ul> + <li> + <codeph>DESCRIBE</codeph> for a table or a view returns the name, type, and comment for each of the + columns. For a view, if the column value is computed by an expression, the column name is automatically + generated as <codeph>_c0</codeph>, <codeph>_c1</codeph>, and so on depending on the ordinal number of the + column. + </li> + + <li> + A table created with no special format or storage clauses is designated as a <codeph>MANAGED_TABLE</codeph> + (an <q>internal table</q> in Impala terminology). Its data files are stored in an HDFS directory under the + default Hive data directory. By default, it uses Text data format. + </li> + + <li> + A view is designated as <codeph>VIRTUAL_VIEW</codeph> in <codeph>DESCRIBE FORMATTED</codeph> output. Some + of its properties are <codeph>NULL</codeph> or blank because they are inherited from the base table. The + text of the query that defines the view is part of the <codeph>DESCRIBE FORMATTED</codeph> output. + </li> + + <li> + A table with additional clauses in the <codeph>CREATE TABLE</codeph> statement has differences in + <codeph>DESCRIBE FORMATTED</codeph> output. The output for <codeph>T2</codeph> includes the + <codeph>EXTERNAL_TABLE</codeph> keyword because of the <codeph>CREATE EXTERNAL TABLE</codeph> syntax, and + different <codeph>InputFormat</codeph> and <codeph>OutputFormat</codeph> fields to reflect the Parquet file + format. + </li> + </ul> <codeblock>[localhost:21000] > create table t1 (x int, y int, s string); Query: create table t1 (x int, y int, s string) @@ -423,36 +548,39 @@ Returned 3 row(s) in 0.13s [localhost:21000] > describe formatted t1; Query: describe formatted t1 Query finished, fetching results ... -+------------------------------+--------------------------------------------------------------------+----------------------+ -| name | type | comment | -+------------------------------+--------------------------------------------------------------------+----------------------+ -| # col_name | data_type | comment | -| | NULL | NULL | -| x | int | None | -| y | int | None | -| s | string | None | -| | NULL | NULL | -| # Detailed Table Information | NULL | NULL | -| Database: | describe_formatted | NULL | -| Owner: | cloudera | NULL | -| CreateTime: | Mon Jul 22 17:03:16 EDT 2013 | NULL | -| LastAccessTime: | UNKNOWN | NULL | -| Protect Mode: | None | NULL | -| Retention: | 0 | NULL | -| Location: | hdfs://127.0.0.1:8020/user/hive/warehouse/describe_formatted.db/t1 | NULL | -| Table Type: | MANAGED_TABLE | NULL | -| Table Parameters: | NULL | NULL | -| | transient_lastDdlTime | 1374526996 | -| | NULL | NULL | -| # Storage Information | NULL | NULL | -| SerDe Library: | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL | -| InputFormat: | org.apache.hadoop.mapred.TextInputFormat | NULL | -| OutputFormat: | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | NULL | -| Compressed: | No | NULL | -| Num Buckets: | 0 | NULL | -| Bucket Columns: | [] | NULL | -| Sort Columns: | [] | NULL | -+------------------------------+--------------------------------------------------------------------+----------------------+ ++------------------------------+--------------------------------------------+------------+ +| name | type | comment | ++------------------------------+--------------------------------------------+------------+ +| # col_name | data_type | comment | +| | NULL | NULL | +| x | int | None | +| y | int | None | +| s | string | None | +| | NULL | NULL | +| # Detailed Table Information | NULL | NULL | +| Database: | describe_formatted | NULL | +| Owner: | cloudera | NULL | +| CreateTime: | Mon Jul 22 17:03:16 EDT 2013 | NULL | +| LastAccessTime: | UNKNOWN | NULL | +| Protect Mode: | None | NULL | +| Retention: | 0 | NULL | +| Location: | hdfs://127.0.0.1:8020/user/hive/warehouse/ | | +| | describe_formatted.db/t1 | NULL | +| Table Type: | MANAGED_TABLE | NULL | +| Table Parameters: | NULL | NULL | +| | transient_lastDdlTime | 1374526996 | +| | NULL | NULL | +| # Storage Information | NULL | NULL | +| SerDe Library: | org.apache.hadoop.hive.serde2.lazy. | | +| | LazySimpleSerDe | NULL | +| InputFormat: | org.apache.hadoop.mapred.TextInputFormat | NULL | +| OutputFormat: | org.apache.hadoop.hive.ql.io. | | +| | HiveIgnoreKeyTextOutputFormat | NULL | +| Compressed: | No | NULL | +| Num Buckets: | 0 | NULL | +| Bucket Columns: | [] | NULL | +| Sort Columns: | [] | NULL | ++------------------------------+--------------------------------------------+------------+ Returned 26 row(s) in 0.03s [localhost:21000] > create view v1 as select x, upper(s) from t1; Query: create view v1 as select x, upper(s) from t1 @@ -506,37 +634,37 @@ Returned 28 row(s) in 0.03s [localhost:21000] > describe formatted t2; Query: describe formatted t2 Query finished, fetching results ... -+------------------------------+----------------------------------------------------+----------------------+ -| name | type | comment | -+------------------------------+----------------------------------------------------+----------------------+ -| # col_name | data_type | comment | -| | NULL | NULL | -| x | int | None | -| y | int | None | -| s | string | None | -| | NULL | NULL | -| # Detailed Table Information | NULL | NULL | -| Database: | describe_formatted | NULL | -| Owner: | cloudera | NULL | -| CreateTime: | Mon Jul 22 17:01:47 EDT 2013 | NULL | -| LastAccessTime: | UNKNOWN | NULL | -| Protect Mode: | None | NULL | -| Retention: | 0 | NULL | -| Location: | hdfs://127.0.0.1:8020/user/cloudera/sample_data | NULL | -| Table Type: | EXTERNAL_TABLE | NULL | -| Table Parameters: | NULL | NULL | -| | EXTERNAL | TRUE | -| | transient_lastDdlTime | 1374526907 | -| | NULL | NULL | -| # Storage Information | NULL | NULL | -| SerDe Library: | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL | -| InputFormat: | com.cloudera.impala.hive.serde.ParquetInputFormat | NULL | -| OutputFormat: | com.cloudera.impala.hive.serde.ParquetOutputFormat | NULL | -| Compressed: | No | NULL | -| Num Buckets: | 0 | NULL | -| Bucket Columns: | [] | NULL | -| Sort Columns: | [] | NULL | -+------------------------------+----------------------------------------------------+----------------------+ ++------------------------------+----------------------------------------------------+------------+ +| name | type | comment | ++------------------------------+----------------------------------------------------+------------+ +| # col_name | data_type | comment | +| | NULL | NULL | +| x | int | None | +| y | int | None | +| s | string | None | +| | NULL | NULL | +| # Detailed Table Information | NULL | NULL | +| Database: | describe_formatted | NULL | +| Owner: | cloudera | NULL | +| CreateTime: | Mon Jul 22 17:01:47 EDT 2013 | NULL | +| LastAccessTime: | UNKNOWN | NULL | +| Protect Mode: | None | NULL | +| Retention: | 0 | NULL | +| Location: | hdfs://127.0.0.1:8020/user/cloudera/sample_data | NULL | +| Table Type: | EXTERNAL_TABLE | NULL | +| Table Parameters: | NULL | NULL | +| | EXTERNAL | TRUE | +| | transient_lastDdlTime | 1374526907 | +| | NULL | NULL | +| # Storage Information | NULL | NULL | +| SerDe Library: | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL | +| InputFormat: | com.cloudera.impala.hive.serde.ParquetInputFormat | NULL | +| OutputFormat: | com.cloudera.impala.hive.serde.ParquetOutputFormat | NULL | +| Compressed: | No | NULL | +| Num Buckets: | 0 | NULL | +| Bucket Columns: | [] | NULL | +| Sort Columns: | [] | NULL | ++------------------------------+----------------------------------------------------+------------+ Returned 27 row(s) in 0.17s</codeblock> <p conref="../shared/impala_common.xml#common/cancel_blurb_no"/> http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_disable_codegen.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_disable_codegen.xml b/docs/topics/impala_disable_codegen.xml index 844d49d..bcc5787 100644 --- a/docs/topics/impala_disable_codegen.xml +++ b/docs/topics/impala_disable_codegen.xml @@ -3,11 +3,13 @@ <concept id="disable_codegen"> <title>DISABLE_CODEGEN Query Option</title> + <titlealts audience="PDF"><navtitle>DISABLE_CODEGEN</navtitle></titlealts> <prolog> <metadata> <data name="Category" value="Impala"/> <data name="Category" value="Impala Query Options"/> <data name="Category" value="Troubleshooting"/> + <data name="Category" value="Performance"/> </metadata> </prolog> http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_disable_unsafe_spills.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_disable_unsafe_spills.xml b/docs/topics/impala_disable_unsafe_spills.xml index f251d65..17ad2e1 100644 --- a/docs/topics/impala_disable_unsafe_spills.xml +++ b/docs/topics/impala_disable_unsafe_spills.xml @@ -2,19 +2,23 @@ <!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd"> <concept rev="2.0.0" id="disable_unsafe_spills"> - <title>DISABLE_UNSAFE_SPILLS Query Option</title> + <title>DISABLE_UNSAFE_SPILLS Query Option (CDH 5.2 or higher only)</title> + <titlealts audience="PDF"><navtitle>DISABLE_UNSAFE_SPILLS</navtitle></titlealts> <prolog> <metadata> <data name="Category" value="Impala"/> <data name="Category" value="Impala Query Options"/> + <data name="Category" value="Performance"/> <data name="Category" value="Scalability"/> <data name="Category" value="Memory"/> + <data name="Category" value="Developers"/> + <data name="Category" value="Data Analysts"/> </metadata> </prolog> <conbody> - <p> + <p rev="2.0.0"> <indexterm audience="Cloudera">DISABLE_UNSAFE_SPILLS query option</indexterm> Enable this option if you prefer to have queries fail when they exceed the Impala memory limit, rather than write temporary data to disk. @@ -44,5 +48,6 @@ <p conref="../shared/impala_common.xml#common/type_boolean"/> <p conref="../shared/impala_common.xml#common/default_false_0"/> + <p conref="../shared/impala_common.xml#common/added_in_20"/> </conbody> </concept> http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_disk_space.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_disk_space.xml b/docs/topics/impala_disk_space.xml index 8bc3ca8..b6daaeb 100644 --- a/docs/topics/impala_disk_space.xml +++ b/docs/topics/impala_disk_space.xml @@ -4,7 +4,17 @@ <title>Managing Disk Space for Impala Data</title> <titlealts audience="PDF"><navtitle>Managing Disk Space</navtitle></titlealts> - + <prolog> + <metadata> + <data name="Category" value="Impala"/> + <data name="Category" value="Disk Storage"/> + <data name="Category" value="Administrators"/> + <data name="Category" value="Developers"/> + <data name="Category" value="Data Analysts"/> + <data name="Category" value="Tables"/> + <data name="Category" value="Compression"/> + </metadata> + </prolog> <conbody> @@ -14,6 +24,106 @@ to minimize space consumption and file duplication. </p> - + <ul> + <li> + <p> + Use compact binary file formats where practical. Numeric and time-based data in particular can be stored + in more compact form in binary data files. Depending on the file format, various compression and encoding + features can reduce file size even further. You can specify the <codeph>STORED AS</codeph> clause as part + of the <codeph>CREATE TABLE</codeph> statement, or <codeph>ALTER TABLE</codeph> with the <codeph>SET + FILEFORMAT</codeph> clause for an existing table or partition within a partitioned table. See + <xref href="impala_file_formats.xml#file_formats"/> for details about file formats, especially + <xref href="impala_parquet.xml#parquet"/>. See <xref href="impala_create_table.xml#create_table"/> and + <xref href="impala_alter_table.xml#alter_table"/> for syntax details. + </p> + </li> + + <li> + <p> + You manage underlying data files differently depending on whether the corresponding Impala table is + defined as an <xref href="impala_tables.xml#internal_tables">internal</xref> or + <xref href="impala_tables.xml#external_tables">external</xref> table: + </p> + <ul> + <li> + Use the <codeph>DESCRIBE FORMATTED</codeph> statement to check if a particular table is internal + (managed by Impala) or external, and to see the physical location of the data files in HDFS. See + <xref href="impala_describe.xml#describe"/> for details. + </li> + + <li> + For Impala-managed (<q>internal</q>) tables, use <codeph>DROP TABLE</codeph> statements to remove + data files. See <xref href="impala_drop_table.xml#drop_table"/> for details. + </li> + + <li> + For tables not managed by Impala (<q>external</q> tables), use appropriate HDFS-related commands such + as <codeph>hadoop fs</codeph>, <codeph>hdfs dfs</codeph>, or <codeph>distcp</codeph>, to create, move, + copy, or delete files within HDFS directories that are accessible by the <codeph>impala</codeph> user. + Issue a <codeph>REFRESH <varname>table_name</varname></codeph> statement after adding or removing any + files from the data directory of an external table. See <xref href="impala_refresh.xml#refresh"/> for + details. + </li> + + <li> + Use external tables to reference HDFS data files in their original location. With this technique, you + avoid copying the files, and you can map more than one Impala table to the same set of data files. When + you drop the Impala table, the data files are left undisturbed. See + <xref href="impala_tables.xml#external_tables"/> for details. + </li> + + <li> + Use the <codeph>LOAD DATA</codeph> statement to move HDFS files into the data directory for an Impala + table from inside Impala, without the need to specify the HDFS path of the destination directory. This + technique works for both internal and external tables. See + <xref href="impala_load_data.xml#load_data"/> for details. + </li> + </ul> + </li> + + <li> + <p> + Make sure that the HDFS trashcan is configured correctly. When you remove files from HDFS, the space + might not be reclaimed for use by other files until sometime later, when the trashcan is emptied. See + <xref href="impala_drop_table.xml#drop_table"/> and the FAQ entry + <xref href="impala_faq.xml#faq_sql/faq_drop_table_space"/> for details. See + <xref href="impala_prereqs.xml#prereqs_account"/> for permissions needed for the HDFS trashcan to operate + correctly. + </p> + </li> + + <li> + <p> + Drop all tables in a database before dropping the database itself. See + <xref href="impala_drop_database.xml#drop_database"/> for details. + </p> + </li> + + <li> + <p> + Clean up temporary files after failed <codeph>INSERT</codeph> statements. If an <codeph>INSERT</codeph> + statement encounters an error, and you see a directory named <filepath>.impala_insert_staging</filepath> + or <filepath>_impala_insert_staging</filepath> left behind in the data directory for the table, it might + contain temporary data files taking up space in HDFS. You might be able to salvage these data files, for + example if they are complete but could not be moved into place due to a permission error. Or, you might + delete those files through commands such as <codeph>hadoop fs</codeph> or <codeph>hdfs dfs</codeph>, to + reclaim space before re-trying the <codeph>INSERT</codeph>. Issue <codeph>DESCRIBE FORMATTED + <varname>table_name</varname></codeph> to see the HDFS path where you can check for temporary files. + </p> + </li> + + <li rev="1.4.0"> + <p rev="obwl" conref="../shared/impala_common.xml#common/order_by_scratch_dir"/> + </li> + + <li rev="2.2.0"> + <p> + If you use the Amazon Simple Storage Service (S3) as a place to offload + data to reduce the volume of local storage, Impala 2.2.0 and higher + can query the data directly from S3. + See <xref href="impala_s3.xml#s3"/> for details. + </p> + </li> + </ul> </conbody> </concept> http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_distinct.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_distinct.xml b/docs/topics/impala_distinct.xml index d49e400..8661831 100644 --- a/docs/topics/impala_distinct.xml +++ b/docs/topics/impala_distinct.xml @@ -9,6 +9,8 @@ <data name="Category" value="SQL"/> <data name="Category" value="Querying"/> <data name="Category" value="Aggregate Functions"/> + <data name="Category" value="Developers"/> + <data name="Category" value="Data Analysts"/> </metadata> </prolog> http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_dml.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_dml.xml b/docs/topics/impala_dml.xml index 66d4022..ecce473 100644 --- a/docs/topics/impala_dml.xml +++ b/docs/topics/impala_dml.xml @@ -25,7 +25,7 @@ </p> <ul> - <li audience="impala_next"> + <li audience="Cloudera"> <xref href="impala_delete.xml#delete"/>; works for Kudu tables only </li> @@ -37,7 +37,7 @@ <xref href="impala_load_data.xml#load_data"/> </li> - <li audience="impala_next"> + <li audience="Cloudera"> <xref href="impala_update.xml#update"/>; works for Kudu tables only </li> </ul> http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_double.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_double.xml b/docs/topics/impala_double.xml index f1d1756..c69eae2 100644 --- a/docs/topics/impala_double.xml +++ b/docs/topics/impala_double.xml @@ -3,7 +3,7 @@ <concept id="double"> <title>DOUBLE Data Type</title> - <titlealts><navtitle>DOUBLE</navtitle></titlealts> + <titlealts audience="PDF"><navtitle>DOUBLE</navtitle></titlealts> <prolog> <metadata> <data name="Category" value="Impala"/> @@ -74,11 +74,11 @@ SELECT CAST(1000.5 AS DOUBLE); <p conref="../shared/impala_common.xml#common/text_bulky"/> -<!-- <p conref="/Content/impala_common_xi44078.xml#common/compatibility_blurb"/> --> +<!-- <p conref="../shared/impala_common.xml#common/compatibility_blurb"/> --> <p conref="../shared/impala_common.xml#common/internals_8_bytes"/> -<!-- <p conref="/Content/impala_common_xi44078.xml#common/added_in_20"/> --> +<!-- <p conref="../shared/impala_common.xml#common/added_in_20"/> --> <p conref="../shared/impala_common.xml#common/column_stats_constant"/> http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_drop_database.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_drop_database.xml b/docs/topics/impala_drop_database.xml index c6a1b64..fca7a60 100644 --- a/docs/topics/impala_drop_database.xml +++ b/docs/topics/impala_drop_database.xml @@ -3,7 +3,7 @@ <concept id="drop_database"> <title>DROP DATABASE Statement</title> - <titlealts><navtitle>DROP DATABASE</navtitle></titlealts> + <titlealts audience="PDF"><navtitle>DROP DATABASE</navtitle></titlealts> <prolog> <metadata> <data name="Category" value="Impala"/> @@ -11,6 +11,8 @@ <data name="Category" value="Databases"/> <data name="Category" value="DDL"/> <data name="Category" value="Schemas"/> + <data name="Category" value="Developers"/> + <data name="Category" value="Data Analysts"/> </metadata> </prolog> @@ -70,7 +72,7 @@ </li> <li> <p> - To keep tables or views contained by a database while removing the database itself, use + To keep tables or views contained by a database while removing the database itself, use <codeph>ALTER TABLE</codeph> and <codeph>ALTER VIEW</codeph> to move the relevant objects to a different database before dropping the original database. </p> @@ -101,6 +103,10 @@ DATABASE</codeph>, <codeph>USE</codeph>, and <codeph>DROP DATABASE</codeph>. </p> + <p conref="../shared/impala_common.xml#common/s3_blurb"/> + + <p conref="../shared/impala_common.xml#common/s3_ddl"/> + <p conref="../shared/impala_common.xml#common/cancel_blurb_no"/> <p conref="../shared/impala_common.xml#common/permissions_blurb"/> http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_drop_function.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_drop_function.xml b/docs/topics/impala_drop_function.xml index 51a4d90..0f6c33b 100644 --- a/docs/topics/impala_drop_function.xml +++ b/docs/topics/impala_drop_function.xml @@ -3,7 +3,7 @@ <concept rev="1.2" id="drop_function"> <title>DROP FUNCTION Statement</title> - <titlealts><navtitle>DROP FUNCTION</navtitle></titlealts> + <titlealts audience="PDF"><navtitle>DROP FUNCTION</navtitle></titlealts> <prolog> <metadata> <data name="Category" value="Impala"/> @@ -12,6 +12,8 @@ <data name="Category" value="Impala Functions"/> <data name="Category" value="UDFs"/> <data name="Category" value="Schemas"/> + <data name="Category" value="Developers"/> + <data name="Category" value="Data Analysts"/> </metadata> </prolog> @@ -25,8 +27,36 @@ <p conref="../shared/impala_common.xml#common/syntax_blurb"/> + <p> + To drop C++ UDFs and UDAs: + </p> + <codeblock>DROP [AGGREGATE] FUNCTION [IF EXISTS] [<varname>db_name</varname>.]<varname>function_name</varname>(<varname>type</varname>[, <varname>type</varname>...])</codeblock> + <note rev="2.5.0 IMPALA-2843 CDH-39148"> + <p rev="2.5.0 IMPALA-2843 CDH-39148"> + The preceding syntax, which includes the function signature, also applies to Java UDFs that were created + using the corresponding <codeph>CREATE FUNCTION</codeph> syntax that includes the argument and return types. + After upgrading to CDH 5.7 / Impala 2.5 or higher, consider re-creating all Java UDFs with the + <codeph>CREATE FUNCTION</codeph> syntax that does not include the function signature. Java UDFs created this + way are now persisted in the metastore database and do not need to be re-created after an Impala restart. + </p> + </note> + + <p rev="2.5.0 IMPALA-2843 CDH-39148"> + To drop Java UDFs (created using the <codeph>CREATE FUNCTION</codeph> syntax with no function signature): + </p> + +<codeblock rev="2.5.0">DROP FUNCTION [IF EXISTS] [<varname>db_name</varname>.]<varname>function_name</varname></codeblock> + +<!-- +Examples: +CREATE FUNCTION IF NOT EXISTS foo location '/path/to/jar' SYMBOL='TestUdf'; +CREATE FUNCTION bar location '/path/to/jar' SYMBOL='TestUdf2'; +DROP FUNCTION foo; +DROP FUNCTION IF EXISTS bar; +--> + <p conref="../shared/impala_common.xml#common/ddl_blurb"/> <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/> @@ -51,6 +81,43 @@ not HDFS files and directories. </p> + <p conref="../shared/impala_common.xml#common/example_blurb"/> + <p rev="2.5.0 IMPALA-2843 CDH-39148"> + The following example shows how to drop Java functions created with the signatureless + <codeph>CREATE FUNCTION</codeph> syntax in CDH 5.7 / Impala 2.5 and higher. + Issuing <codeph>DROP FUNCTION <varname>function_name</varname></codeph> removes all the + overloaded functions under that name. + (See <xref href="impala_create_function.xml#create_function"/> for a longer example + showing how to set up such functions in the first place.) + </p> +<codeblock rev="2.5.0 IMPALA-2843 CDH-39148"> +create function my_func location '/user/impala/udfs/udf-examples-cdh570.jar' + symbol='com.cloudera.impala.TestUdf'; + +show functions; ++-------------+---------------------------------------+-------------+---------------+ +| return type | signature | binary type | is persistent | ++-------------+---------------------------------------+-------------+---------------+ +| BIGINT | my_func(BIGINT) | JAVA | true | +| BOOLEAN | my_func(BOOLEAN) | JAVA | true | +| BOOLEAN | my_func(BOOLEAN, BOOLEAN) | JAVA | true | +... +| BIGINT | testudf(BIGINT) | JAVA | true | +| BOOLEAN | testudf(BOOLEAN) | JAVA | true | +| BOOLEAN | testudf(BOOLEAN, BOOLEAN) | JAVA | true | +... + +drop function my_func; +show functions; ++-------------+---------------------------------------+-------------+---------------+ +| return type | signature | binary type | is persistent | ++-------------+---------------------------------------+-------------+---------------+ +| BIGINT | testudf(BIGINT) | JAVA | true | +| BOOLEAN | testudf(BOOLEAN) | JAVA | true | +| BOOLEAN | testudf(BOOLEAN, BOOLEAN) | JAVA | true | +... +</codeblock> + <p conref="../shared/impala_common.xml#common/related_info"/> <p> http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_drop_role.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_drop_role.xml b/docs/topics/impala_drop_role.xml index 35d2157..b60f465 100644 --- a/docs/topics/impala_drop_role.xml +++ b/docs/topics/impala_drop_role.xml @@ -3,14 +3,18 @@ <concept rev="1.4.0" id="drop_role"> <title>DROP ROLE Statement (CDH 5.2 or higher only)</title> - <titlealts><navtitle>DROP ROLE (CDH 5.2 or higher only)</navtitle></titlealts> + <titlealts audience="PDF"><navtitle>DROP ROLE</navtitle></titlealts> <prolog> <metadata> <data name="Category" value="Impala"/> <data name="Category" value="DDL"/> <data name="Category" value="SQL"/> <data name="Category" value="Sentry"/> + <data name="Category" value="Security"/> <data name="Category" value="Roles"/> + <data name="Category" value="Administrators"/> + <data name="Category" value="Developers"/> + <data name="Category" value="Data Analysts"/> <!-- Consider whether to go deeper into categories like Security for the Sentry-related statements. --> </metadata> </prolog> http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_drop_stats.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_drop_stats.xml b/docs/topics/impala_drop_stats.xml index 56697f4..df82d6b 100644 --- a/docs/topics/impala_drop_stats.xml +++ b/docs/topics/impala_drop_stats.xml @@ -1,23 +1,27 @@ <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd"> -<concept rev="2.1.0" id="drop_stats" xml:lang="en-US"> +<concept rev="2.1.0" id="drop_stats"> <title>DROP STATS Statement</title> - <titlealts><navtitle>DROP STATS</navtitle></titlealts> + <titlealts audience="PDF"><navtitle>DROP STATS</navtitle></titlealts> <prolog> <metadata> <data name="Category" value="Impala"/> <data name="Category" value="SQL"/> <data name="Category" value="DDL"/> + <data name="Category" value="ETL"/> + <data name="Category" value="Ingest"/> <data name="Category" value="Tables"/> <data name="Category" value="Performance"/> <data name="Category" value="Scalability"/> + <data name="Category" value="Developers"/> + <data name="Category" value="Data Analysts"/> </metadata> </prolog> <conbody> - <p> + <p rev="2.1.0"> <indexterm audience="Cloudera">DROP STATS statement</indexterm> Removes the specified statistics from a table or partition. The statistics were originally created by the <codeph>COMPUTE STATS</codeph> or <codeph>COMPUTE INCREMENTAL STATS</codeph> statement. @@ -107,9 +111,9 @@ DROP INCREMENTAL STATS [<varname>database_name</varname>.]<varname>table_name</v Applies to any subsequent examples with output from SHOW ... STATS too. --> <codeblock>show table stats item_partitioned; -+-------------+-------+--------+----------+--------------+---------+------------------ ++-------------+-------+--------+----------+--------------+---------+----------------- | i_category | #Rows | #Files | Size | Bytes Cached | Format | Incremental stats -+-------------+-------+--------+----------+--------------+---------+------------------ ++-------------+-------+--------+----------+--------------+---------+----------------- | Books | 1733 | 1 | 223.74KB | NOT CACHED | PARQUET | true | Children | 1786 | 1 | 230.05KB | NOT CACHED | PARQUET | true | Electronics | 1812 | 1 | 232.67KB | NOT CACHED | PARQUET | true @@ -121,34 +125,34 @@ DROP INCREMENTAL STATS [<varname>database_name</varname>.]<varname>table_name</v | Sports | 1783 | 1 | 227.97KB | NOT CACHED | PARQUET | true | Women | 1790 | 1 | 226.27KB | NOT CACHED | PARQUET | true | Total | 17957 | 10 | 2.25MB | 0B | | -+-------------+-------+--------+----------+--------------+---------+------------------ ++-------------+-------+--------+----------+--------------+---------+----------------- show column stats item_partitioned; -+------------------+-----------+------------------+--------+----------+--------------- ++------------------+-----------+------------------+--------+----------+-------------- | Column | Type | #Distinct Values | #Nulls | Max Size | Avg Size -+------------------+-----------+------------------+--------+----------+--------------- ++------------------+-----------+------------------+--------+----------+-------------- | i_item_sk | INT | 19443 | -1 | 4 | 4 | i_item_id | STRING | 9025 | -1 | 16 | 16 | i_rec_start_date | TIMESTAMP | 4 | -1 | 16 | 16 | i_rec_end_date | TIMESTAMP | 3 | -1 | 16 | 16 -| i_item_desc | STRING | 13330 | -1 | 200 | 100.3028030395 +| i_item_desc | STRING | 13330 | -1 | 200 | 100.302803039 | i_current_price | FLOAT | 2807 | -1 | 4 | 4 | i_wholesale_cost | FLOAT | 2105 | -1 | 4 | 4 | i_brand_id | INT | 965 | -1 | 4 | 4 -| i_brand | STRING | 725 | -1 | 22 | 16.17760086059 +| i_brand | STRING | 725 | -1 | 22 | 16.1776008605 | i_class_id | INT | 16 | -1 | 4 | 4 -| i_class | STRING | 101 | -1 | 15 | 7.767499923706 +| i_class | STRING | 101 | -1 | 15 | 7.76749992370 | i_category_id | INT | 10 | -1 | 4 | 4 | i_manufact_id | INT | 1857 | -1 | 4 | 4 -| i_manufact | STRING | 1028 | -1 | 15 | 11.32950019836 -| i_size | STRING | 8 | -1 | 11 | 4.334599971771 -| i_formulation | STRING | 12884 | -1 | 20 | 19.97999954223 -| i_color | STRING | 92 | -1 | 10 | 5.380899906158 -| i_units | STRING | 22 | -1 | 7 | 4.186900138854 -| i_container | STRING | 2 | -1 | 7 | 6.992599964141 +| i_manufact | STRING | 1028 | -1 | 15 | 11.3295001983 +| i_size | STRING | 8 | -1 | 11 | 4.33459997177 +| i_formulation | STRING | 12884 | -1 | 20 | 19.9799995422 +| i_color | STRING | 92 | -1 | 10 | 5.38089990615 +| i_units | STRING | 22 | -1 | 7 | 4.18690013885 +| i_container | STRING | 2 | -1 | 7 | 6.99259996414 | i_manager_id | INT | 105 | -1 | 4 | 4 -| i_product_name | STRING | 19094 | -1 | 25 | 18.02330017089 +| i_product_name | STRING | 19094 | -1 | 25 | 18.0233001708 | i_category | STRING | 10 | 0 | -1 | -1 -+------------------+-----------+------------------+--------+----------+--------------- ++------------------+-----------+------------------+--------+----------+-------------- </codeblock> <p> @@ -170,7 +174,7 @@ drop incremental stats item_partitioned partition (i_category='Electronics'); show table stats item_partitioned +-------------+-------+--------+----------+--------------+---------+------------------ | i_category | #Rows | #Files | Size | Bytes Cached | Format | Incremental stats -+-------------+-------+--------+----------+--------------+---------+------------------ ++-------------+-------+--------+----------+--------------+---------+----------------- | Books | 1733 | 1 | 223.74KB | NOT CACHED | PARQUET | true | Children | 1786 | 1 | 230.05KB | NOT CACHED | PARQUET | true | Electronics | -1 | 1 | 232.67KB | NOT CACHED | PARQUET | false @@ -182,34 +186,34 @@ show table stats item_partitioned | Sports | -1 | 1 | 227.97KB | NOT CACHED | PARQUET | false | Women | 1790 | 1 | 226.27KB | NOT CACHED | PARQUET | true | Total | 17957 | 10 | 2.25MB | 0B | | -+-------------+-------+--------+----------+--------------+---------+------------------ ++-------------+-------+--------+----------+--------------+---------+----------------- show column stats item_partitioned -+------------------+-----------+------------------+--------+----------+--------------- ++------------------+-----------+------------------+--------+----------+-------------- | Column | Type | #Distinct Values | #Nulls | Max Size | Avg Size -+------------------+-----------+------------------+--------+----------+--------------- ++------------------+-----------+------------------+--------+----------+-------------- | i_item_sk | INT | 19443 | -1 | 4 | 4 | i_item_id | STRING | 9025 | -1 | 16 | 16 | i_rec_start_date | TIMESTAMP | 4 | -1 | 16 | 16 | i_rec_end_date | TIMESTAMP | 3 | -1 | 16 | 16 -| i_item_desc | STRING | 13330 | -1 | 200 | 100.3028030395 +| i_item_desc | STRING | 13330 | -1 | 200 | 100.302803039 | i_current_price | FLOAT | 2807 | -1 | 4 | 4 | i_wholesale_cost | FLOAT | 2105 | -1 | 4 | 4 | i_brand_id | INT | 965 | -1 | 4 | 4 -| i_brand | STRING | 725 | -1 | 22 | 16.17760086059 +| i_brand | STRING | 725 | -1 | 22 | 16.1776008605 | i_class_id | INT | 16 | -1 | 4 | 4 -| i_class | STRING | 101 | -1 | 15 | 7.767499923706 +| i_class | STRING | 101 | -1 | 15 | 7.76749992370 | i_category_id | INT | 10 | -1 | 4 | 4 | i_manufact_id | INT | 1857 | -1 | 4 | 4 -| i_manufact | STRING | 1028 | -1 | 15 | 11.32950019836 -| i_size | STRING | 8 | -1 | 11 | 4.334599971771 -| i_formulation | STRING | 12884 | -1 | 20 | 19.97999954223 -| i_color | STRING | 92 | -1 | 10 | 5.380899906158 -| i_units | STRING | 22 | -1 | 7 | 4.186900138854 -| i_container | STRING | 2 | -1 | 7 | 6.992599964141 +| i_manufact | STRING | 1028 | -1 | 15 | 11.3295001983 +| i_size | STRING | 8 | -1 | 11 | 4.33459997177 +| i_formulation | STRING | 12884 | -1 | 20 | 19.9799995422 +| i_color | STRING | 92 | -1 | 10 | 5.38089990615 +| i_units | STRING | 22 | -1 | 7 | 4.18690013885 +| i_container | STRING | 2 | -1 | 7 | 6.99259996414 | i_manager_id | INT | 105 | -1 | 4 | 4 -| i_product_name | STRING | 19094 | -1 | 25 | 18.02330017089 +| i_product_name | STRING | 19094 | -1 | 25 | 18.0233001708 | i_category | STRING | 10 | 0 | -1 | -1 -+------------------+-----------+------------------+--------+----------+--------------- ++------------------+-----------+------------------+--------+----------+-------------- </codeblock> <p> http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_drop_table.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_drop_table.xml b/docs/topics/impala_drop_table.xml index 33cb726..81ce8c6 100644 --- a/docs/topics/impala_drop_table.xml +++ b/docs/topics/impala_drop_table.xml @@ -3,7 +3,7 @@ <concept id="drop_table"> <title>DROP TABLE Statement</title> - <titlealts><navtitle>DROP TABLE</navtitle></titlealts> + <titlealts audience="PDF"><navtitle>DROP TABLE</navtitle></titlealts> <prolog> <metadata> <data name="Category" value="Impala"/> @@ -11,6 +11,9 @@ <data name="Category" value="DDL"/> <data name="Category" value="Tables"/> <data name="Category" value="Schemas"/> + <data name="Category" value="S3"/> + <data name="Category" value="Developers"/> + <data name="Category" value="Data Analysts"/> </metadata> </prolog> @@ -44,14 +47,13 @@ <b>PURGE clause:</b> </p> - <p rev="2.3.0"> - The optional <codeph>PURGE</codeph> keyword, available in CDH 5.5 / Impala 2.3 and higher, - causes Impala to remove the associated HDFS data files - immediately, rather than going through the HDFS trashcan mechanism. Use this keyword when dropping - a table if it is crucial to remove the data as quickly as possible to free up space, or if there is - a problem with the trashcan, such as the trashcan not being configured or being in a different - HDFS encryption zone than the data files. - </p> + <p rev="2.3.0"> The optional <codeph>PURGE</codeph> keyword, available in + CDH 5.5 / Impala 2.3 and higher, causes Impala to remove the associated + HDFS data files immediately, rather than going through the HDFS trashcan + mechanism. Use this keyword when dropping a table if it is crucial to + remove the data as quickly as possible to free up space, or if there is a + problem with the trashcan, such as the trash cannot being configured or + being in a different HDFS encryption zone than the data files. </p> <p conref="../shared/impala_common.xml#common/ddl_blurb"/> @@ -108,13 +110,19 @@ drop table temporary.trivial;</codeblock> <p conref="../shared/impala_common.xml#common/disk_space_blurb"/> <p conref="../shared/impala_common.xml#common/s3_blurb"/> - <p rev="2.2.0"> - Although Impala cannot write new data to a table stored in the Amazon - S3 filesystem, the <codeph>DROP TABLE</codeph> statement can remove data files from S3 + <p rev="2.6.0 CDH-39913 IMPALA-1878"> + The <codeph>DROP TABLE</codeph> statement can remove data files from S3 if the associated S3 table is an internal table. + In CDH 5.8 / Impala 2.6 and higher, as part of improved support for writing + to S3, Impala also removes the associated folder when dropping an internal table + that resides on S3. See <xref href="impala_s3.xml#s3"/> for details about working with S3 tables. </p> + <p conref="../shared/impala_common.xml#common/s3_drop_table_purge"/> + + <p conref="../shared/impala_common.xml#common/s3_ddl"/> + <p conref="../shared/impala_common.xml#common/cancel_blurb_no"/> <p conref="../shared/impala_common.xml#common/permissions_blurb"/> http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_drop_view.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_drop_view.xml b/docs/topics/impala_drop_view.xml index edcab58..627fd47 100644 --- a/docs/topics/impala_drop_view.xml +++ b/docs/topics/impala_drop_view.xml @@ -3,7 +3,7 @@ <concept rev="1.1" id="drop_view"> <title>DROP VIEW Statement</title> - <titlealts><navtitle>DROP VIEW</navtitle></titlealts> + <titlealts audience="PDF"><navtitle>DROP VIEW</navtitle></titlealts> <prolog> <metadata> <data name="Category" value="Impala"/> @@ -11,6 +11,7 @@ <data name="Category" value="DDL"/> <data name="Category" value="Schemas"/> <data name="Category" value="Tables"/> + <data name="Category" value="Views"/> </metadata> </prolog> http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_exec_single_node_rows_threshold.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_exec_single_node_rows_threshold.xml b/docs/topics/impala_exec_single_node_rows_threshold.xml index fa3007d..c677a64 100644 --- a/docs/topics/impala_exec_single_node_rows_threshold.xml +++ b/docs/topics/impala_exec_single_node_rows_threshold.xml @@ -1,20 +1,23 @@ <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd"> -<concept rev="2.0.0" id="exec_single_node_rows_threshold" xml:lang="en-US"> +<concept rev="2.0.0" id="exec_single_node_rows_threshold"> - <title>EXEC_SINGLE_NODE_ROWS_THRESHOLD Query Option</title> + <title>EXEC_SINGLE_NODE_ROWS_THRESHOLD Query Option (CDH 5.3 or higher only)</title> + <titlealts audience="PDF"><navtitle>EXEC_SINGLE_NODE_ROWS_THRESHOLD</navtitle></titlealts> <prolog> <metadata> <data name="Category" value="Impala"/> <data name="Category" value="Impala Query Options"/> <data name="Category" value="Scalability"/> <data name="Category" value="Performance"/> + <data name="Category" value="Developers"/> + <data name="Category" value="Data Analysts"/> </metadata> </prolog> <conbody> - <p> + <p rev="2.0.0"> <indexterm audience="Cloudera">EXEC_SINGLE_NODE_ROWS_THRESHOLD query option</indexterm> This setting controls the cutoff point (in terms of number of rows scanned) below which Impala treats a query as a <q>small</q> query, turning off optimizations such as parallel execution and native code generation. The @@ -66,6 +69,8 @@ generating native code are expected to outweigh any overhead from the remote reads. </p> + <p conref="../shared/impala_common.xml#common/added_in_210"/> + <p conref="../shared/impala_common.xml#common/example_blurb"/> <p> @@ -81,7 +86,7 @@ SELECT * FROM enormous_table LIMIT 300; <!-- Don't have any other places that tie into this particular optimization technique yet. Potentially: conceptual topics about code generation, distributed queries -<p conref="/Content/impala_common_xi44078.xml#common/related_info"/> +<p conref="../shared/impala_common.xml#common/related_info"/> <p> </p> -->
