This is an automated email from the ASF dual-hosted git repository.
zabetak pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/hive-site.git
The following commit(s) were added to refs/heads/main by this push:
new e2dc2452 Fix some "Raw HTML omitted" warnings and formatting issues
(part 5) (#107)
e2dc2452 is described below
commit e2dc24520edf53a956589003a38b7d1655453ffd
Author: Thomas Rebele <[email protected]>
AuthorDate: Tue Jun 9 09:44:13 2026 +0200
Fix some "Raw HTML omitted" warnings and formatting issues (part 5) (#107)
---
content/docs/latest/language/languagemanual-udf.md | 44 +++++++++++-----------
content/docs/latest/user/hive-transactions.md | 6 +--
content/docs/latest/user/tutorial.md | 24 ++++++------
content/docs/latest/webhcat/webhcat-configure.md | 8 ++--
.../docs/latest/webhcat/webhcat-installwebhcat.md | 2 +-
content/general/PrivacyPolicy.md | 4 +-
6 files changed, 44 insertions(+), 44 deletions(-)
diff --git a/content/docs/latest/language/languagemanual-udf.md
b/content/docs/latest/language/languagemanual-udf.md
index 92341edf..b6247878 100644
--- a/content/docs/latest/language/languagemanual-udf.md
+++ b/content/docs/latest/language/languagemanual-udf.md
@@ -188,12 +188,12 @@ The following built-in collection functions are supported
in Hive:
| **Return Type** | **Name(Signature)** | **Description** |
| --- | --- | --- |
-| int | size(Map<K.V>) | Returns the number of elements in the map type. |
-| int | size(Array<T>) | Returns the number of elements in the array type. |
-| array<K> | map_keys(Map<K.V>) | Returns an unordered array containing the
keys of the input map. |
-| array<V> | map_values(Map<K.V>) | Returns an unordered array containing the
values of the input map. |
-| boolean | array_contains(Array<T>, value) | Returns TRUE if the array
contains value. |
-| array<t> | sort_array(Array<T>) | Sorts the input array in ascending order
according to the natural ordering of the array elements and returns it (as of
version [0.9.0](https://issues.apache.org/jira/browse/HIVE-2279)). |
+| int | size(Map\<K.V\>) | Returns the number of elements in the map type. |
+| int | size(Array\<T\>) | Returns the number of elements in the array type. |
+| array\<K\> | map_keys(Map\<K.V\>) | Returns an unordered array containing
the keys of the input map. |
+| array\<V\> | map_values(Map\<K.V\>) | Returns an unordered array containing
the values of the input map. |
+| boolean | array_contains(Array\<T\>, value) | Returns TRUE if the array
contains value. |
+| array\<t\> | sort_array(Array\<T\>) | Sorts the input array in ascending
order according to the natural ordering of the array elements and returns it
(as of version [0.9.0](https://issues.apache.org/jira/browse/HIVE-2279)). |
### Type Conversion Functions
@@ -202,7 +202,7 @@ The following type conversion functions are supported in
Hive:
| Return Type | Name(Signature) | Description |
| --- | --- | --- |
| binary | binary(string|binary) | Casts the parameter into a binary. |
-| **Expected "=" to follow "type"** | cast(expr as <type>) | Converts the
results of the expression expr to <type>. For example, cast('1' as BIGINT) will
convert the string '1' to its integral representation. A null is returned if
the conversion does not succeed. If cast(expr as boolean) Hive returns true for
a non-empty string. |
+| **Expected "=" to follow "type"** | cast(expr as \<type\>) | Converts the
results of the expression expr to \<type\>. For example, cast('1' as BIGINT)
will convert the string '1' to its integral representation. A null is returned
if the conversion does not succeed. If cast(expr as boolean) Hive returns true
for a non-empty string. |
### Date Functions
@@ -272,9 +272,9 @@ The following built-in String functions are supported in
Hive:
| int | character_length(string str) | Returns the number of UTF-8 characters
contained in str (as of Hive
[2.2.0](https://issues.apache.org/jira/browse/HIVE-15979)). The function
char_length is shorthand for this function. |
| string | chr(bigint|double A) | Returns the ASCII character having the
binary equivalent to A (as of Hive [1.3.0 and
2.1.0](https://issues.apache.org/jira/browse/HIVE-13063)). If A is larger than
256 the result is equivalent to chr(A % 256). Example: select chr(88); returns
"X". |
| string | concat(string|binary A, string|binary B...) | Returns the string or
bytes resulting from concatenating the strings or bytes passed in as parameters
in order. For example, concat('foo', 'bar') results in 'foobar'. Note that this
function can take any number of input strings. |
-| array<struct<string,double>> | context_ngrams(array<array<string>>,
array<string>, int K, int pf) | Returns the top-k contextual N-grams from a set
of tokenized sentences, given a string of "context". See
[StatisticsAndDataMining]({{< ref "statisticsanddatamining" >}}) for more
information. |
+| array\<struct\<string,double\>\> | context_ngrams(array\<array\<string\>\>,
array\<string\>, int K, int pf) | Returns the top-k contextual N-grams from a
set of tokenized sentences, given a string of "context". See
[StatisticsAndDataMining]({{< ref "statisticsanddatamining" >}}) for more
information. |
| string | concat_ws(string SEP, string A, string B...) | Like concat() above,
but with custom separator SEP. |
-| string | concat_ws(string SEP, array<string>) | Like concat_ws() above, but
taking an array of strings. (as of Hive
[0.9.0](https://issues.apache.org/jira/browse/HIVE-2203)) |
+| string | concat_ws(string SEP, array\<string\>) | Like concat_ws() above,
but taking an array of strings. (as of Hive
[0.9.0](https://issues.apache.org/jira/browse/HIVE-2203)) |
| string | decode(binary bin, string charset) | Decodes the first argument
into a String using the provided character set (one of 'US-ASCII',
'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16'). If either argument is
null, the result will also be null. (As of Hive
[0.12.0](https://issues.apache.org/jira/browse/HIVE-2482).) |
| string | elt(N int,str1 string,str2 string,str3 string,...) | Return string
at index number. For example elt(2,'hello','world') returns 'world'. Returns
NULL if N is less than 1 or greater than the number of arguments.(see
<https://dev.mysql.com/doc/refman/5.7/en/string-functions.html#function_elt>) |
| binary | encode(string src, string charset) | Encodes the first argument
into a BINARY using the provided character set (one of 'US-ASCII',
'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16'). If either argument is
null, the result will also be null. (As of Hive
[0.12.0](https://issues.apache.org/jira/browse/HIVE-2482).) |
@@ -289,7 +289,7 @@ The following built-in String functions are supported in
Hive:
| string | lower(string A) lcase(string A) | Returns the string resulting from
converting all characters of B to lower case. For example, lower('fOoBaR')
results in 'foobar'. |
| string | lpad(string str, int len, string pad) | Returns str, left-padded
with pad to a length of len. If str is longer than len, the return value is
shortened to len characters. In case of empty pad string, the return value is
null. |
| string | ltrim(string A) | Returns the string resulting from trimming spaces
from the beginning(left hand side) of A. For example, ltrim(' foobar ') results
in 'foobar '. |
-| array<struct<string,double>> | ngrams(array<array<string>>, int N, int K,
int pf) | Returns the top-k N-grams from a set of tokenized sentences, such as
those returned by the sentences() UDAF. See [StatisticsAndDataMining]({{< ref
"statisticsanddatamining" >}}) for more information. |
+| array\<struct\<string,double\>\> | ngrams(array\<array\<string\>\>, int N,
int K, int pf) | Returns the top-k N-grams from a set of tokenized sentences,
such as those returned by the sentences() UDAF. See
[StatisticsAndDataMining]({{< ref "statisticsanddatamining" >}}) for more
information. |
| int | octet_length(string str) | Returns the number of octets required to
hold the string str in UTF-8 encoding (since Hive
[2.2.0](https://issues.apache.org/jira/browse/HIVE-15979)). Note that
octet_length(str) can be larger than character_length(str). |
| string | parse_url(string urlString, string partToExtract [, string
keyToExtract]) | Returns the specified part from the URL. Valid values for
partToExtract include HOST, PATH, QUERY, REF, PROTOCOL, AUTHORITY, FILE, and
USERINFO. For example,
parse_url('http://facebook.com/path1/p.php?k1=v1&k2=v2#Ref1', 'HOST') returns
'facebook.com'. Also a value of a particular key in QUERY can be extracted by
providing the key as the third argument, for example,
parse_url('http://facebook.com/path1/ [...]
| string | printf(String format, Obj... args) | Returns the input formatted
according do printf-style format strings (as of Hive
[0.9.0](https://issues.apache.org/jira/browse/HIVE-2695)). |
@@ -311,10 +311,10 @@ The following built-in String functions are supported in
Hive:
| string | reverse(string A) | Returns the reversed string. |
| string | rpad(string str, int len, string pad) | Returns str, right-padded
with pad to a length of len. If str is longer than len, the return value is
shortened to len characters. In case of empty pad string, the return value is
null. |
| string | rtrim(string A) | Returns the string resulting from trimming spaces
from the end(right hand side) of A. For example, rtrim(' foobar ') results in '
foobar'. |
-| array<array<string>> | sentences(string str, string lang, string locale) |
Tokenizes a string of natural language text into words and sentences, where
each sentence is broken at the appropriate sentence boundary and returned as an
array of words. The 'lang' and 'locale' are optional arguments. For example,
sentences('Hello there! How are you?') returns ( ("Hello", "there"), ("How",
"are", "you") ). |
+| array\<array\<string\>\> | sentences(string str, string lang, string locale)
| Tokenizes a string of natural language text into words and sentences, where
each sentence is broken at the appropriate sentence boundary and returned as an
array of words. The 'lang' and 'locale' are optional arguments. For example,
sentences('Hello there! How are you?') returns ( ("Hello", "there"), ("How",
"are", "you") ). |
| string | space(int n) | Returns a string of n spaces. |
| array | split(string str, string pat) | Splits str around pat (pat is a
regular expression). |
-| map<string,string> | str_to_map(text[, delimiter1, delimiter2]) | Splits
text into key-value pairs using two delimiters. Delimiter1 separates text into
K-V pairs, and Delimiter2 splits each K-V pair. Default delimiters are ',' for
delimiter1 and ':' for delimiter2. |
+| map\<string,string\> | str_to_map(text[, delimiter1, delimiter2]) | Splits
text into key-value pairs using two delimiters. Delimiter1 separates text into
K-V pairs, and Delimiter2 splits each K-V pair. Default delimiters are ',' for
delimiter1 and ':' for delimiter2. |
| string | substr(string|binary A, int start) substring(string|binary A, int
start) | Returns the substring or slice of the byte array of A starting from
start position till the end of string A. For example, substr('foobar', 4)
results in 'bar' (see
[<http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_substr>]).
|
| string | substr(string|binary A, int start, int len) substring(string|binary
A, int start, int len) | Returns the substring or slice of the byte array of A
starting from start position with length len. For example, substr('foobar', 4,
1) results in 'b' (see
[<http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_substr>]).
|
| string | substring_index(string A, string delim, int count) | Returns the
substring from string A before count occurrences of the delimiter delim (as of
Hive [1.3.0](https://issues.apache.org/jira/browse/HIVE-686)). If count is
positive, everything to the left of the final delimiter (counting from the
left) is returned. If count is negative, everything to the right of the final
delimiter (counting from the right) is returned. Substring_index performs a
case-sensitive match when searchi [...]
@@ -433,9 +433,9 @@ The following built-in aggregate functions are supported in
Hive:
| DOUBLE | covar_samp(col1, col2) | Returns the sample covariance of a pair of
a numeric columns in the group. |
| DOUBLE | corr(col1, col2) | Returns the Pearson coefficient of correlation
of a pair of a numeric columns in the group. |
| DOUBLE | percentile(BIGINT col, p) | Returns the exact pth percentile of a
column in the group (does not work with floating point types). p must be
between 0 and 1. NOTE: A true percentile can only be computed for integer
values. Use PERCENTILE_APPROX if your input is non-integral. |
-| array<double> | percentile(BIGINT col, array(p1 [, p2]...)) | Returns the
exact percentiles p1, p2, ... of a column in the group (does not work with
floating point types). pi must be between 0 and 1. NOTE: A true percentile can
only be computed for integer values. Use PERCENTILE_APPROX if your input is
non-integral. |
+| array\<double\> | percentile(BIGINT col, array(p1 [, p2]...)) | Returns the
exact percentiles p1, p2, ... of a column in the group (does not work with
floating point types). pi must be between 0 and 1. NOTE: A true percentile can
only be computed for integer values. Use PERCENTILE_APPROX if your input is
non-integral. |
| DOUBLE | percentile_approx(DOUBLE col, p [, B]) | Returns an approximate pth
percentile of a numeric column (including floating point types) in the group.
The B parameter controls approximation accuracy at the cost of memory. Higher
values yield better approximations, and the default is 10,000. When the number
of distinct values in col is smaller than B, this gives an exact percentile
value. |
-| array<double> | percentile_approx(DOUBLE col, array(p1 [, p2]...) [, B]) |
Same as above, but accepts and returns an array of percentile values instead of
a single one. |
+| array\<double\> | percentile_approx(DOUBLE col, array(p1 [, p2]...) [, B]) |
Same as above, but accepts and returns an array of percentile values instead of
a single one. |
| double | regr_avgx(independent, dependent) | Equivalent to avg(dependent).
As of [Hive 2.2.0](https://issues.apache.org/jira/browse/HIVE-15978). |
| double | regr_avgy(independent, dependent) | Equivalent to avg(independent).
As of [Hive 2.2.0](https://issues.apache.org/jira/browse/HIVE-15978). |
| double | regr_count(independent, dependent) | Returns the number of non-null
pairs used to fit the linear regression line. As of [Hive
2.2.0](https://issues.apache.org/jira/browse/HIVE-15978). |
@@ -445,7 +445,7 @@ The following built-in aggregate functions are supported in
Hive:
| double | regr_sxx(independent, dependent) | Equivalent to
regr_count(independent, dependent) * var_pop(dependent). As of [Hive
2.2.0](https://issues.apache.org/jira/browse/HIVE-15978). |
| double | regr_sxy(independent, dependent) | Equivalent to
regr_count(independent, dependent) * covar_pop(independent, dependent). As of
[Hive 2.2.0](https://issues.apache.org/jira/browse/HIVE-15978). |
| double | regr_syy(independent, dependent) | Equivalent to
regr_count(independent, dependent) * var_pop(independent). As of [Hive
2.2.0](https://issues.apache.org/jira/browse/HIVE-15978). |
-| array<struct {`'x','y'`}> | histogram_numeric(col, b) | Computes a histogram
of a numeric column in the group using b non-uniformly spaced bins. The output
is an array of size b of double-valued (x,y) coordinates that represent the bin
centers and heights |
+| array\<struct {`'x','y'`}\> | histogram_numeric(col, b) | Computes a
histogram of a numeric column in the group using b non-uniformly spaced bins.
The output is an array of size b of double-valued (x,y) coordinates that
represent the bin centers and heights |
| array | collect_set(col) | Returns a set of objects with duplicate elements
eliminated. |
| array | collect_list(col) | Returns a list of objects with duplicates. (As
of Hive [0.13.0](https://issues.apache.org/jira/browse/HIVE-5294).) |
| INTEGER | ntile(INTEGER x) | Divides an ordered partition into `x` groups
called buckets and assigns a bucket number to each row in the partition. This
allows easy calculation of tertiles, quartiles, deciles, percentiles and other
common summary statistics. (As of Hive
[0.11.0](https://issues.apache.org/jira/browse/HIVE-896).) |
@@ -456,14 +456,14 @@ Normal user-defined functions, such as concat(), take in
a single input row and
| **Row-set columns types** | **Name(Signature)** | **Description** |
| --- | --- | --- |
-| T | explode(ARRAY<T> a) | Explodes an array to multiple rows. Returns a
row-set with a single column (*col*), one row for each element from the array. |
-| Tkey,Tvalue | explode(MAP<Tkey,Tvalue> m) | Explodes a map to multiple rows.
Returns a row-set with a two columns (*key,value)* , one row for each
key-value pair from the input map. (As of Hive
[0.8.0](https://issues.apache.org/jira/browse/HIVE-1735).). |
-| int,T | posexplode(ARRAY<T> a) | Explodes an array to multiple rows with
additional positional column of *int* type (position of items in the original
array, starting with 0). Returns a row-set with two columns (*pos,val*), one
row for each element from the array. |
-| T1,...,Tn | inline(ARRAY<STRUCT<f1:T1,...,fn:Tn>> a) | Explodes an array of
structs to multiple rows. Returns a row-set with N columns (N = number of top
level elements in the struct), one row per struct from the array. (As of Hive
[0.10](https://issues.apache.org/jira/browse/HIVE-3238).) |
+| T | explode(ARRAY\<T\> a) | Explodes an array to multiple rows. Returns a
row-set with a single column (*col*), one row for each element from the array. |
+| Tkey,Tvalue | explode(MAP\<Tkey,Tvalue\> m) | Explodes a map to multiple
rows. Returns a row-set with a two columns (*key,value)* , one row for each
key-value pair from the input map. (As of Hive
[0.8.0](https://issues.apache.org/jira/browse/HIVE-1735).). |
+| int,T | posexplode(ARRAY\<T\> a) | Explodes an array to multiple rows with
additional positional column of *int* type (position of items in the original
array, starting with 0). Returns a row-set with two columns (*pos,val*), one
row for each element from the array. |
+| T1,...,Tn | inline(ARRAY\<STRUCT\<f1:T1,...,fn:Tn\>\> a) | Explodes an array
of structs to multiple rows. Returns a row-set with N columns (N = number of
top level elements in the struct), one row per struct from the array. (As of
Hive [0.10](https://issues.apache.org/jira/browse/HIVE-3238).) |
| T1,...,Tn/r | stack(int r,T1 V1,...,Tn/r Vn) | Breaks up *n* values
V1,...,Vn into *r* rows. Each row will have *n/r* columns. *r* must be
constant. |
| | | |
| string1,...,stringn | json_tuple(string jsonStr,string k1,...,string kn) |
Takes JSON string and a set of *n* keys, and returns a tuple of *n* values.
This is a more efficient version of the `get_json_object` UDF because it can
get multiple keys with just one call. |
-| string 1,...,stringn | parse_url_tuple(string urlStr,string p1,...,string
pn) | Takes URL string and a set of *n* URL parts, and returns a tuple of *n*
values. This is similar to the `parse_url()` UDF but can extract multiple parts
at once out of a URL. Valid part names are: HOST, PATH, QUERY, REF, PROTOCOL,
AUTHORITY, FILE, USERINFO, QUERY:<KEY>. |
+| string 1,...,stringn | parse_url_tuple(string urlStr,string p1,...,string
pn) | Takes URL string and a set of *n* URL parts, and returns a tuple of *n*
values. This is similar to the `parse_url()` UDF but can extract multiple parts
at once out of a URL. Valid part names are: HOST, PATH, QUERY, REF, PROTOCOL,
AUTHORITY, FILE, USERINFO, QUERY:\<KEY\>. |
@@ -576,7 +576,7 @@ Also see [Writing UDTFs]({{< ref "developerguide-udtf" >}})
if you want to creat
As an example of using `explode()` in the SELECT expression list, consider a
table named myTable that has a single column (myCol) and two rows:
-| Array<int> myCol |
+| Array\<int\> myCol |
| --- |
| [100,200,300] |
| [400,500,600] |
@@ -615,7 +615,7 @@ Available as of Hive 0.13.0. See
[HIVE-4943](https://issues.apache.org/jira/brow
As an example of using `posexplode()` in the SELECT expression list, consider
a table named myTable that has a single column (myCol) and two rows:
-| Array<int> myCol |
+| Array\<int\> myCol |
| --- |
| [100,200,300] |
| [400,500,600] |
diff --git a/content/docs/latest/user/hive-transactions.md
b/content/docs/latest/user/hive-transactions.md
index 30258c3e..2a368ab1 100644
--- a/content/docs/latest/user/hive-transactions.md
+++ b/content/docs/latest/user/hive-transactions.md
@@ -101,7 +101,7 @@ This module is responsible for discovering which tables or
partitions are due fo
#### Worker
-Each Worker handles a single compaction task. A compaction is a MapReduce job
with name in the following form: <hostname>-compactor-<db>.<table>.<partition>.
Each worker submits the job to the cluster (via [hive.compactor.job.queue]({{<
ref "#hive-compactor-job-queue" >}}) if defined) and waits for the job to
finish. [hive.compactor.worker.threads]({{< ref
"#hive-compactor-worker-threads" >}}) determines the number of Workers in each
Metastore. The total number of Workers in the Hive [...]
+Each Worker handles a single compaction task. A compaction is a MapReduce job
with name in the following form:
\<hostname\>-compactor-\<db\>.\<table\>.\<partition\>. Each worker submits the
job to the cluster (via [hive.compactor.job.queue]({{< ref
"#hive-compactor-job-queue" >}}) if defined) and waits for the job to finish.
[hive.compactor.worker.threads]({{< ref "#hive-compactor-worker-threads" >}})
determines the number of Workers in each Metastore. The total number of
Workers in [...]
#### Cleaner
@@ -178,7 +178,7 @@ A number of new configuration parameters have been added to
the system to suppor
| metastore.compactor.long.running.initiator.threshold.error | *Default:* 12h
| Metastore | Initiator cycle duration after which an error will be logged.
Default time unit is: hours |
| hive.compactor.worker.sleep.time | *Default:*10800ms | HiveServer2 | Time in
milliseconds for which a worker threads goes into sleep before starting another
iteration in case of no launched job or error |
| hive.compactor.worker.max.sleep.time | *Default:* 320000ms | HiveServer2 |
Max time in milliseconds for which a worker threads goes into sleep before
starting another iteration used for backoff in case of no launched job or error
|
-| [hive.compactor.worker.threads]({{< ref "#hive-compactor-worker-threads"
>}}) deprecated. Use metastore.compactor.worker.threads instead. | *Default:*
0*Value required for transactions:* > 0 on at least one instance of the Thrift
metastore service | Metastore | How many compactor worker threads to run on
this metastore instance.2 |
+| [hive.compactor.worker.threads]({{< ref "#hive-compactor-worker-threads"
>}}) deprecated. Use metastore.compactor.worker.threads instead. | *Default:*
0*Value required for transactions:* \> 0 on at least one instance of the Thrift
metastore service | Metastore | How many compactor worker threads to run on
this metastore instance.2 |
| [hive.compactor.worker.timeout]({{< ref "#hive-compactor-worker-timeout"
>}}) | *Default:* 86400s | Metastore | Time in seconds after which a compaction
job will be declared failed and the compaction re-queued. |
| [hive.compactor.cleaner.run.interval]({{< ref
"#hive-compactor-cleaner-run-interval" >}}) | *Default*: 5000ms | Metastore |
Time in milliseconds between runs of the cleaner thread. ([Hive
0.14.0](https://issues.apache.org/jira/browse/HIVE-8258) and later.) |
| [hive.compactor.check.interval]({{< ref "#hive-compactor-check-interval"
>}}) | *Default:* 300s | Metastore | Time in seconds between checks to see if
any tables or partitions need to be compacted.3 |
@@ -244,7 +244,7 @@ If a table owner does not wish the system to automatically
determine when to com
Table properties are set with the TBLPROPERTIES clause when a table is created
or altered, as described in the [Create Table]({{< ref "#create-table" >}}) and
[Alter Table Properties]({{< ref "#alter-table-properties" >}}) sections of
Hive Data Definition Language. The "`transactional`" and "`NO_AUTO_COMPACTION`"
table properties are case-sensitive in Hive releases 0.x and 1.0, but they are
case-insensitive starting with release 1.1.0
([HIVE-8308](https://issues.apache.org/jira/browse/HI [...]
-More compaction related options can be set via TBLPROPERTIES as of [Hive 1.3.0
and 2.1.0](https://issues.apache.org/jira/browse/HIVE-13354). They can be set
at both table-level via [CREATE
TABLE](/docs/latest/language/languagemanual-ddl#createdroptruncate-table), and
on request-level via [ALTER TABLE/PARTITION
COMPACT](/docs/latest/language/languagemanual-ddl#alter-tablepartition-compact).
These are used to override the Warehouse/table wide settings. For example,
to override an MR prop [...]
+More compaction related options can be set via TBLPROPERTIES as of [Hive 1.3.0
and 2.1.0](https://issues.apache.org/jira/browse/HIVE-13354). They can be set
at both table-level via [CREATE
TABLE](/docs/latest/language/languagemanual-ddl#createdroptruncate-table), and
on request-level via [ALTER TABLE/PARTITION
COMPACT](/docs/latest/language/languagemanual-ddl#alter-tablepartition-compact).
These are used to override the Warehouse/table wide settings. For example,
to override an MR prop [...]
**Example: Set compaction options in TBLPROPERTIES at table level**
diff --git a/content/docs/latest/user/tutorial.md
b/content/docs/latest/user/tutorial.md
index f2906226..e03d3488 100644
--- a/content/docs/latest/user/tutorial.md
+++ b/content/docs/latest/user/tutorial.md
@@ -108,7 +108,7 @@ Explicit type conversion can be done using the cast
operator as shown in the [#B
Complex Types can be built up from primitive types and other composite types
using:
* Structs: the elements within the type can be accessed using the DOT (.)
notation. For example, for a column c of type STRUCT {a INT; b INT}, the a
field is accessed by the expression c.a
-* Maps (key-value tuples): The elements are accessed using ['element name']
notation. For example in a map M comprising of a mapping from 'group' -> gid
the gid value can be accessed using M['group']
+* Maps (key-value tuples): The elements are accessed using ['element name']
notation. For example in a map M comprising of a mapping from 'group' -\> gid
the gid value can be accessed using M['group']
* Arrays (indexable lists): The elements in the array have to be in the same
type. Elements can be accessed using the [n] notation where n is an index
(zero-based) into the array. For example, for an array A having the elements
['a', 'b', 'c'], A[1] retruns 'b'.
Using the primitive types and the constructs for creating complex types, types
with arbitrary levels of nesting can be created. For example, a type User may
comprise of the following fields:
@@ -143,7 +143,7 @@ Java's "Instant" timestamps define a point in time that
remains constant regardl
#### Comparisons with other tools
-| | SQL 2003 | Oracle | Sybase | Postgres | MySQL | Microsoft SQL | IBM DB2 |
Presto | Snowflake | Hive >= 3.1 | Iceberg | Spark |
+| | SQL 2003 | Oracle | Sybase | Postgres | MySQL | Microsoft SQL | IBM DB2 |
Presto | Snowflake | Hive \>= 3.1 | Iceberg | Spark |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| timestamp | Local | Local | Local | Local | Instant | Other | Local | Local
| Local | Local | Local | Instant |
| timestamp with local time zone | | Instant | | | | | | | Instant |
Instant | | |
@@ -177,10 +177,10 @@ All Hive keywords are case-insensitive, including the
names of Hive operators an
| --- | --- | --- |
| A = B | all primitive types | TRUE if expression A is equivalent to
expression B; otherwise FALSE |
| A != B | all primitive types | TRUE if expression A is *not* equivalent to
expression B; otherwise FALSE |
-| A < B | all primitive types | TRUE if expression A is less than expression
B; otherwise FALSE |
-| A <= B | all primitive types | TRUE if expression A is less than or equal to
expression B; otherwise FALSE |
-| A > B | all primitive types | TRUE if expression A is greater than
expression B] otherwise FALSE |
-| A >= B | all primitive types | TRUE if expression A is greater than or equal
to expression B otherwise FALSE |
+| A \< B | all primitive types | TRUE if expression A is less than expression
B; otherwise FALSE |
+| A \<= B | all primitive types | TRUE if expression A is less than or equal
to expression B; otherwise FALSE |
+| A \> B | all primitive types | TRUE if expression A is greater than
expression B] otherwise FALSE |
+| A \>= B | all primitive types | TRUE if expression A is greater than or
equal to expression B otherwise FALSE |
| A IS NULL | all types | TRUE if expression A evaluates to NULL otherwise
FALSE |
| A IS NOT NULL | all types | FALSE if expression A evaluates to NULL
otherwise TRUE |
| A LIKE B | strings | TRUE if string A matches the SQL simple regular
expression B, otherwise FALSE. The comparison is done character by character.
The _ character in B matches any character in A (similar to **.** in posix
regular expressions), and the % character in B matches an arbitrary number of
characters in A (similar to **.*** in posix regular expressions). For example,
`'foobar' LIKE 'foo'` evaluates to FALSE where as `'foobar' LIKE 'foo___'`
evaluates to TRUE and so does `'foob [...]
@@ -217,7 +217,7 @@ All Hive keywords are case-insensitive, including the names
of Hive operators an
| **Operator** | **Operand types** | **Description** |
| --- | --- | --- |
| A[n] | A is an Array and n is an int | returns the nth element in the array
A. The first element has index 0, for example, if A is an array comprising of
['foo', 'bar'] then A[0] returns 'foo' and A[1] returns 'bar' |
-| M[key] | M is a Map<K, V> and key has type K | returns the value
corresponding to the key in the map for example, if M is a map comprising of
{'f' -> 'foo', 'b' -> 'bar', 'all' -> 'foobar'} then M['all'] returns 'foobar' |
+| M[key] | M is a Map\<K, V\> and key has type K | returns the value
corresponding to the key in the map for example, if M is a map comprising of
{'f' -\> 'foo', 'b' -\> 'bar', 'all' -\> 'foobar'} then M['all'] returns
'foobar' |
| S.x | S is a struct | returns the x field of S, for example, for struct
foobar {int foo, int bar} foobar.foo returns the integer stored in the foo
field of the struct. |
### Built In Functions
@@ -242,9 +242,9 @@ All Hive keywords are case-insensitive, including the names
of Hive operators an
| string | ltrim(string A) | returns the string resulting from trimming spaces
from the beginning(left hand side) of A. For example, ltrim(' foobar ') results
in 'foobar ' |
| string | rtrim(string A) | returns the string resulting from trimming spaces
from the end(right hand side) of A. For example, rtrim(' foobar ') results in '
foobar' |
| string | regexp_replace(string A, string B, string C) | returns the string
resulting from replacing all substrings in B that match the Java regular
expression syntax(See [Java regular expressions
syntax](http://java.sun.com/j2se/1.4.2/docs/api/java/util/regex/Pattern.html))
with C. For example, regexp_replace('foobar', 'oo|ar', ) returns 'fb' |
-| int | size(Map<K.V>) | returns the number of elements in the map type |
-| int | size(Array<T>) | returns the number of elements in the array type |
-| *value of <type>* | cast(*<expr>* as *<type>*) | converts the results of the
expression expr to <type>, for example, cast('1' as BIGINT) will convert the
string '1' to it integral representation. A null is returned if the conversion
does not succeed. |
+| int | size(Map\<K.V\>) | returns the number of elements in the map type |
+| int | size(Array\<T\>) | returns the number of elements in the array type |
+| *value of \<type\>* | cast(*\<expr\>* as *\<type\>*) | converts the results
of the expression expr to \<type\>, for example, cast('1' as BIGINT) will
convert the string '1' to it integral representation. A null is returned if the
conversion does not succeed. |
| string | from_unixtime(int unixtime) | convert the number of seconds from
the UNIX epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp
of that moment in the current system time zone in the format of "1970-01-01
00:00:00" |
| string | to_date(string timestamp) | Return the date part of a timestamp
string: to_date("1970-01-01 00:00:00") = "1970-01-01" |
| int | year(string date) | Return the year part of a date or a timestamp
string: year("1970-01-01 00:00:00") = 1970, year("1970-01-01") = 1970 |
@@ -827,7 +827,7 @@ Array columns in tables can be as follows:
CREATE TABLE array_table (int_array_column ARRAY<INT>);
```
-Assuming that pv.friends is of the type ARRAY<INT> (i.e. it is an array of
integers), the user can get a specific element in the array by its index as
shown in the following command:
+Assuming that pv.friends is of the type ARRAY\<INT\> (i.e. it is an array of
integers), the user can get a specific element in the array by its index as
shown in the following command:
```
SELECT pv.friends[2]
@@ -847,7 +847,7 @@ The user can also get the length of the array using the
size function as shown b
### Map (Associative Arrays) Operations
-Maps provide collections similar to associative arrays. Such structures can
only be created programmatically currently. We will be extending this soon. For
the purpose of the current example assume that pv.properties is of the type
map<String, String> i.e. it is an associative array from strings to string.
Accordingly, the following query:
+Maps provide collections similar to associative arrays. Such structures can
only be created programmatically currently. We will be extending this soon. For
the purpose of the current example assume that pv.properties is of the type
map\<String, String\> i.e. it is an associative array from strings to string.
Accordingly, the following query:
```
INSERT OVERWRITE page_views_map
diff --git a/content/docs/latest/webhcat/webhcat-configure.md
b/content/docs/latest/webhcat/webhcat-configure.md
index 31d18537..831b68e8 100644
--- a/content/docs/latest/webhcat/webhcat-configure.md
+++ b/content/docs/latest/webhcat/webhcat-configure.md
@@ -52,7 +52,7 @@ The webhcat-log4j.properties file sets the location of the
log files created by
| **templeton.hcat** | The path to the HCatalog executable. |
| **templeton.hive.archive** | The path to the Hive archive. |
| **templeton.hive.path** | The path to the Hive executable. |
-| **templeton.hive.properties** | Properties to set when running Hive (during
job submission). This is expected to be a comma-separated prop=value list. If
some value is itself a comma-separated list, the escape character is '\'
</description> (from [Hive
0.13.1](https://issues.apache.org/jira/browse/HIVE-4576) onward).To use it in a
cluster with Kerberos security enabled, set `hive.metastore.sasl.enabled=false`
and add `hive.metastore.execute.setugi=true`. Using localhost in metastore
[...]
+| **templeton.hive.properties** | Properties to set when running Hive (during
job submission). This is expected to be a comma-separated prop=value list. If
some value is itself a comma-separated list, the escape character is '\\' (from
[Hive 0.13.1](https://issues.apache.org/jira/browse/HIVE-4576) onward).To use
it in a cluster with Kerberos security enabled, set
`hive.metastore.sasl.enabled=false` and add
`hive.metastore.execute.setugi=true`. Using localhost in metastore URI does not
w [...]
| **templeton.exec.encoding** | The encoding of the stdout and stderr data. |
| **templeton.exec.timeout** | How long in milliseconds a program is allowed
to run on the WebHCat box. |
| **templeton.exec.max-procs** | The maximum number of processes allowed to
run at once. |
@@ -74,15 +74,15 @@ The webhcat-log4j.properties file sets the location of the
log files created by
| **templeton.kerberos.keytab** | The keytab file containing the credentials
for the Kerberos principal. |
| **templeton.hadoop.queue.name** | MapReduce queue name where WebHCat
map-only jobs will be submitted to. Can be used to avoid a deadlock where all
map slots in the cluster are taken over by Templeton launcher tasks.Versions:
[Hive 0.12.0](https://issues.apache.org/jira/browse/HIVE-4679) and later. |
| **templeton.mapper.memory.mb** | WebHCat controller job's Launch mapper's
memory limit in megabytes. When submitting a controller job, WebHCat will
overwrite `mapreduce.map.memory.mb` with this value. If empty, WebHCat will not
set `mapreduce.map.memory.mb` when submitting the controller job, therefore the
configuration in mapred-site.xml will be used.Versions: [Hive
0.14.0](https://issues.apache.org/jira/browse/HIVE-7155) and later. |
-| **templeton.frame.options.filter** | Adds web server protection from
clickjacking using X-Frame-Options header. The possible values are DENY,
SAMEORIGIN, ALLOW-FROM <uri>.Versions: [Hive
3.0.0](https://issues.apache.org/jira/browse/HIVE-17679) and later. |
+| **templeton.frame.options.filter** | Adds web server protection from
clickjacking using X-Frame-Options header. The possible values are DENY,
SAMEORIGIN, ALLOW-FROM \<uri\>.Versions: [Hive
3.0.0](https://issues.apache.org/jira/browse/HIVE-17679) and later. |
#### Default Values
Some of the default values for WebHCat configuration variables depend on the
release number. For the default values in the Hive release you are using, see
the webhcat-default.xml file. It can be found in the SVN repository at:
-*
http://svn.apache.org/repos/asf/hive/branches/branch-*<release_number>*/hcatalog/webhcat/svr/src/main/config/webhcat-default.xml
+*
http://svn.apache.org/repos/asf/hive/branches/branch-*\<release_number\>*/hcatalog/webhcat/svr/src/main/config/webhcat-default.xml
-where *<release_number>* is 0.11, 0.12, and so on. Prior to Hive 0.11, WebHCat
was in the Apache incubator.
+where *\<release_number\>* is 0.11, 0.12, and so on. Prior to Hive 0.11,
WebHCat was in the Apache incubator.
For example:
diff --git a/content/docs/latest/webhcat/webhcat-installwebhcat.md
b/content/docs/latest/webhcat/webhcat-installwebhcat.md
index 4273b81a..3053ddca 100644
--- a/content/docs/latest/webhcat/webhcat-installwebhcat.md
+++ b/content/docs/latest/webhcat/webhcat-installwebhcat.md
@@ -81,7 +81,7 @@ hadoop fs -put <hadoop streaming jar> \
```
-where *<templeton.streaming.jar>* is a property value defined in
`webhcat-default.xml` which can be overridden in the `webhcat-site.xml` file,
and *<hadoop streaming jar>* is the Hadoop streaming jar in your Hadoop version:
+where *\<templeton.streaming.jar\>* is a property value defined in
`webhcat-default.xml` which can be overridden in the `webhcat-site.xml` file,
and *\<hadoop streaming jar\>* is the Hadoop streaming jar in your Hadoop
version:
+ `hadoop-1.*/contrib/streaming/hadoop-streaming-*.jar` in the Hadoop
1.x tar
+ `hadoop-2.*/share/hadoop/tools/lib/hadoop-streaming-*.jar` in the
Hadoop 2.x tar
diff --git a/content/general/PrivacyPolicy.md b/content/general/PrivacyPolicy.md
index 99f64980..71ce8211 100644
--- a/content/general/PrivacyPolicy.md
+++ b/content/general/PrivacyPolicy.md
@@ -36,9 +36,9 @@ the following:
5. The addresses of pages from where you followed a link to our site.
Part of this information is gathered using a tracking cookie set by the
-<a href="http://www.google.com/analytics/">Google Analytics</a>
+[Google Analytics](http://www.google.com/analytics/)
service and handled by Google as
-described in their <a href="http://www.google.com/privacy.html">privacy
policy</a>.
+described in their [privacy policy](http://www.google.com/privacy.html).
See your browser documentation for instructions on how to disable the
cookie if you prefer not to share this data with Google.