[impala] branch master updated: IMPALA-9029: [DOCS] Impala 3.4 Release Notes

joemcdonnell Tue, 17 Mar 2020 13:25:50 -0700

This is an automated email from the ASF dual-hosted git repository.

joemcdonnell pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git



The following commit(s) were added to refs/heads/master by this push:
     new 955868f  IMPALA-9029: [DOCS] Impala 3.4 Release Notes
955868f is described below

commit 955868f88a0593b2c75168e304bd4ff65b06ff98
Author: Alex Rodoni <[email protected]>
AuthorDate: Fri Dec 6 11:51:17 2019 -0800

    IMPALA-9029: [DOCS] Impala 3.4 Release Notes
    
    -Added broadcast_bytes_limit query option
    
    Change-Id: I4385749de35f8379ecf6566fe515ed500b42d6cc
    Reviewed-on: http://gerrit.cloudera.org:8080/14863
    Tested-by: Impala Public Jenkins <[email protected]>
    Reviewed-by: Joe McDonnell <[email protected]>
---
 docs/shared/impala_common.xml               |   3 +-
 docs/topics/impala_incompatible_changes.xml |  20 +++
 docs/topics/impala_known_issues.xml         | 236 +++-------------------------
 docs/topics/impala_new_features.xml         | 185 +++++++++++++++++++++-
 docs/topics/impala_txtfile.xml              |   3 +-
 5 files changed, 223 insertions(+), 224 deletions(-)

diff --git a/docs/shared/impala_common.xml b/docs/shared/impala_common.xml
index e6a04a0..cdf25d8 100644
--- a/docs/shared/impala_common.xml
+++ b/docs/shared/impala_common.xml
@@ -1519,8 +1519,7 @@ alter table partitioned_data set tblproperties 
('numRows'='1030000', 'STATS_GENE
         or <codeph>-f</codeph> options are used. 
       </p>
 
-      <p id="live_progress_live_summary_asciinema">
-        To see how the <codeph>LIVE_PROGRESS</codeph> and 
<codeph>LIVE_SUMMARY</codeph> query
+      <p id="live_progress_live_summary_asciinema">To see how the 
<codeph>LIVE_PROGRESS</codeph> and <codeph>LIVE_SUMMARY</codeph> query
         options work in real time, see
         <xref href="https://asciinema.org/a/1rv7qippo0fe7h5k1b6k4nexk"; 
scope="external" format="html">this
           animated demo</xref>.
diff --git a/docs/topics/impala_incompatible_changes.xml 
b/docs/topics/impala_incompatible_changes.xml
index 43af944..15de3ac 100644
--- a/docs/topics/impala_incompatible_changes.xml
+++ b/docs/topics/impala_incompatible_changes.xml
@@ -50,6 +50,26 @@ under the License.
     </p>
     <p outputclass="toc inpage"/>
   </conbody>
+  <concept id="incompatible_changes_340x">
+    <title>Incompatible Changes Introduced in Impala 3.4.x</title>
+    <conbody>
+      <p> For the full list of issues closed in this release, including any 
that
+        introduce behavior changes or incompatibilities, see the <xref
+          keyref="changelog_34">changelog for <keyword keyref="impala34"
+          /></xref>. <ul>
+          <li>To optimize query performance, Impala planner uses the value of
+            the <codeph>fs.s3a.block.size</codeph> startup flag when 
calculating
+            the split size on non-block based stores, e.g. S3, ADLS, etc.
+            Starting in this release, Impala planner uses the
+              <codeph>PARQUET_OBJECT_STORE_SPLIT_SIZE</codeph> query option to
+            get the Parquet file format specific split size.<p>For Parquet
+              files, the <codeph>fs.s3a.block.size</codeph> startup flag is no
+              longer used.</p><p>The default value of the
+                <codeph>PARQUET_OBJECT_STORE_SPLIT_SIZE</codeph> query option 
is
+              256 MB.</p></li>
+        </ul></p>
+    </conbody>
+  </concept>
   <concept id="incompatible_changes_330x">
     <title>Incompatible Changes Introduced in Impala 3.3.x</title>
     <conbody>
diff --git a/docs/topics/impala_known_issues.xml 
b/docs/topics/impala_known_issues.xml
index 7ad7604..edb549c 100644
--- a/docs/topics/impala_known_issues.xml
+++ b/docs/topics/impala_known_issues.xml
@@ -259,162 +259,6 @@ under the License.
 
     </concept>
 
-<!--IMPALA-7585 and IMPALA-7298 are fixed. Should be removed from here?-->
-
-    <concept id="IMPALA-7585" audience="hidden">
-
-      <title>Impala user not added to /etc/passwd when LDAP is enabled</title>
-
-      <conbody>
-
-        <p>
-          When using Impala with LDAP enabled, a user may hit the following:
-        </p>
-
-<pre>Not authorized: Client connection negotiation failed: client connection 
to 127.0.0.1:27000: SASL(-1): generic failure: All-whitespace username.</pre>
-
-        <p>
-          The following sequence can lead to the <codeph>impala</codeph> user 
not being created
-          in <codeph>/etc/passwd</codeph> on some machines on the cluster.
-          <ul>
-            <li>
-              Time 1: The <codeph>impala</codeph> user is not in LDAP. Impala 
was installed on
-              machine 1, and the user <codeph>impala</codeph> is created in
-              <codeph>/etc/passwd</codeph>.
-            </li>
-
-            <li>
-              Time 2: The <codeph>impala</codeph> user is added to LDAP.
-            </li>
-
-            <li>
-              Time 3: A new machine is added to the cluster. When adding 
Impala service to this
-              new machine, adding the <codeph>impala</codeph> user will fail 
as it already
-              exists in LDAP.
-            </li>
-          </ul>
-        </p>
-
-        <p>
-          The consequence is that the <codeph>impala</codeph> user doesn't 
exist in
-          <codeph>/etc/passwd</codeph> on the new machine, leading to the 
error above.
-        </p>
-
-        <p>
-          <b>Workaround</b>: Manually edit <codeph>/etc/passwd</codeph> to add 
the
-          <codeph>impala</codeph> user
-        </p>
-
-        <p>
-          <b>Apache Issue:</b> <xref keyref="IMPALA-7585">IMPALA-7585</xref>
-        </p>
-
-        <p>
-          <b>Affected Versions:</b> Impala 2.12, Impala 3.0
-        </p>
-
-        <p>
-          <b>Fixed Version:</b> Impala 3.1
-        </p>
-
-      </conbody>
-
-    </concept>
-
-    <concept id="IMPALA-7298" audience="hidden">
-
-      <title>Kerberos authentication fails with the reverse DNS lookup 
disabled</title>
-
-      <conbody>
-
-        <p>
-          Kerberos authentication does not function correctly if <codeph>rdns 
= false</codeph>
-          is configured in <codeph>krb5.conf</codeph>. If the flag 
<codeph>rdns =
-          false</codeph>, when Impala tries to match principals, it will fail 
because Kerberos
-          receives a SPN (Service Principal Name) with an IP address in it, 
but Impala expects a
-          principal with a FQDN in it.
-        </p>
-
-        <p>
-          You may hit the following error:
-        </p>
-
-<pre>WARNINGS: TransmitData() to X.X.X.X:27000 failed: Not authorized: Client 
connection negotiation failed: client connection to X.X.X.X:27000: Server 
impala/[email protected] not found in Kerberos database
-</pre>
-
-        <p>
-          <b>Apache Issue:</b> <xref keyref="IMPALA-7298">IMPALA-7298</xref>
-        </p>
-
-        <p>
-          <b>Affected Versions:</b> Impala 2.12.0 and 3.0
-        </p>
-
-        <p>
-          <b>Workaround:</b> Set the following flags in 
<codeph>krb5.conf</codeph>:
-          <ul>
-            <li>
-              <codeph>dns_canonicalize_hostname = true</codeph>
-            </li>
-
-            <li>
-              <codeph>rdns = true</codeph>
-            </li>
-          </ul>
-        </p>
-
-        <p>
-          <b>Fixed Versions:</b> Impala 3.1
-        </p>
-
-      </conbody>
-
-    </concept>
-
-<!--kudu2198 is fixed-->
-
-    <concept id="KUDU-2198" audience="hidden">
-
-      <title>System-wide auth-to-local mapping not applied correctly to Kudu 
service account</title>
-
-      <conbody>
-
-        <p>
-          Due to system the <codeph>auth_to_local</codeph> mapping, the 
principal may be mapped
-          to some local name.
-        </p>
-
-        <p>
-          When running with Kerberos enabled, you may hit the following error 
message where
-          <varname>&lt;random-string></varname> is some random string which 
doesn't match the
-          primary in the Kerberos principal.
-        </p>
-
-<pre>WARNINGS: TransmitData() to X.X.X.X:27000 failed: Remote error: Not 
authorized: {username='&lt;random-string>', principal='impala/redacted'} is not 
allowed to access DataStreamService
-</pre>
-
-        <p>
-          <b>Workaround</b>: Start Impala with the
-          <codeph>--use_system_auth_to_local=false</codeph> flag to ignore the 
system-wide
-          <codeph>auth_to_local</codeph> mappings configured in 
<codeph>/etc/krb5.conf</codeph>.
-        </p>
-
-        <p>
-          <b>Apache Issue:</b> <xref keyref="IMPALA-8154">IMPALA-8154</xref>
-        </p>
-
-        <p>
-          <b>Affected Versions:</b> Impala 2.12, Impala 3.0 / Kudu 1.6
-        </p>
-
-        <p>
-          <b>Fixed Versions:</b> Impala 3.2
-        </p>
-
-      </conbody>
-
-    </concept>
-
   </concept>
 
   <concept id="known_issues_resources">
@@ -722,25 +566,6 @@ ALTER TABLE table_name SET 
TBLPROPERTIES('EXTERNAL'='TRUE');
 
     </concept>
 
-    <concept id="IMP-175" audience="hidden">
-
-      <title>Deviation from Hive behavior: Out of range values float/double 
values are returned as maximum allowed value of type (Hive returns NULL)</title>
-
-      <conbody>
-
-        <p>
-          Impala behavior differs from Hive with respect to out of range 
float/double values.
-          Out of range values are returned as maximum allowed value of type 
(Hive returns NULL).
-        </p>
-
-        <p>
-          <b>Workaround:</b> None
-        </p>
-
-      </conbody>
-
-    </concept>
-
     <concept id="flume_writeformat_text">
 
       <title>Configuration needed for Flume to be compatible with 
Impala</title>
@@ -837,6 +662,24 @@ ALTER TABLE table_name SET 
TBLPROPERTIES('EXTERNAL'='TRUE');
       </conbody>
 
     </concept>
+    <concept id="IMPALA-8953">
+      <title>Tables and databases sharing same name can cause query
+        failures</title>
+      <conbody>
+        <p>A table and a database that share the same name can cause a query
+          failure if the table is not readable by Impala, for example, the 
table
+          was created in Hive in the Open CSV Serde format. The following
+          exception will return:</p>
+        <codeblock>CAUSED BY: TableLoadingException: Unrecognized table type 
for table</codeblock>
+        <p>
+          <b>Apache Issue:</b>
+          <xref keyref="IMPALA-8953">IMPALA-8953</xref>
+        </p>
+        <p>
+          <b>Workaround:</b> Do not create databases and tables with the same
+          names.</p>
+      </conbody>
+    </concept>
 
   </concept>
 
@@ -852,22 +695,6 @@ ALTER TABLE table_name SET 
TBLPROPERTIES('EXTERNAL'='TRUE');
       </p>
 
     </conbody>
-    <!--IMPALA8376 fixed the issue below.-->
-    <concept id="IMPALA-8829" audience="hidden">
-      <title>Unable to Correctly Parse the Terabyte Unit</title>
-      <conbody>
-        <p>Impala does not support parsing strings that contain "TB" when used
-          as a unit for terabytes. The flags related to memory limits may be
-          affected, such as the flags for scratch space and data cache.</p>
-        <p><b>Workaround:</b> Use other supported units to specify values, e.g.
-          GB or MB.</p>
-        <p><b>Affected Versions:</b> All versions</p>
-        <p>
-          <b>Apache Issue:</b>
-          <xref keyref="IMPALA-8829">IMPALA-8829</xref>
-        </p>
-      </conbody>
-    </concept>
 
     <concept id="IMPALA-4551">
 
@@ -989,33 +816,6 @@ ALTER TABLE table_name SET 
TBLPROPERTIES('EXTERNAL'='TRUE');
 
     </concept>
 
-<!--Fixed in 3.2-->
-
-    <concept id="IMPALA-941" rev="IMPALA-941" audience="hidden">
-
-      <title>Impala Parser issue when using fully qualified table names that 
start with a number</title>
-
-      <conbody>
-
-        <p>
-          A fully qualified table name starting with a number could cause a 
parsing error. In a
-          name such as <codeph>db.571_market</codeph>, the decimal point 
followed by digits is
-          interpreted as a floating-point number.
-        </p>
-
-        <p>
-          <b>Apache Issue:</b> <xref keyref="IMPALA-941">IMPALA-941</xref>
-        </p>
-
-        <p>
-          <b>Workaround:</b> Surround each part of the fully qualified name 
with backticks
-          (<codeph>``</codeph>).
-        </p>
-
-      </conbody>
-
-    </concept>
-
     <concept id="IMPALA-532" rev="IMPALA-532">
 
       <title>Impala should tolerate bad locale settings</title>
diff --git a/docs/topics/impala_new_features.xml 
b/docs/topics/impala_new_features.xml
index 65f543f..7b558f4 100644
--- a/docs/topics/impala_new_features.xml
+++ b/docs/topics/impala_new_features.xml
@@ -45,6 +45,185 @@ under the License.
     <p outputclass="toc inpage"/>
 
   </conbody>
+  <concept rev="3.2.0" id="new_features_34">
+    <title>New Features in <keyword keyref="impala34"/></title>
+    <conbody>
+      <p> The following sections describe the noteworthy improvements made in
+          <keyword keyref="impala34"/>. </p>
+      <p> For the full list of issues closed in this release, see the <xref
+          keyref="changelog_34">changelog for <keyword keyref="impala34"
+          /></xref>. </p>
+      <section id="section_cw4_nmw_pjb">
+        <title>Support for Hive Insert-Only Transactional Tables</title>
+        <p>Impala added the support to truncate insert-only transactional
+          tables. </p>
+        <p>By default, Impala creates an insert-only transactional table when
+          you issue the <codeph>CREATE TABLE</codeph> statement.</p>
+        <p>Use the Hive compaction to compact small files to improve the
+          performance and scalability of metadata in transactional tables.</p>
+        <p>See <xref href="impala_transactions.xml#transactions"/> for more
+          information.</p>
+      </section>
+      <section id="impala-8656">
+        <title>Server-side Spooling of Query Results</title>
+        <p>You can use the <codeph>SPOOL_QUERY_RESULTS</codeph> query option to
+          control how query results are returned to the client.</p>
+        <p>By default, when a client fetches a set of query results, the next
+          set of results are fetched in batches until all the result rows are
+          produced. If a client issues a query without fetching all the 
results,
+          the query fragments continue to hold on to the resources until the
+          query is canceled and unregistered, potentially tying up resources 
and
+          causing other queries to wait in admission control.</p>
+        <p>When the query result spooling feature is enabled, the result sets 
of
+          queries are eagerly fetched and buffered until they are read by the
+          client, and resources are freed up for other queries.</p>
+        <p>See <xref href="impala_query_results_spooling.xml#data_sink"/> for
+          the new feature and the query options.</p>
+      </section>
+      <section id="impala-8584">
+        <title>Cookie-based Authentication</title>
+        <p>Starting in this version, Impala supports cookies for authentication
+          when clients connect via HiveServer2 over HTTP. </p>
+        <p>You can use the <codeph>--max_cookie_lifetime_s startup</codeph> 
flag
+          to:</p>
+        <ul>
+          <li>Disable the use of cookies</li>
+          <li>Control how long generated cookies are valid for</li>
+        </ul>
+        <p>See <xref href="impala_client.xml#intro_client"/> for more
+          information.</p>
+      </section>
+      <section id="section_hw4_nmw_pjb">
+        <title>Object Ownership Support</title>
+        <p>Object ownership for tables, views, and databases is enabled by
+          default in Impala. When you create a database, a table, or a view, as
+          the owner of that object, you implicitly have the privileges on the
+          object. The privileges that owners have are specified in Ranger on 
the
+          special user, <codeph>{OWNER}</codeph>. </p>
+        <p>The <codeph>{OWNER}</codeph> user must be defined in Ranger for the
+          object ownership privileges work in Impala.</p>
+        <p>See <xref href="impala_authorization.xml#authorization"/> for
+          details.</p>
+      </section>
+      <section id="impala-8752">
+        <title>New Built-in Functions for Fuzzy Matching of Strings</title>
+        <p>Use the new Jaro or Jaro-Winkler functions to perform fuzzy matches
+          on relatively short strings, e.g. to scrub user inputs of names
+          against the records in the database.</p>
+        <ul>
+          <li><codeph>JARO_DISTANCE</codeph>, <codeph>JARO_DST</codeph></li>
+          <li><codeph>JARO_SIMILARITY</codeph>, <codeph>JARO_SIM</codeph></li>
+          <li><codeph>JARO_WINKLER_DISTANCE</codeph>,
+            <codeph>JW_DST</codeph></li>
+          <li><codeph>JARO_WINKLER_SIMILARITY</codeph>,
+            <codeph>JW_SIM</codeph></li>
+        </ul>
+        <p>See <xref href="impala_string_functions.xml#string_functions"/> for
+          details.</p>
+      </section>
+      <section id="impala-8376">
+        <title>Capacity Quota for Scratch Disks</title>
+        <p>When configuring scratch space for intermediate files used in large
+          sorts, joins, aggregations, or analytic function operations, use the
+            <codeph>‑‑scratch_dirs</codeph> startup flag to optionally specify 
a
+          capacity quota per scratch directory, e.g.,
+            <codeph>‑‑scratch_dirs=/dir1:5MB,/dir2</codeph>.</p>
+        <p>See <xref href="impala_file_formats.xml#file_formats"/> for
+          details.</p>
+      </section>
+      <section id="impala-8913">
+        <title>Query Option for Disabling HBase Row Estimation</title>
+        <p>During query plan generation, Impala samples underlying HBase tables
+          to estimate row count and row size, but the sampling process can
+          negatively impact the planning time. To alleviate the issue, when the
+          HBase table stats do not change much in a short time, disable the
+          sampling with the <codeph>DISABLE_HBASE_NUM_ROWS_ESTIMATE</codeph>
+          query option so that the Impala planner falls back to using Hive
+          Metastore (HMS) table stats instead. </p>
+        <p>See <xref
+            
href="impala_disable_hbase_num_rows_estimate.xml#disable_hbase_num_rows_estimate"
+          />.</p>
+      </section>
+      <section id="impala-8942">
+        <title>Query Option for Controlling Size of Parquet Splits on Non-block
+          Stores</title>
+        <p>To optimize query performance, Impala planner uses the value of the
+            <codeph>fs.s3a.block.size</codeph> startup flag when calculating 
the
+          split size on non-block based stores, e.g. S3, ADLS, etc. Starting in
+          this release, Impala planner uses the
+            <codeph>PARQUET_OBJECT_STORE_SPLIT_SIZE</codeph> query option to 
get
+          the Parquet file format specific split size. </p>
+        <p>For Parquet files, the <codeph>fs.s3a.block.size</codeph> startup
+          flag is no longer used.</p>
+        <p>The default value of the
+            <codeph>PARQUET_OBJECT_STORE_SPLIT_SIZE</codeph> query option is 
256
+          MB.</p>
+        <p>See <xref href="impala_s3.xml#s3"/> for tuning Impala query
+          performance for S3.</p>
+      </section>
+      <section id="impala-5149">
+        <title>Query Profile Exported to JSON</title>
+        <p>On the Query Details page of Impala Daemon Web UI, you have a new
+          option, in addition to the existing Thrift and Text formats, to 
export
+          the query profile output in the JSON format.</p>
+        <p>See <xref href="impala_webui.xml#webui"/> for generating JSON query
+          profile outputs in Web UI.</p>
+      </section>
+      <section id="section_rnb_ny4_yjb">
+        <title>DATE Data Type Supported in Avro Tables</title>
+        <p>You can now use the <codeph>DATE</codeph> data type to query date
+          values from Avro tables.</p>
+        <p>See <xref href="impala_avro.xml#avro"/> for details.</p>
+      </section>
+      <section>
+        <title>Primary Key and Foreign Key Constraints</title>
+        <p>This release adds support for primary and foreign key constraints,
+          but in this release the constraints are advisory and intended for
+          estimating cardinality during query planning in a future release.
+          There is no attempt to enforce constraints. See <xref 
+            href="impala_create_table.xml"/> for details. </p>
+      </section>
+      <section>
+        <title>Enhanced External Kudu Table</title>
+        <p>By default HMS implicitly translates internal Kudu tables to 
external
+          Kudu tables with the 'external.table.purge' property set to true. 
These
+          tables behave similar to internal tables. You can explicitly create 
such
+          external Kudu tables. See <xref href="impala_create_table.xml"/>
+          for details.</p>
+      </section>
+      <section>
+        <title>Ranger Column Masking</title>
+        <p>This release supports Ranger column masking, which hides sensitive 
columnar
+          data in Impala query output. For example, you can define a policy 
that reveals
+          only the first or last four characters of column data. Column 
masking is enabled
+          by default. See <xref 
href="impala_authorization.xml#sec_ranger_col_masking"/>
+          for details.</p>
+      </section>
+      <section>
+        <title>BROADCAST_BYTES_LIMIT query option</title>
+        <p>You can set the default limit for the size of the broadcast input. 
Such a limit
+          can prevent possible performance problems.</p>
+        <!--Add link to details after file is published.-->
+      </section>
+      <section>
+        <title>Experimental Support for Apache Hudi</title>
+        <p>In this release, you can use Read Optimized Queries on Hudi tables. 
See
+          <xref href="impala_hudi.xml"/> for details. </p>
+      </section>
+      <section>
+        <title>ORC Reads Enabled by Default</title>
+        <p>Impala stability and performance have been improved. Consequently, 
ORC reads are now
+          enabled in Impala by default. To disable, set 
<codeph>--enable_orc_scanner</codeph> to
+          <codeph>false</codeph> when starting the cluster. See <xref 
href="impala_orc.xml"/> for
+          details.</p>
+      </section>
+      <section>
+        <title>Support for ZSTD and DEFLATE</title>
+        <p>This release supports ZSTD and DEFLATE compression codecs for text 
files. See 
+          <xref href="impala_txtfile.xml#gzip"/> for details.</p>
+      </section>
+    </conbody>
+  </concept>
   <concept rev="3.2.0" id="new_features_33">
     <title>New Features in <keyword keyref="impala33"/></title>
     <conbody>
@@ -231,9 +410,9 @@ under the License.
         <title>Default File Format Changed to Parquet</title>
         <p>When you create a table, the default format for that table data is
           now Parquet.</p>
-        <p>For backward compatibility, you can use the DEFAULT_FILE_FORMAT 
query
-          option to set the default file format to the previous default, text,
-          or other formats.</p>
+        <p>For backward compatibility, you can use the
+            <codeph>DEFAULT_FILE_FORMAT</codeph> query option to set the 
default
+          file format to the previous default, text, or other formats.</p>
       </section>
       <section id="section_m1h_mnf_t3b">
         <title>Built-in Function to Process JSON Objects</title>
diff --git a/docs/topics/impala_txtfile.xml b/docs/topics/impala_txtfile.xml
index 078f957..9d3d04f 100644
--- a/docs/topics/impala_txtfile.xml
+++ b/docs/topics/impala_txtfile.xml
@@ -120,7 +120,8 @@ under the License.
         details.
       </p>
 
-      <p rev="2.0.0">You can also use text data compressed in the bzip2, 
deflate, gzip, Snappy, or
+      <p rev="2.0.0">
+        You can also use text data compressed in the bzip2, deflate, gzip, 
Snappy, or
         zstd formats. Because these compressed formats are not 
<q>splittable</q> in the way that LZO
         is, there is less opportunity for Impala to parallelize queries on 
them. Therefore, use
         these types of compressed data only for convenience if that is the 
format in which you

[impala] branch master updated: IMPALA-9029: [DOCS] Impala 3.4 Release Notes

Reply via email to