This is an automated email from the ASF dual-hosted git repository. tarmstrong pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/impala.git
commit acddf76bcb3fe8b5f8cc1c60e8d3f15c6e164d5b Author: Alex Rodoni <[email protected]> AuthorDate: Wed Dec 11 11:31:16 2019 -0800 [DOCS] Impala is not optimized for the IN operator when accessing HBASE Change-Id: I37337a18c7add3c64795b3b2e49670493a9a8e44 Reviewed-on: http://gerrit.cloudera.org:8080/14891 Reviewed-by: Lars Volker <[email protected]> Tested-by: Impala Public Jenkins <[email protected]> --- docs/topics/impala_hbase.xml | 31 ++++++++++++++++--------------- 1 file changed, 16 insertions(+), 15 deletions(-) diff --git a/docs/topics/impala_hbase.xml b/docs/topics/impala_hbase.xml index 63f14af..aaf451c 100644 --- a/docs/topics/impala_hbase.xml +++ b/docs/topics/impala_hbase.xml @@ -110,12 +110,12 @@ under the License. the new table.) </li> - <li> - You issue queries against the Impala tables. For efficient queries, use <codeph>WHERE</codeph> clauses to - find a single key value or a range of key values wherever practical, by testing the Impala column - corresponding to the HBase row key. Avoid queries that do full-table scans, which are efficient for - regular Impala tables but inefficient in HBase. - </li> + <li> You issue queries against the Impala tables. For efficient queries, + use the <codeph>WHERE</codeph> clause to find a single key value or a + range of key values wherever practical, by testing the Impala column + corresponding to the HBase row key. Avoid queries that do full-table + scans, which are efficient for regular Impala tables but inefficient + in HBase. </li> </ul> <p> @@ -180,15 +180,16 @@ under the License. key or value fields. All the type enforcement is done on the Impala side. </p> - <p> - For best performance of Impala queries against HBase tables, most queries will perform comparisons in the - <codeph>WHERE</codeph> against the column that corresponds to the HBase row key. When creating the table - through the Hive shell, use the <codeph>STRING</codeph> data type for the column that corresponds to the - HBase row key. Impala can translate conditional tests (through operators such as <codeph>=</codeph>, - <codeph><</codeph>, <codeph>BETWEEN</codeph>, and <codeph>IN</codeph>) against this column into fast - lookups in HBase, but this optimization (<q>predicate pushdown</q>) only works when that column is - defined as <codeph>STRING</codeph>. - </p> + <p> For best performance of Impala queries against HBase tables, most + queries will perform comparisons in the <codeph>WHERE</codeph> clause + against the column that corresponds to the HBase row key. When creating + the table through the Hive shell, use the <codeph>STRING</codeph> data + type for the column that corresponds to the HBase row key. Impala can + translate predicates (through operators such as <codeph>=</codeph>, + <codeph><</codeph>, and <codeph>BETWEEN</codeph>) against this + column into fast lookups in HBase, but this optimization (<q>predicate + pushdown</q>) only works when that column is defined as + <codeph>STRING</codeph>. </p> <p> Starting in Impala 1.1, Impala also supports reading and writing to columns that are defined in the Hive
