impala git commit: IMPALA-7039: Ignore the port in HBase planner tests

2018-05-22 Thread tarasbob
Repository: impala Updated Branches: refs/heads/master 23e11dc72 -> fd7e7c93c IMPALA-7039: Ignore the port in HBase planner tests Before this patch, we used to check the HBase port in the HBase planner tests. This caused a failure when HBase was running on a different port than expected. We fi

[2/3] impala git commit: IMPALA-6816: minimise calls to GetMinSubscriberTopicVersion()

2018-06-26 Thread tarasbob
IMPALA-6816: minimise calls to GetMinSubscriberTopicVersion() min_subscriber_topic_version is expensive to compute (requires iterating over all subscribers to compute) but is only used by one subscriber/topic pair: Impalads receiving catalog topic updates. This patch implements a simple fix - onl

[1/3] impala git commit: IMPALA-7210: global debug actions should be case insensitive

2018-06-26 Thread tarasbob
Repository: impala Updated Branches: refs/heads/master bbb399ddf -> e6abf8e86 IMPALA-7210: global debug actions should be case insensitive The ExecNode debug actions don't care about case so better to be consistent. Testing: verify that this works: set debug_action=coord_before_exec_rpc:sle

[3/3] impala git commit: IMPALA-7149: Disable some tests in the EC build

2018-06-26 Thread tarasbob
IMPALA-7149: Disable some tests in the EC build We temporarily disable the resource limits tests in the EC build to make it pass. We also disable the tests marked with "tuned_for_minicluster" in the EC build. Cherry-picks: not for 2.x. Change-Id: I0975b1a28b318625f853b612bdfea3a8adcd776e Reviewe

[2/3] impala git commit: IMPALA-6802 (part 6): Clean up authorization tests

2018-06-29 Thread tarasbob
IMPALA-6802 (part 6): Clean up authorization tests This is the last part of the authorization test clean up. This patch rewrites the following tests: - alter database - explain - comment on - function - alter table/view This patch also adds the following authorization tests: - update - upsert -

[1/3] impala git commit: IMPALA-6802 (part 6): Clean up authorization tests

2018-06-29 Thread tarasbob
Repository: impala Updated Branches: refs/heads/master 4ce2af9ff -> 8060f4d50 http://git-wip-us.apache.org/repos/asf/impala/blob/5c880e52/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java -- diff --git a/fe/src

[3/3] impala git commit: IMPALA-7102 (Part 1): Disable reading of erasure coding by default

2018-06-29 Thread tarasbob
IMPALA-7102 (Part 1): Disable reading of erasure coding by default In this patch we add a query option ALLOW_ERASURE_CODED_FILES, that allows us to enable or disable the support of erasure coded files. Even though Impala should be able to handle HDFS erasure coded files already, this feature hasn'

impala git commit: IMPALA-4848: Add WIDTH_BUCKET() function

2018-06-30 Thread tarasbob
Repository: impala Updated Branches: refs/heads/master 8060f4d50 -> dde930830 IMPALA-4848: Add WIDTH_BUCKET() function Syntax : width_bucket(expr decimal, min_val decimal, max_val decimal, num_buckets int) This function creates equiwidth histograms , where the histogram range is divided int

[4/6] impala git commit: IMPALA-7190: Remove unsupported format writer support

2018-07-03 Thread tarasbob
IMPALA-7190: Remove unsupported format writer support This patch removes write support for unsupported formats like Sequence, Avro and compressed text. Also, the related query options ALLOW_UNSUPPORTED_FORMATS and SEQ_COMPRESSION_MODE have been migrated to the REMOVED query options type. Testing:

[2/6] impala git commit: IMPALA-7237: handle hex digits in ParseSmaps()

2018-07-03 Thread tarasbob
IMPALA-7237: handle hex digits in ParseSmaps() Testing: Manual. Added some temporary logging to print out which branch it took with each line and confirmed it took the right branch for a line starting with 'f'. Change-Id: I3dad846dafb25b414bee1858eb63f3eda31d59ac Reviewed-on: http://gerrit.cloude

[6/6] impala git commit: IMPALA-7236: Fix the parsing of ALLOW_ERASURE_CODED_FILES

2018-07-03 Thread tarasbob
IMPALA-7236: Fix the parsing of ALLOW_ERASURE_CODED_FILES This patch adds a missing "break" statement in a switch statement changed by IMPALA-7102. Also fixes an non-deterministic test case. Change-Id: Ife1e791541e3f4fed6bec00945390c7d7681e824 Reviewed-on: http://gerrit.cloudera.org:8080/10857 Re

[5/6] impala git commit: IMPALA-6883: [DOCS] Refactor impala_authorization doc

2018-07-03 Thread tarasbob
IMPALA-6883: [DOCS] Refactor impala_authorization doc Change-Id: I3df72adb25dcdcbc286934b048645f47d876b33d Reviewed-on: http://gerrit.cloudera.org:8080/10786 Reviewed-by: Alex Rodoni Tested-by: Impala Public Jenkins Project: http://git-wip-us.apache.org/repos/asf/impala/repo Commit: http://git

[3/6] impala git commit: [DOCS] Clarification on admission control and DDL statements

2018-07-03 Thread tarasbob
[DOCS] Clarification on admission control and DDL statements Removed the confusing example and paragraphs. Change-Id: I2e3e82bd34e88e7a13de1864aeb97f01023bc715 Reviewed-on: http://gerrit.cloudera.org:8080/10829 Reviewed-by: Tim Armstrong Tested-by: Impala Public Jenkins Project: http://git-wi

[1/6] impala git commit: IMPALA-5981: [DOCS] Documented SET=""

2018-07-03 Thread tarasbob
Repository: impala Updated Branches: refs/heads/master 2b6d71fee -> 61e6a4777 IMPALA-5981: [DOCS] Documented SET="" Also, refactored the Impala SET doc and moved the command SET to the Impala Shell Commands doc. Change-Id: I7211405d5cc0a548c05ea5218798591873c14417 Reviewed-on: http://gerrit.c

[3/3] impala git commit: IMPALA-6642 (Part 2): clean up start-impala-cluster.py

2018-07-06 Thread tarasbob
IMPALA-6642 (Part 2): clean up start-impala-cluster.py We clean up start-impala-cluster.py in general in this patch by using logging instead of "print" and formatting strings using the format() function. We make sure to include a timestamp in each log message in order to make it easier to debug fa

[1/3] impala git commit: IMPALA-6625: Skip computing parquet conjuncts for non-Parquet scans

2018-07-06 Thread tarasbob
Repository: impala Updated Branches: refs/heads/master fd0ba0fd2 -> 30d196fd5 IMPALA-6625: Skip computing parquet conjuncts for non-Parquet scans This change ensures that the planner computes parquet conjuncts only when for scans containing parquet files. Additionally, it also handles PARQUET_

[2/3] impala git commit: Bump toolchain version, include libunwind

2018-07-06 Thread tarasbob
Bump toolchain version, include libunwind Change-Id: I0b26f6a342dd7ba282c3f6c4de93745aff2dd095 Reviewed-on: http://gerrit.cloudera.org:8080/10755 Reviewed-by: Lars Volker Tested-by: Impala Public Jenkins Project: http://git-wip-us.apache.org/repos/asf/impala/repo Commit: http://git-wip-us.apac

[3/4] impala git commit: IMPALA-7216: Fix SQL generated by CreateViewStmt & AlterViewStmt

2018-07-10 Thread tarasbob
IMPALA-7216: Fix SQL generated by CreateViewStmt & AlterViewStmt The toSql functions in CreateViewStmt and AlterViewStmt generated invalid SQL by appending types to column definitions. This change appends just the column names to fix it. Testing: Added tests to ToSqlTest to verify it. Change-Id:

[2/4] impala git commit: IMPALA-7254: Inconsistent decimal behavior for IN/BETWEEN predicate

2018-07-10 Thread tarasbob
IMPALA-7254: Inconsistent decimal behavior for IN/BETWEEN predicate In decimal v2, performing a cast that can result in a loss of precision is considered as an error. In the prior code when finding a compatible type for performing a cast between expressions that have decimal and floating types can

[1/4] impala git commit: Fix zsh issue in set-pythonpath.sh

2018-07-10 Thread tarasbob
Repository: impala Updated Branches: refs/heads/master c01efd096 -> 36ce54bd5 Fix zsh issue in set-pythonpath.sh When sourcing set-pythonpath.sh on a partially checked out or partially built source tree (e.g. when running perf tests or after build errors), an empty shell glob pattern would ret

[4/4] impala git commit: IMPALA-7260: Fix decimal binary predicates

2018-07-10 Thread tarasbob
IMPALA-7260: Fix decimal binary predicates When casting the inputs to a function call, we would try to cast non decimal numbers to a specific decimal type even though this is not necessary. The specific decimal type would be calculated by looking at all the inputs. It was possible for this calcula

impala git commit: IMPALA-7211: Fix the between predicate for decimals

2018-07-11 Thread tarasbob
Repository: impala Updated Branches: refs/heads/master e8a669bf9 -> 80c009631 IMPALA-7211: Fix the between predicate for decimals Before this patch, some queries would fail where the inputs to the between predicate were decimal types that are not compatible with each other. We would needlessly

[1/2] impala git commit: IMPALA-3562: support column restriction for compute stats

2018-02-01 Thread tarasbob
Repository: impala Updated Branches: refs/heads/master d49f629c4 -> 4bd7cc8db IMPALA-3562: support column restriction for compute stats The 'compute stats' statement currently computes column-level statistics for all columns of a table. This adds potentially unneeded work for columns whose sta

[2/2] impala git commit: IMPALA-6429: Fix decimal division

2018-02-01 Thread tarasbob
IMPALA-6429: Fix decimal division Before this patch, it was possible for an overflow to not be detected when doing a decimal division. When scaling up the dividend before doing the division, we do not check for overflow. This is ok if the we are scaling up by 10^38 or less because the result is gu

[20/21] impala git commit: IMPALA-6697: Downgrade setuptools to be compatible with Python 2.6

2018-03-20 Thread tarasbob
IMPALA-6697: Downgrade setuptools to be compatible with Python 2.6 Change-Id: I0d4727b7a5911269b82287ed9ce759f1e211f386 Reviewed-on: http://gerrit.cloudera.org:8080/9713 Reviewed-by: Philip Zeyliger Tested-by: Lars Volker Reviewed-on: http://gerrit.cloudera.org:8080/9714 Reviewed-by: Lars Volker

[05/21] impala git commit: IMPALA-5270: Pass resolved exprs into analytic SortInfo.

2018-03-20 Thread tarasbob
IMPALA-5270: Pass resolved exprs into analytic SortInfo. The bug was that the SortInfo of analytics was given ordering exprs that were not fully resolved against their input (e.g. inline views were not resolved). As a result, the SortInfo logic did not materialize exprs like rand() coming from inl

[07/21] impala git commit: IMPALA-6695: Fix PyPi regex, update setuptools version

2018-03-20 Thread tarasbob
IMPALA-6695: Fix PyPi regex, update setuptools version pytest-runner, which is required by kudu-python requires are more recent version of setuptools. Adding an explicit dependency required an update to the regular expression to parse PyPi URLs. Change-Id: Ia67189f81a31a9a5a0ed80cd4d6661762ef427b

[19/21] impala git commit: Consistently use Java 1.7 compiler.

2018-03-20 Thread tarasbob
Consistently use Java 1.7 compiler. We use Java 1.7 in fe/pom.xml, where most of our Java code is. For consistency, this updates the rest of our Maven configurations to use the same version of Java. A change I'm working with uses try-with-resources in HBase splitting, which is how I ran into this.

[04/21] impala git commit: IMPALA-6551: Change Kudu TPCDS and TPCH columns to DECIMAL

2018-03-20 Thread tarasbob
IMPALA-6551: Change Kudu TPCDS and TPCH columns to DECIMAL Before Kudu supported DECIMAL columns the TPCDS and TPCH columns were djusted to use DOUBLE in place of DECIMAL. This patch undoes that change now that Kudu supports DECIMAL. Testing: - Updated concurrent_select.py - Updated test_tpch_q

[01/21] impala git commit: IMPALA-6551: Change Kudu TPCDS and TPCH columns to DECIMAL

2018-03-20 Thread tarasbob
Repository: impala Updated Branches: refs/heads/2.x 7336839db -> b99a5d97b http://git-wip-us.apache.org/repos/asf/impala/blob/f6ad4e6b/testdata/workloads/tpch/queries/tpch-kudu-q17.test -- diff --git a/testdata/workloads/tpch/q

[11/21] impala git commit: [DOCS] Changed to --use_local_tz_for_unix_timestamp_conversions

2018-03-20 Thread tarasbob
[DOCS] Changed to --use_local_tz_for_unix_timestamp_conversions Change-Id: Id75ec73031b97aa8a3c61ccdbaea39db008b4093 Reviewed-on: http://gerrit.cloudera.org:8080/9620 Reviewed-by: Tim Armstrong Tested-by: Impala Public Jenkins Project: http://git-wip-us.apache.org/repos/asf/impala/repo Commit:

[02/21] impala git commit: IMPALA-6551: Change Kudu TPCDS and TPCH columns to DECIMAL

2018-03-20 Thread tarasbob
http://git-wip-us.apache.org/repos/asf/impala/blob/f6ad4e6b/testdata/workloads/tpch/queries/tpch-kudu-q16.test -- diff --git a/testdata/workloads/tpch/queries/tpch-kudu-q16.test b/testdata/workloads/tpch/queries/tpch-kudu-q16.test

[06/21] impala git commit: IMPALA-6690: Fix pip_download.py on python 2.6

2018-03-20 Thread tarasbob
IMPALA-6690: Fix pip_download.py on python 2.6 IMPALA-6682 used set literal syntax in pip_download.py, which is introduced in python 2.7. This patch changes it to set constructor. It's tested on python 2.6.9. Change-Id: I82b4116ee056f605c8aadf39a8b92b78313cb8bf Reviewed-on: http://gerrit.clouder

[17/21] impala git commit: IMPALA-6655: Add owner information on database creation

2018-03-20 Thread tarasbob
IMPALA-6655: Add owner information on database creation Add owner information on database creation. > create database foo; > describe database extended foo; +-+--+-+ | name| location | comment | +-+--+-+ | foo | | | | Owner:

[03/21] impala git commit: IMPALA-6551: Change Kudu TPCDS and TPCH columns to DECIMAL

2018-03-20 Thread tarasbob
http://git-wip-us.apache.org/repos/asf/impala/blob/f6ad4e6b/testdata/workloads/tpch/queries/tpch-kudu-q11.test -- diff --git a/testdata/workloads/tpch/queries/tpch-kudu-q11.test b/testdata/workloads/tpch/queries/tpch-kudu-q11.test

[08/21] impala git commit: IMPALA-6682: Remove MD5 assumption from pypi download script

2018-03-20 Thread tarasbob
IMPALA-6682: Remove MD5 assumption from pypi download script pip_download.py assumes the python repository to use md5 as the hash algorithm, which is not required by PEP-503 and not always true in reality. This patch removes this assumption and enables support of all hash algorithms in python hash

[09/21] impala git commit: IMPALA-6589: remove invalid DCHECK in parquet reader

2018-03-20 Thread tarasbob
IMPALA-6589: remove invalid DCHECK in parquet reader The DCHECK was only valid if the Parquet file metadata is internally consistent, with the number of values reported by the metadata matching the number of encoded levels. The DCHECK was intended to directly detect misuse of the RleBatchDecoder

[15/21] impala git commit: IMPALA-6488: removes use-after-free bug in lib_cache

2018-03-20 Thread tarasbob
IMPALA-6488: removes use-after-free bug in lib_cache Several recent runs have resulted in a boost mutex invalid argument exception. The mutex in question is the one that guards individual lib_cache entries (LibCacheEntry::lock). The exception is thrown due to the entry being deleted by another th

[03/11] impala git commit: IMPALA-6675: Default to --compact_catalog_topic=true.

2018-03-20 Thread tarasbob
IMPALA-6675: Default to --compact_catalog_topic=true. Testing: - Ran a few queries locally - Ran test_compact_catalog_updates.py locally Mostafa's perf evaluation: - 130 node cluster - Load metadata after invalidate for 4 tables, each with 100K partitions and 1 million files Results compaction

[02/11] impala git commit: [DOCS] Changed to --use_local_tz_for_unix_timestamp_conversions

2018-03-20 Thread tarasbob
[DOCS] Changed to --use_local_tz_for_unix_timestamp_conversions Change-Id: Id75ec73031b97aa8a3c61ccdbaea39db008b4093 Reviewed-on: http://gerrit.cloudera.org:8080/9620 Reviewed-by: Tim Armstrong Tested-by: Impala Public Jenkins Project: http://git-wip-us.apache.org/repos/asf/impala/repo Commit:

[01/11] impala git commit: IMPALA-6683: Fix infinite loop after restarting the catalog

2018-03-20 Thread tarasbob
Repository: impala Updated Branches: refs/heads/master 8dde41e80 -> 1d38c584a IMPALA-6683: Fix infinite loop after restarting the catalog Currently the catalog service ID topic item includes the ID string. It causes the coexistence of multiple catalog service ID topic items after the catalogd

[04/11] impala git commit: IMPALA-6655: Add owner information on database creation

2018-03-20 Thread tarasbob
IMPALA-6655: Add owner information on database creation Add owner information on database creation. > create database foo; > describe database extended foo; +-+--+-+ | name| location | comment | +-+--+-+ | foo | | | | Owner:

[09/11] impala git commit: IMPALA-6697: Downgrade setuptools to be compatible with Python 2.6

2018-03-20 Thread tarasbob
IMPALA-6697: Downgrade setuptools to be compatible with Python 2.6 Change-Id: I0d4727b7a5911269b82287ed9ce759f1e211f386 Reviewed-on: http://gerrit.cloudera.org:8080/9713 Reviewed-by: Philip Zeyliger Tested-by: Lars Volker Project: http://git-wip-us.apache.org/repos/asf/impala/repo Commit: http

[08/11] impala git commit: IMPALA-6695: Fix PyPi regex, update setuptools version

2018-03-20 Thread tarasbob
IMPALA-6695: Fix PyPi regex, update setuptools version pytest-runner, which is required by kudu-python requires are more recent version of setuptools. Adding an explicit dependency required an update to the regular expression to parse PyPi URLs. Change-Id: Ia67189f81a31a9a5a0ed80cd4d6661762ef427b

[11/11] impala git commit: IMPALA-6643: Add REFRESH fine-grained privilege

2018-03-20 Thread tarasbob
IMPALA-6643: Add REFRESH fine-grained privilege Before this patch, ALL privilege was required to execute INVALIDATE METADATA and having any privilege allowed executing REFRESH and INVALIDATE METADATA . With this patch, REFRESH privilege is now required to execute INVALIDATE METADATA or REFRESH st

[05/11] impala git commit: IMPALA-6690: Fix pip_download.py on python 2.6

2018-03-20 Thread tarasbob
IMPALA-6690: Fix pip_download.py on python 2.6 IMPALA-6682 used set literal syntax in pip_download.py, which is introduced in python 2.7. This patch changes it to set constructor. It's tested on python 2.6.9. Change-Id: I82b4116ee056f605c8aadf39a8b92b78313cb8bf Reviewed-on: http://gerrit.clouder

[06/11] impala git commit: IMPALA-6589: remove invalid DCHECK in parquet reader

2018-03-20 Thread tarasbob
IMPALA-6589: remove invalid DCHECK in parquet reader The DCHECK was only valid if the Parquet file metadata is internally consistent, with the number of values reported by the metadata matching the number of encoded levels. The DCHECK was intended to directly detect misuse of the RleBatchDecoder

[10/11] impala git commit: IMPALA-6610: Improve LDAP auth fail warning message in impala-shell

2018-03-20 Thread tarasbob
IMPALA-6610: Improve LDAP auth fail warning message in impala-shell The value of LDAP password in Impala shell contains extra line break causes authentication failure, but the user can't detect the cause of the failure. I fixed the issue by adding inspection to the password for common pitfalls an

[07/11] impala git commit: Consistently use Java 1.7 compiler.

2018-03-20 Thread tarasbob
Consistently use Java 1.7 compiler. We use Java 1.7 in fe/pom.xml, where most of our Java code is. For consistency, this updates the rest of our Maven configurations to use the same version of Java. A change I'm working with uses try-with-resources in HBase splitting, which is how I ran into this.

[18/21] impala git commit: IMPALA-6654: [DOCS] Updated the Kudu/Sentry/Impala limitations

2018-03-20 Thread tarasbob
IMPALA-6654: [DOCS] Updated the Kudu/Sentry/Impala limitations Change-Id: I8991d85e77c7f5075525734145291457d50a7633 Reviewed-on: http://gerrit.cloudera.org:8080/9618 Reviewed-by: Thomas Tauber-Marshall Tested-by: Impala Public Jenkins Project: http://git-wip-us.apache.org/repos/asf/impala/repo

[12/21] impala git commit: Removing (broken) retries from split-hbase.sh.

2018-03-20 Thread tarasbob
Removing (broken) retries from split-hbase.sh. The retries in split-hbase.sh don't work in the common case, because $MINIKDC_PRINC_HIVE is not set in non-kerberized (common) environments. The regular data load scripts (create-load-data.sh) have code to manage that, but split-hbase.sh blindly forge

[10/21] impala git commit: IMPALA-6675: Default to --compact_catalog_topic=true.

2018-03-20 Thread tarasbob
IMPALA-6675: Default to --compact_catalog_topic=true. Testing: - Ran a few queries locally - Ran test_compact_catalog_updates.py locally Mostafa's perf evaluation: - 130 node cluster - Load metadata after invalidate for 4 tables, each with 100K partitions and 1 million files Results compaction

[13/21] impala git commit: IMPALA-6498: test_query_profile_thrift_timestamps causes following tests to fail.

2018-03-20 Thread tarasbob
IMPALA-6498: test_query_profile_thrift_timestamps causes following tests to fail. test_query_profile_thrift_timestamps uses ImapaTestSuite.client.close() to force cancellation/unregistration of the query, so that 'EndTime' of the query shows up in the profile. Since other test cases also need a va

[16/21] impala git commit: IMPALA-6652: Rename label of MemTracker for early RPCs

2018-03-20 Thread tarasbob
IMPALA-6652: Rename label of MemTracker for early RPCs This change renames the label of the MemTracker in KrpcDataStreamMgr for tracking payloads of early RPCs to "Data Stream Manager Early RPCs". This is to distinguish these RPCs from the deferred RPCs in a receiver. The early RPCs refer to those

[14/21] impala git commit: IMPALA-6683: Fix infinite loop after restarting the catalog

2018-03-20 Thread tarasbob
IMPALA-6683: Fix infinite loop after restarting the catalog Currently the catalog service ID topic item includes the ID string. It causes the coexistence of multiple catalog service ID topic items after the catalogd restarts. Impalad therefore keeps detecting the change of catalog service ID and r

[21/21] impala git commit: IMPALA-6610: Improve LDAP auth fail warning message in impala-shell

2018-03-20 Thread tarasbob
IMPALA-6610: Improve LDAP auth fail warning message in impala-shell The value of LDAP password in Impala shell contains extra line break causes authentication failure, but the user can't detect the cause of the failure. I fixed the issue by adding inspection to the password for common pitfalls an

[2/6] impala git commit: Ignore commit which has already been picked

2018-03-23 Thread tarasbob
Ignore commit which has already been picked 588e1d46 has already been picked, but with a different Change-Id than in master. Change-Id: Icd6cd9ee99bbc113246f5d01548c6c3239a936c5 Reviewed-on: http://gerrit.cloudera.org:8080/9774 Reviewed-by: Joe McDonnell Tested-by: Lars Volker Project: http:/

[6/6] impala git commit: IMPALA-6716: Store LDAP options as shell member variables

2018-03-23 Thread tarasbob
IMPALA-6716: Store LDAP options as shell member variables When passing comamnd line options to a new instance of the ImpalaShell, we ususally transfer the options to member variables of that new instance. We weren't doing that with all of the LDAP-related options, even though we wanted to access t

[2/4] impala git commit: IMPALA-4277: Support multiple versions of Hadoop ecosystem

2018-03-23 Thread tarasbob
http://git-wip-us.apache.org/repos/asf/impala/blob/783de170/testdata/cluster/node_templates/cdh5/etc/init.d/kudu-common -- diff --git a/testdata/cluster/node_templates/cdh5/etc/init.d/kudu-common b/testdata/cluster/node_templates/

[5/6] impala git commit: KUDU-2305: Limit sidecars to INT_MAX and fortify socket code

2018-03-23 Thread tarasbob
KUDU-2305: Limit sidecars to INT_MAX and fortify socket code Inspection of the code revealed some other local variables that could overflow with large messages. This patch takes two approaches to eliminate the issues. First, it limits the total size of the messages by limiting the total size of t

[4/4] impala git commit: IMPALA-4277: Support multiple versions of Hadoop ecosystem

2018-03-23 Thread tarasbob
IMPALA-4277: Support multiple versions of Hadoop ecosystem Adds support for building against two sets of Hadoop ecosystem components. The control variable is IMPALA_MINICLUSTER_PROFILE_OVERRIDE, which can either be set to 2 (for Hadoop 2, Hive 1, and so on) or 3 (for Hadoop 3, Hive 2, and so on).

[3/6] impala git commit: IMPALA-6622: Backport parts of IMPALA-4924 to 2.x

2018-03-23 Thread tarasbob
IMPALA-6622: Backport parts of IMPALA-4924 to 2.x We enabled Decimal V2 by default on master (but not on the 2.x branch) in IMPALA-4924. There were some other code changes that are not specific to enableing Decimal V2 that are causing merge conflicts. In this patch, we backport those changes to re

[3/4] impala git commit: IMPALA-4277: Support multiple versions of Hadoop ecosystem

2018-03-23 Thread tarasbob
http://git-wip-us.apache.org/repos/asf/impala/blob/783de170/fe/src/compat-minicluster-profile-3/java/org/apache/impala/analysis/ParquetHelper.java -- diff --git a/fe/src/compat-minicluster-profile-3/java/org/apache/impala/analysis

[1/6] impala git commit: IMPALA-6324: Support reading RLE-encoded boolean values in Parquet scanner

2018-03-23 Thread tarasbob
Repository: impala Updated Branches: refs/heads/2.x 689ee533c -> bca3d459c IMPALA-6324: Support reading RLE-encoded boolean values in Parquet scanner Impala already supported RLE encoding for levels and dictionary pages, so the only task was to integrate it into BoolColumnReader. A new benchm

[1/4] impala git commit: IMPALA-6722: include fs prefix for udf test

2018-03-23 Thread tarasbob
Repository: impala Updated Branches: refs/heads/master 08b60a15c -> 783de170c IMPALA-6722: include fs prefix for udf test test_native_functions_race failed tests since it did not include a path prefix. The fix uses get_fs_path to include the fs prefix. Change-Id: I314d8c32e4bc3857aefd244b524f

[4/6] impala git commit: IMPALA-6704: Skip config validations in session-expiry-test.

2018-03-23 Thread tarasbob
IMPALA-6704: Skip config validations in session-expiry-test. Just like expr-test, we can skip config checking when creating the InProcessImpalaServer in session-expiry-test. This fixes an issue where the test would fail when there is no minicluster. (The test itself would actually race and only fa

impala git commit: Updating ignored_commits.

2018-04-10 Thread tarasbob
Repository: impala Updated Branches: refs/heads/2.x b90a67265 -> eaab248ee Updating ignored_commits. Cherrypicking was stuck due to two changes needing to have been marked as not for 2.x. Change-Id: Ic7744b31ff1e1435e8d91f5d0bb7986bd7e5f7a8 Reviewed-on: http://gerrit.cloudera.org:8080/9971 Te

[4/4] impala git commit: IMPALA-6805: Show current database in Impala shell prompt

2018-04-10 Thread tarasbob
IMPALA-6805: Show current database in Impala shell prompt Prompt format: [host:port] db_name> Testing: - Added new shell tests - Ran end-to-end shell tests Change-Id: Ifb0ae58507321e426e5f0f16518671420974a3fc Reviewed-on: http://gerrit.cloudera.org:8080/9927 Reviewed-by: Fredy Wijaya Reviewed-b

[1/4] impala git commit: Remove Yarn from minicluster by default.

2018-04-10 Thread tarasbob
Repository: impala Updated Branches: refs/heads/master 4c1538ab1 -> 830e3346f Remove Yarn from minicluster by default. Turns out that we start Yarn as part of the minicluster, but we never use it. (HiveServer2 is configured to run MR jobs "locally" in process.) Likely, this Yarn integration is

[3/4] impala git commit: Remove Yarn from minicluster by default. (2nd try)

2018-04-10 Thread tarasbob
Remove Yarn from minicluster by default. (2nd try) Remove Yarn from minicluster by default. Turns out that we start Yarn as part of the minicluster, but we never use it. (HiveServer2 is configured to run MR jobs "locally" in process.) Likely, this Yarn integration is a vestige of Yarn/Llama integ

[2/4] impala git commit: Revert "Remove Yarn from minicluster by default."

2018-04-10 Thread tarasbob
Revert "Remove Yarn from minicluster by default." This reverts commit c05df104570fa2cb7067599bbe3b87740ca9f09e. Change-Id: I00151795581d22a9852cceaca1d21325d68dbe59 Reviewed-on: http://gerrit.cloudera.org:8080/9969 Reviewed-by: Philip Zeyliger Tested-by: Philip Zeyliger Project: http://git-wi

[1/3] incubator-impala git commit: IMPALA-6054: Parquet dictionary pages should be freed on dictionary construction

2017-11-21 Thread tarasbob
Repository: incubator-impala Updated Branches: refs/heads/master 3632ed4b9 -> bc12a9eb3 IMPALA-6054: Parquet dictionary pages should be freed on dictionary construction During dictionary constructon, most types are copied from the parquet dictionary page, but StringValues keep pointers to it.

[3/3] incubator-impala git commit: IMPALA-5019: Decimal V2 addition

2017-11-21 Thread tarasbob
IMPALA-5019: Decimal V2 addition In this patch, we implement the new decimal return type rules for addition expressions. These rules become active when the query option DECIMAL_V2 is enabled. The algorithm for determining the type of the result is described in the JIRA. DECIMAL V1: +-

[2/3] incubator-impala git commit: IMPALA-5624: Replace "ls -l" with opendir() in ProcessStateInfo

2017-11-21 Thread tarasbob
IMPALA-5624: Replace "ls -l" with opendir() in ProcessStateInfo Running shell commands from impalad can be problematic, because using popen() leads to forking which causes a spike in virtual memory. To avoid this, "ls" is replaced with POSIX API calls. FileDescriptorMap fd_desc_ was only used t

[5/5] impala git commit: IMPALA-5014: Part 1: Round when casting string to decimal

2017-12-22 Thread tarasbob
IMPALA-5014: Part 1: Round when casting string to decimal In this patch we implement rounding when casting string to decimal if DECIMAL_V2 is enabled. The backend method that parses strings and converts them to decimals is refactored to make it easier to understand. Testing: - Added some BE tests

[4/5] impala git commit: Move symlinked auxiliary tests/* to tests/functional/*

2017-12-22 Thread tarasbob
Move symlinked auxiliary tests/* to tests/functional/* The layout of the Impala-auxiliary-tests/tests directory is changing to allow for different kinds of tests to be saved there. But just in case the new functional sub-directory does not exist, preserve backwards compatibility with the older lay

[1/5] impala git commit: Remove unused deps, centralize some pom versions, upgrade SLF4J and commons-io.

2017-12-22 Thread tarasbob
Repository: impala Updated Branches: refs/heads/master 2fb11fb73 -> a16fe803c Remove unused deps, centralize some pom versions, upgrade SLF4J and commons-io. As a follow-on to centralizing into one parent pom, we can now manage thirdparty dependency versions in Java a little bit more clearly.

[3/5] impala git commit: IMPALA-6225: Part 2: Query profile date-time strings should have ns precision.

2017-12-22 Thread tarasbob
IMPALA-6225: Part 2: Query profile date-time strings should have ns precision. This commit follows 16d8dd58. This patch adds a test case that inspects the thrift profile of a completed query, and verifies that the "Start Time" and "End Time" of the query have nanosecond precision. We chose to wor

[2/5] impala git commit: KUDU-2198. Allow disregarding system-wide auth-to-local mapping

2017-12-22 Thread tarasbob
KUDU-2198. Allow disregarding system-wide auth-to-local mapping This adds a workaround for an issue reported on the user mailing list. Some systems are configured such that the auth_to_local mapping provided by the krb5 library doesn't work properly for service accounts. This patch adds a new con

[1/2] impala git commit: IMPALA-4168: Adds Oracle-style hint placement for INSERT/UPSERT

2018-01-09 Thread tarasbob
Repository: impala Updated Branches: refs/heads/master 38461c524 -> f810458ca IMPALA-4168: Adds Oracle-style hint placement for INSERT/UPSERT Allow to specify Oracle-style hint on INSERT/UPSERT statements. For example, - insert /* +noshuffle */ into table functional.alltypes partition(year, mo

[2/2] impala git commit: IMPALA-6231: Implement decimal_v2 fuzz test

2018-01-09 Thread tarasbob
IMPALA-6231: Implement decimal_v2 fuzz test Implement a test that generates random decimal numbers in the pytest framework, performs a random mathemtaical operation in Impala and verifies that the result is correct by doing the same operating using the Python decimal module. We try to generate not

[1/2] impala git commit: IMPALA-4323: "SET ROW FORMAT" option added to "ALTER TABLE" command

2018-01-16 Thread tarasbob
Repository: impala Updated Branches: refs/heads/master 3fc42ded0 -> f8b406222 IMPALA-4323: "SET ROW FORMAT" option added to "ALTER TABLE" command Examples of new command: ALTER TABLE t1 SET ROW FORMAT DELIMITED FIELDS TERMINATED BY '\002'; ALTER TABLE t1 SET ROW FORMAT DELIMITED LINES TERMINAT

[2/2] impala git commit: IMPALA-6388: Fix the Union node number of hosts estimation

2018-01-16 Thread tarasbob
IMPALA-6388: Fix the Union node number of hosts estimation Before this patch, we would estimate the number of hosts for the union node by looking only at the first union operand. This is obviously incorrect and lead us to underestimate the value. We fix the problem by setting the estimate to be t

[03/12] impala git commit: IMPALA-6386: Invalidate metadata at table level for dataload

2018-01-17 Thread tarasbob
IMPALA-6386: Invalidate metadata at table level for dataload Dataload currently executes bin/load-data.py for TPC-H, TPC-DS, and functional-query concurrently. One of the final steps for bin/load-data.py is to run a global "invalidate metadata". Global "invalidate metadata" commands are known to c

[08/12] impala git commit: IMPALA-5478: Run TPCDS queries with decimal_v2 enabled

2018-01-17 Thread tarasbob
http://git-wip-us.apache.org/repos/asf/impala/blob/35a3e186/testdata/workloads/tpcds/queries/tpcds-decimal_v2-q65.test -- diff --git a/testdata/workloads/tpcds/queries/tpcds-decimal_v2-q65.test b/testdata/workloads/tpcds/queries/t

[12/12] impala git commit: IMPALA-5478: Run TPCDS queries with decimal_v2 enabled

2018-01-17 Thread tarasbob
IMPALA-5478: Run TPCDS queries with decimal_v2 enabled We add new TPCDS .test files that are expected to be run with decimal_v2 enabled. The new expected results were generated using Impala and I inspected them manually. Change-Id: Ib867c51a521ec4a087bc127d99aee4b95ba97733 Reviewed-on: http://ger

[07/12] impala git commit: IMPALA-5478: Run TPCDS queries with decimal_v2 enabled

2018-01-17 Thread tarasbob
http://git-wip-us.apache.org/repos/asf/impala/blob/35a3e186/testdata/workloads/tpcds/queries/tpcds-decimal_v2-q71.test -- diff --git a/testdata/workloads/tpcds/queries/tpcds-decimal_v2-q71.test b/testdata/workloads/tpcds/queries/t

[01/12] impala git commit: IMPALA-6399: Increase timeout in test_observability to reduce flakiness

2018-01-17 Thread tarasbob
Repository: impala Updated Branches: refs/heads/master 6cc76d720 -> 35a3e186d IMPALA-6399: Increase timeout in test_observability to reduce flakiness Change-Id: I58f7e7b367e73675be42e85f55fd7698d51f92af Reviewed-on: http://gerrit.cloudera.org:8080/9034 Reviewed-by: Sailesh Mukil Tested-by: La

[10/12] impala git commit: IMPALA-5478: Run TPCDS queries with decimal_v2 enabled

2018-01-17 Thread tarasbob
http://git-wip-us.apache.org/repos/asf/impala/blob/35a3e186/testdata/workloads/tpcds/queries/tpcds-decimal_v2-q20.test -- diff --git a/testdata/workloads/tpcds/queries/tpcds-decimal_v2-q20.test b/testdata/workloads/tpcds/queries/t

[02/12] impala git commit: IMPALA-4315: Allow USE and SHOW TABLES if the user has only column privileges

2018-01-17 Thread tarasbob
IMPALA-4315: Allow USE and SHOW TABLES if the user has only column privileges USE and SHOW TABLES should be allowed if there is at least one table in a database where the user has table or column privileges. Impala incorrectly checked only for table privileges. To test this issue in Authorization

[09/12] impala git commit: IMPALA-5478: Run TPCDS queries with decimal_v2 enabled

2018-01-17 Thread tarasbob
http://git-wip-us.apache.org/repos/asf/impala/blob/35a3e186/testdata/workloads/tpcds/queries/tpcds-decimal_v2-q4.test -- diff --git a/testdata/workloads/tpcds/queries/tpcds-decimal_v2-q4.test b/testdata/workloads/tpcds/queries/tpc

[11/12] impala git commit: IMPALA-5478: Run TPCDS queries with decimal_v2 enabled

2018-01-17 Thread tarasbob
http://git-wip-us.apache.org/repos/asf/impala/blob/35a3e186/testdata/workloads/tpcds/queries/tpcds-decimal_v2-q2.test -- diff --git a/testdata/workloads/tpcds/queries/tpcds-decimal_v2-q2.test b/testdata/workloads/tpcds/queries/tpc

[04/12] impala git commit: IMPALA-5478: Run TPCDS queries with decimal_v2 enabled

2018-01-17 Thread tarasbob
http://git-wip-us.apache.org/repos/asf/impala/blob/35a3e186/testdata/workloads/tpcds/queries/tpcds-decimal_v2-q99.test -- diff --git a/testdata/workloads/tpcds/queries/tpcds-decimal_v2-q99.test b/testdata/workloads/tpcds/queries/t

[06/12] impala git commit: IMPALA-5478: Run TPCDS queries with decimal_v2 enabled

2018-01-17 Thread tarasbob
http://git-wip-us.apache.org/repos/asf/impala/blob/35a3e186/testdata/workloads/tpcds/queries/tpcds-decimal_v2-q78.test -- diff --git a/testdata/workloads/tpcds/queries/tpcds-decimal_v2-q78.test b/testdata/workloads/tpcds/queries/t

[05/12] impala git commit: IMPALA-5478: Run TPCDS queries with decimal_v2 enabled

2018-01-17 Thread tarasbob
http://git-wip-us.apache.org/repos/asf/impala/blob/35a3e186/testdata/workloads/tpcds/queries/tpcds-decimal_v2-q98.test -- diff --git a/testdata/workloads/tpcds/queries/tpcds-decimal_v2-q98.test b/testdata/workloads/tpcds/queries/t