[Impala-ASF-CR] IMPALA-4100,4112: Qgen: Replace EXTRACT UDF + IS [NOT] DISTINCT FROM in HiveSqlWriter
stak...@cloudera.com has posted comments on this change. Change subject: IMPALA-4100,4112: Qgen: Replace EXTRACT UDF + IS [NOT] DISTINCT FROM in HiveSqlWriter .. Patch Set 3: > > Have we considered added some basic sanity tests for the query > generator? > > I've got a few directed unit tests for a single area and plan to > add more as I add features. Thanks for the quick code review! I start working on adding some unit tests also. -- To view, visit http://gerrit.cloudera.org:8080/4357 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3922ca61af59ecd2899c911b1a03e11ab5c26e11 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: stak...@cloudera.com Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database
stak...@cloudera.com has uploaded a new patch set (#3). Change subject: IMPALA-3980: qgen: re-enable Hive as a target database .. IMPALA-3980: qgen: re-enable Hive as a target database Changes: * Added hive cli options back in (removed in commit "Stress test: Various changes") * Modifications so that if --use-hive is specified, a Hive connection is actually created * A few minor bug fixes so that the RQG can be run locally * Modified MiniCluster to use HADOOP_CONF_DIR and HIVE_CONF_DIR rather than a hard-coded file under IMPALA_HOME Testing: * Hive integration tested locally by invoking the data generator via the command: ./data-generator.py \ --db-name=functional \ --use-hive \ --min-row-count=50 \ --max-row-count=100 \ --storage-file-formats textfile \ --use-postgresql \ --postgresql-user stakiar and the discrepancy checker via the command: ./discrepancy-checker.py \ --db-name=functional \ --use-hive \ --use-postgresql \ --postgresql-user stakiar \ --test-db-type HIVE \ --timeout 300 \ --query-count 50 \ --profile hive * The output of the above two commands is essentially the same as the Impala output, however, about 20% of the queries will fail when the discrepancy checker is run * Regression testing done by running Leopard in a local VM running Ubuntu 14.04, and by running the discrepancy checker against Impala while inside an Impala Docker container Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 --- M tests/comparison/cli_options.py M tests/comparison/cluster.py M tests/comparison/data_generator.py 3 files changed, 61 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/4011/3 -- To view, visit http://gerrit.cloudera.org:8080/4011 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com
[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database
stak...@cloudera.com has posted comments on this change. Change subject: IMPALA-3980: qgen: re-enable Hive as a target database .. Patch Set 3: (6 comments) http://gerrit.cloudera.org:8080/#/c/4011/2//COMMIT_MSG Commit Message: PS2, Line 7: IMPALA-3980: qgen: re-enable Hive as a target database > I prefer commit messages that, after the bug, speak of the code section cha Done PS2, Line 19: * Hive integration tested locally by invoking the data generator via the command: : : ./data-generator.py \ : --db-name=functional \ : --use-hive \ : --min-row-count=50 \ > While it makes the commit message have more lines, it is more readable to w Done http://gerrit.cloudera.org:8080/#/c/4011/2/tests/comparison/cli_options.py File tests/comparison/cli_options.py: PS2, Line 150: group.add_argument('--hive-host', default=DEFAULT_HIVE_HOST, : help="The name of the host running the HS2") : group.add_argument("--hive-port", default=DEFAULT_HIVE_PORT, type=int, : help="The port of HiveServer2") > Why not derive these defaults from symbols imported from cluster.py? (The c Done http://gerrit.cloudera.org:8080/#/c/4011/2/tests/comparison/cluster.py File tests/comparison/cluster.py: PS2, Line 166: class MiniCluster(Cluster): > This changes the MiniCluster constructor interface for all callers. It also Done PS2, Line 178: def _init_local_hadoop_conf_dir(self): > Thanks for doing the cleanups like this one. No problem! It makes the code a lot easier to run locally. http://gerrit.cloudera.org:8080/#/c/4011/2/tests/comparison/data_generator.py File tests/comparison/data_generator.py: PS2, Line 101: def populate_db(self, table_count, postgresql_conn=None, is_hive=False) > We don't use variable names of the format "isHive" typically, so can you ch Done -- To view, visit http://gerrit.cloudera.org:8080/4011 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database
stak...@cloudera.com has uploaded a new patch set (#4). Change subject: IMPALA-3980: qgen: re-enable Hive as a target database .. IMPALA-3980: qgen: re-enable Hive as a target database Changes: * Added hive cli options back in (removed in commit "Stress test: Various changes") * Modifications so that if --use-hive is specified, a Hive connection is actually created * A few minor bug fixes so that the RQG can be run locally * Modified MiniCluster to use HADOOP_CONF_DIR and HIVE_CONF_DIR rather than a hard-coded file under IMPALA_HOME Testing: * Hive integration tested locally by invoking the data generator via the command: ./data-generator.py \ --db-name=functional \ --use-hive \ --min-row-count=50 \ --max-row-count=100 \ --storage-file-formats textfile \ --use-postgresql \ --postgresql-user stakiar and the discrepancy checker via the command: ./discrepancy-checker.py \ --db-name=functional \ --use-hive \ --use-postgresql \ --postgresql-user stakiar \ --test-db-type HIVE \ --timeout 300 \ --query-count 50 \ --profile hive * The output of the above two commands is essentially the same as the Impala output, however, about 20% of the queries will fail when the discrepancy checker is run * Regression testing done by running Leopard in a local VM running Ubuntu 14.04, and by running the discrepancy checker against Impala while inside an Impala Docker container Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 --- M tests/comparison/cli_options.py M tests/comparison/cluster.py M tests/comparison/data_generator.py 3 files changed, 61 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/4011/4 -- To view, visit http://gerrit.cloudera.org:8080/4011 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com
[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database
stak...@cloudera.com has posted comments on this change. Change subject: IMPALA-3980: qgen: re-enable Hive as a target database .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/4011/3/tests/comparison/cluster.py File tests/comparison/cluster.py: PS3, Line 56: DEFAULT_HIVE_PORT = 11050 > This should be reverted back to 11050 to match what's already in use in the Done -- To view, visit http://gerrit.cloudera.org:8080/4011 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database
stak...@cloudera.com has posted comments on this change. Change subject: IMPALA-3980: qgen: re-enable Hive as a target database .. Patch Set 4: > I'd like you to run the small suite of cluster tests, but you can't > yet until you rebase on top of https://gerrit.cloudera.org/#/c/4404/ > I realize there aren't that many, and they may not find anything, > but it's a habit we need to start. Thanks. Will do, looks like your patch just got merged. I will rebase my patch and then run the tests. Will update this review once I'm done. -- To view, visit http://gerrit.cloudera.org:8080/4011 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database
stak...@cloudera.com has posted comments on this change. Change subject: IMPALA-3980: qgen: re-enable Hive as a target database .. Patch Set 4: > > I'd like you to run the small suite of cluster tests, but you > can't > > yet until you rebase on top of https://gerrit.cloudera.org/#/c/4404/ > > I realize there aren't that many, and they may not find anything, > > but it's a habit we need to start. Thanks. > > Will do, looks like your patch just got merged. I will rebase my > patch and then run the tests. Will update this review once I'm > done. @Michael, rebased this patch and ran all the units tests (test_cluster.py and test_query_objects.py). All tests passed. -- To view, visit http://gerrit.cloudera.org:8080/4011 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4028: Trim sentry config file path spaces while impala start.
davy...@163.com has posted comments on this change. Change subject: IMPALA-4028: Trim sentry config file path spaces while impala start. .. Patch Set 3: We have developed a configuration management system through which command line parameters can be set. Thank you for your advice. I will use quotes to show the file name path in the error message. -- To view, visit http://gerrit.cloudera.org:8080/4309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3a76b9e4236caa3f2088fba8a9cf0236fced2634 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: davy...@163.com Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: davy...@163.com Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4028: Trim sentry config file path spaces while impala start.
Hello Lars Volker, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4309 to look at the new patch set (#4). Change subject: IMPALA-4028: Trim sentry config file path spaces while impala start. .. IMPALA-4028: Trim sentry config file path spaces while impala start. When sentry config file path is wrong input which end contains spaces, impala start up failed. Trimming sentry config file path spaces and checking file again while the original file has been checked failed will solve this problem. Change-Id: I3a76b9e4236caa3f2088fba8a9cf0236fced2634 --- M fe/src/main/java/com/cloudera/impala/authorization/SentryConfig.java 1 file changed, 3 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/4309/4 -- To view, visit http://gerrit.cloudera.org:8080/4309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3a76b9e4236caa3f2088fba8a9cf0236fced2634 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: davy...@163.com Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: davy...@163.com
[Impala-ASF-CR] IMPALA-4101: qgen: Hive join predicates should only contains equality functions
stak...@cloudera.com has uploaded a new change for review. http://gerrit.cloudera.org:8080/4419 Change subject: IMPALA-4101: qgen: Hive join predicates should only contains equality functions .. IMPALA-4101: qgen: Hive join predicates should only contains equality functions Background: Hive only supports equi-joins in its JOIN clause, while Postgres and Impala support more complex functions such as <, <=, >, >=, etc. This change modifies the QueryGenerator._create_relational_join_condition and QueryGenerator._create_boolean_func_tree methods to only construct equality join conditions under certain conditions. The _create_boolean_func_tree method is invoked via QueryGenerator -> create_query -> _create_from_clause -> _create_join_clause -> _create_relational_join_condition -> _create_boolean_func_tree. This method is invoked when constructing the JOIN, WHERE, and HAVING clauses. It creates a tree of functions that would typically be found in any of these clauses. Changes: The parameter "signatures" is added to the method _create_boolean_func_tree, and it lists out all the allowed signatures the function is allowed to use. Previously, this list of signatures was populated by calling _funcs_to_allowed_signatures(FUNCS), and if "signatures" is not specified, then the code defaults back to the results of that method. A new method in the DefaultProfile called get_allowed_join_signatures is introduced and returns a list of function signatures that are allowed within a JOIN clause. The DefaultProfile allows all given signatures, while the HiveProfile only allows for the Equals and And functions, as well as any function that operates over only one column. The reason for these restrictions is that Hive only allows equality joins, does not allow OR operators in the join clause, and does not allow any operator that operates over multiple columns. Note that the _create_boolean_func_tree still allows for OR operators due to some logic around it's "and_or_fill_ratio" variable. The plan is to fix this in a future patch that specifically focuses on removing OR operators from Hive JOIN clauses. Testing: * Added a new unit test that ensures the HiveProfile only returns equality joins * Tested against Hive locally * Tested against Impala via Leopard * Tested against Impala via the Discrepancy Checker inside an Impala Docker container Change-Id: Ibe8832a03cfa0d7ecc293ec6db6db2bcb34ab459 --- M tests/comparison/discrepancy_searcher.py M tests/comparison/query_generator.py M tests/comparison/query_profile.py A tests/comparison/tests/hive/test_create_join_clause.py 4 files changed, 110 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/4419/1 -- To view, visit http://gerrit.cloudera.org:8080/4419 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ibe8832a03cfa0d7ecc293ec6db6db2bcb34ab459 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com
[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database
stak...@cloudera.com has posted comments on this change. Change subject: IMPALA-3980: qgen: re-enable Hive as a target database .. Patch Set 5: > Uploaded patch set 5: Patch Set 4 was rebased. @Michael, not sure what happened there, but I just pushed the rebased version. -- To view, visit http://gerrit.cloudera.org:8080/4011 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4101: qgen: Hive join predicates should only contains equality functions
stak...@cloudera.com has uploaded a new patch set (#2). Change subject: IMPALA-4101: qgen: Hive join predicates should only contains equality functions .. IMPALA-4101: qgen: Hive join predicates should only contains equality functions Background: Hive only supports equi-joins in its JOIN clause, while Postgres and Impala support more complex functions such as <, <=, >, >=, etc. This change modifies the QueryGenerator._create_relational_join_condition and QueryGenerator._create_boolean_func_tree methods to only construct equality join conditions under certain conditions. The _create_boolean_func_tree method is invoked via QueryGenerator -> create_query -> _create_from_clause -> _create_join_clause -> _create_relational_join_condition -> _create_boolean_func_tree. This method is invoked when constructing the JOIN, WHERE, and HAVING clauses. It creates a tree of functions that would typically be found in any of these clauses. Changes: The parameter "signatures" is added to the method _create_boolean_func_tree, and it lists out all the allowed signatures the function is allowed to use. Previously, this list of signatures was populated by calling _funcs_to_allowed_signatures(FUNCS), and if "signatures" is not specified, then the code defaults back to the results of that method. A new method in the DefaultProfile called get_allowed_join_signatures is introduced and returns a list of function signatures that are allowed within a JOIN clause. The DefaultProfile allows all given signatures, while the HiveProfile only allows for the Equals and And functions, as well as any function that operates over only one column. The reason for these restrictions is that Hive only allows equality joins, does not allow OR operators in the join clause, and does not allow any operator that operates over multiple columns. Note that the _create_boolean_func_tree still allows for OR operators due to some logic around it's "and_or_fill_ratio" variable. The plan is to fix this in a future patch that specifically focuses on removing OR operators from Hive JOIN clauses. Minor change to discrepancy_searcher so that the logs print out "Hive" instead of "Impala" when running against Hive. Testing: * Added a new unit test that ensures the HiveProfile only returns equality joins * Tested against Hive locally * Tested against Impala via Leopard * Tested against Impala via the Discrepancy Checker inside an Impala Docker container Change-Id: Ibe8832a03cfa0d7ecc293ec6db6db2bcb34ab459 --- M tests/comparison/discrepancy_searcher.py M tests/comparison/query_generator.py M tests/comparison/query_profile.py A tests/comparison/tests/hive/test_create_join_clause.py 4 files changed, 110 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/4419/2 -- To view, visit http://gerrit.cloudera.org:8080/4419 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibe8832a03cfa0d7ecc293ec6db6db2bcb34ab459 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com
[Impala-ASF-CR] IMPALA-4101: qgen: Hive join predicates should only contains equality functions
stak...@cloudera.com has posted comments on this change. Change subject: IMPALA-4101: qgen: Hive join predicates should only contains equality functions .. Patch Set 2: > Uploaded patch set 2: Commit message was updated. @Michael, full description of why this patch is necessary and what it changes is in the commit message. -- To view, visit http://gerrit.cloudera.org:8080/4419 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibe8832a03cfa0d7ecc293ec6db6db2bcb34ab459 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database
stak...@cloudera.com has uploaded a new patch set (#6). Change subject: IMPALA-3980: qgen: re-enable Hive as a target database .. IMPALA-3980: qgen: re-enable Hive as a target database Changes: * Added hive cli options back in (removed in commit "Stress test: Various changes") * Modifications so that if --use-hive is specified, a Hive connection is actually created * A few minor bug fixes so that the RQG can be run locally * Modified MiniCluster to use HADOOP_CONF_DIR and HIVE_CONF_DIR rather than a hard-coded file under IMPALA_HOME Testing: * Hive integration tested locally by invoking the data generator via the command: ./data-generator.py \ --db-name=functional \ --use-hive \ --min-row-count=50 \ --max-row-count=100 \ --storage-file-formats textfile \ --use-postgresql \ --postgresql-user stakiar and the discrepancy checker via the command: ./discrepancy-checker.py \ --db-name=functional \ --use-hive \ --use-postgresql \ --postgresql-user stakiar \ --test-db-type HIVE \ --timeout 300 \ --query-count 50 \ --profile hive * The output of the above two commands is essentially the same as the Impala output, however, about 20% of the queries will fail when the discrepancy checker is run * Regression testing done by running Leopard in a local VM running Ubuntu 14.04, and by running the discrepancy checker against Impala while inside an Impala Docker container Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 --- M fe/src/test/resources/hive-default.xml M tests/comparison/cli_options.py M tests/comparison/cluster.py M tests/comparison/data_generator.py 4 files changed, 66 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/4011/6 -- To view, visit http://gerrit.cloudera.org:8080/4011 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com
[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database
stak...@cloudera.com has posted comments on this change. Change subject: IMPALA-3980: qgen: re-enable Hive as a target database .. Patch Set 6: > Uploaded patch set 6. Hey Michael, Apologies, yes you are correct, the tests were failing with this patch. It seems my dev setup wasn't fully setup properly. Originally, I was trying to avoid building Impala fully, so I made some modifications to get the Python code to run locally. The tests were passing locally, but I realize this probably isn't a great approach if I want to reliably run these unit tests. I've set up a proper Impala dev environment on a remote machine. Once I did that I was able to re-produce the error. The hive-default.xml file under fe/src/test/resources/ is not a valid XML file. It was missing some element terminators. Once I fixed those the tests starting working. I have updated the patch. -- To view, visit http://gerrit.cloudera.org:8080/4011 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database
stak...@cloudera.com has uploaded a new patch set (#7). Change subject: IMPALA-3980: qgen: re-enable Hive as a target database .. IMPALA-3980: qgen: re-enable Hive as a target database Changes: * Added hive cli options back in (removed in commit "Stress test: Various changes") * Modifications so that if --use-hive is specified, a Hive connection is actually created * A few minor bug fixes so that the RQG can be run locally * Modified MiniCluster to use HADOOP_CONF_DIR and HIVE_CONF_DIR rather than a hard-coded file under IMPALA_HOME * Fixed fe/src/test/resources/hive-default.xml so that it is a valid XML file, it was missing a few element terminators that cause an exception in the cluster.py file Testing: * Hive integration tested locally by invoking the data generator via the command: ./data-generator.py \ --db-name=functional \ --use-hive \ --min-row-count=50 \ --max-row-count=100 \ --storage-file-formats textfile \ --use-postgresql \ --postgresql-user stakiar and the discrepancy checker via the command: ./discrepancy-checker.py \ --db-name=functional \ --use-hive \ --use-postgresql \ --postgresql-user stakiar \ --test-db-type HIVE \ --timeout 300 \ --query-count 50 \ --profile hive * The output of the above two commands is essentially the same as the Impala output, however, about 20% of the queries will fail when the discrepancy checker is run * Regression testing done by running Leopard in a local VM running Ubuntu 14.04, and by running the discrepancy checker against Impala while inside an Impala Docker container Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 --- M fe/src/test/resources/hive-default.xml M tests/comparison/cli_options.py M tests/comparison/cluster.py M tests/comparison/data_generator.py 4 files changed, 66 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/4011/7 -- To view, visit http://gerrit.cloudera.org:8080/4011 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com
[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database
stak...@cloudera.com has posted comments on this change. Change subject: IMPALA-3980: qgen: re-enable Hive as a target database .. Patch Set 7: > > The hive-default.xml file under > > fe/src/test/resources/ is not a valid XML file. It was missing > some > > element terminators. Once I fixed those the tests starting > working. > > I have updated the patch. > > Thanks, can you update the commit message as well? Updated -- To view, visit http://gerrit.cloudera.org:8080/4011 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4856: Port ImpalaInternalService to KRPC
Anonymous Coward #168 has posted comments on this change. Change subject: IMPALA-4856: Port ImpalaInternalService to KRPC .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/5888/1/be/src/service/impala-internal-service.cc File be/src/service/impala-internal-service.cc: Line 66: sync callback ? AddData func may block some time that will cause many work thread block here so bad . I think should move this task to Business processing thread, and it decided when to call callback. Like fbthrift ThriftServer side task processing mechanism?Execute the callback function at the appropriate time -- To view, visit http://gerrit.cloudera.org:8080/5888 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I95229290566a8ccffd80ed2d74c1c57cf1479238 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Anonymous Coward #168 Gerrit-HasComments: Yes
[Impala-ASF-CR](asf-site) Impala and Hive have slightly different SQL they accept.
Anonymous Coward #290 has posted comments on this change. Change subject: Impala and Hive have slightly different SQL they accept. .. Patch Set 1: Do not use the term 'compatible' or imply it. https://wiki.apache.org/hadoop/Defining%20Hadoop#Compatibility Some products have been released that have been described as "compatible" with Hadoop, even though parts of the Hadoop codebase have either been changed or replaced. The Apache™ Hadoop® developer team are not a standards body: they do not qualify such (derivative) works as compatible. Nor do they feel constrained by the requirements of external entities when changing the behavior of Apache Hadoop software or related Apache software. Nothing is compatible with Apache Hive other than Apache Hive. -- To view, visit http://gerrit.cloudera.org:8080/6567 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4e38dc15c31dbb3e297e1d3f9e8251a1fa22d1a7 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: asf-site Gerrit-Owner: Jim Apple Gerrit-Reviewer: Anonymous Coward #290 Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR](asf-site) Impala and Hive have slightly different SQL they accept.
Anonymous Coward #290 has posted comments on this change. Change subject: Impala and Hive have slightly different SQL they accept. .. Patch Set 1: The issue we (HivePMC) are trying dealing with is there are several entities with copies of Hive code, linking to Hive, or both and at the same time claiming 'Apache Hive Compatibility'. As I explained 'Apache Hive Compatibility' is not a thing they can assert, 1. Assuming you are using an unaltered Apache Hive metastore you could say. "Impala stores it's data in Hive Metastore." 2. Assuming you are using an unaltered apache hive server 2 you could say. " Impala's public facing interfaces mirror that of Hive X.Y" 3. You could say, "Impala's SQL dialect is very similar to that of Hive SQL." or "Many Apache Hive queries will run on Impala unaltered", First, use Defining Hadoop document as a guideline. Second, as a guideline, imagine Hive was some large commercial database 'DB2orgres 10Gee', you would be hard pressed to find any other entity software claiming compatibility with that thing. To summarize: I appreciate your rapid response. I do realize it is a tag line so you want to make it short and to the point (while of course making Impala sound as awesome as possible). In any case, I do not want to tell you what to say, use your best judgement. I hope that helps. The changes are appreciated. -- To view, visit http://gerrit.cloudera.org:8080/6567 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4e38dc15c31dbb3e297e1d3f9e8251a1fa22d1a7 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: asf-site Gerrit-Owner: Jim Apple Gerrit-Reviewer: Anonymous Coward #290 Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5280: Coalesce chains of OR conditions to an IN predicate
sakinape...@cloudera.com has uploaded a new change for review. http://gerrit.cloudera.org:8080/7110 Change subject: IMPALA-5280: Coalesce chains of OR conditions to an IN predicate .. IMPALA-5280: Coalesce chains of OR conditions to an IN predicate This change introduces a new rule to merge disjunct equality predicates into an IN predicate. As with every rule being applied bottom up, the rule merges the leaf OR predicates into an in predicate and subsequently merges the OR predicate to the existing IN predicate Change-Id: If02396b752c5497de9a92828c24c8062027dc2e2 --- M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/Expr.java A fe/src/main/java/org/apache/impala/rewrite/CoalesceEqualityDisjunctsToInRule.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java 4 files changed, 156 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/7110/1 -- To view, visit http://gerrit.cloudera.org:8080/7110 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: If02396b752c5497de9a92828c24c8062027dc2e2 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: sakinape...@cloudera.com
[Impala-ASF-CR] IMPALA-5280: Coalesce chains of OR conditions to an IN predicate
sakinape...@cloudera.com has uploaded a new patch set (#2). Change subject: IMPALA-5280: Coalesce chains of OR conditions to an IN predicate .. IMPALA-5280: Coalesce chains of OR conditions to an IN predicate This change introduces a new rule to merge disjunct equality predicates into an IN predicate. As with every rule being applied bottom up, the rule merges the leaf OR predicates into an in predicate and subsequently merges the OR predicate to the existing IN predicate Patch also include changes in response to alex's review comments Change-Id: If02396b752c5497de9a92828c24c8062027dc2e2 --- M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/Expr.java A fe/src/main/java/org/apache/impala/rewrite/EqualityDisjunctsToInRule.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java 4 files changed, 144 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/7110/2 -- To view, visit http://gerrit.cloudera.org:8080/7110 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: If02396b752c5497de9a92828c24c8062027dc2e2 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: sakinape...@cloudera.com Gerrit-Reviewer: Alex Behm
[Impala-ASF-CR] IMPALA-5280: Coalesce chains of OR conditions to an IN predicate
sakinape...@cloudera.com has posted comments on this change. Change subject: IMPALA-5280: Coalesce chains of OR conditions to an IN predicate .. Patch Set 2: (21 comments) Responding to comments... http://gerrit.cloudera.org:8080/#/c/7110/1/fe/src/main/java/org/apache/impala/analysis/Expr.java File fe/src/main/java/org/apache/impala/analysis/Expr.java: Line 179: > Not really sure what this predicate does. Seems very specific, so probably changed it to expr with equality condition and literal on the right. Generic enough to keep it here. Line 183: return arg instanceof BinaryPredicate > What's special about these literals? Why not all literals, i.e. isLiteral() Done. bool and null has only few values. however if there are predicates then we dont want to restrict rewriting. hence chaged to isLiteral http://gerrit.cloudera.org:8080/#/c/7110/1/fe/src/main/java/org/apache/impala/rewrite/CoalesceEqualityDisjunctsToInRule.java File fe/src/main/java/org/apache/impala/rewrite/CoalesceEqualityDisjunctsToInRule.java: Line 15 > several typos Done Line 16 > please follow the format of other rules to list examples Done Line 19 > CoalesceEqDisjunctsToInRule (a little shorter) Done Line 21 > remove, no logging please, too expensive Done Line 27 > very expensive, remove Done Line 28 > single line if Done Line 34 > single line Done Line 39 > single line Done Line 46 > Try to be consistent with rest of code base: Done Line 48 > whitespace (same below) Done Line 60 > else if seems clearer Done Line 64 > single line if Done Line 67 > Don't we need otherPred to be a BinaryPredicate with an EQ condition? Added the condition in Expr.java Line 69 > Predicates might not be normalized, i.e. you could have Agree. do we capture this effort in a separate jira? Line 73 > What is this check trying to do? It doesn't seem necessary. Redundant. Trying to check if the elements in the IN clause and the candidate predicate literal is of same type or not. Removed. Line 82 > We might want to extend NormalizeBinaryPredicates to make dealing with thes Yes. added a combo testcase with normalized predicate rule for simple case. Line 105 > What is this trying to do? Redundant. removed. Line 109 > Why not leftChild1.isLiteral()? Done http://gerrit.cloudera.org:8080/#/c/7110/1/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java File fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java: Line 80: public void TestBetweenToCompoundRule() throws AnalysisException { > nit: move to bottom Done -- To view, visit http://gerrit.cloudera.org:8080/7110 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If02396b752c5497de9a92828c24c8062027dc2e2 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: sakinape...@cloudera.com Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: sakinape...@cloudera.com Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5280: Coalesce chains of OR conditions to an IN predicate
sakinape...@cloudera.com has uploaded a new patch set (#3). Change subject: IMPALA-5280: Coalesce chains of OR conditions to an IN predicate .. IMPALA-5280: Coalesce chains of OR conditions to an IN predicate This change introduces a new rule to merge disjunct equality predicates into an IN predicate. As with every rule being applied bottom up, the rule merges the leaf OR predicates into an in predicate and subsequently merges the OR predicate to the existing IN predicate Patch also include changes in response to alex's review comments to normalize the binary predicates. Change-Id: If02396b752c5497de9a92828c24c8062027dc2e2 --- M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/Expr.java A fe/src/main/java/org/apache/impala/rewrite/EqualityDisjunctsToInRule.java M fe/src/main/java/org/apache/impala/rewrite/NormalizeBinaryPredicatesRule.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java 5 files changed, 172 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/7110/3 -- To view, visit http://gerrit.cloudera.org:8080/7110 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: If02396b752c5497de9a92828c24c8062027dc2e2 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: sakinape...@cloudera.com Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: sakinape...@cloudera.com
[Impala-ASF-CR] IMPALA-5280: Coalesce chains of OR conditions to an IN predicate
sakinape...@cloudera.com has posted comments on this change. Change subject: IMPALA-5280: Coalesce chains of OR conditions to an IN predicate .. Patch Set 3: (17 comments) additional tests and changes to normalize binary predicate rule http://gerrit.cloudera.org:8080/#/c/7110/2/fe/src/main/java/org/apache/impala/analysis/Expr.java File fe/src/main/java/org/apache/impala/analysis/Expr.java: Line 180: public final static com.google.common.base.Predicate IS_EXPR_EQ_LITERAL_PREDICATE = > long line Done http://gerrit.cloudera.org:8080/#/c/7110/1/fe/src/main/java/org/apache/impala/rewrite/CoalesceEqualityDisjunctsToInRule.java File fe/src/main/java/org/apache/impala/rewrite/CoalesceEqualityDisjunctsToInRule.java: Line 69 > I think this particular case is easy enough to fix in this JIRA. It's a sim Adding code in the rule class. http://gerrit.cloudera.org:8080/#/c/7110/2/fe/src/main/java/org/apache/impala/rewrite/EqualityDisjunctsToInRule.java File fe/src/main/java/org/apache/impala/rewrite/EqualityDisjunctsToInRule.java: Line 12: > Suggest modifications for brevity and grammar: Done Line 16: * Examples: > add an example where an equality predicate is merged into an IN predicate Done Line 32: Expr inAndOtherExpr = rewriteInAndOtherExpr(child0, child1); > What about IN OR IN? That can be produced by (a = 1 or a = 2) or (a = 3 or Done. Added code to merge two IN predicates. Line 38: return expr; > This comment should describe the inputs and outputs more clearly, in partic Done Line 52: } > extra space before "return" Done Line 54: inPred = (InPredicate) child1; > can get rid of these empty lines imo Done Line 55: otherPred = child0; > Our rewriting framework requires that we return a new InPredicate here. See Added code to instantiate new class Line 61: // other predicate can be OR predicate or IN predicate > Same here. Comment needs polish to describe inputs and outputs. Done Line 64: if (Expr.IS_EXPR_EQ_LITERAL_PREDICATE.apply(otherPred)) > single lines? Done Line 72: } > Seems more concise to inline these function calls, and get rid of the extra Done Line 78: private Expr rewriteEqEqPredicate(Expr child0, Expr child1) { > To keep things simple, I think we should also require that rightChild1 is a removed this as it is covered by the IS_EXPR_EQ_LITERAL_PREDICATE function http://gerrit.cloudera.org:8080/#/c/7110/2/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java File fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java: Line 392: RewritesOk("int_col * 3 = 6 or int_col * 3 = 9 or int_col * 3 = 12", > add parenthesis to make the AND/OR precedence explicit Done Line 394: > remove blank lines, these 3 test cases belong together logically, so it's g Done Line 403: edToInrule, "int_col * 3 IN (6, 9) OR int_col * 3 <= 12"); > add negative tests to show the non-obvious cases in which the rewrite is no Done Line 411: RewritesOk("int_col in (1,2) or int_col = 3", edToInrule, > int_col in (1,2) or int_col in (3,4) should also work Done -- To view, visit http://gerrit.cloudera.org:8080/7110 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If02396b752c5497de9a92828c24c8062027dc2e2 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: sakinape...@cloudera.com Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: sakinape...@cloudera.com Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4028: Improve message for improper Sentry config file path to make extra spaces visible.
Hello Lars Volker, Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4309 to look at the new patch set (#5). Change subject: IMPALA-4028: Improve message for improper Sentry config file path to make extra spaces visible. .. IMPALA-4028: Improve message for improper Sentry config file path to make extra spaces visible. When Sentry config file is incorrectly entered,the end contains spaces, impala starts up failed. Use quotes to embrace the file path in the error message that will help to identify the redundant spaces in the file path. Change-Id: I3a76b9e4236caa3f2088fba8a9cf0236fced2634 --- M fe/src/main/java/com/cloudera/impala/authorization/SentryConfig.java 1 file changed, 3 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/4309/5 -- To view, visit http://gerrit.cloudera.org:8080/4309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3a76b9e4236caa3f2088fba8a9cf0236fced2634 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: davy...@163.com Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: davy...@163.com
[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database
stak...@cloudera.com has posted comments on this change. Change subject: IMPALA-3980: qgen: re-enable Hive as a target database .. Patch Set 7: > This needs a committer's look. @Michael, do we know of a committer that has time to look at this patch? -- To view, visit http://gerrit.cloudera.org:8080/4011 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4101: qgen: Hive join predicates should only contains equality functions
stak...@cloudera.com has uploaded a new patch set (#3). Change subject: IMPALA-4101: qgen: Hive join predicates should only contains equality functions .. IMPALA-4101: qgen: Hive join predicates should only contains equality functions Background: Hive only supports equi-joins in its JOIN clause, while Postgres and Impala support more complex functions such as <, <=, >, >=, etc. This change modifies the QueryGenerator._create_relational_join_condition and QueryGenerator._create_boolean_func_tree methods to only construct equality join conditions under certain conditions. The _create_boolean_func_tree method is invoked via QueryGenerator -> create_query -> _create_from_clause -> _create_join_clause -> _create_relational_join_condition -> _create_boolean_func_tree. This method is invoked when constructing the JOIN, WHERE, and HAVING clauses. It creates a tree of functions that would typically be found in any of these clauses. Changes: The parameter "signatures" is added to the method _create_boolean_func_tree, and it lists out all the allowed signatures the function is allowed to use. Previously, this list of signatures was populated by calling _funcs_to_allowed_signatures(FUNCS), and if "signatures" is not specified, then the code defaults back to the results of that method. A new method in the DefaultProfile called get_allowed_join_signatures is introduced and returns a list of function signatures that are allowed within a JOIN clause. The DefaultProfile allows all given signatures, while the HiveProfile only allows for the Equals and And functions, as well as any function that operates over only one column. The reason for these restrictions is that Hive only allows equality joins, does not allow OR operators in the join clause, and has some restrictions on functions that operate over multiple different tables. This last restriction is somewhat subtle, if one side of the equals operator contains a function that operates over two different tables, the other side of the operator cannot contain either of those tables. While it is possible to have functions that take in multiple input parameters, the inputs must be taken from specific tables to prevent Hive from throwing a compile time exception. Adding support for this in qgen code will require significant effort and modification to some core methods (_create_relational_join_condition and _populate_func_with_vals), so its best to disable these for Hive altogether. Note that the _create_boolean_func_tree still allows for OR operators due to some logic around it's "and_or_fill_ratio" variable. The plan is to fix this in a future patch that specifically focuses on removing OR operators from Hive JOIN clauses. Minor change to discrepancy_searcher so that the logs print out "Hive" instead of "Impala" when running against Hive. Testing: * Added a new unit test that ensures the HiveProfile only returns equality joins * Unit tests pass * Tested against Hive locally * Tested against Impala via Leopard * Tested against Impala via the Discrepancy Checker Change-Id: Ibe8832a03cfa0d7ecc293ec6db6db2bcb34ab459 --- M tests/comparison/discrepancy_searcher.py M tests/comparison/query_generator.py M tests/comparison/query_profile.py A tests/comparison/tests/hive/test_create_join_clause.py 4 files changed, 110 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/4419/3 -- To view, visit http://gerrit.cloudera.org:8080/4419 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibe8832a03cfa0d7ecc293ec6db6db2bcb34ab459 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com
[Impala-ASF-CR] IMPALA-4101: qgen: Hive join predicates should only contains equality functions
stak...@cloudera.com has uploaded a new patch set (#4). Change subject: IMPALA-4101: qgen: Hive join predicates should only contains equality functions .. IMPALA-4101: qgen: Hive join predicates should only contains equality functions Background: Hive only supports equi-joins in its JOIN clause, while Postgres and Impala support more complex functions such as <, <=, >, >=, etc. This change modifies the QueryGenerator._create_relational_join_condition and QueryGenerator._create_boolean_func_tree methods to only construct equality join conditions under certain conditions. The _create_boolean_func_tree method is invoked via QueryGenerator -> create_query -> _create_from_clause -> _create_join_clause -> _create_relational_join_condition -> _create_boolean_func_tree. This method is invoked when constructing the JOIN, WHERE, and HAVING clauses. It creates a tree of functions that would typically be found in any of these clauses. Changes: The parameter "signatures" is added to the method _create_boolean_func_tree, and it lists out all the allowed signatures the function is allowed to use. Previously, this list of signatures was populated by calling _funcs_to_allowed_signatures(FUNCS), and if "signatures" is not specified, then the code defaults back to the results of that method. A new method in the DefaultProfile called get_allowed_join_signatures is introduced and returns a list of function signatures that are allowed within a JOIN clause. The DefaultProfile allows all given signatures, while the HiveProfile only allows for the Equals and And functions, as well as any function that operates over only one column. The reason for these restrictions is that Hive only allows equality joins, does not allow OR operators in the join clause, and has some restrictions on functions that operate over multiple different tables. This last restriction is somewhat subtle; if one side of the equals operator contains a function that operates over two different tables, the other side of the operator cannot contain either of those tables. While it is possible to have functions that take in multiple input parameters, the inputs must be taken from specific tables to prevent Hive from throwing a compile time exception. Adding support for this in qgen code will require significant effort and modification to some core methods (_create_relational_join_condition and _populate_func_with_vals), so it's best to disable these for Hive altogether. Note that the _create_boolean_func_tree still allows for OR operators due to some logic around its "and_or_fill_ratio" variable. The plan is to fix this in a future patch that specifically focuses on removing OR operators from Hive JOIN clauses. Minor change to discrepancy_searcher so that the logs print out "Hive" instead of "Impala" when running against Hive. Testing: * Added a new unit test that ensures the HiveProfile only returns equality joins * Unit tests pass * Tested against Hive locally * Tested against Impala via Leopard * Tested against Impala via the Discrepancy Checker Change-Id: Ibe8832a03cfa0d7ecc293ec6db6db2bcb34ab459 --- M tests/comparison/discrepancy_searcher.py M tests/comparison/query_generator.py M tests/comparison/query_profile.py A tests/comparison/tests/hive/test_create_join_clause.py 4 files changed, 117 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/4419/4 -- To view, visit http://gerrit.cloudera.org:8080/4419 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibe8832a03cfa0d7ecc293ec6db6db2bcb34ab459 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com
[Impala-ASF-CR] IMPALA-4101: qgen: Hive join predicates should only contains equality functions
stak...@cloudera.com has posted comments on this change. Change subject: IMPALA-4101: qgen: Hive join predicates should only contains equality functions .. Patch Set 4: (6 comments) @Michael, comments addressed. http://gerrit.cloudera.org:8080/#/c/4419/3//COMMIT_MSG Commit Message: PS3, Line 35: subtle; > Nit: this is a comma splice. You can change the comma to something like a c Done PS3, Line 41: it's > Nit: "it's" Done PS3, Line 45: its > Nit: "its" :) Done http://gerrit.cloudera.org:8080/#/c/4419/3/tests/comparison/query_generator.py File tests/comparison/query_generator.py: Line 1364: relational_col_types: Can be used to restrict the signature of the chosen > Add documentation about the signatures parameter. Done http://gerrit.cloudera.org:8080/#/c/4419/3/tests/comparison/query_profile.py File tests/comparison/query_profile.py: PS3, Line 661: : PROFILES = [var for var in locals().values() : if isinstance(var, type) > Maybe use an or expression for these two conditions and remove the elif? Re-wrote this as a single list comprehension http://gerrit.cloudera.org:8080/#/c/4419/3/tests/comparison/tests/hive/test_create_join_clause.py File tests/comparison/tests/hive/test_create_join_clause.py: PS3, Line 31: class FakeQueryProfile(HiveProfile): > 1. I prefer "Fake" instead of "Mock". I prefer the use of "mock" only when Sounds reasonable, updated -- To view, visit http://gerrit.cloudera.org:8080/4419 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibe8832a03cfa0d7ecc293ec6db6db2bcb34ab459 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4101: qgen: Hive join predicates should only contains equality functions
stak...@cloudera.com has uploaded a new patch set (#5). Change subject: IMPALA-4101: qgen: Hive join predicates should only contains equality functions .. IMPALA-4101: qgen: Hive join predicates should only contains equality functions Background: Hive only supports equi-joins in its JOIN clause, while Postgres and Impala support more complex functions such as <, <=, >, >=, etc. This change modifies the QueryGenerator._create_relational_join_condition and QueryGenerator._create_boolean_func_tree methods to only construct equality join conditions under certain conditions. The _create_boolean_func_tree method is invoked via QueryGenerator -> create_query -> _create_from_clause -> _create_join_clause -> _create_relational_join_condition -> _create_boolean_func_tree. This method is invoked when constructing the JOIN, WHERE, and HAVING clauses. It creates a tree of functions that would typically be found in any of these clauses. Changes: The parameter "signatures" is added to the method _create_boolean_func_tree, and it lists out all the allowed signatures the function is allowed to use. Previously, this list of signatures was populated by calling _funcs_to_allowed_signatures(FUNCS), and if "signatures" is not specified, then the code defaults back to the results of that method. A new method in the DefaultProfile called get_allowed_join_signatures is introduced and returns a list of function signatures that are allowed within a JOIN clause. The DefaultProfile allows all given signatures, while the HiveProfile only allows for the Equals and And functions, as well as any function that operates over only one column. The reason for these restrictions is that Hive only allows equality joins, does not allow OR operators in the join clause, and has some restrictions on functions that operate over multiple different tables. This last restriction is somewhat subtle; if one side of the equals operator contains a function that operates over two different tables, the other side of the operator cannot contain either of those tables. While it is possible to have functions that take in multiple input parameters, the inputs must be taken from specific tables to prevent Hive from throwing a compile time exception. Adding support for this in qgen code will require significant effort and modification to some core methods (_create_relational_join_condition and _populate_func_with_vals), so it's best to disable these for Hive altogether. Note that the _create_boolean_func_tree still allows for OR operators due to some logic around its "and_or_fill_ratio" variable. The plan is to fix this in a future patch that specifically focuses on removing OR operators from Hive JOIN clauses. Minor change to discrepancy_searcher so that the logs print out "Hive" instead of "Impala" when running against Hive. Testing: * Added a new unit test that ensures the HiveProfile only returns equality joins * Unit tests pass * Tested against Hive locally * Tested against Impala via Leopard * Tested against Impala via the Discrepancy Checker Change-Id: Ibe8832a03cfa0d7ecc293ec6db6db2bcb34ab459 --- M tests/comparison/discrepancy_searcher.py M tests/comparison/query_generator.py M tests/comparison/query_profile.py A tests/comparison/tests/hive/test_hive_create_relational_join_condition.py 4 files changed, 117 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/4419/5 -- To view, visit http://gerrit.cloudera.org:8080/4419 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibe8832a03cfa0d7ecc293ec6db6db2bcb34ab459 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com
[Impala-ASF-CR] IMPALA-4101: qgen: Hive join predicates should only contains equality functions
stak...@cloudera.com has posted comments on this change. Change subject: IMPALA-4101: qgen: Hive join predicates should only contains equality functions .. Patch Set 5: (5 comments) http://gerrit.cloudera.org:8080/#/c/4419/4/tests/comparison/query_generator.py File tests/comparison/query_generator.py: PS4, Line 1369: signature > s/functions/signatures/ Done http://gerrit.cloudera.org:8080/#/c/4419/4/tests/comparison/tests/hive/test_create_join_clause.py File tests/comparison/tests/hive/test_create_join_clause.py: PS4, Line 25: > Nit: this test is hive-specific, so hive should be in the test name. Done PS4, Line 31: > FakeHiveQueryProfile Done PS4, Line 55: > Nit: This isn't a mock object; it's a real QueryGenerator. I would just cal Done PS4, Line 60: > Any thoughts on maybe parametrizing this test to test with different types? I think its worth doing, but would rather do it in a separate test that isn't Hive specific. This test is solely responsible for checking if equality joins are made for Hive or not. I think testing if the join clause is correctly made with different column types would go in a separate test. I'm working on a few unit tests for create_relational_join_condition and _create_boolean_func_tree that are not Hive specific and will incorporate your suggestions there (will be in a separate Gerrit review). -- To view, visit http://gerrit.cloudera.org:8080/4419 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibe8832a03cfa0d7ecc293ec6db6db2bcb34ab459 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4101: qgen: Hive join predicates should only contains equality functions
stak...@cloudera.com has posted comments on this change. Change subject: IMPALA-4101: qgen: Hive join predicates should only contains equality functions .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/4419/5/tests/comparison/query_generator.py File tests/comparison/query_generator.py: Line 1436: if self.profile.only_use_equality_join_predicates(): > I don't think this whole patch is necessary. We just need only_use_equality That was my original approach, but this parameter does not fully work. If and_or_count and relational_count are drained to 0, then the code hits line 1453 and any function could be chosen (including say a < or > operator), which could cause Hive to break. I tried to add this same check to line 1453 but that didn't work because it only allows Equals.signatures, which eliminates the possibility of having any function in the JOIN clause (e.g ABS, FLOOR, CEIL, etc.) -- To view, visit http://gerrit.cloudera.org:8080/4419 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibe8832a03cfa0d7ecc293ec6db6db2bcb34ab459 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database
Hello Michael Brown, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4011 to look at the new patch set (#8). Change subject: IMPALA-3980: qgen: re-enable Hive as a target database .. IMPALA-3980: qgen: re-enable Hive as a target database Changes: * Added hive cli options back in (removed in commit "Stress test: Various changes") * Modifications so that if --use-hive is specified, a Hive connection is actually created * A few minor bug fixes so that the RQG can be run locally * Modified MiniCluster to use HADOOP_CONF_DIR and HIVE_CONF_DIR rather than a hard-coded file under IMPALA_HOME * Fixed fe/src/test/resources/hive-default.xml so that it is a valid XML file, it was missing a few element terminators that cause an exception in the cluster.py file Testing: * Hive integration tested locally by invoking the data generator via the command: ./data-generator.py \ --db-name=functional \ --use-hive \ --min-row-count=50 \ --max-row-count=100 \ --storage-file-formats textfile \ --use-postgresql \ --postgresql-user stakiar and the discrepancy checker via the command: ./discrepancy-checker.py \ --db-name=functional \ --use-hive \ --use-postgresql \ --postgresql-user stakiar \ --test-db-type HIVE \ --timeout 300 \ --query-count 50 \ --profile hive * The output of the above two commands is essentially the same as the Impala output, however, about 20% of the queries will fail when the discrepancy checker is run * Regression testing done by running Leopard in a local VM running Ubuntu 14.04, and by running the discrepancy checker against Impala while inside an Impala Docker container Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 --- M fe/src/test/resources/hive-default.xml M tests/comparison/cli_options.py M tests/comparison/cluster.py M tests/comparison/data_generator.py M tests/comparison/db_connection.py 5 files changed, 91 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/4011/8 -- To view, visit http://gerrit.cloudera.org:8080/4011 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com
[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database
stak...@cloudera.com has posted comments on this change. Change subject: IMPALA-3980: qgen: re-enable Hive as a target database .. Patch Set 8: (4 comments) http://gerrit.cloudera.org:8080/#/c/4011/7/tests/comparison/cluster.py File tests/comparison/cluster.py: Line 100: if result is None: > I think it's a little confusing to check if default is None here. Done Line 185: for file_name in ["hive-site.xml"]: > I'm not sure that modifying the minicluster here is the right thing to do. I run the Hive tests against a local instance of Hive, so setting this to HIVE_CONF_DIR makes development for me much. I think it will also be necessary to run the RQG against Hive in general as Hive-on-RQG should be able to accept a custom hive-default.xml file. I've tested this change against Impala (ran the data_generator, discpreancy_checker, and Leopard) and it works fine. There were a few failures because now the the code uses fe/src/test/resources/hive-default.xml which had some invalid xml clauses. If you think this will cause problems for Impala, I can revert it. However, in that case I would like to create a new Cluster object that can accept generic hive-default.xml files. http://gerrit.cloudera.org:8080/#/c/4011/7/tests/comparison/data_generator.py File tests/comparison/data_generator.py: Line 92: self.cluster = None > How about adding a variable here called, for example, db_engine that can be Done Line 153: with self.cluster.impala.cursor(db_name=self.db_name) as cursor: > We don't need to compute stats if we are using hive? Thats a good point, updated the patch to do this. -- To view, visit http://gerrit.cloudera.org:8080/4011 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database
Hello Michael Brown, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4011 to look at the new patch set (#9). Change subject: IMPALA-3980: qgen: re-enable Hive as a target database .. IMPALA-3980: qgen: re-enable Hive as a target database Changes: * Added hive cli options back in (removed in commit "Stress test: Various changes") * Modifications so that if --use-hive is specified, a Hive connection is actually created * A few minor bug fixes so that the RQG can be run locally * Modified MiniCluster to use HADOOP_CONF_DIR and HIVE_CONF_DIR rather than a hard-coded file under IMPALA_HOME * Fixed fe/src/test/resources/hive-default.xml so that it is a valid XML file, it was missing a few element terminators that cause an exception in the cluster.py file Testing: * Hive integration tested locally by invoking the data generator via the command: ./data-generator.py \ --db-name=functional \ --use-hive \ --min-row-count=50 \ --max-row-count=100 \ --storage-file-formats textfile \ --use-postgresql \ --postgresql-user stakiar and the discrepancy checker via the command: ./discrepancy-checker.py \ --db-name=functional \ --use-hive \ --use-postgresql \ --postgresql-user stakiar \ --test-db-type HIVE \ --timeout 300 \ --query-count 50 \ --profile hive * The output of the above two commands is essentially the same as the Impala output, however, about 20% of the queries will fail when the discrepancy checker is run * Regression testing done by running Leopard in a local VM running Ubuntu 14.04, and by running the discrepancy checker against Impala while inside an Impala Docker container Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 --- M fe/src/test/resources/hive-default.xml M tests/comparison/cli_options.py M tests/comparison/cluster.py M tests/comparison/data_generator.py M tests/comparison/db_connection.py 5 files changed, 92 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/4011/9 -- To view, visit http://gerrit.cloudera.org:8080/4011 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com
[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database
Hello Michael Brown, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4011 to look at the new patch set (#10). Change subject: IMPALA-3980: qgen: re-enable Hive as a target database .. IMPALA-3980: qgen: re-enable Hive as a target database Changes: * Added hive cli options back in (removed in commit "Stress test: Various changes") * Modifications so that if --use-hive is specified, a Hive connection is actually created * A few minor bug fixes so that the RQG can be run locally * Modified MiniCluster to use HADOOP_CONF_DIR and HIVE_CONF_DIR rather than a hard-coded file under IMPALA_HOME * Fixed fe/src/test/resources/hive-default.xml so that it is a valid XML file, it was missing a few element terminators that cause an exception in the cluster.py file Testing: * Hive integration tested locally by invoking the data generator via the command: ./data-generator.py \ --db-name=functional \ --use-hive \ --min-row-count=50 \ --max-row-count=100 \ --storage-file-formats textfile \ --use-postgresql \ --postgresql-user stakiar and the discrepancy checker via the command: ./discrepancy-checker.py \ --db-name=functional \ --use-hive \ --use-postgresql \ --postgresql-user stakiar \ --test-db-type HIVE \ --timeout 300 \ --query-count 50 \ --profile hive * The output of the above two commands is essentially the same as the Impala output, however, about 20% of the queries will fail when the discrepancy checker is run * Regression testing done by running Leopard in a local VM running Ubuntu 14.04, and by running the discrepancy checker against Impala while inside an Impala Docker container Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 --- M fe/src/test/resources/hive-default.xml M tests/comparison/cli_options.py M tests/comparison/cluster.py M tests/comparison/data_generator.py M tests/comparison/db_connection.py 5 files changed, 92 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/4011/10 -- To view, visit http://gerrit.cloudera.org:8080/4011 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com
[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database
stak...@cloudera.com has posted comments on this change. Change subject: IMPALA-3980: qgen: re-enable Hive as a target database .. Patch Set 8: @Taras, comments addressed, patch updated and re-tested. -- To view, visit http://gerrit.cloudera.org:8080/4011 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4101: qgen: Hive join predicates should only contains equality functions
Hello Michael Brown, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4419 to look at the new patch set (#6). Change subject: IMPALA-4101: qgen: Hive join predicates should only contains equality functions .. IMPALA-4101: qgen: Hive join predicates should only contains equality functions Background: Hive only supports equi-joins in its JOIN clause, while Postgres and Impala support more complex functions such as <, <=, >, >=, etc. This change modifies the QueryGenerator._create_relational_join_condition and QueryGenerator._create_boolean_func_tree methods to only construct equality join conditions under certain conditions. The _create_boolean_func_tree method is invoked via QueryGenerator -> create_query -> _create_from_clause -> _create_join_clause -> _create_relational_join_condition -> _create_boolean_func_tree. This method is invoked when constructing the JOIN, WHERE, and HAVING clauses. It creates a tree of functions that would typically be found in any of these clauses. Changes: The parameter "signatures" is added to the method _create_boolean_func_tree, and it lists out all the allowed signatures the function is allowed to use. Previously, this list of signatures was populated by calling _funcs_to_allowed_signatures(FUNCS), and if "signatures" is not specified, then the code defaults back to the results of that method. A new method in the DefaultProfile called get_allowed_join_signatures is introduced and returns a list of function signatures that are allowed within a JOIN clause. The DefaultProfile allows all given signatures, while the HiveProfile only allows for the Equals and And functions, as well as any function that operates over only one column. The reason for these restrictions is that Hive only allows equality joins, does not allow OR operators in the join clause, and has some restrictions on functions that operate over multiple different tables. This last restriction is somewhat subtle; if one side of the equals operator contains a function that operates over two different tables, the other side of the operator cannot contain either of those tables. While it is possible to have functions that take in multiple input parameters, the inputs must be taken from specific tables to prevent Hive from throwing a compile time exception. Adding support for this in qgen code will require significant effort and modification to some core methods (_create_relational_join_condition and _populate_func_with_vals), so it's best to disable these for Hive altogether. Note that the _create_boolean_func_tree still allows for OR operators due to some logic around its "and_or_fill_ratio" variable. The plan is to fix this in a future patch that specifically focuses on removing OR operators from Hive JOIN clauses. Minor change to discrepancy_searcher so that the logs print out "Hive" instead of "Impala" when running against Hive. Testing: * Added a new unit test that ensures the HiveProfile only returns equality joins * Unit tests pass * Tested against Hive locally * Tested against Impala via Leopard * Tested against Impala via the Discrepancy Checker Change-Id: Ibe8832a03cfa0d7ecc293ec6db6db2bcb34ab459 --- M tests/comparison/discrepancy_searcher.py M tests/comparison/query_generator.py M tests/comparison/query_profile.py A tests/comparison/tests/hive/test_hive_create_relational_join_condition.py 4 files changed, 137 insertions(+), 19 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/4419/6 -- To view, visit http://gerrit.cloudera.org:8080/4419 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibe8832a03cfa0d7ecc293ec6db6db2bcb34ab459 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com
[Impala-ASF-CR] IMPALA-4101: qgen: Hive join predicates should only contains equality functions
stak...@cloudera.com has posted comments on this change. Change subject: IMPALA-4101: qgen: Hive join predicates should only contains equality functions .. Patch Set 6: (1 comment) @Taras, comments addressed, new changes have been tested. http://gerrit.cloudera.org:8080/#/c/4419/5/tests/comparison/query_generator.py File tests/comparison/query_generator.py: Line 1436: signature = self.profile.choose_func_signature(func_class.signatures()) > Ok, I see. This means that currency only_use_equality_join_predicates is no I made the fix for only_use_equality_join_predicates, but I moved the check outside the _create_boolean_func_tree function. I think its simpler if _create_boolean_func_tree just operates on a given list of signatures, and the caller decides what signatures those should be. I defined a list of non-equality join predicates as (GreaterThan, GreaterThanOrEquals, In, IsNotDistinctFrom, IsNotDistinctFromOp, LessThan, LessThanOrEquals, NotEquals, NotIn). So basically, the check will exclude any of these functions from the func tree. I left the get_allowed_join_signatures method because Hive has additional restrictions in addition to only allowing equi-joins (see commit message for a detailed explanation). -- To view, visit http://gerrit.cloudera.org:8080/4419 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibe8832a03cfa0d7ecc293ec6db6db2bcb34ab459 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4101: qgen: Hive join predicates should only contains equality functions
Hello Michael Brown, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4419 to look at the new patch set (#7). Change subject: IMPALA-4101: qgen: Hive join predicates should only contains equality functions .. IMPALA-4101: qgen: Hive join predicates should only contains equality functions Background: Hive only supports equi-joins in its JOIN clause, while Postgres and Impala support more complex functions such as <, <=, >, >=, etc. This change modifies the QueryGenerator._create_relational_join_condition and QueryGenerator._create_boolean_func_tree methods to only construct equality join conditions under certain conditions. The _create_boolean_func_tree method is invoked via QueryGenerator -> create_query -> _create_from_clause -> _create_join_clause -> _create_relational_join_condition -> _create_boolean_func_tree. This method is invoked when constructing the JOIN, WHERE, and HAVING clauses. It creates a tree of functions that would typically be found in any of these clauses. Changes: The parameter "signatures" is added to the method _create_boolean_func_tree, and it lists out all the allowed signatures the function is allowed to use. Previously, this list of signatures was populated by calling _funcs_to_allowed_signatures(FUNCS), and if "signatures" is not specified, then the code defaults back to the results of that method. A new method in the DefaultProfile called get_allowed_join_signatures is introduced and returns a list of function signatures that are allowed within a JOIN clause. The DefaultProfile allows all given signatures, while the HiveProfile only allows for the Equals and And functions, as well as any function that operates over only one column. The reason for these restrictions is that Hive only allows equality joins, does not allow OR operators in the join clause, and has some restrictions on functions that operate over multiple different tables. This last restriction is somewhat subtle; if one side of the equals operator contains a function that operates over two different tables, the other side of the operator cannot contain either of those tables. While it is possible to have functions that take in multiple input parameters, the inputs must be taken from specific tables to prevent Hive from throwing a compile time exception. Adding support for this in qgen code will require significant effort and modification to some core methods (_create_relational_join_condition and _populate_func_with_vals), so it's best to disable these for Hive altogether. Note that the _create_boolean_func_tree still allows for OR operators due to some logic around its "and_or_fill_ratio" variable. The plan is to fix this in a future patch that specifically focuses on removing OR operators from Hive JOIN clauses. Minor change to discrepancy_searcher so that the logs print out "Hive" instead of "Impala" when running against Hive. Testing: * Added a new unit test that ensures the HiveProfile only returns equality joins * Unit tests pass * Tested against Hive locally * Tested against Impala via Leopard * Tested against Impala via the Discrepancy Checker Change-Id: Ibe8832a03cfa0d7ecc293ec6db6db2bcb34ab459 --- M tests/comparison/discrepancy_searcher.py M tests/comparison/query_generator.py M tests/comparison/query_profile.py A tests/comparison/tests/hive/test_hive_create_relational_join_condition.py 4 files changed, 137 insertions(+), 19 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/4419/7 -- To view, visit http://gerrit.cloudera.org:8080/4419 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibe8832a03cfa0d7ecc293ec6db6db2bcb34ab459 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com
[Impala-ASF-CR] IMPALA-3980: qgen: re-enable Hive as a target database
Hello Michael Brown, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4011 to look at the new patch set (#11). Change subject: IMPALA-3980: qgen: re-enable Hive as a target database .. IMPALA-3980: qgen: re-enable Hive as a target database Changes: * Added hive cli options back in (removed in commit "Stress test: Various changes") * Modifications so that if --use-hive is specified, a Hive connection is actually created * A few minor bug fixes so that the RQG can be run locally * Modified MiniCluster to use HADOOP_CONF_DIR and HIVE_CONF_DIR rather than a hard-coded file under IMPALA_HOME * Fixed fe/src/test/resources/hive-default.xml so that it is a valid XML file, it was missing a few element terminators that cause an exception in the cluster.py file Testing: * Hive integration tested locally by invoking the data generator via the command: ./data-generator.py \ --db-name=functional \ --use-hive \ --min-row-count=50 \ --max-row-count=100 \ --storage-file-formats textfile \ --use-postgresql \ --postgresql-user stakiar and the discrepancy checker via the command: ./discrepancy-checker.py \ --db-name=functional \ --use-hive \ --use-postgresql \ --postgresql-user stakiar \ --test-db-type HIVE \ --timeout 300 \ --query-count 50 \ --profile hive * The output of the above two commands is essentially the same as the Impala output, however, about 20% of the queries will fail when the discrepancy checker is run * Regression testing done by running Leopard in a local VM running Ubuntu 14.04, and by running the discrepancy checker against Impala while inside an Impala Docker container Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 --- M fe/src/test/resources/hive-default.xml M tests/comparison/cli_options.py M tests/comparison/cluster.py M tests/comparison/data_generator.py M tests/comparison/db_connection.py 5 files changed, 126 insertions(+), 24 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/4011/11 -- To view, visit http://gerrit.cloudera.org:8080/4011 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifb1199b50a5b65c21de7876fb70cc03bda1a9b46 Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com
[Impala-ASF-CR] IMPALA-4101: qgen: Hive join predicates should only contains equality functions
Hello Michael Brown, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4419 to look at the new patch set (#8). Change subject: IMPALA-4101: qgen: Hive join predicates should only contains equality functions .. IMPALA-4101: qgen: Hive join predicates should only contains equality functions Background: Hive only supports equi-joins in its JOIN clause, while Postgres and Impala support more complex functions such as <, <=, >, >=, etc. This change modifies the QueryGenerator._create_relational_join_condition and QueryGenerator._create_boolean_func_tree methods to only construct equality join conditions under certain conditions. The _create_boolean_func_tree method is invoked via QueryGenerator -> create_query -> _create_from_clause -> _create_join_clause -> _create_relational_join_condition -> _create_boolean_func_tree. This method is invoked when constructing the JOIN, WHERE, and HAVING clauses. It creates a tree of functions that would typically be found in any of these clauses. Changes: The parameter "signatures" is added to the method _create_boolean_func_tree, and it lists out all the allowed signatures the function is allowed to use. Previously, this list of signatures was populated by calling _funcs_to_allowed_signatures(FUNCS), and if "signatures" is not specified, then the code defaults back to the results of that method. A new method in the DefaultProfile called get_allowed_join_signatures is introduced and returns a list of function signatures that are allowed within a JOIN clause. The DefaultProfile allows all given signatures, while the HiveProfile only allows for the Equals and And functions, as well as any function that operates over only one column. The reason for these restrictions is that Hive only allows equality joins, does not allow OR operators in the join clause, and has some restrictions on functions that operate over multiple different tables. This last restriction is somewhat subtle; if one side of the equals operator contains a function that operates over two different tables, the other side of the operator cannot contain either of those tables. While it is possible to have functions that take in multiple input parameters, the inputs must be taken from specific tables to prevent Hive from throwing a compile time exception. Adding support for this in qgen code will require significant effort and modification to some core methods (_create_relational_join_condition and _populate_func_with_vals), so it's best to disable these for Hive altogether. Note that the _create_boolean_func_tree still allows for OR operators due to some logic around its "and_or_fill_ratio" variable. The plan is to fix this in a future patch that specifically focuses on removing OR operators from Hive JOIN clauses. Minor change to discrepancy_searcher so that the logs print out "Hive" instead of "Impala" when running against Hive. Testing: * Added a new unit test that ensures the HiveProfile only returns equality joins * Unit tests pass * Tested against Hive locally * Tested against Impala via Leopard * Tested against Impala via the Discrepancy Checker Change-Id: Ibe8832a03cfa0d7ecc293ec6db6db2bcb34ab459 --- M tests/comparison/discrepancy_searcher.py M tests/comparison/query_generator.py M tests/comparison/query_profile.py A tests/comparison/tests/hive/test_hive_create_relational_join_condition.py 4 files changed, 140 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/4419/8 -- To view, visit http://gerrit.cloudera.org:8080/4419 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibe8832a03cfa0d7ecc293ec6db6db2bcb34ab459 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com
[Impala-ASF-CR] IMPALA-4101: qgen: Hive join predicates should only contains equality functions
stak...@cloudera.com has posted comments on this change. Change subject: IMPALA-4101: qgen: Hive join predicates should only contains equality functions .. Patch Set 8: (4 comments) http://gerrit.cloudera.org:8080/#/c/4419/7/tests/comparison/query_generator.py File tests/comparison/query_generator.py: Line 1214: self.profile.is_non_equality_join_predicate(sig.func)] > This if statement should be above line 1211 and line 1211 should go into th Done PS7, Line 1216: join > there should a 4 space indentation here Done PS7, Line 1356: allowed_si > I think it would be clearer if this was renamed to allowed_signatures Done Line 1442: signature = self.profile.choose_relational_func_signature(relational_signatures) > I like this. This means that we are not calling the only_use_equality_join_ Yeah, only_use_equality_join_predicates() is never called inside _create_boolean_func_tree() which I think makes more sense since _create_boolean_func_tree() should be agnostic to whether or not it is being used to create a join predicate. -- To view, visit http://gerrit.cloudera.org:8080/4419 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibe8832a03cfa0d7ecc293ec6db6db2bcb34ab459 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4142: qgen: Hive does not support CTEs inside sub-query blocks
stak...@cloudera.com has uploaded a new change for review. http://gerrit.cloudera.org:8080/4534 Change subject: IMPALA-4142: qgen: Hive does not support CTEs inside sub-query blocks .. IMPALA-4142: qgen: Hive does not support CTEs inside sub-query blocks Changes: Hive does not support WITH clauses inside sub-query blocks, the DefaultProfile actually already has an option to disabled this called "use_nested_with" and the HiveProfile returns False for this method. However, there is a bug in the usage of "use_nested_with"; it is not checked everytime a sub-query is created. Specifically, it is not checked when an in-line view is created, or when the query within a WITH clause is created (so it's possible to create WITH ... AS (WITH ...)). This patch fixes both of these issues since Hive throws a SQL compile exception for either case. Testing: * Added new unit tests * All unit tests pass * Tested locally against Hive * Tested against Impala via Leopard * Tested against Impala via the discrepancy checker Change-Id: Ie585bbbd88a32f502457a4ca9f2f54c7e1ade354 --- M tests/comparison/query_generator.py A tests/comparison/tests/test_use_nested_with.py 2 files changed, 72 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/4534/1 -- To view, visit http://gerrit.cloudera.org:8080/4534 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ie585bbbd88a32f502457a4ca9f2f54c7e1ade354 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com
[Impala-ASF-CR] IMPALA-4101: qgen: Hive join predicates should only contains equality functions
Hello Michael Brown, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4419 to look at the new patch set (#9). Change subject: IMPALA-4101: qgen: Hive join predicates should only contains equality functions .. IMPALA-4101: qgen: Hive join predicates should only contains equality functions Background: Hive only supports equi-joins in its JOIN clause, while Postgres and Impala support more complex functions such as <, <=, >, >=, etc. This change modifies the QueryGenerator._create_relational_join_condition and QueryGenerator._create_boolean_func_tree methods to only construct equality join conditions under certain conditions. The _create_boolean_func_tree method is invoked via QueryGenerator -> create_query -> _create_from_clause -> _create_join_clause -> _create_relational_join_condition -> _create_boolean_func_tree. This method is invoked when constructing the JOIN, WHERE, and HAVING clauses. It creates a tree of functions that would typically be found in any of these clauses. Changes: The parameter "signatures" is added to the method _create_boolean_func_tree, and it lists out all the allowed signatures the function is allowed to use. Previously, this list of signatures was populated by calling _funcs_to_allowed_signatures(FUNCS), and if "signatures" is not specified, then the code defaults back to the results of that method. A new method in the DefaultProfile called get_allowed_join_signatures is introduced and returns a list of function signatures that are allowed within a JOIN clause. The DefaultProfile allows all given signatures, while the HiveProfile only allows for the Equals and And functions, as well as any function that operates over only one column. The reason for these restrictions is that Hive only allows equality joins, does not allow OR operators in the join clause, and has some restrictions on functions that operate over multiple different tables. This last restriction is somewhat subtle; if one side of the equals operator contains a function that operates over two different tables, the other side of the operator cannot contain either of those tables. While it is possible to have functions that take in multiple input parameters, the inputs must be taken from specific tables to prevent Hive from throwing a compile time exception. Adding support for this in qgen code will require significant effort and modification to some core methods (_create_relational_join_condition and _populate_func_with_vals), so it's best to disable these for Hive altogether. Note that the _create_boolean_func_tree still allows for OR operators due to some logic around its "and_or_fill_ratio" variable. The plan is to fix this in a future patch that specifically focuses on removing OR operators from Hive JOIN clauses. Minor change to discrepancy_searcher so that the logs print out "Hive" instead of "Impala" when running against Hive. Testing: * Added a new unit test that ensures the HiveProfile only returns equality joins * Unit tests pass * Tested against Hive locally * Tested against Impala via Leopard * Tested against Impala via the Discrepancy Checker Change-Id: Ibe8832a03cfa0d7ecc293ec6db6db2bcb34ab459 --- M tests/comparison/discrepancy_searcher.py M tests/comparison/query_generator.py M tests/comparison/query_profile.py A tests/comparison/tests/hive/test_hive_create_relational_join_condition.py 4 files changed, 139 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/4419/9 -- To view, visit http://gerrit.cloudera.org:8080/4419 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibe8832a03cfa0d7ecc293ec6db6db2bcb34ab459 Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: stak...@cloudera.com
[Impala-ASF-CR] IMPALA-4142: qgen: Hive does not support CTEs inside sub-query blocks
stak...@cloudera.com has uploaded a new patch set (#2). Change subject: IMPALA-4142: qgen: Hive does not support CTEs inside sub-query blocks .. IMPALA-4142: qgen: Hive does not support CTEs inside sub-query blocks Changes: Hive does not support WITH clauses inside sub-query blocks, the DefaultProfile actually already has an option to disable this called "use_nested_with" and the HiveProfile returns False for this method. However, there is a bug in the usage of "use_nested_with"; it is not checked every time a sub-query is created. Specifically, it is not checked when an in-line view is created, or when the query within a WITH clause is created (so it's possible to create WITH ... AS (WITH ...)). This patch fixes both of these issues since Hive throws a SQL compile exception for either case. Testing: * Added new unit tests * All unit tests pass * Tested locally against Hive * Tested against Impala via Leopard * Tested against Impala via the discrepancy checker Change-Id: Ie585bbbd88a32f502457a4ca9f2f54c7e1ade354 --- M tests/comparison/query_generator.py A tests/comparison/tests/test_use_nested_with.py 2 files changed, 72 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/4534/2 -- To view, visit http://gerrit.cloudera.org:8080/4534 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie585bbbd88a32f502457a4ca9f2f54c7e1ade354 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: stak...@cloudera.com Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-4042: count(distinct NULL) fails on a view
Anonymous Coward #216 has posted comments on this change. Change subject: IMPALA-4042: count(distinct NULL) fails on a view .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/4600/2/fe/src/main/java/org/apache/impala/analysis/Expr.java File fe/src/main/java/org/apache/impala/analysis/Expr.java: Line 734: if (smap == null || this instanceof NullLiteral) return result; > This invalidates the comment above that when smap is null, this function is I thought that NullLiteral would not require substitution and, hence, simply clone/return for NullLiteral. Could you explain when NullLiteral requires substitution, for me to have better understanding on the code? In the meantime, let me follow source code to figure out where I can leverage preserveRootType. -- To view, visit http://gerrit.cloudera.org:8080/4600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Icf3b4511234e473e5b9548fbf3e97f333c9980f1 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun Hwang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Anonymous Coward #216 Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Michael Ho Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4042: count(distinct NULL) fails on a view
Anonymous Coward #216 has posted comments on this change. Change subject: IMPALA-4042: count(distinct NULL) fails on a view .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/4600/2/fe/src/main/java/org/apache/impala/analysis/Expr.java File fe/src/main/java/org/apache/impala/analysis/Expr.java: Line 734: if (smap == null || this instanceof NullLiteral) return result; > I don't think it's particularly relevant whether we need to substitute a Nu Agreed w/ the design for Expr. I am updating the fix to use preserveRootType. Thank you for the clarification. :) -- To view, visit http://gerrit.cloudera.org:8080/4600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Icf3b4511234e473e5b9548fbf3e97f333c9980f1 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun Hwang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Anonymous Coward #216 Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Michael Ho Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4431: Add audit event log control mechanism to prevent disk overflow.
davy...@163.com has uploaded a new change for review. http://gerrit.cloudera.org:8080/4971 Change subject: IMPALA-4431: Add audit event log control mechanism to prevent disk overflow. .. IMPALA-4431: Add audit event log control mechanism to prevent disk overflow. There is no limit to the number of audit event log files. When audit event log is enabled, the growing number of log files will spill the disk space. This patch adds checking and rotation mechanism on audit event log files to prevent file number out of control which causes disk overflow. Change-Id: I8c3229cbdb6275f969c15258c9ccab6efeb24368 --- M be/src/common/init.cc M be/src/common/logging.cc M be/src/common/logging.h 3 files changed, 27 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/4971/1 -- To view, visit http://gerrit.cloudera.org:8080/4971 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I8c3229cbdb6275f969c15258c9ccab6efeb24368 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: davy...@163.com
[Impala-ASF-CR] IMPALA-4431: Add audit event log control mechanism to prevent disk overflow.
davy...@163.com has uploaded a new patch set (#2). Change subject: IMPALA-4431: Add audit event log control mechanism to prevent disk overflow. .. IMPALA-4431: Add audit event log control mechanism to prevent disk overflow. There is no limit to the number of audit event log files. When audit event log is enabled,the growing number of log files will fill up the disk space. This patch adds checking and rotation mechanism on audit event log files to prevent file number out of control which causes disk overflow. Change-Id: I8c3229cbdb6275f969c15258c9ccab6efeb24368 --- M be/src/common/init.cc M be/src/common/logging.cc M be/src/common/logging.h 3 files changed, 28 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/4971/2 -- To view, visit http://gerrit.cloudera.org:8080/4971 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8c3229cbdb6275f969c15258c9ccab6efeb24368 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: davy...@163.com Gerrit-Reviewer: Lars Volker
[Impala-ASF-CR] IMPALA-3167: Fix assignment of WHERE-clause predicate through grouping agg + outer join.
Anonymous Coward #27 has posted comments on this change. Change subject: IMPALA-3167: Fix assignment of WHERE-clause predicate through grouping agg + outer join. .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/4960/1/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: PS1, Line 1426: hasOjTest Where is this function used? http://gerrit.cloudera.org:8080/#/c/4960/1/fe/src/main/java/org/apache/impala/analysis/Expr.java File fe/src/main/java/org/apache/impala/analysis/Expr.java: PS1, Line 851: public static void removeDuplicates(List l) Suggest to refactor this function a bit like below (if possible): public static List removeDuplicates(List l) { if (l == null) { return l; } else { return Lists.newArrayList(Sets.newHashSet(l)); } } This makes this function thread-safe. The passed in list won't be modified, and thus can be safely shared among threads. PS1, Line 853: List origList = Lists.newArrayList(l); : l.clear(); : for (C expr: origList) if (!l.contains(expr)) l.add(expr); Would it be better to use a Set instead of List? Set s = Sets.newHashSet(l); l.clear(); l.addAll(s); -- To view, visit http://gerrit.cloudera.org:8080/4960 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I774d13a13ad1e8fe82512df98dc29983bdd232eb Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Anonymous Coward #27 Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3167: Fix assignment of WHERE-clause predicate through grouping agg + outer join.
Anonymous Coward #27 has posted comments on this change. Change subject: IMPALA-3167: Fix assignment of WHERE-clause predicate through grouping agg + outer join. .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/4960/1/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: PS1, Line 1558: evalByJoin This variable name is a bit confusing to me. Is "evalAfterJoin" better? -- To view, visit http://gerrit.cloudera.org:8080/4960 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I774d13a13ad1e8fe82512df98dc29983bdd232eb Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Anonymous Coward #27 Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3398: Add docs to main Impala branch.
Anonymous Coward #250 has posted comments on this change. Change subject: IMPALA-3398: Add docs to main Impala branch. .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/5014/1/docs/generatingImpalaDoc.md File docs/generatingImpalaDoc.md: PS1, Line 52: : #Setting JAVA\_HOME : : : Set your JAVA\_HOME environment variable to tell your computer where to find the Java executable file. For example, to set your JAVA\_HOME environment on Mac OS X when you the the 1.8.0\_101 version of the Java Development Kit (JDK) installed and you are using the Bash version 3.2 shell, perform the following steps: : : 1. Edit your /Users/\/.bash\_profile file and add the following lines to the end of the file: : : #Set JAVA_HOME : JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home : export JAVA_HOME; : :Where jdk1.8.0\_101.jdk is the version of JDK that you have installed. For example, if you have installed jdk1.8.0\_102.jdk, you would use that value instead. : : 2. Test to make sure you have set your JAVA\_HOME correctly: : * Open a terminal window and type: $JAVA\_HOME/bin/java -version : * Press return. If you see something like the following: : java version "1.5.0_16" : Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_16-b06-284) : Java HotSpot (TM) Client VM (build 1.5.0\_16-133, mixed mode, sharing) : : Then you've successfully set your JAVA\_HOME environment variable to the binary stored in /Library/Java/JavaVirtualMachines/jdk1.8.0\_101.jdk/Contents/Home. This section must be reviewed for accuracy of markdown tagging. -- To view, visit http://gerrit.cloudera.org:8080/5014 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8861e99adc446f659a04463ca78c79200669484f Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Anonymous Coward #250 Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: John Russell Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3973: optional 2nd and 3rd arguments for instr().
Anonymous Coward #268 has posted comments on this change. Change subject: IMPALA-3973: optional 2nd and 3rd arguments for instr(). .. Patch Set 2: Code-Review+1 Thanks for the doc update! -- To view, visit http://gerrit.cloudera.org:8080/5589 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I17268bdb480230938f94559fe1eabe34ac2448b7 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Anonymous Coward #268 Gerrit-Reviewer: Zoltan Ivanfi Gerrit-HasComments: No
[Impala-ASF-CR] DOCS-1757
ambreen.k...@cloudera.com has uploaded a new change for review. http://gerrit.cloudera.org:8080/5594 Change subject: DOCS-1757 .. DOCS-1757 Change-Id: Ibf5f387fa07cf988a99b9ede4066bfd2b27afe16 --- M docs/topics/impala_authorization.xml 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/5594/1 -- To view, visit http://gerrit.cloudera.org:8080/5594 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ibf5f387fa07cf988a99b9ede4066bfd2b27afe16 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: ambreen.k...@cloudera.com
[Impala-ASF-CR] Remove audience="Cloudera" from DITAVal, replace with audience="hidden".
Anonymous Coward #250 has posted comments on this change. Change subject: Remove audience="Cloudera" from DITAVal, replace with audience="hidden". .. Patch Set 1: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/5591 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0771ccf912d8112194ad52a7fa76b092ea6cff72 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Ambreen Kazi Gerrit-Reviewer: Anonymous Coward #250 Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: John Russell Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4036: show create table outputs invalid SQL for partitioned tables with comments
joemcdonn...@cloudera.com has uploaded a new change for review. http://gerrit.cloudera.org:8080/5648 Change subject: IMPALA-4036: show create table outputs invalid SQL for partitioned tables with comments .. IMPALA-4036: show create table outputs invalid SQL for partitioned tables with comments For a table that has both a table comment and a partition specified, "show create table" incorrectly outputs the comment before the partition. This is not the correct order, and it results in an invalid SQL. This transaction fixes the ordering (partition comes before comment) and adds tests for this case. Change-Id: Iccf9488e8b1d28fcaf3a40f935157c9dd792812e Update code style of ToSqlTest.java Change-Id: I29a33cfd142b473997fdc3acfe3f0966bc7ed784 --- M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test 3 files changed, 42 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/5648/1 -- To view, visit http://gerrit.cloudera.org:8080/5648 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I29a33cfd142b473997fdc3acfe3f0966bc7ed784 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: joemcdonn...@cloudera.com
[Impala-ASF-CR] IMPALA-4036: show create table outputs invalid SQL for partitioned table with comment
joemcdonn...@cloudera.com has uploaded a new patch set (#2). Change subject: IMPALA-4036: show create table outputs invalid SQL for partitioned table with comment .. IMPALA-4036: show create table outputs invalid SQL for partitioned table with comment For a table that has both a table comment and a partition specified, "show create table" incorrectly outputs the comment before the partition. This is not the correct order, and it results in an invalid SQL. This transaction fixes the ordering (partition comes before comment) and adds tests for this case. Change-Id: I29a33cfd142b473997fdc3acfe3f0966bc7ed784 --- M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test 3 files changed, 42 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/5648/2 -- To view, visit http://gerrit.cloudera.org:8080/5648 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I29a33cfd142b473997fdc3acfe3f0966bc7ed784 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: joemcdonn...@cloudera.com Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-4739: ExprRewriter fails on HAVING clauses
zams...@cloudera.com has posted comments on this change. Change subject: IMPALA-4739: ExprRewriter fails on HAVING clauses .. Patch Set 6: (2 comments) I realize this is already merged - simply learning how to use gerrit and I actually did have two comments. http://gerrit.cloudera.org:8080/#/c/5662/6/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: PS6, Line 1065: ; Nit: spurious semi-colon http://gerrit.cloudera.org:8080/#/c/5662/6/testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test File testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test: PS6, Line 67: sum(2 + id) > 1, sum(2 + id) >= 5 Looks like further opportunities for redundant subexpression elimination remain, although this one is a bit more difficult. Is there already a JIRA to track this? -- To view, visit http://gerrit.cloudera.org:8080/5662 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ife74c61f549f620c42f74928f6474e8a5a7b7f00 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel Kornacker Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: zams...@cloudera.com Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3977: TransmitData() should not block
Anonymous Coward #168 has posted comments on this change. Change subject: IMPALA-3977: TransmitData() should not block .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/5491/4/be/src/runtime/data-stream-mgr.cc File be/src/runtime/data-stream-mgr.cc: Line 113: return Status(TErrorCode::DATASTREAM_RECEIVER_EAGAIN, PrintId(fragment_instance_id)); did you consider this the case: ReachLimit() then Recvr deregister . then some sender first rpc coming . -- To view, visit http://gerrit.cloudera.org:8080/5491 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I47f9d6c1de95a72ab73471dcd2a3118ef13e7db0 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Anonymous Coward #168 Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5511: Add process start time to debug web page
gaborkas...@cloudera.com has uploaded a new change for review. http://gerrit.cloudera.org:8080/7363 Change subject: IMPALA-5511: Add process start time to debug web page .. IMPALA-5511: Add process start time to debug web page Read the start date and time of the impalad, catalogd and statestored processes for the Debug Web UI. Uses the stat command on the /proc/ directory and format the date with the date command to local time format. Change-Id: I05ae2f80835b1b0e4bc3b38731778ba0871338a4 --- M be/src/util/proc-info-test.cc M be/src/util/process-state-info.cc M be/src/util/process-state-info.h 3 files changed, 27 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/7363/1 -- To view, visit http://gerrit.cloudera.org:8080/7363 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I05ae2f80835b1b0e4bc3b38731778ba0871338a4 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: gaborkas...@cloudera.com
[Impala-ASF-CR] IMPALA-2782: Allow impala-shell to connect directly to impalad when configured with load balancer and kerberos.
a...@phdata.io has posted comments on this change. Change subject: IMPALA-2782: Allow impala-shell to connect directly to impalad when configured with load balancer and kerberos. .. Patch Set 2: I think we should use impalad[1] for the port still since then you'll still get the default 21000 if you specify nothing and its less you would have to enter for the -b flag. I also question whether this should be called something like kerberos_host_name so its more inline with kerberos_service_name change. host, port = self.lb.encode('ascii', 'ignore'), int(self.impalad[1]) -- To view, visit http://gerrit.cloudera.org:8080/7241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4726226a7a3817421b133f74dd4f4cf8c52135f9 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Vincent Tran Gerrit-Reviewer: Vincent Tran Gerrit-Reviewer: a...@phdata.io Gerrit-HasComments: No
[Impala-ASF-CR] CDH-56987: Update sentry version to 2.0.0-cdh6.x-SNAPSHOT in impala
kkal...@cloudera.com has uploaded a new change for review. http://gerrit.cloudera.org:8080/7507 Change subject: CDH-56987: Update sentry version to 2.0.0-cdh6.x-SNAPSHOT in impala .. CDH-56987: Update sentry version to 2.0.0-cdh6.x-SNAPSHOT in impala Conflicts: bin/impala-config-branch.sh Change-Id: I666809a37cdaca28dc754c9b2ff7830b3ff475b7 --- M bin/impala-config-branch.sh 1 file changed, 2 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/7507/2 -- To view, visit http://gerrit.cloudera.org:8080/7507 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I666809a37cdaca28dc754c9b2ff7830b3ff475b7 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: kkal...@cloudera.com
[Impala-ASF-CR] CDH-56988: Update the sentry dependencies to reflect the new version of sentry
kkal...@cloudera.com has uploaded a new change for review. http://gerrit.cloudera.org:8080/7508 Change subject: CDH-56988: Update the sentry dependencies to reflect the new version of sentry .. CDH-56988: Update the sentry dependencies to reflect the new version of sentry Change-Id: Ib201b5cb0aed10fd0449c2075b8022a752525cad --- M fe/pom.xml 1 file changed, 2 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/08/7508/1 -- To view, visit http://gerrit.cloudera.org:8080/7508 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ib201b5cb0aed10fd0449c2075b8022a752525cad Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: kkal...@cloudera.com
[Impala-ASF-CR] CDH-56990: Fix impala code as sentry-policy-db package is removed in sentry 2.0.0-cdh6.x-SNAPSHOT
kkal...@cloudera.com has uploaded a new change for review. http://gerrit.cloudera.org:8080/7509 Change subject: CDH-56990: Fix impala code as sentry-policy-db package is removed in sentry 2.0.0-cdh6.x-SNAPSHOT .. CDH-56990: Fix impala code as sentry-policy-db package is removed in sentry 2.0.0-cdh6.x-SNAPSHOT Change-Id: If0807832e410731b16b63d17beffbc56f0c6584d --- M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java M fe/src/main/java/org/apache/impala/util/SentryPolicyService.java 2 files changed, 5 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/7509/1 -- To view, visit http://gerrit.cloudera.org:8080/7509 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: If0807832e410731b16b63d17beffbc56f0c6584d Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: kkal...@cloudera.com
[Impala-ASF-CR] CDH-56991: Implement new API's in AuthorizationPolicy
kkal...@cloudera.com has uploaded a new change for review. http://gerrit.cloudera.org:8080/7510 Change subject: CDH-56991: Implement new API's in AuthorizationPolicy .. CDH-56991: Implement new API's in AuthorizationPolicy Change-Id: Idce534216efc48ca0b9374ff358de95ff8e173d6 --- M fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java 1 file changed, 22 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/7510/1 -- To view, visit http://gerrit.cloudera.org:8080/7510 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Idce534216efc48ca0b9374ff358de95ff8e173d6 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: kkal...@cloudera.com
[Impala-ASF-CR] Problem: Improve error message when subquery is used in the ON clause
psi...@cloudera.com has uploaded a new change for review. http://gerrit.cloudera.org:8080/7588 Change subject: Problem: Improve error message when subquery is used in the ON clause .. Problem: Improve error message when subquery is used in the ON clause Change-Id: I0d1dc47987de7ea04402e1ead31d81cddf2f96f2 Fix: Print the error stating that "Suquery not allowed in the ON clause" --- M fe/src/main/java/org/apache/impala/analysis/TableRef.java 1 file changed, 4 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/7588/1 -- To view, visit http://gerrit.cloudera.org:8080/7588 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I0d1dc47987de7ea04402e1ead31d81cddf2f96f2 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: psi...@cloudera.com
[Impala-ASF-CR] IMPALA-4252: Min-max runtime filters for Kudu
Anonymous Coward #345 has posted comments on this change. ( http://gerrit.cloudera.org:8080/7793 ) Change subject: IMPALA-4252: Min-max runtime filters for Kudu .. Patch Set 6: (1 comment) link error while not include this inline head file. http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.cc File be/src/util/min-max-filter.cc: http://gerrit.cloudera.org:8080/#/c/7793/6/be/src/util/min-max-filter.cc@24 PS6, Line 24: " #include "runtime/timestamp-value.inline.h" -- To view, visit http://gerrit.cloudera.org:8080/7793 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02bad890f5b5f78388a3041bf38f89369b5e2f1c Gerrit-Change-Number: 7793 Gerrit-PatchSet: 6 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Anonymous Coward #345 Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 16 Oct 2017 07:37:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5362 : Preserve case-sensitivity in field titles
ydjainopensou...@gmail.com has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8544 Change subject: IMPALA-5362 : Preserve case-sensitivity in field titles .. IMPALA-5362 : Preserve case-sensitivity in field titles To preserve case sensitivity in column labels , this patch modifies the getColumnLabel() method of the SelectListitem class to return original string. However since internally all identifiers are represented in lowercase a few minor changes have been made to SelectStmt.java.lowercase "null" was causing union test to fail hence replaced it with "NULL" Change-Id: Ia574f0b7d5bdbebace270ce4079632bf29b3f00e --- M fe/src/main/java/org/apache/impala/analysis/SelectListItem.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M testdata/workloads/functional-planner/queries/PlannerTest/union.test 3 files changed, 16 insertions(+), 16 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/8544/1 -- To view, visit http://gerrit.cloudera.org:8080/8544 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ia574f0b7d5bdbebace270ce4079632bf29b3f00e Gerrit-Change-Number: 8544 Gerrit-PatchSet: 1 Gerrit-Owner: ydjainopensou...@gmail.com
[Impala-ASF-CR] IMPALA-5362 : Preserve case-sensitivity in field titles
ydjainopensou...@gmail.com has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8555 Change subject: IMPALA-5362 : Preserve case-sensitivity in field titles .. IMPALA-5362 : Preserve case-sensitivity in field titles Fixed labeling. Change-Id: I146b8d72e7a8f4b6d181dbd0bfe450c7451e36e4 --- M fe/src/main/java/org/apache/impala/analysis/SelectListItem.java M tests/shell/test_shell_interactive.py 2 files changed, 10 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/8555/1 -- To view, visit http://gerrit.cloudera.org:8080/8555 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I146b8d72e7a8f4b6d181dbd0bfe450c7451e36e4 Gerrit-Change-Number: 8555 Gerrit-PatchSet: 1 Gerrit-Owner: ydjainopensou...@gmail.com