[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test
Alex Behm has posted comments on this change. Change subject: IMPALA-4467: Add support for DML statements in stress test .. Patch Set 4: (13 comments) Looks pretty good! http://gerrit.cloudera.org:8080/#/c/5093/4//COMMIT_MSG Commit Message: Line 24: following flag: --dml-mod-values 11 13 17. For each mod value 4 DML Seems ok for now, but from a perspective we should consider changing these mod values to %rows of the original table. I think ultimately a user wants to control the "size" of the DML operations and that is more naturally expressed via percent of rows imo. The mod values only seem meaningful if you also know the table size, and applying the same mod values to different tables seems strange. http://gerrit.cloudera.org:8080/#/c/5093/4/tests/stress/concurrent_select.py File tests/stress/concurrent_select.py: Line 368: self._select_probability = select_probability check that this is indeed a float between 0 and 1 Line 747: self.modifies_table = False This flag is pretty confusing. What do we use it for? Even in your example, the table is "modified" in the sense that the upsert changes the state of the table, but not the user-visible contents. Also, after we've deleted some rows, an insert/upsert may even change the user-visible contents. If we need this flag we should come up with a more accurate name. Line 1297: return add comment why we need to skip compute stats Line 1428: if table.name + "_original" in set(table.name for table in tables): What does it mean if this is not true? Maybe add a LOG.debug message to explain Line 1429: cursor.execute("SHOW CREATE TABLE " + table.name) Add a note or TODO that this will not work for certain types of Kudu tables, e.g., if they have range partitions. I assume that the Kudu tables used in our testing have simple hash partitioning based on the PK? Sounds like a potential coverage gap. On the other hand, for more complex partition schemes, this SHOW CREATE TABLE trick will not work. Line 1440: For each table in the database that cursor is connected to, create several queries, Mention the limitations of this function: 1. Only generates DML statements against Kudu tables, and ignores non-Kudu tables. 2. Requires that the type of the first column of the primary key is an integer type 3. ... Line 1443: exists a table with a '_original' suffix that has unmodified, for example, tpch data. that has unmodified -> that is never modified Line 1450: continue LOG.debug what is happening here Line 1452: if primary_key.exact_type not in (TinyInt, SmallInt, BigInt): what about Int? Line 1456: continue LOG.debug what is happening here because some of these limitations are not really obvious Line 1481: "UPDATE a SET {update_list} FROM {table_name} a JOIN {table_name}_original b " This might give you strange results for tables with multiple primary keys. Maybe we should explicitly limit this functionality to tables with a single primary-key column? Seems like a fine limitation for now, but better to be explicit about what works as expected, and what probably does not. Line 1559: cursor.execute("SHOW CREATE TABLE " + table_name) same deal with SHOW CREATE TABLE as above, add a note or TODO that this process will not work for some Kudu tables -- To view, visit http://gerrit.cloudera.org:8080/5093 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4639: Add pytest option and skip markers for tests that only run locally.
Alex Behm has posted comments on this change. Change subject: IMPALA-4639: Add pytest option and skip markers for tests that only run locally. .. Patch Set 2: (4 comments) Is the motivation of this change to get Jenkins jobs run against a remote cluster green? Please briefly summarize the motivation of this change. http://gerrit.cloudera.org:8080/#/c/5446/2/tests/common/skip.py File tests/common/skip.py: Line 122: mini_cluster_only = pytest.mark.skipif(pytest.config.option.testing_remote_cluster, Why skip and not xfail? With xfail we have the option of still running the tests, e.g., to detect crashes. http://gerrit.cloudera.org:8080/#/c/5446/2/tests/conftest.py File tests/conftest.py: Line 108: parser.addoption("--testing_remote_cluster", action="store_true", default=False, I'm ok with adding this flag, but I think we should enforce its semantics. Currently, we can set this flag regardless of whether we are testing on a remote cluster or not which seems potentially error-prone. Line 109:help="Indicates that tests are being run against a remote cluster.") Mention what effect this flag has. It will cause tests to be skipped. http://gerrit.cloudera.org:8080/#/c/5446/2/tests/query_test/test_mt_dop.py File tests/query_test/test_mt_dop.py: Line 95: @SkipIfRemoteTesting.mini_cluster_only Does test_nested_types.py or test_tpch_nested_queries.py succeed on a remote cluster? I'm surprised that only this test would be skipped of loading tpch_nested is not working. -- To view, visit http://gerrit.cloudera.org:8080/5446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4638: Run queries with MT DOP through admission control.
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4638: Run queries with MT_DOP through admission control. .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/5447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3e05697a59e14617fb6a13ce8e0410820823fd3a Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4638: Run queries with MT DOP through admission control.
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-4638: Run queries with MT_DOP through admission control. .. IMPALA-4638: Run queries with MT_DOP through admission control. The bug was a simple oversight: A TODO that was not addressed. Testing: I examined query profiles before and after this change with and without MT_DOP set. After this patch all queries (with and without MT_DOP) go through admission control. Change-Id: I3e05697a59e14617fb6a13ce8e0410820823fd3a Reviewed-on: http://gerrit.cloudera.org:8080/5447 Reviewed-by: Alex Behm Tested-by: Internal Jenkins --- M be/src/scheduling/simple-scheduler.cc 1 file changed, 1 insertion(+), 3 deletions(-) Approvals: Internal Jenkins: Verified Alex Behm: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/5447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I3e05697a59e14617fb6a13ce8e0410820823fd3a Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4611: Checking perms on S3 files is a very expensive no-op
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-4611: Checking perms on S3 files is a very expensive no-op .. Patch Set 1: Code-Review+2 (3 comments) http://gerrit.cloudera.org:8080/#/c/5449/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: Line 797:* from that. Expand the comment to mention that for S3 it always returns READ/WRITE access. PS1, Line 802: getPermissions nit: when referencing functions use parentheses (getPermissions()). PS1, Line 806: TODO: Revisit if the S3A connector is updated to be able to manage S3 object : // permissions. If there is a JIRA for this, can you add it here? -- To view, visit http://gerrit.cloudera.org:8080/5449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If9d1072c185a6162727019cdf1cb34d7f3f1c75c Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4639: Add pytest option and skip markers for tests that only run locally.
Lars Volker has posted comments on this change. Change subject: IMPALA-4639: Add pytest option and skip markers for tests that only run locally. .. Patch Set 2: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/5446/2/tests/common/skip.py File tests/common/skip.py: PS2, Line 123: Setup for this t Maybe better "This test is currently..."? -- To view, visit http://gerrit.cloudera.org:8080/5446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4639: Add pytest option and skip markers for tests that only run locally.
Michael Brown has posted comments on this change. Change subject: IMPALA-4639: Add pytest option and skip markers for tests that only run locally. .. Patch Set 2: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/5446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4639: Add pytest option and skip markers for tests that only run locally.
David Knupp has uploaded a new patch set (#2). Change subject: IMPALA-4639: Add pytest option and skip markers for tests that only run locally. .. IMPALA-4639: Add pytest option and skip markers for tests that only run locally. Add new skip marker to test_mt_dop.TestMtDopParquet.test_parquet_nested Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f --- M tests/common/skip.py M tests/conftest.py M tests/metadata/test_compute_stats.py M tests/query_test/test_mt_dop.py 4 files changed, 13 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/5446/2 -- To view, visit http://gerrit.cloudera.org:8080/5446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4639: Add pytest option and skip markers for tests that only run locally.
David Knupp has posted comments on this change. Change subject: IMPALA-4639: Add pytest option and skip markers for tests that only run locally. .. Patch Set 1: (8 comments) http://gerrit.cloudera.org:8080/#/c/5446/1//COMMIT_MSG Commit Message: PS1, Line 7: Add pytest option and skip markers for tests that only run locally. > Please add Jira ID. Done http://gerrit.cloudera.org:8080/#/c/5446/1/tests/common/skip.py File tests/common/skip.py: PS1, Line 120: class SkipIfRemoteTesting: : """Skip marker for those tests that may not work on remote clusters.""" : local_cluster_only = pytest.mark.skipif(pytest.config.option.skip_local_only_tests, : reason="--skip_local_only_tests specified") > I'm fine with the class name. I'm less enthused on the command line name. W Done PS1, Line 123: --skip_local_only_tests > Could this ever be re-used by some other dev without that command line opti No, if you don't specify the command line, then this won't have any effect. PS1, Line 124: hbase_split > Is this used anywhere? Thanks, forgot to add that file where it gets used. PS1, Line 125: HBase is not split on remote cluster the same as on the local cluster. > This doesn't tell me much, probably because I don't know how splitting work Hbase splitting is usually non-deterministic, but we enforce a particular split on our local tests via a setup shell script, and then conform our tests to expect that split. We don't have a way of doing this on the remote cluster, and it would require a big overhaul to our existing tests to work without it. http://gerrit.cloudera.org:8080/#/c/5446/1/tests/query_test/test_mt_dop.py File tests/query_test/test_mt_dop.py: PS1, Line 95: @SkipIfRemoteTesting.local_cluster_only > From just looking at this it is not clear to me whether this is skipped or This will only be skipped if --testing_remote_cluster is supplied on the command line. PS1, Line 95: @SkipIfRemoteTesting.local_cluster_only > I think also it's confusing when we have the distinction between a local HD Done Line 99: self.run_test_case('QueryTest/mt-dop-parquet-nested', vector) > Why isn't this test expected to succeed on a cluster? It seems like it shou It's a setup problem. We're still working though kinks on getting our entire data load working across the board on remote clusters. Even though the process completes without any errors, here are as yet silent failures. I can add a JIRA and reference it here. -- To view, visit http://gerrit.cloudera.org:8080/5446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4638: Run queries with MT DOP through admission control.
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4638: Run queries with MT_DOP through admission control. .. Patch Set 2: Verified-1 Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/640/ -- To view, visit http://gerrit.cloudera.org:8080/5447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3e05697a59e14617fb6a13ce8e0410820823fd3a Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4640: Fix number of rows displayed by parquet-reader tool
Lars Volker has uploaded a new change for review. http://gerrit.cloudera.org:8080/5453 Change subject: IMPALA-4640: Fix number of rows displayed by parquet-reader tool .. IMPALA-4640: Fix number of rows displayed by parquet-reader tool The variable just never got updated in the code. This change also adds verification that all columns contain the same number of rows. Change-Id: I281a784a0aa2df4ed1852dfb864587a0c1aa4d9a --- M be/src/util/parquet-reader.cc 1 file changed, 11 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/5453/1 -- To view, visit http://gerrit.cloudera.org:8080/5453 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I281a784a0aa2df4ed1852dfb864587a0c1aa4d9a Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker
[Impala-ASF-CR] Add pytest option and skip markers for tests that only run locally.
Tim Armstrong has posted comments on this change. Change subject: Add pytest option and skip markers for tests that only run locally. .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/5446/1/tests/query_test/test_mt_dop.py File tests/query_test/test_mt_dop.py: PS1, Line 95: @SkipIfRemoteTesting.local_cluster_only > From just looking at this it is not clear to me whether this is skipped or I think also it's confusing when we have the distinction between a local HDFS cluster and local filesystem. Maybe this could be minicluster_only? Line 99: self.run_test_case('QueryTest/mt-dop-parquet-nested', vector) Why isn't this test expected to succeed on a cluster? It seems like it should work unless there's a product bug. -- To view, visit http://gerrit.cloudera.org:8080/5446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4611: Checking perms on S3 files is a very expensive no-op
Bharath Vissapragada has posted comments on this change. Change subject: IMPALA-4611: Checking perms on S3 files is a very expensive no-op .. Patch Set 1: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/5449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If9d1072c185a6162727019cdf1cb34d7f3f1c75c Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Bharath Vissapragada Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4403: Implement SHOW RANGE PARTITIONS for Kudu tables
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4403: Implement SHOW RANGE PARTITIONS for Kudu tables .. Patch Set 11: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/5390 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idf5b2fdd02938a42fa59ec98884e4ac915dd1f65 Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4403: Implement SHOW RANGE PARTITIONS for Kudu tables
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-4403: Implement SHOW RANGE PARTITIONS for Kudu tables .. IMPALA-4403: Implement SHOW RANGE PARTITIONS for Kudu tables Change-Id: Idf5b2fdd02938a42fa59ec98884e4ac915dd1f65 Reviewed-on: http://gerrit.cloudera.org:8080/5390 Reviewed-by: Lars Volker Reviewed-by: Matthew Jacobs Tested-by: Internal Jenkins --- M common/thrift/Frontend.thrift M fe/src/main/cup/sql-parser.cup D fe/src/main/java/org/apache/impala/analysis/ShowPartitionsStmt.java M fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test M testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test M tests/query_test/test_kudu.py 12 files changed, 238 insertions(+), 72 deletions(-) Approvals: Lars Volker: Looks good to me, but someone else must approve Matthew Jacobs: Looks good to me, approved Internal Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/5390 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Idf5b2fdd02938a42fa59ec98884e4ac915dd1f65 Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4638: Run queries with MT DOP through admission control.
Alex Behm has posted comments on this change. Change subject: IMPALA-4638: Run queries with MT_DOP through admission control. .. Patch Set 2: Code-Review+2 rebase, carry +1, I talked to Dan and he said I should go ahead and GVO -- To view, visit http://gerrit.cloudera.org:8080/5447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3e05697a59e14617fb6a13ce8e0410820823fd3a Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4638: Run queries with MT DOP through admission control.
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4638: Run queries with MT_DOP through admission control. .. Patch Set 1: Code-Review+1 I don't think I know the code well enough to fully understand the consequences, but it makes sense. -- To view, visit http://gerrit.cloudera.org:8080/5447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3e05697a59e14617fb6a13ce8e0410820823fd3a Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4566: Kudu client glog contention can cause timeouts
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4566: Kudu client glog contention can cause timeouts .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/5334 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie535d89ec2525232d4f6a29dd44f51cd6e18a0d2 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4566: Kudu client glog contention can cause timeouts
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-4566: Kudu client glog contention can cause timeouts .. IMPALA-4566: Kudu client glog contention can cause timeouts Under stressful workloads, there appears to be significant contention in glog resulting from Kudu logging, causing timeouts and failed queries. An easy solution for now is to downgrade Kudu WARNINGs to INFOs as WARNINGs grab a lock to flush the log. This is appropriate as Kudu logs WARNINGs much more frequently than Impala and for things that Impala would normally consider INFO-level. Testing: Manually verified that the patch redirects Kudu WARNINGs to the INFO log. Not tested under stress to verify if this actually solves the contention problem. Change-Id: Ie535d89ec2525232d4f6a29dd44f51cd6e18a0d2 Reviewed-on: http://gerrit.cloudera.org:8080/5334 Reviewed-by: Matthew Jacobs Tested-by: Internal Jenkins --- M be/src/exec/kudu-util.cc 1 file changed, 4 insertions(+), 2 deletions(-) Approvals: Matthew Jacobs: Looks good to me, approved Internal Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/5334 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ie535d89ec2525232d4f6a29dd44f51cd6e18a0d2 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-4638: Run queries with MT DOP through admission control.
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-4638: Run queries with MT_DOP through admission control. .. Patch Set 1: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/5447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3e05697a59e14617fb6a13ce8e0410820823fd3a Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test
Michael Brown has posted comments on this change. Change subject: IMPALA-4467: Add support for DML statements in stress test .. Patch Set 4: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/5093 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test
Taras Bobrovytsky has uploaded a new patch set (#4). Change subject: IMPALA-4467: Add support for DML statements in stress test .. IMPALA-4467: Add support for DML statements in stress test - Add support for insert, upsert, update and and delete statements. - Add support for compute stats with mt_dop query options. - Update impyla version in order to be able to have access to query error text for DML queries. - Made flake8 fixes. flake8 on this file is clean. For every Kudu table in the databases, we make a copy and add a '_original' suffix to the table name. The DML queries will only make modifications to the non original table, the original table will never be modified. The orignal tables could be used to bring the non-original table to the inital state. Two flags were added for doing this: --reset-databases-before-binary-search and --reset-databases-after-binary-search. The DML queries are generated based on the mod values passed in with the following flag: --dml-mod-values 11 13 17. For each mod value 4 DML queries are generated. The DML operations will touch table rows where primary_key % mod_value = 0. So, the larger the mod value, the more rows would be affected. The DML queries are generated in such a way that the data for the insert, upsert, and update queries is taken from the table with the _original suffix. The stress test generates DML queries for only kudu databases. For example, --tpch-kudu-db=tpch_100_kudu --tpch-db=tpch_100 --generate-dml-queries would only generate queries for the tpch_100_kudu database. Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40 --- M infra/python/deps/requirements.txt M tests/stress/concurrent_select.py M tests/util/parse_util.py 3 files changed, 652 insertions(+), 262 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/5093/4 -- To view, visit http://gerrit.cloudera.org:8080/5093 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test
Taras Bobrovytsky has posted comments on this change. Change subject: IMPALA-4467: Add support for DML statements in stress test .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/5093/2/tests/stress/concurrent_select.py File tests/stress/concurrent_select.py: PS2, Line 1464: insert_query.modifies_table = False > It's possible updatale_column_names is a bad name. In any case, it currentl I think it would be more clear to leave it as is. http://gerrit.cloudera.org:8080/#/c/5093/3/tests/stress/concurrent_select.py File tests/stress/concurrent_select.py: PS3, Line 1461: this will still be still > Some extra words here. Done PS3, Line 1657: quereis > spelling: queries Done -- To view, visit http://gerrit.cloudera.org:8080/5093 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test
Hello Michael Brown, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5093 to look at the new patch set (#4). Change subject: IMPALA-4467: Add support for DML statements in stress test .. IMPALA-4467: Add support for DML statements in stress test - Add support for insert, upsert, update and and delete statements. - Add support for compute stats with mt_dop query options. - Update impyla version in order to be able to have access to query error text for DML queries. - Made flake8 fixes. flake8 on this file is clean. For every Kudu table in the databases, we make a copy and add a '_original' suffix to the table name. The DML queries will only make modifications to the non original table, the original table will never be modified. The orignal tables could be used to bring the non-original table to the inital state. Two flags were added for doing this: --reset-databases-before-binary-search and --reset-databases-after-binary-search. The DML queries are generated based on the mod values passed in with the following flag: --dml-mod-values 11 13 17. For each mod value 4 DML queries are generated. The DML operations will touch table rows where primary_key % mod_value = 0. So, the larger the mod value, the more rows would be affected. The DML queries are generated in such a way that the data for the insert, upsert, and update queries is taken from the table with the _original suffix. The stress test generates DML queries for only kudu databases. For example, --tpch-kudu-db=tpch_100_kudu --tpch-db=tpch_100 --generate-dml-queries would only generate queries for the tpch_100_kudu database. Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40 --- M infra/python/deps/requirements.txt M tests/stress/concurrent_select.py M tests/util/parse_util.py 3 files changed, 652 insertions(+), 262 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/5093/4 -- To view, visit http://gerrit.cloudera.org:8080/5093 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-4611: Checking perms on S3 files is a very expensive no-op
Sailesh Mukil has uploaded a new change for review. http://gerrit.cloudera.org:8080/5449 Change subject: IMPALA-4611: Checking perms on S3 files is a very expensive no-op .. IMPALA-4611: Checking perms on S3 files is a very expensive no-op We call getPermissions() on partition directories to find out if Impala has access to those files. On S3, this currently is a no-op as the S3A connector does not try to set/get the permissions for S3 objects. So, it always returns the default set of permissions -> 777. However, it still makes a roundtrip to S3 causing a slow down in the Catalog. We can return the READ_WRITE permission immediately if we know we are accessing an S3 file, thereby avoiding the round trip to S3 for every partition. This will greatly speedup metadata operations for S3 tables and partitions, which is already known to be a big bottleneck. If and when the S3A connector is able to manage permissions in the future, we need to revisit this code. However, as permissions on S3 are unsupported by Impala right now, we might as well gain on perf. Change-Id: If9d1072c185a6162727019cdf1cb34d7f3f1c75c --- M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java 1 file changed, 9 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/5449/1 -- To view, visit http://gerrit.cloudera.org:8080/5449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: If9d1072c185a6162727019cdf1cb34d7f3f1c75c Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil
[Impala-ASF-CR] Add pytest option and skip markers for tests that only run locally.
Lars Volker has posted comments on this change. Change subject: Add pytest option and skip markers for tests that only run locally. .. Patch Set 1: (4 comments) http://gerrit.cloudera.org:8080/#/c/5446/1/tests/common/skip.py File tests/common/skip.py: PS1, Line 123: --skip_local_only_tests Could this ever be re-used by some other dev without that command line option being present? Then the error might be misleading. PS1, Line 124: hbase_split Is this used anywhere? PS1, Line 125: HBase is not split on remote cluster the same as on the local cluster. This doesn't tell me much, probably because I don't know how splitting works local and/or remote. Maybe you can rephrase to explain what exactly is different. http://gerrit.cloudera.org:8080/#/c/5446/1/tests/query_test/test_mt_dop.py File tests/query_test/test_mt_dop.py: PS1, Line 95: @SkipIfRemoteTesting.local_cluster_only >From just looking at this it is not clear to me whether this is skipped or >only runs locally / remotely. -- To view, visit http://gerrit.cloudera.org:8080/5446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2605: Omit the sort and mini stress tests
Alex Behm has posted comments on this change. Change subject: IMPALA-2605: Omit the sort and mini stress tests .. Patch Set 2: Code-Review+1 I'm not going to stand in the way of progress, but for the record: - We have no root cause for the hang - We are removing test coverage - We have no plan of restoring the test coverage The fact that this test has not caught many failures is not necessarily an indication that it's useless. By that definition most tests would be useless. Think about what would happen if this became an accepted procedure for getting rid of "inconvenient" test problems. I concede this case might be an exception, just wanted to explain my general reluctance given the facts. I'm much more in favor of deleting code we don't need instead of leaving it and disabling it without understanding what is happening. -- To view, visit http://gerrit.cloudera.org:8080/5401 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibd30abf8215415e0f2830b725e43b005daa2bb2d Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] Add pytest option and skip markers for tests that only run locally.
Michael Brown has posted comments on this change. Change subject: Add pytest option and skip markers for tests that only run locally. .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/5446/1//COMMIT_MSG Commit Message: PS1, Line 7: Add pytest option and skip markers for tests that only run locally. Please add Jira ID. http://gerrit.cloudera.org:8080/#/c/5446/1/tests/common/skip.py File tests/common/skip.py: PS1, Line 120: class SkipIfRemoteTesting: : """Skip marker for those tests that may not work on remote clusters.""" : local_cluster_only = pytest.mark.skipif(pytest.config.option.skip_local_only_tests, : reason="--skip_local_only_tests specified") I'm fine with the class name. I'm less enthused on the command line name. We already have one notion of "local", which is the local filesystem as opposed to HDFS et al. Now we are introducing another notion of "local". Is there a philosophical argument I'm missing against something like "--remote-cluster-tests" ? That is more precise and clear as to the nature of the testing. -- To view, visit http://gerrit.cloudera.org:8080/5446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2057: Better error message for incorrect avro decimal column declaration
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-2057: Better error message for incorrect avro decimal column declaration .. IMPALA-2057: Better error message for incorrect avro decimal column declaration Adding a better error message when logical type is specified at a wrong level or is not not specified in an avro decimal column declaration. Change-Id: Iad23706128223b6537d565471ef5d8faa91b0b5a Reviewed-on: http://gerrit.cloudera.org:8080/5255 Reviewed-by: Bharath Vissapragada Tested-by: Internal Jenkins --- M fe/src/main/java/org/apache/impala/util/AvroSchemaParser.java 1 file changed, 24 insertions(+), 18 deletions(-) Approvals: Bharath Vissapragada: Looks good to me, approved Internal Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/5255 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Iad23706128223b6537d565471ef5d8faa91b0b5a Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: anujphadke Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke
[Impala-ASF-CR] IMPALA-2057: Better error message for incorrect avro decimal column declaration
Internal Jenkins has posted comments on this change. Change subject: IMPALA-2057: Better error message for incorrect avro decimal column declaration .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/5255 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iad23706128223b6537d565471ef5d8faa91b0b5a Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: anujphadke Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-HasComments: No
[Impala-ASF-CR] Add pytest option and skip markers for tests that only run locally.
David Knupp has uploaded a new change for review. http://gerrit.cloudera.org:8080/5446 Change subject: Add pytest option and skip markers for tests that only run locally. .. Add pytest option and skip markers for tests that only run locally. Add new skip marker to test_mt_dop.TestMtDopParquet.test_parquet_nested Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f --- M tests/common/skip.py M tests/conftest.py M tests/query_test/test_mt_dop.py 3 files changed, 11 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/5446/1 -- To view, visit http://gerrit.cloudera.org:8080/5446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Michael Brown
[Impala-ASF-CR] IMPALA-4638: Run queries with MT DOP through admission control.
Alex Behm has uploaded a new change for review. http://gerrit.cloudera.org:8080/5447 Change subject: IMPALA-4638: Run queries with MT_DOP through admission control. .. IMPALA-4638: Run queries with MT_DOP through admission control. The bug was a simple oversight: A TODO that was not addressed. Testing: I examined query profiles before and after this change with and without MT_DOP set. After this patch all queries (with and without MT_DOP) go through admission control. Change-Id: I3e05697a59e14617fb6a13ce8e0410820823fd3a --- M be/src/scheduling/simple-scheduler.cc 1 file changed, 1 insertion(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/5447/1 -- To view, visit http://gerrit.cloudera.org:8080/5447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I3e05697a59e14617fb6a13ce8e0410820823fd3a Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm
[Impala-ASF-CR] IMPALA-3524: Don't process spilled partitions with 0 probe rows
Tim Armstrong has posted comments on this change. Change subject: IMPALA-3524: Don't process spilled partitions with 0 probe rows .. Patch Set 2: (11 comments) http://gerrit.cloudera.org:8080/#/c/5389/2/be/src/exec/partitioned-hash-join-node.cc File be/src/exec/partitioned-hash-join-node.cc: Line 378: RETURN_IF_ERROR(input_partition_->build_partition()->build_rows()->PrepareForRead( We don't even need to do this for many join modes. E.g. for INNER JOIN if the probe partition is empty, we can just discard the partition. I think you actually implemented the optimisation for the harder join types (right outer join, etc) instead of the easier types (inner join, etc). Line 387: runtime_profile_->AddInfoString( The log message might be a bit confusing, since it's Maybe it's not worth exposing in the profile? Or exposing it as a simple counter. Line 649: if (output_unmatched_pos_ != -1) { Can you factor out the two cases into different functions? PS2, Line 651: DCHECK DCHECK_EQ Line 656: if (build_rows->num_rows() == build_rows->rows_returned()) { Can't we skip this check and just look at eos below? BTS::GetNext() has the exact same check to return eos. It's kind of weird that we're doing this and ignoring eos. Line 666: for (RowBatch::Iterator batch_iter(output_unmatched_batch_.get(), output_unmatched_pos_); Can we just save the batch iterator instead of output_unmatched_pos_? Line 671: if (join_op_ == TJoinOp::RIGHT_ANTI_JOIN) { This output row creation is kind of subtle and duplicated - factor out into a helper function? Line 691: DCHECK(output_build_partitions_.empty()); Why is this true? http://gerrit.cloudera.org:8080/#/c/5389/2/be/src/exec/partitioned-hash-join-node.h File be/src/exec/partitioned-hash-join-node.h: PS2, Line 460: output_unmatched_batch_ Normally we single-quote variable names in comments so they stand out a bit more. Many of the surrounding comments don't but we should do it for new ones. Line 464: std::unique_ptr output_unmatched_batch_; We should mention that it's a batch from output_build_partitions_.front() http://gerrit.cloudera.org:8080/#/c/5389/2/testdata/workloads/functional-query/queries/QueryTest/spilling.test File testdata/workloads/functional-query/queries/QueryTest/spilling.test: Line 631: QUERY Can you comment which join modes these are testing? -- To view, visit http://gerrit.cloudera.org:8080/5389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I175b32dd9031e51218b38c37693ac3e31dfab47b Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] Add a build flag for the undefined behavior sanitizer, aka "ubsan".
Tim Armstrong has posted comments on this change. Change subject: Add a build flag for the undefined behavior sanitizer, aka "ubsan". .. Patch Set 2: If they're relying on twos-complement we could just cast to unsigned for the bit-manipulation, right? -- To view, visit http://gerrit.cloudera.org:8080/5082 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I88c7234bd7c5eb7404490a0913d90470c10835e7 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4631: don't use floating point operations for time unit conversions
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4631: don't use floating point operations for time unit conversions .. Patch Set 5: Verified-1 Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/639/ -- To view, visit http://gerrit.cloudera.org:8080/5434 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7237f579b201f5bd3930f66e9c2c8d700c37ffeb Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan Hecht Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Michael Ho Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2605: Omit the sort and mini stress tests
Tim Armstrong has posted comments on this change. Change subject: IMPALA-2605: Omit the sort and mini stress tests .. Patch Set 2: I don't think concurrent_select.py is that well suited without some adaption. You can run something like the below command on the mini-cluster but it takes quite a while to warm up. It would be suitable to run maybe in a post-commit job where we can let it run longer. ./tests/stress/concurrent_select.py --tpch-db tpch_parquet --max-queries=100 -- To view, visit http://gerrit.cloudera.org:8080/5401 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibd30abf8215415e0f2830b725e43b005daa2bb2d Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3989: Display skew warning for poorly formatted Parquet files
Thomas Tauber-Marshall has posted comments on this change. Change subject: IMPALA-3989: Display skew warning for poorly formatted Parquet files .. Patch Set 4: (9 comments) http://gerrit.cloudera.org:8080/#/c/5400/4/be/src/exec/hdfs-scan-node-base.h File be/src/exec/hdfs-scan-node-base.h: Line 75: bool poorly_formatted_file_warned; Presumably you did this to prevent excessive logging, but I think its unnecessary. Removing it won't result in excess logging because error messages are grouped by their error code (eg. in the shell you'll just see one instance of the message followed by '(1 of X similar)', and its better not to clutter up this struct. If you're still worried about excessive logging, you could move the call to LogPoorlyFormattedParquetFileWarning to the end of NextRowGroup outside of the 'while(true)'. In fact, if you remove this, then LogPoorlyFormattedParquetFileWarning isn't really doing much and you could eliminate it and just do the logging inline in NextRowGroup. http://gerrit.cloudera.org:8080/#/c/5400/4/common/thrift/generate_error_codes.py File common/thrift/generate_error_codes.py: PS4, Line 315: Remove spaces, here and below. Sorry, I know we're inconsistent about our Python style, but we try to mostly follow PEP8: https://www.python.org/dev/peps/pep-0008/#indentation http://gerrit.cloudera.org:8080/#/c/5400/4/tests/query_test/test_scanners.py File tests/query_test/test_scanners.py: PS4, Line 319: 'NumScannersWithMisalignedRowGroups' NumScannersWithNoReads PS4, Line 321: """ Move """ to next line. Another place where we're inconsistent with our Python style: https://www.python.org/dev/peps/pep-0008/#documentation-strings PS4, Line 347: num_scanners_misaligned_row_groups num_scanners_with_no_reads PS4, Line 349: excuting executing PS4, Line 351: 'num_scanners_misaligned_row_groups' num_scanners_with_no_reads PS4, Line 352: """ Move to next line. PS4, Line 362: 'NumScannersWithMisalignedRowGroups: NumScannersWithNoReads -- To view, visit http://gerrit.cloudera.org:8080/5400 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibf48d978383d73efdade733a892e795ebd53c76a Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4355: random query generator: modify statement execution flow to support DML
Taras Bobrovytsky has posted comments on this change. Change subject: IMPALA-4355: random query generator: modify statement execution flow to support DML .. Patch Set 1: (9 comments) http://gerrit.cloudera.org:8080/#/c/5387/1//COMMIT_MSG Commit Message: PS1, Line 11: which wording: randomly choose a table on which to execute PS1, Line 12: copied to run the DML statement I'm not really following the logic. We randomly choose a table because we need that table copied? Line 36: modes. The first is DML_SETUP: this is a DML statement that needs to be long line. By the way, did you know that there is an automatic way to wrap text in vim? set textwidth=72 for example, http://stackoverflow.com/questions/823754/how-can-i-wrap-text-to-some-length-in-vim http://gerrit.cloudera.org:8080/#/c/5387/1/tests/comparison/discrepancy_searcher.py File tests/comparison/discrepancy_searcher.py: Line 635: # 1. Don't have to spend more time parsing SHOW CREATE TABLE output just to get it This point is not very clear. Why do we need to parse anything, do we already have a table object passed in to this function? Line 637: # 2. Ensure we copy the table-under-test exactly. This includes primary keys, It's not really clear to me either. Isn't the table object that's passed in already exact? Line 695: result = query_result_comparator.compare_query_results(table_copy_statement) So we copy the entire table for every query that we want to execute? Won't this be slow, or are we planning to use small tables for testing? http://gerrit.cloudera.org:8080/#/c/5387/1/tests/comparison/query.py File tests/comparison/query.py: Line 111: self.execution = StatementExecutionMode.SELECT_STATEMENT Maybe better to set it to None by default, like other variables? This way, you won't forget to set the mode when creating the object. http://gerrit.cloudera.org:8080/#/c/5387/1/tests/comparison/statement_generator.py File tests/comparison/statement_generator.py: Line 32: class InsertStatementGenerator(object): Is the plan to also support all DML statements for this release? PS1, Line 53: (_, since you're using the _ variable, you should name it something else. -- To view, visit http://gerrit.cloudera.org:8080/5387 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia4c63a2223185d0e056cc5713796772e5d1b8414 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4631: don't use floating point operations for time unit conversions
Dan Hecht has posted comments on this change. Change subject: IMPALA-4631: don't use floating point operations for time unit conversions .. Patch Set 5: Code-Review+2 Rebase -- To view, visit http://gerrit.cloudera.org:8080/5434 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7237f579b201f5bd3930f66e9c2c8d700c37ffeb Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan Hecht Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Michael Ho Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4014: Introduce query-wide execution state. .. Patch Set 12: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/4418 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel Kornacker Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers
Tim Armstrong has uploaded a new patch set (#13). Change subject: IMPALA-3200: Implement suballocator for splitting buffers .. IMPALA-3200: Implement suballocator for splitting buffers This is useful for situations like hash tables, where we want to make multiple non-spillable allocations of variable size from buffer pool memory and not incur the overhead of interacting with the global buffer pool. The allocator subdivides buffers to service allocations and uses a buddy allocation algorithm to merge freed allocations into larger chunks. This helps avoid external fragmentation and is quite effective at reusing memory given the typical doubling allocation patterns of hash tables in partitioned aggs and joins. Testing: The allocator has fairly robust internal consistency checks via assertions and unique_ptrs. Includes a unit test that exercises various allocation patterns. I tested porting hash tables over to allocate memory using an earlier version of the suballocator, which worked well (was able to run a wide range of queries successfully with good performance). Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788 --- M be/src/common/names.h M be/src/exec/hash-table.h M be/src/runtime/bufferpool/CMakeLists.txt A be/src/runtime/bufferpool/suballocator-test.cc A be/src/runtime/bufferpool/suballocator.cc A be/src/runtime/bufferpool/suballocator.h 6 files changed, 933 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/4715/13 -- To view, visit http://gerrit.cloudera.org:8080/4715 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788 Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers
Tim Armstrong has posted comments on this change. Change subject: IMPALA-3200: Implement suballocator for splitting buffers .. Patch Set 13: Rebased onto the change that moves bufferpool/ under runtime/ -- To view, visit http://gerrit.cloudera.org:8080/4715 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788 Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers
Tim Armstrong has posted comments on this change. Change subject: IMPALA-3200: Implement suballocator for splitting buffers .. Patch Set 4: (3 comments) Sorry for the slow turnaround, got caught up in things. http://gerrit.cloudera.org:8080/#/c/4715/4//COMMIT_MSG Commit Message: Line 14: larger chunks. This helps avoid fragmentation and is quite effective > I think there are a few ways to get both what I am concerned about (buddy a Gotcha. I added the static_assert - that is a good idea. I think the Atom idea is nice but I think we end up with the same problem with large buffers, since the buffer pool only supports powers-of-two. http://gerrit.cloudera.org:8080/#/c/4715/11/be/src/bufferpool/suballocator-test.cc File be/src/bufferpool/suballocator-test.cc: Line 107: // Make allocations smaller than the buffer size. > What happened to the const? The const was incorrect - I pushed a patchset without the fix. The problem is that GetUsedReservation() is a non-const method since it acquires the mutex. http://gerrit.cloudera.org:8080/#/c/4715/4/be/src/bufferpool/suballocator.h File be/src/bufferpool/suballocator.h: Line 171: Suballocation* prev_free_; > Just so I'm clear, in the current system, a Suballocation may have several Yep, that's true. I realise that it's really not obvious why the lifetime of 'buddy_' and 'prev_free_' is shorter than the unique_ptr. I documented that and then described the ownership cases in the class header. -- To view, visit http://gerrit.cloudera.org:8080/4715 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers
Tim Armstrong has uploaded a new patch set (#12). Change subject: IMPALA-3200: Implement suballocator for splitting buffers .. IMPALA-3200: Implement suballocator for splitting buffers This is useful for situations like hash tables, where we want to make multiple non-spillable allocations of variable size from buffer pool memory and not incur the overhead of interacting with the global buffer pool. The allocator subdivides buffers to service allocations and uses a buddy allocation algorithm to merge freed allocations into larger chunks. This helps avoid external fragmentation and is quite effective at reusing memory given the typical doubling allocation patterns of hash tables in partitioned aggs and joins. Testing: The allocator has fairly robust internal consistency checks via assertions and unique_ptrs. Includes a unit test that exercises various allocation patterns. I tested porting hash tables over to allocate memory using an earlier version of the suballocator, which worked well (was able to run a wide range of queries successfully with good performance). Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788 --- M be/src/bufferpool/CMakeLists.txt A be/src/bufferpool/suballocator-test.cc A be/src/bufferpool/suballocator.cc A be/src/bufferpool/suballocator.h M be/src/common/names.h M be/src/exec/hash-table.h 6 files changed, 933 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/4715/12 -- To view, visit http://gerrit.cloudera.org:8080/4715 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788 Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4403: Implement SHOW RANGE PARTITIONS for Kudu tables
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-4403: Implement SHOW RANGE PARTITIONS for Kudu tables .. Patch Set 11: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/5390 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idf5b2fdd02938a42fa59ec98884e4ac915dd1f65 Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4403: Implement SHOW RANGE PARTITIONS for Kudu tables
Lars Volker has posted comments on this change. Change subject: IMPALA-4403: Implement SHOW RANGE PARTITIONS for Kudu tables .. Patch Set 11: Code-Review+1 Rebased, ran ParserTest, AnalyzeDDLTest, kudu_alter.test and kudu_partition_ddl.test locally. -- To view, visit http://gerrit.cloudera.org:8080/5390 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idf5b2fdd02938a42fa59ec98884e4ac915dd1f65 Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4631: don't use floating point operations for time unit conversions
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4631: don't use floating point operations for time unit conversions .. Patch Set 4: Verified-1 Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/636/ -- To view, visit http://gerrit.cloudera.org:8080/5434 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7237f579b201f5bd3930f66e9c2c8d700c37ffeb Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan Hecht Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Michael Ho Gerrit-HasComments: No
[Impala-ASF-CR] Bump Kudu server version to latest master (a70c905006)
Internal Jenkins has submitted this change and it was merged. Change subject: Bump Kudu server version to latest master (a70c905006) .. Bump Kudu server version to latest master (a70c905006) This also re-enabled kudu_alter.test, which was disabled in IMPALA-4628. Change-Id: Ie5acdeffea7ed9a68ce0f48d1f68c6c922044704 Reviewed-on: http://gerrit.cloudera.org:8080/5427 Reviewed-by: Matthew Jacobs Tested-by: Internal Jenkins --- M bin/impala-config.sh M tests/query_test/test_kudu.py 2 files changed, 2 insertions(+), 3 deletions(-) Approvals: Matthew Jacobs: Looks good to me, approved Internal Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/5427 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ie5acdeffea7ed9a68ce0f48d1f68c6c922044704 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Matthew Jacobs
[Impala-ASF-CR] Bump Kudu server version to latest master (a70c905006)
Internal Jenkins has posted comments on this change. Change subject: Bump Kudu server version to latest master (a70c905006) .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/5427 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie5acdeffea7ed9a68ce0f48d1f68c6c922044704 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test
Michael Brown has posted comments on this change. Change subject: IMPALA-4467: Add support for DML statements in stress test .. Patch Set 3: Code-Review+1 (5 comments) http://gerrit.cloudera.org:8080/#/c/5093/2/tests/stress/concurrent_select.py File tests/stress/concurrent_select.py: PS2, Line 1417: > For performance, it's faster to check if an element exists in a set than a Done PS2, Line 1464: insert_query.modifies_table = False > Not sure, is it always going to be the case that a column is updatable if a It's possible updatale_column_names is a bad name. In any case, it currently returns all the columns that are not primary keys, very similar to your expression above. PS2, Line 1466: insert_query.name = "insert_{0}".format(table.name) : insert_query.db_name = cursor.db_name : insert_query.sql = ( > Done That makes a notable difference. Thanks. http://gerrit.cloudera.org:8080/#/c/5093/3/tests/stress/concurrent_select.py File tests/stress/concurrent_select.py: PS3, Line 1461: this will still be still Some extra words here. PS3, Line 1657: quereis spelling: queries -- To view, visit http://gerrit.cloudera.org:8080/5093 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4631: don't use floating point operations for time unit conversions
Michael Ho has posted comments on this change. Change subject: IMPALA-4631: don't use floating point operations for time unit conversions .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/5434 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7237f579b201f5bd3930f66e9c2c8d700c37ffeb Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan Hecht Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4630: remove debug webpage easter egg
Tim Armstrong has submitted this change and it was merged. Change subject: IMPALA-4630: remove debug webpage easter egg .. IMPALA-4630: remove debug webpage easter egg Change-Id: Ic2b1eb876dcec71a56bf76ea5f045818c6cd9a78 Reviewed-on: http://gerrit.cloudera.org:8080/5429 Tested-by: Impala Public Jenkins Reviewed-by: Tim Armstrong --- M www/common-footer.tmpl 1 file changed, 3 insertions(+), 1 deletion(-) Approvals: Impala Public Jenkins: Verified Tim Armstrong: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/5429 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ic2b1eb876dcec71a56bf76ea5f045818c6cd9a78 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Martin Grund Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4630: remove debug webpage easter egg
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4630: remove debug webpage easter egg .. Patch Set 3: Code-Review+2 carry +2 from dan -- To view, visit http://gerrit.cloudera.org:8080/5429 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic2b1eb876dcec71a56bf76ea5f045818c6cd9a78 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Martin Grund Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2057: Better error message for incorrect avro decimal column declaration
Bharath Vissapragada has posted comments on this change. Change subject: IMPALA-2057: Better error message for incorrect avro decimal column declaration .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/5255 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iad23706128223b6537d565471ef5d8faa91b0b5a Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: anujphadke Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3989: Display skew warning for poorly formatted Parquet files
Attila Jeges has posted comments on this change. Change subject: IMPALA-3989: Display skew warning for poorly formatted Parquet files .. Patch Set 4: (3 comments) http://gerrit.cloudera.org:8080/#/c/5400/3/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: Line 175: ADD_COUNTER(scan_node_->runtime_profile(), "NumScannersWithNoReads", TUnit::UNIT); > I think the name here is confusing - it seems like its really counting the Done http://gerrit.cloudera.org:8080/#/c/5400/3/be/src/exec/hdfs-scan-node-base.cc File be/src/exec/hdfs-scan-node-base.cc: Line 868: void HdfsScanNodeBase::LogPoorlyFormattedParquetFileWarning(const std::string& filename) { > 'string&' Done http://gerrit.cloudera.org:8080/#/c/5400/3/be/src/exec/hdfs-scan-node-base.h File be/src/exec/hdfs-scan-node-base.h: Line 279: void LogPoorlyFormattedParquetFileWarning(const std::string& filename); > 'string&' instead of '&filename' Done -- To view, visit http://gerrit.cloudera.org:8080/5400 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibf48d978383d73efdade733a892e795ebd53c76a Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3989: Display skew warning for poorly formatted Parquet files
Attila Jeges has uploaded a new patch set (#4). Change subject: IMPALA-3989: Display skew warning for poorly formatted Parquet files .. IMPALA-3989: Display skew warning for poorly formatted Parquet files Parquet files are scanned in the granularity of row groups. Each row group belongs to one or more splits and each split is scanned by a separate parquet scanner. If some row groups span multiple splits, then we will most likely end up seeing some scanners having remote reads and some scanners not performing scans at all. This will attribute to skew across the cluster where distribution of scans is uneven. This change adds a counter (NumScannersWithNoReads) to the scan node's runtime profile to track the number of parquet scanners that end up doing no reads becuse of poor formatting. It also displays a warning message when a misaligned row group is found. Change-Id: Ibf48d978383d73efdade733a892e795ebd53c76a --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M common/thrift/generate_error_codes.py M tests/query_test/test_scanners.py 6 files changed, 139 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/5400/4 -- To view, visit http://gerrit.cloudera.org:8080/5400 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibf48d978383d73efdade733a892e795ebd53c76a Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-3989: Display skew warning for poorly formatted Parquet files
Attila Jeges has uploaded a new patch set (#4). Change subject: IMPALA-3989: Display skew warning for poorly formatted Parquet files .. IMPALA-3989: Display skew warning for poorly formatted Parquet files Parquet files are scanned in the granularity of row groups. Each row group belongs to one or more splits and each split is scanned by a separate parquet scanner. If some row groups span multiple splits, then we will most likely end up seeing some scanners having remote reads and some scanners not performing scans at all. This will attribute to skew across the cluster where distribution of scans is uneven. This change adds a counter (NumScannersWithNoReads) to the scan node's runtime profile to track the number of parquet scanners that end up doing no reads becuse of poor formatting. It also displays a warning message when a misaligned row group is found. Change-Id: Ibf48d978383d73efdade733a892e795ebd53c76a --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M common/thrift/generate_error_codes.py M tests/query_test/test_scanners.py 6 files changed, 139 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/5400/4 -- To view, visit http://gerrit.cloudera.org:8080/5400 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibf48d978383d73efdade733a892e795ebd53c76a Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4014: Introduce query-wide execution state. .. Patch Set 12: Verified-1 Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/632/ -- To view, visit http://gerrit.cloudera.org:8080/4418 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel Kornacker Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-4014: Introduce query-wide execution state. .. Patch Set 12: Code-Review+2 fixed merge accidents -- To view, visit http://gerrit.cloudera.org:8080/4418 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel Kornacker Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.
Hello Impala Public Jenkins, Internal Jenkins, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4418 to look at the new patch set (#12). Change subject: IMPALA-4014: Introduce query-wide execution state. .. IMPALA-4014: Introduce query-wide execution state. This introduces a global structure to coordinate execution of fragment instances on a backend for a single query. New classes: - QueryExecMgr: subsumes FragmentMgr - QueryState - FragmentInstanceState: replaces FragmentExecState Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b --- M be/src/benchmarks/expr-benchmark.cc M be/src/exec/catalog-op-executor.cc M be/src/exec/data-source-scan-node.cc M be/src/exec/exchange-node.cc M be/src/exec/hash-table-test.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/union-node.cc M be/src/exprs/expr-test.cc M be/src/runtime/CMakeLists.txt M be/src/runtime/buffered-block-mgr-test.cc M be/src/runtime/buffered-tuple-stream-test.cc M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/data-stream-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h R be/src/runtime/fragment-instance-state.cc A be/src/runtime/fragment-instance-state.h M be/src/runtime/plan-fragment-executor.cc M be/src/runtime/plan-fragment-executor.h A be/src/runtime/query-exec-mgr.cc A be/src/runtime/query-exec-mgr.h A be/src/runtime/query-state.cc A be/src/runtime/query-state.h M be/src/runtime/runtime-filter-bank.cc M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/runtime/test-env.h M be/src/runtime/thread-resource-mgr.h M be/src/scheduling/query-schedule.cc M be/src/scheduling/request-pool-service.cc M be/src/scheduling/simple-scheduler.cc M be/src/service/CMakeLists.txt M be/src/service/fe-support.cc D be/src/service/fragment-exec-state.h D be/src/service/fragment-mgr.cc D be/src/service/fragment-mgr.h M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-http-handler.cc A be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-exec-state.cc M be/src/service/query-exec-state.h M be/src/testutil/desc-tbl-builder.h M be/src/udf/udf.cc M be/src/util/container-util.h M be/src/util/thread.h M be/src/util/uid-util.h M common/thrift/ImpalaInternalService.thrift M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/main/java/org/apache/impala/planner/UnionNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java M fe/src/test/java/org/apache/impala/service/FrontendTest.java M fe/src/test/java/org/apache/impala/testutil/TestUtils.java 67 files changed, 1,178 insertions(+), 782 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/4418/12 -- To view, visit http://gerrit.cloudera.org:8080/4418 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel Kornacker Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil