[Impala-ASF-CR] IMPALA-6092: avoid drop/create function interactions in e2e tests
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8593 ) Change subject: IMPALA-6092: avoid drop/create function interactions in e2e tests .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/8593/2/tests/query_test/test_udfs.py File tests/query_test/test_udfs.py: http://gerrit.cloudera.org:8080/#/c/8593/2/tests/query_test/test_udfs.py@422 PS2, Line 422: check_call(["hadoop", "fs", "-put", "-f", src_udf_path, tgt_udf_path]) > it does get left around (as with the other tests). perhaps a better pattern changed the paths so that these tmp jar get cleaned up. a new dir is created for each test's db. I put the jar in that db dir (as discussed) and verified that the dirs + files are removed (useful for dev env where we re-use the state). -- To view, visit http://gerrit.cloudera.org:8080/8593 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ica3538788b1d2ab5e361261e2ade62780b838e65 Gerrit-Change-Number: 8593 Gerrit-PatchSet: 2 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 22 Nov 2017 23:18:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1422: support a constant on LHS of IN predicates.
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8322 ) Change subject: IMPALA-1422: support a constant on LHS of IN predicates. .. Patch Set 8: (24 comments) http://gerrit.cloudera.org:8080/#/c/8322/8/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java File fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java: http://gerrit.cloudera.org:8080/#/c/8322/8/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@326 PS8, Line 326:*condition using the LHS constant. The rewrite handles group by, > Not clear what "handles" here means. It doesn't really do anything special thanks for the suggestion.. re-worded. http://gerrit.cloudera.org:8080/#/c/8322/8/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@327 PS8, Line 327:*analytic functions, limit, and uncorrelated subqueries. Correlated subqueries > Move last sentence into a TODO like this: Done http://gerrit.cloudera.org:8080/#/c/8322/8/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@330 PS8, Line 330:*TODO: handle correlated NOT IN case > remove (see above comment) Done http://gerrit.cloudera.org:8080/#/c/8322/8/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@340 PS8, Line 340:*So, if C is equal to any x_i, the expression is false. Similarly, if any > Nice description. I'm wondering if we directly translate this insight into yup, simplified. http://gerrit.cloudera.org:8080/#/c/8322/8/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@355 PS8, Line 355: Preconditions.checkArgument(rhs.contains(Subquery.class)); > not needed because L356-357 performs the same check Done http://gerrit.cloudera.org:8080/#/c/8322/8/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@398 PS8, Line 398: newSubquery.analyze(rhsQuery.getAnalyzer()); > This seems wrong. It might happen to work but pretty sure this will eventua Done http://gerrit.cloudera.org:8080/#/c/8322/8/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@714 PS8, Line 714: if (isCorrelatedPredicate(subqueryStmt.getWhereClause(), subqueryTupleIds)) { > containsCorrelatedPredicate()? yes, good catch. further simplified. http://gerrit.cloudera.org:8080/#/c/8322/8/fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java: http://gerrit.cloudera.org:8080/#/c/8322/8/fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java@277 PS8, Line 277: // NOT IN subquery with a correlated non-equi predicate is ok if the subquery only > [NOT] IN subquery Done http://gerrit.cloudera.org:8080/#/c/8322/8/fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java@292 PS8, Line 292: AnalyzesOk(String.format("select 1 from functional.allcomplextypes t " + > What about queries without complex types and non-constant LHS in the IN pre there's several above, for example L220 has a non-constant LHS and the types are not complex. L92 loops over a bunch of scalar types (and non-constant LHS). http://gerrit.cloudera.org:8080/#/c/8322/8/fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java@299 PS8, Line 299: AnalysisError(String.format("select * from functional.alltypestiny t where id %s " + > These tests are pretty rough to follow, perhaps we should spent some time b makes sense. the cut in this patch is just to get the tests to exercise the various branches more consistently where applicable. http://gerrit.cloudera.org:8080/#/c/8322/8/fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java@437 PS8, Line 437: // Tests for constant on the left hand side of [NOT] IN. > Since we are doing a double rewrite here, it's interesting to test nested s separated these out. I looked at making constant LHS just another dimension of the existing tests but I think a more comprehensive clean up here is better as a separate change. for now, I've added several more tests. http://gerrit.cloudera.org:8080/#/c/8322/8/fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java@476 PS8, Line 476: > You can omit this part if you like. Error message matching is prefix based. Done http://gerrit.cloudera.org:8080/#/c/8322/8/fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java@482 PS8, Line 482: "SELECT int_col FROM functional.alltypestiny b WHERE b.id = a.id ORDER BY int_col ASC LIMIT 5"); > long line, you can omit this line in the expected error message Done http://gerrit.cloudera.org:8080/#/c/8322/8/fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java@520 PS8, Line 520: + "(select * from functional.tinyinttable x where t.id = x.int_col)"); > please move "+" to previous line (easier to read and more consistent with e Done
[Impala-ASF-CR] IMPALA-1422: support a constant on LHS of IN predicates.
Hello Dimitris Tsirogiannis, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8322 to look at the new patch set (#9). Change subject: IMPALA-1422: support a constant on LHS of IN predicates. .. IMPALA-1422: support a constant on LHS of IN predicates. Currently, constant expressions for the LHS of the IN predicate are not supported. This patch adds this support as a rewrite in StmtRewriter (where subqueries are rewritten to joins). Since there is a nested-loop variant of left semijoin, support for IN is handled by not erring out. NOT IN is handled by a rewrite to corresponding NOT EXISTS predicate. Support for NOT IN with a correlated subquery is not included in this change. Re-organized the frontend subquery analysis tests to expand coverage. Testing: - added frontend subquery analysis tests - added e2e tests Change-Id: I0d69889a3c72e90be9d4ccf47d2816819ae32acb --- M fe/src/main/java/org/apache/impala/analysis/InPredicate.java M fe/src/main/java/org/apache/impala/analysis/SlotRef.java M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java M testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test A testdata/workloads/functional-query/queries/QueryTest/subquery-in-constant-lhs.test M tests/query_test/test_queries.py 7 files changed, 903 insertions(+), 203 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/22/8322/9 -- To view, visit http://gerrit.cloudera.org:8080/8322 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0d69889a3c72e90be9d4ccf47d2816819ae32acb Gerrit-Change-Number: 8322 Gerrit-PatchSet: 9 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-1422: support a constant on LHS of IN predicates.
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8322 ) Change subject: IMPALA-1422: support a constant on LHS of IN predicates. .. Patch Set 9: (2 comments) http://gerrit.cloudera.org:8080/#/c/8322/9/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java File fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java: http://gerrit.cloudera.org:8080/#/c/8322/9/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@368 PS9, Line 368: String wrapperAlias = rhsQuery.getTableAliasGenerator().getNextAlias(); > Please address this alias generation issue as discussed. done. I create generators externally now since I can't create the outer-block before I've created this view. Then I pass them to the outer-block after its created. http://gerrit.cloudera.org:8080/#/c/8322/9/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@401 PS9, Line 401: > remove, no point in having a blank line here Done -- To view, visit http://gerrit.cloudera.org:8080/8322 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0d69889a3c72e90be9d4ccf47d2816819ae32acb Gerrit-Change-Number: 8322 Gerrit-PatchSet: 9 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 01 Dec 2017 06:50:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1422: support a constant on LHS of IN predicates.
Hello Dimitris Tsirogiannis, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8322 to look at the new patch set (#10). Change subject: IMPALA-1422: support a constant on LHS of IN predicates. .. IMPALA-1422: support a constant on LHS of IN predicates. Currently, constant expressions for the LHS of the IN predicate are not supported. This patch adds this support as a rewrite in StmtRewriter (where subqueries are rewritten to joins). Since there is a nested-loop variant of left semijoin, support for IN is handled by not erring out. NOT IN is handled by a rewrite to corresponding NOT EXISTS predicate. Support for NOT IN with a correlated subquery is not included in this change. Re-organized the frontend subquery analysis tests to expand coverage. Testing: - added frontend subquery analysis tests - added e2e tests Change-Id: I0d69889a3c72e90be9d4ccf47d2816819ae32acb --- M fe/src/main/java/org/apache/impala/analysis/InPredicate.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/analysis/SlotRef.java M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java M testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test A testdata/workloads/functional-query/queries/QueryTest/subquery-in-constant-lhs.test M tests/query_test/test_queries.py 8 files changed, 921 insertions(+), 203 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/22/8322/10 -- To view, visit http://gerrit.cloudera.org:8080/8322 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0d69889a3c72e90be9d4ccf47d2816819ae32acb Gerrit-Change-Number: 8322 Gerrit-PatchSet: 10 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-4168: Adopt Oracle-style hint placement for INSERT/UPSERT
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8676 ) Change subject: IMPALA-4168: Adopt Oracle-style hint placement for INSERT/UPSERT .. Patch Set 2: (4 comments) thanks for the updates! couple of comments to round out the testing and improve the test refactor. http://gerrit.cloudera.org:8080/#/c/8676/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8676/2//COMMIT_MSG@7 PS2, Line 7: Adopt The jira says "adopt", but we're really not making this style the default. This change adds an additional hinting style to coexist with the current placement. I'd just say: Adds Oracle-style hint placement for INSERT/UPSERT I'd also add a simple example to illustrate the new placement and update the testing section as needed pending the other comments. http://gerrit.cloudera.org:8080/#/c/8676/2/fe/src/test/java/org/apache/impala/analysis/ParserTest.java File fe/src/test/java/org/apache/impala/analysis/ParserTest.java: http://gerrit.cloudera.org:8080/#/c/8676/2/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@44 PS2, Line 44: These tests confirm that the oracle style hints are indeed added as expected for inserts/select and upsert. Looking at https://gerrit.sjc.cloudera.com/#/c/4235/, which added additional hinting styles, we should also add the following tests: - to-sql: test/.../analysis/ToSqlTest [this patch will print the hint always in the default location. Look at L916 of main/.../analysis/InsertStmt. It should be an isolated change to modify toSql to print in the user specified location... will need a member to record the location style, an additional constructor arg, and modified copy constructor)] - end-to-end test: testdata/workloads/functional-planner/queries/PlannerTest/insert.test (look at L450) http://gerrit.cloudera.org:8080/#/c/8676/2/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@410 PS2, Line 410: "shuffle" This difference is preventing you from having one loop and factoring lines such as L381-383. After looking at this, I'd go with the other suggestion I had to adjust TestInsertHints to include the loop over hint placement. You can adjust the signature of TestInsertHints to be: TestInsertHints(String pattern, String hint, String... expectedHints) The pattern input can just be the string you'd pass in to String.format (so do the String.format in TestInsertHint). The hint can be what you currently have where the prefix/suffix are varied. http://gerrit.cloudera.org:8080/#/c/8676/2/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@426 PS2, Line 426: ParsesOk pls add a TestUpsertHints method that is similar to TestInsertHints so that it gets equivalent test coverage as TestInsertHints (note that ParsesOk does not verify that hints are actually attached to the statement). -- To view, visit http://gerrit.cloudera.org:8080/8676 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ied7629d70197a0270cdc0853e00cc021fdb4dc20 Gerrit-Change-Number: 8676 Gerrit-PatchSet: 2 Gerrit-Owner: Kim Jin ChulGerrit-Reviewer: Alex Behm Gerrit-Reviewer: John Russell Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 12 Dec 2017 18:10:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls (wip)
Vuk Ercegovac has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/8523 ) Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls (wip) .. IMPALA-5931: Generates scan ranges in planner for s3/adls (wip) Currently, for filesystems that do not include physical block information (e.g., block replica locations, caching), synthetic blocks are generated and stored in the catalog when metadata is loaded. Example file systems for which this is done includes S3, ADLS, and local fs. This change avoids generating these blocks when metadata is loaded. Instead, scan ranges are directly generated from such files by the backend coordinator. Previously, all scan ranges were produced by the planner in HDFSScanNode in the frontend. Now, those files without block information are sent to the coordinator represented by a split specification that determines how the coordinator will create scan ranges to send to executors. This change reduces the space needed in the catalog and reduces the scan range datastructures that are passed from the frontend to the backend when planning and coordinating a query. In addition a bug is avoided where non-splittable files were being split anyways to support the query parameter that places a limit on scan ranges. The "wip" status is currently to add/run more tests. Testing: - local filesystem tests exercise this code path - manually tried larger local filesystem tables (tpch) with multiple partitions and observed the same scan ranges. - TODO: s3 and adls testing Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 --- M be/src/scheduling/scheduler.cc M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java 5 files changed, 282 insertions(+), 165 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8523/2 -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 2 Gerrit-Owner: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-4993: extend dictionary filtering to collections
Hello Lars Volker, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8775 to look at the new patch set (#6). Change subject: IMPALA-4993: extend dictionary filtering to collections .. IMPALA-4993: extend dictionary filtering to collections Currently, top-level scalar columns in parquet files can be used at runtime to prune row-groups by evaluating certain conjuncts over the column's dictionary (if available). This change extends such pruning to scalar values that are stored in collection type columns. Currently, dictionary pruning works by finding eligible conjuncts for top-level slots. Since only top-level slots are supported, the slots are implicitly part of the scan node's tuple descriptor. With this change, we track eligible conjuncts by slot as well as the tuple that contains the slot (either top-level or nested collection). Since collection conjuncts are already managed by a map that associates tuple descriptors to a list of their conjuncts, this extension follows the existing representation. The frontend builds the mapping ofto conjuncts that are dictionary filterable. The backend is adjusted to use the same representation. In addition, collection readers are decomposed into scalar filterable columns and other, non-dictionary filterable readers. When filtering a row group using a conjunct associated to a (possibly) nested collection type, an additional tuple buffer is allocated per tuple descriptor. Testing: - e2e test extended to illustrate row-groups that are pruned by nested collection dictionary filters. Change-Id: If3a2abcfc3d0f7d18756816659fed77ce12668dd --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/runtime/collection-value-builder.h M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test M testdata/workloads/functional-query/queries/QueryTest/parquet-filtering.test 13 files changed, 652 insertions(+), 201 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/8775/6 -- To view, visit http://gerrit.cloudera.org:8080/8775 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If3a2abcfc3d0f7d18756816659fed77ce12668dd Gerrit-Change-Number: 8775 Gerrit-PatchSet: 6 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-4993: extend dictionary filtering to collections
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8775 ) Change subject: IMPALA-4993: extend dictionary filtering to collections .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/8775/5/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/8775/5/be/src/exec/hdfs-parquet-scanner.cc@799 PS5, Line 799: back_inserter(dict_filterable_readers_)); > nit: We usually omit the std:: in cc files. Also these (and other places be done. found some other uses so made this file consistent. -- To view, visit http://gerrit.cloudera.org:8080/8775 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If3a2abcfc3d0f7d18756816659fed77ce12668dd Gerrit-Change-Number: 8775 Gerrit-PatchSet: 5 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 14 Dec 2017 01:06:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4168: Adds Oracle-style hint placement for INSERT/UPSERT
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8676 ) Change subject: IMPALA-4168: Adds Oracle-style hint placement for INSERT/UPSERT .. Patch Set 7: Code-Review+1 lgtm, thanks for the changes! alex, please take a look. -- To view, visit http://gerrit.cloudera.org:8080/8676 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ied7629d70197a0270cdc0853e00cc021fdb4dc20 Gerrit-Change-Number: 8676 Gerrit-PatchSet: 7 Gerrit-Owner: Kim Jin ChulGerrit-Reviewer: Alex Behm Gerrit-Reviewer: John Russell Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 14 Dec 2017 01:54:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4168: Adds Oracle-style hint placement for INSERT/UPSERT
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8676 ) Change subject: IMPALA-4168: Adds Oracle-style hint placement for INSERT/UPSERT .. Patch Set 6: (3 comments) several more follow-up comments and we should be good. thx! http://gerrit.cloudera.org:8080/#/c/8676/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8676/5//COMMIT_MSG@12 PS5, Line 12: rebuild omit "query rebuild" http://gerrit.cloudera.org:8080/#/c/8676/5/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java: http://gerrit.cloudera.org:8080/#/c/8676/5/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java@56 PS5, Line 56: AtStatement argh... I was inconsistent with my suggestion and see it here now. I have Start/End vs. AtStatement/AtQuery. I have a slight preference for the latter since its more precise. I'll let you decide that one. But it should be consistent. http://gerrit.cloudera.org:8080/#/c/8676/5/fe/src/test/java/org/apache/impala/analysis/ParserTest.java File fe/src/test/java/org/apache/impala/analysis/ParserTest.java: http://gerrit.cloudera.org:8080/#/c/8676/5/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@310 PS5, Line 310: actualPlanHints, String... expectedHints) { : List actualHints = Lists.newArrayList(); : for (PlanHint hint: actualPlanHints) actualHints there's some shadowing here (actualHints is a parameter and local variable)-- does it do what you intend? perhaps to make it clearer, call the local one "stringHints". -- To view, visit http://gerrit.cloudera.org:8080/8676 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ied7629d70197a0270cdc0853e00cc021fdb4dc20 Gerrit-Change-Number: 8676 Gerrit-PatchSet: 6 Gerrit-Owner: Kim Jin ChulGerrit-Reviewer: Alex Behm Gerrit-Reviewer: John Russell Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 14 Dec 2017 01:33:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4993: extend dictionary filtering to collections
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8775 ) Change subject: IMPALA-4993: extend dictionary filtering to collections .. Patch Set 6: Code-Review+1 Carry +1 from Lars -- To view, visit http://gerrit.cloudera.org:8080/8775 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If3a2abcfc3d0f7d18756816659fed77ce12668dd Gerrit-Change-Number: 8775 Gerrit-PatchSet: 6 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 14 Dec 2017 01:08:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4168: Adds Oracle-style hint placement for INSERT/UPSERT
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8676 ) Change subject: IMPALA-4168: Adds Oracle-style hint placement for INSERT/UPSERT .. Patch Set 4: (20 comments) thanks for the changes! the refactor for the tests looks great. most follow-up comments are oriented around naming. http://gerrit.cloudera.org:8080/#/c/8676/4/fe/src/main/cup/sql-parser.cup File fe/src/main/cup/sql-parser.cup: http://gerrit.cloudera.org:8080/#/c/8676/4/fe/src/main/cup/sql-parser.cup@717 PS4, Line 717: Please beware of that the both of Oracle and default hint styles are supported when : // extending INSERT/UPSERT syntax. Replace with: "Note: when extending INSERT/UPSERT syntax, hinting is supported at the beginning of the statement and before the query." http://gerrit.cloudera.org:8080/#/c/8676/4/fe/src/main/cup/sql-parser.cup@807 PS4, Line 807: Please beware of that the both of Oracle and default hint styles are supported when : // extending INSERT/UPSERT syntax. make this consistent with the proposal on L717. http://gerrit.cloudera.org:8080/#/c/8676/4/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java File fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java: http://gerrit.cloudera.org:8080/#/c/8676/4/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java@86 PS4, Line 86: InsertStmt.HintLocation.DefaultLoc more intuitive to pass in null here (what does DefaultLoc for no hints mean?) http://gerrit.cloudera.org:8080/#/c/8676/4/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java: http://gerrit.cloudera.org:8080/#/c/8676/4/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java@56 PS4, Line 56: // A flag to determine the location of a hint This is a good place to describe the alternatives, including examples. Also, I'd use more descriptive names ("oracle" and "default" don't convey much). How about the following: // Determines the location of optional hints. The AtStatment option // is motivated by Oracle's hint placement at the start of the // statement and the AtQuery option places the hint right before // the query (if specified). // // Examples: // Start: INSERT /* my hint */ ... SELECT ... // End: INSERT /* my hint */ SELECT ... public enum HintLocation { Start, End }; http://gerrit.cloudera.org:8080/#/c/8676/4/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java@210 PS4, Line 210: hintLoc_ = hintLoc; can a caller pass in null? how about we pick the default (current placement) if null? http://gerrit.cloudera.org:8080/#/c/8676/4/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java@908 PS4, Line 908: hintLoc_ == HintLocation.OracleLoc && !planHints_.isEmpty() I would flip these tests around (test for whether we have hints, then test the hint location). http://gerrit.cloudera.org:8080/#/c/8676/4/fe/src/test/java/org/apache/impala/analysis/ParserTest.java File fe/src/test/java/org/apache/impala/analysis/ParserTest.java: http://gerrit.cloudera.org:8080/#/c/8676/4/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@303 PS4, Line 303: BuildQueryStmt more specific: rename to InjectInsertHint http://gerrit.cloudera.org:8080/#/c/8676/4/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@310 PS4, Line 310: hints rename to actualHints http://gerrit.cloudera.org:8080/#/c/8676/4/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@310 PS4, Line 310: TestHints rename to VerifyHints http://gerrit.cloudera.org:8080/#/c/8676/4/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@311 PS4, Line 311: actualHints rename to actualStringHints http://gerrit.cloudera.org:8080/#/c/8676/4/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@318 PS4, Line 318: Parses stmt and checks that the insert hints stmt are the expected hints. How about: Injects hints into pattern and checks that the injected hints match the expected hints. http://gerrit.cloudera.org:8080/#/c/8676/4/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@320 PS4, Line 320: TestInsertHints rename this to: TestInsertStmtHints... our InsertStmt covers both inserts and upserts. Update the comment to state that its used for both inserts and updates. http://gerrit.cloudera.org:8080/#/c/8676/4/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@327 PS4, Line 327: /** :* Parses stmt and checks that the upsert hints stmt are the expected hints. :*/ : private void TestUpsertHints(String pattern, String hint, String... expectedHints) { : TestInsertHints(pattern, hint, expectedHints); : } remove-- not needed. http://gerrit.cloudera.org:8080/#/c/8676/4/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@396 PS4, Line 396:
[Impala-ASF-CR] IMPALA-4993: extend dictionary filtering to collections
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8775 ) Change subject: IMPALA-4993: extend dictionary filtering to collections .. Patch Set 3: updating two more tests that core-tests found where additional dictionary predicates are used in plans. -- To view, visit http://gerrit.cloudera.org:8080/8775 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If3a2abcfc3d0f7d18756816659fed77ce12668dd Gerrit-Change-Number: 8775 Gerrit-PatchSet: 3 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 13 Dec 2017 00:19:52 + Gerrit-HasComments: No
[Impala-ASF-CR] Revert "IMPALA-3887: Wait for HDFS replication in data loading"
Vuk Ercegovac has abandoned this change. ( http://gerrit.cloudera.org:8080/8879 ) Change subject: Revert "IMPALA-3887: Wait for HDFS replication in data loading" .. Abandoned going w/ dave's patch -- To view, visit http://gerrit.cloudera.org:8080/8879 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: I4318b4fcb94aa67ac55692ae5b961915e038 Gerrit-Change-Number: 8879 Gerrit-PatchSet: 1 Gerrit-Owner: Vuk Ercegovac
[Impala-ASF-CR] Revert "IMPALA-3887: Wait for HDFS replication in data loading"
Vuk Ercegovac has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8879 Change subject: Revert "IMPALA-3887: Wait for HDFS replication in data loading" .. Revert "IMPALA-3887: Wait for HDFS replication in data loading" Using fsck on non-hdfs file systems causes errors. This reverts commit 5a7c10ec3d500c1bb582e985cb299e7dbcfe5520. Change-Id: I4318b4fcb94aa67ac55692ae5b961915e038 --- M testdata/bin/create-load-data.sh 1 file changed, 0 insertions(+), 19 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/8879/1 -- To view, visit http://gerrit.cloudera.org:8080/8879 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I4318b4fcb94aa67ac55692ae5b961915e038 Gerrit-Change-Number: 8879 Gerrit-PatchSet: 1 Gerrit-Owner: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-4993: extend dictionary filtering to collections
Hello Lars Volker, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8775 to look at the new patch set (#5). Change subject: IMPALA-4993: extend dictionary filtering to collections .. IMPALA-4993: extend dictionary filtering to collections Currently, top-level scalar columns in parquet files can be used at runtime to prune row-groups by evaluating certain conjuncts over the column's dictionary (if available). This change extends such pruning to scalar values that are stored in collection type columns. Currently, dictionary pruning works by finding eligible conjuncts for top-level slots. Since only top-level slots are supported, the slots are implicitly part of the scan node's tuple descriptor. With this change, we track eligible conjuncts by slot as well as the tuple that contains the slot (either top-level or nested collection). Since collection conjuncts are already managed by a map that associates tuple descriptors to a list of their conjuncts, this extension follows the existing representation. The frontend builds the mapping ofto conjuncts that are dictionary filterable. The backend is adjusted to use the same representation. In addition, collection readers are decomposed into scalar filterable columns and other, non-dictionary filterable readers. When filtering a row group using a conjunct associated to a (possibly) nested collection type, an additional tuple buffer is allocated per tuple descriptor. Testing: - e2e test extended to illustrate row-groups that are pruned by nested collection dictionary filters. Change-Id: If3a2abcfc3d0f7d18756816659fed77ce12668dd --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/runtime/collection-value-builder.h M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test M testdata/workloads/functional-query/queries/QueryTest/parquet-filtering.test 13 files changed, 650 insertions(+), 198 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/8775/5 -- To view, visit http://gerrit.cloudera.org:8080/8775 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If3a2abcfc3d0f7d18756816659fed77ce12668dd Gerrit-Change-Number: 8775 Gerrit-PatchSet: 5 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-4993: extend dictionary filtering to collections
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8775 ) Change subject: IMPALA-4993: extend dictionary filtering to collections .. Patch Set 4: (8 comments) http://gerrit.cloudera.org:8080/#/c/8775/4/be/src/exec/hdfs-scan-node-base.h File be/src/exec/hdfs-scan-node-base.h: http://gerrit.cloudera.org:8080/#/c/8775/4/be/src/exec/hdfs-scan-node-base.h@90 PS4, Line 90: /// sequence-based file > This looks like a rebase artifact. I often look at diffs between patchsets ok, I think I ran into a conflict so wanted to merge it. http://gerrit.cloudera.org:8080/#/c/8775/4/be/src/exec/hdfs-scanner.cc File be/src/exec/hdfs-scanner.cc: http://gerrit.cloudera.org:8080/#/c/8775/4/be/src/exec/hdfs-scanner.cc@90 PS4, Line 90: std:: > We usually drop the std:: prefix in cc files. Done http://gerrit.cloudera.org:8080/#/c/8775/4/be/src/exec/hdfs-scanner.cc@96 PS4, Line 96: std:: > same here. Done http://gerrit.cloudera.org:8080/#/c/8775/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/8775/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@656 PS1, Line 656: // Check to see if the conjunct evaluates to true when the slot is NULL > nit: conjuncts still lacks the _ :) whoops, done. http://gerrit.cloudera.org:8080/#/c/8775/4/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/8775/4/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@111 PS4, Line 111: the > Duplicate word Done http://gerrit.cloudera.org:8080/#/c/8775/4/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1099 PS4, Line 1099: output.append(detailPrefix + "parquet statistics predicates: " + > This one compiles the string differently than the others, probably I did th sure, and found one more place. http://gerrit.cloudera.org:8080/#/c/8775/4/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1116 PS4, Line 1116: List totalIdxList = Lists.newArrayList(); > This is more of a guess, but can't you pass entry.getValue() to newArrayLis indeed. done. http://gerrit.cloudera.org:8080/#/c/8775/1/testdata/workloads/functional-query/queries/QueryTest/parquet-filtering.test File testdata/workloads/functional-query/queries/QueryTest/parquet-filtering.test: http://gerrit.cloudera.org:8080/#/c/8775/1/testdata/workloads/functional-query/queries/QueryTest/parquet-filtering.test@275 PS1, Line 275: 1 > That makes sense. Should we add a quick comment for the next person to come yes, good point. done. -- To view, visit http://gerrit.cloudera.org:8080/8775 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If3a2abcfc3d0f7d18756816659fed77ce12668dd Gerrit-Change-Number: 8775 Gerrit-PatchSet: 4 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 13 Dec 2017 21:28:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6092: avoid drop/create function interactions in e2e tests
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8593 ) Change subject: IMPALA-6092: avoid drop/create function interactions in e2e tests .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/8593/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8593/1//COMMIT_MSG@13 PS1, Line 13: test_udf_invalid_symbol drops a function from that jar, which causes the use > nit: line overflow in multiple places. Done http://gerrit.cloudera.org:8080/#/c/8593/1/tests/query_test/test_udfs.py File tests/query_test/test_udfs.py: http://gerrit.cloudera.org:8080/#/c/8593/1/tests/query_test/test_udfs.py@422 PS1, Line 422: check_call(["hadoop", "fs", "-put", "-f", src_udf_path, tgt_udf_path]) > Just to be sure, this works ok on local fs right? Basically we "put" from l Couple of points. This always copies from a local source and it renames to a unique tgt name in local fs. Its also the way that the create/drop tests are also setup (they're careful not to use the same shared jar across tests). I assume the other tests work, but just in case-- how do I run just the local fs version of the test to verify this? -- To view, visit http://gerrit.cloudera.org:8080/8593 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ica3538788b1d2ab5e361261e2ade62780b838e65 Gerrit-Change-Number: 8593 Gerrit-PatchSet: 1 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Mon, 20 Nov 2017 20:22:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4985: use parquet stats of nested types for dynamic pruning
Hello Lars Volker, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8480 to look at the new patch set (#5). Change subject: IMPALA-4985: use parquet stats of nested types for dynamic pruning .. IMPALA-4985: use parquet stats of nested types for dynamic pruning Currently, parquet row-groups can be pruned at run-time using min/max stats when predicates (in, binary) are specified for column scalar types. This patch extends pruning to nested types for the same class of predicates. A nested value is an instance of a nested type (struct, array, map). A nested value consists of other nested and scalar values (as declared by its type). Predicates that can be used for row-group pruning must be applied to nested scalar values. In addition, the parent of the nested scalar must also be required, that is, not empty. The latter requirement is conservative: some filters that could be used for pruning are not used for correctness reasons. Testing: - extended nested-types-parquet-stats e2e test cases. Change-Id: I0c99e20cb080b504442cd5376ea3e046016158fe --- M be/src/exec/hdfs-parquet-scanner.h M fe/src/main/java/org/apache/impala/analysis/CollectionStructType.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test M testdata/workloads/functional-query/queries/QueryTest/nested-types-parquet-stats.test M tests/query_test/test_nested_types.py 7 files changed, 508 insertions(+), 35 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/8480/5 -- To view, visit http://gerrit.cloudera.org:8080/8480 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0c99e20cb080b504442cd5376ea3e046016158fe Gerrit-Change-Number: 8480 Gerrit-PatchSet: 5 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-6092: avoid drop/create function interactions in e2e tests
Vuk Ercegovac has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8593 Change subject: IMPALA-6092: avoid drop/create function interactions in e2e tests .. IMPALA-6092: avoid drop/create function interactions in e2e tests The e2e unit tests for udfs can interact via the backend lib_cache, causing test flakes. IMPALA-6215 explains a race between the lib_cache and UdfExecutor in the frontend which is the likely the root cause. Two e2e tests use the same jar (test_java_udfs and test_udf_invalid_symbol), test_udf_invalid_symbol drops a function from that jar, which causes the use of that jar to fail in the test_java_udfs test. Since the state of lib_cache is per process, its state causes these interactions across unit tests. This change avoids the interactions by using separate jars for the separate tests. Change-Id: Ica3538788b1d2ab5e361261e2ade62780b838e65 --- M be/src/runtime/lib-cache.h M fe/src/main/java/org/apache/impala/hive/executor/UdfExecutor.java M tests/query_test/test_udfs.py 3 files changed, 12 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/8593/1 -- To view, visit http://gerrit.cloudera.org:8080/8593 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ica3538788b1d2ab5e361261e2ade62780b838e65 Gerrit-Change-Number: 8593 Gerrit-PatchSet: 1 Gerrit-Owner: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-6092: avoid drop/create function interactions in e2e tests
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8593 ) Change subject: IMPALA-6092: avoid drop/create function interactions in e2e tests .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/8593/1/tests/query_test/test_udfs.py File tests/query_test/test_udfs.py: http://gerrit.cloudera.org:8080/#/c/8593/1/tests/query_test/test_udfs.py@422 PS1, Line 422: check_call(["hadoop", "fs", "-put", "-f", src_udf_path, tgt_udf_path]) > What I meant was that "hadoop fs -put/copyFromLocal" from what I understand thanks for clarifying-- verified that the test_udfs.py tests pass when run against the local fs. -- To view, visit http://gerrit.cloudera.org:8080/8593 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ica3538788b1d2ab5e361261e2ade62780b838e65 Gerrit-Change-Number: 8593 Gerrit-PatchSet: 2 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 21 Nov 2017 19:00:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6092: avoid drop/create function interactions in e2e tests
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8593 ) Change subject: IMPALA-6092: avoid drop/create function interactions in e2e tests .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/8593/2/tests/query_test/test_udfs.py File tests/query_test/test_udfs.py: http://gerrit.cloudera.org:8080/#/c/8593/2/tests/query_test/test_udfs.py@422 PS2, Line 422: check_call(["hadoop", "fs", "-put", "-f", src_udf_path, tgt_udf_path]) > if the test fails partway through, will this file get deleted somehow or wi it does get left around (as with the other tests). perhaps a better pattern here is to put these in a dir and use the tester's teardown method to cleanup. not sure how many other places follow the current pattern. -- To view, visit http://gerrit.cloudera.org:8080/8593 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ica3538788b1d2ab5e361261e2ade62780b838e65 Gerrit-Change-Number: 8593 Gerrit-PatchSet: 2 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 21 Nov 2017 19:15:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4985: use parquet stats of nested types for dynamic pruning
Hello Lars Volker, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8480 to look at the new patch set (#7). Change subject: IMPALA-4985: use parquet stats of nested types for dynamic pruning .. IMPALA-4985: use parquet stats of nested types for dynamic pruning Currently, parquet row-groups can be pruned at run-time using min/max stats when predicates (in, binary) are specified for column scalar types. This patch extends pruning to nested types for the same class of predicates. A nested value is an instance of a nested type (struct, array, map). A nested value consists of other nested and scalar values (as declared by its type). Predicates that can be used for row-group pruning must be applied to nested scalar values. In addition, the parent of the nested scalar must also be required, that is, not empty. The latter requirement is conservative: some filters that could be used for pruning are not used for correctness reasons. Testing: - extended nested-types-parquet-stats e2e test cases. Change-Id: I0c99e20cb080b504442cd5376ea3e046016158fe --- M be/src/exec/hdfs-parquet-scanner.h M fe/src/main/java/org/apache/impala/analysis/CollectionStructType.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/analysis/SlotRef.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test M testdata/workloads/functional-query/queries/QueryTest/nested-types-parquet-stats.test M tests/query_test/test_nested_types.py 8 files changed, 644 insertions(+), 35 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/8480/7 -- To view, visit http://gerrit.cloudera.org:8080/8480 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0c99e20cb080b504442cd5376ea3e046016158fe Gerrit-Change-Number: 8480 Gerrit-PatchSet: 7 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-4985: use parquet stats of nested types for dynamic pruning
Hello Lars Volker, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8480 to look at the new patch set (#6). Change subject: IMPALA-4985: use parquet stats of nested types for dynamic pruning .. IMPALA-4985: use parquet stats of nested types for dynamic pruning Currently, parquet row-groups can be pruned at run-time using min/max stats when predicates (in, binary) are specified for column scalar types. This patch extends pruning to nested types for the same class of predicates. A nested value is an instance of a nested type (struct, array, map). A nested value consists of other nested and scalar values (as declared by its type). Predicates that can be used for row-group pruning must be applied to nested scalar values. In addition, the parent of the nested scalar must also be required, that is, not empty. The latter requirement is conservative: some filters that could be used for pruning are not used for correctness reasons. Testing: - extended nested-types-parquet-stats e2e test cases. Change-Id: I0c99e20cb080b504442cd5376ea3e046016158fe --- M be/src/exec/hdfs-parquet-scanner.h M fe/src/main/java/org/apache/impala/analysis/CollectionStructType.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/analysis/SlotRef.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test M testdata/workloads/functional-query/queries/QueryTest/nested-types-parquet-stats.test M tests/query_test/test_nested_types.py 8 files changed, 644 insertions(+), 35 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/8480/6 -- To view, visit http://gerrit.cloudera.org:8080/8480 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0c99e20cb080b504442cd5376ea3e046016158fe Gerrit-Change-Number: 8480 Gerrit-PatchSet: 6 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-4985: use parquet stats of nested types for dynamic pruning
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8480 ) Change subject: IMPALA-4985: use parquet stats of nested types for dynamic pruning .. Patch Set 6: (4 comments) http://gerrit.cloudera.org:8080/#/c/8480/6/fe/src/main/java/org/apache/impala/analysis/SlotRef.java File fe/src/main/java/org/apache/impala/analysis/SlotRef.java: http://gerrit.cloudera.org:8080/#/c/8480/6/fe/src/main/java/org/apache/impala/analysis/SlotRef.java@154 PS6, Line 154:* Checks if slotRef refers to an array "pos" pseudo-column. > Checks if this SlotRef Done http://gerrit.cloudera.org:8080/#/c/8480/6/fe/src/main/java/org/apache/impala/analysis/SlotRef.java@160 PS6, Line 160: public boolean isArrayPosReference() { > isArrayPosRef() Done http://gerrit.cloudera.org:8080/#/c/8480/6/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/8480/6/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@541 PS6, Line 541: // It is assumed that analysis adds these guards such that they are correct, but > guards -> filters done. checked that this not used in the rest of this file. http://gerrit.cloudera.org:8080/#/c/8480/5/testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test File testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test: http://gerrit.cloudera.org:8080/#/c/8480/5/testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test@96 PS5, Line 96: PLAN > Got it. Might want to change that, but definitely not in this patch. right, a naive change could cause too many changes/noise. for the point about conflicts, perhaps a plan visitor to extract and test would be useful for checking for simple, local properties like these. that would move all of these tests from this text file to a plain java test. -- To view, visit http://gerrit.cloudera.org:8080/8480 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0c99e20cb080b504442cd5376ea3e046016158fe Gerrit-Change-Number: 8480 Gerrit-PatchSet: 6 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 22 Nov 2017 00:05:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6198: marks a test as debug-only
Vuk Ercegovac has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8573 Change subject: IMPALA-6198: marks a test as debug-only .. IMPALA-6198: marks a test as debug-only The test_catalog_wait test uses flags that are only compiled for debug binaries. This change marks the test as debug-only so that it does not break release tests. Change-Id: I92640b8192545cccea0411c04cc5fcf59fbefed0 --- M tests/custom_cluster/test_catalog_wait.py 1 file changed, 2 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/73/8573/1 -- To view, visit http://gerrit.cloudera.org:8080/8573 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I92640b8192545cccea0411c04cc5fcf59fbefed0 Gerrit-Change-Number: 8573 Gerrit-PatchSet: 1 Gerrit-Owner: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-6273: fixes subquery tests for functional hbase
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8765 ) Change subject: IMPALA-6273: fixes subquery tests for functional_hbase .. Patch Set 3: reduced the number of tests that explicitly reference functional so that test coverage is reduced less. there is one case that remains for which I've filed a separate jira that describes the issue I ran into with nulls + hbase. -- To view, visit http://gerrit.cloudera.org:8080/8765 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibaa3a3df7362ac6d3ed07aff133dc4b3520bb4e0 Gerrit-Change-Number: 8765 Gerrit-PatchSet: 3 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 05 Dec 2017 19:48:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6273: fixes subquery tests for functional hbase
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8765 ) Change subject: IMPALA-6273: fixes subquery tests for functional_hbase .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8765/1/testdata/workloads/functional-query/queries/QueryTest/subquery-in-constant-lhs.test File testdata/workloads/functional-query/queries/QueryTest/subquery-in-constant-lhs.test: http://gerrit.cloudera.org:8080/#/c/8765/1/testdata/workloads/functional-query/queries/QueryTest/subquery-in-constant-lhs.test@116 PS1, Line 116: WHERE NULL IN (SELECT d FROM functional.nulltable); > Consider leaving a comment here about why we're specifying the db explicitl Done -- To view, visit http://gerrit.cloudera.org:8080/8765 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibaa3a3df7362ac6d3ed07aff133dc4b3520bb4e0 Gerrit-Change-Number: 8765 Gerrit-PatchSet: 1 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 05 Dec 2017 18:41:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6273: fixes subquery tests for functional hbase
Hello Philip Zeyliger, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8765 to look at the new patch set (#2). Change subject: IMPALA-6273: fixes subquery tests for functional_hbase .. IMPALA-6273: fixes subquery tests for functional_hbase IMPALA-1422 introduced tests that do not work with the testing setup for hbase. Namely, tinyinttable is not defined in the functional_hbase database, but is defined in the functional database. Exhaustive tests uncovered the issue. This change makes two changes so that tests work with functional_hbase: 1) use a table that is present in both functional and functional_hbase. the tests needed a subquery result with a single int column. tinyinttable is replaced with an inline view that provides this single int column in a portable manner. 2) nulls are handled differently with hbase (see IMPALA-728) so the nulltable used in the tests is set to functional.nulltable to avoid inconsistent results across input formats. Testing: - ran e2e tests with exhaustive exploration strategy for the broken test. Change-Id: Ibaa3a3df7362ac6d3ed07aff133dc4b3520bb4e0 --- M testdata/workloads/functional-query/queries/QueryTest/subquery-in-constant-lhs.test 1 file changed, 17 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/8765/2 -- To view, visit http://gerrit.cloudera.org:8080/8765 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibaa3a3df7362ac6d3ed07aff133dc4b3520bb4e0 Gerrit-Change-Number: 8765 Gerrit-PatchSet: 2 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-6286: Remove invalid runtime filter targets.
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8783 ) Change subject: IMPALA-6286: Remove invalid runtime filter targets. .. Patch Set 6: (4 comments) I have some questions on the different handling of that exception and a suggestion to make the callsites more explicit. http://gerrit.cloudera.org:8080/#/c/8783/6/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/8783/6/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1533 PS6, Line 1533: isTrueWithNullSlots prior, this would have tripped an assertion on backend error. why is it ok to ignore now? http://gerrit.cloudera.org:8080/#/c/8783/6/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1909 PS6, Line 1909: boolean might be clearer to have an enum here: true, false, unknown. instead, you are eating the error-- which presumably is rare and we may want to know about it-- and folding the unknown case into every call-site's catch. http://gerrit.cloudera.org:8080/#/c/8783/6/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1923 PS6, Line 1923: return FeSupport.EvalPredicate(nullTuplePred, getQueryCtx()); just noting that prior, we explicitly asserted the exception case should not happen whereas now that case is silently eaten (L1535). is this what was tripped by the test's too low limit? if so, shouldn't the tests be modified? that test includes an is null predicate for the nested collection-- unclear if that's needed for the test. http://gerrit.cloudera.org:8080/#/c/8783/6/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/8783/6/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@640 PS6, Line 640: continue would it be useful to know that pruning could not be done due to a backend error? -- To view, visit http://gerrit.cloudera.org:8080/8783 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I88153eea9f4b5117df60366fad2bd91776b95298 Gerrit-Change-Number: 8783 Gerrit-PatchSet: 6 Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 08 Dec 2017 07:27:40 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6286: Remove invalid runtime filter targets.
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8783 ) Change subject: IMPALA-6286: Remove invalid runtime filter targets. .. Patch Set 7: Code-Review+1 (3 comments) thanks for the explanation... looks fine to me. http://gerrit.cloudera.org:8080/#/c/8783/6/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/8783/6/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1491 PS6, Line 1491: d nit: predicates http://gerrit.cloudera.org:8080/#/c/8783/6/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1533 PS6, Line 1533: isTrueWithNullSlots > The predicates generated in this function are for optimization purposes and ok, that's what I was looking for. for runtime filter and dictionary pruning, they're clearly for optimization purposes so skipping is easy to reason about. its unclear for this method (getBoundPredicates)-- that a caller can expect some predicates to not be bound due to env/runtime issues. In particular, the previous behavior would have aborted this query, right? In which case, I'd expect additional queries to run with this change. http://gerrit.cloudera.org:8080/#/c/8783/6/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1923 PS6, Line 1923: } > Btw, I plan to add new targeted tests for these expr evaluation failure cas makes sense-- we should then see the BE failure for TupleIsNullPredicate instead of the assertion failure... didn't see the TupleIsNullPredicate example, since this change did not require changes at that callsite. for, getBoundPredicate-- we'd expect to run some queries that previously would fail the assertion. -- To view, visit http://gerrit.cloudera.org:8080/8783 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I88153eea9f4b5117df60366fad2bd91776b95298 Gerrit-Change-Number: 8783 Gerrit-PatchSet: 7 Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 08 Dec 2017 17:56:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4993: extend dictionary filtering to collections
Vuk Ercegovac has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8775 Change subject: IMPALA-4993: extend dictionary filtering to collections .. IMPALA-4993: extend dictionary filtering to collections Currently, top-level scalar columns in parquet files can be used at runtime to prune row-groups by evaluating certain conjuncts over the column's dictionary (if available). This change extends such pruning to scalar values that are stored in collection type columns. Currently, dictionary pruning works by finding eligible conjuncts for top-level slots. Since only top-level slots are supported, the slots are implicitly part of the scan node's tuple descriptor. With this change, we track eligible conjuncts by slot as well as the tuple that contains the slot (either top-level or nested collection). Since collection conjuncts are already managed by a map that associates tuple descriptors to a list of their conjuncts, this extension follows the existing representation. The frontend builds the mapping ofto conjuncts that are dictionary filterable. The backend is adjusted to use the same representation. In addition, collection readers are decomposed into scalar filterable columns and other, non-dictionary filterable readers. When filtering a row group using a conjunct associated to a (possibly) nested collection type, an additional tuple buffer is allocated per tuple descriptor. Testing: - e2e test extended to illustrate row-groups that are pruned by nested collection dictionary filters. Change-Id: If3a2abcfc3d0f7d18756816659fed77ce12668dd --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/runtime/collection-value-builder.h M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/workloads/functional-query/queries/QueryTest/parquet-filtering.test 10 files changed, 518 insertions(+), 180 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/8775/1 -- To view, visit http://gerrit.cloudera.org:8080/8775 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: If3a2abcfc3d0f7d18756816659fed77ce12668dd Gerrit-Change-Number: 8775 Gerrit-PatchSet: 1 Gerrit-Owner: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-4993: extend dictionary filtering to collections
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8775 ) Change subject: IMPALA-4993: extend dictionary filtering to collections .. Patch Set 1: (19 comments) thx for the reviews. http://gerrit.cloudera.org:8080/#/c/8775/1/be/src/exec/hdfs-parquet-scanner.h File be/src/exec/hdfs-parquet-scanner.h: http://gerrit.cloudera.org:8080/#/c/8775/1/be/src/exec/hdfs-parquet-scanner.h@452 PS1, Line 452: Only scalar columns can be : /// dictionary filtered > Can you mention in the comment why? Done. re-worded http://gerrit.cloudera.org:8080/#/c/8775/1/be/src/exec/hdfs-scan-node-base.h File be/src/exec/hdfs-scan-node-base.h: http://gerrit.cloudera.org:8080/#/c/8775/1/be/src/exec/hdfs-scan-node-base.h@167 PS1, Line 167: return thrift_dict_filter_conjuncts_; > You could make the calling code cleaner by returning an empty vector here. its used in one place so don't think its worth it. http://gerrit.cloudera.org:8080/#/c/8775/1/be/src/exec/hdfs-scan-node-base.h@382 PS1, Line 382: /// Dictionary filtering eligible conjuncts for each slot. > Can you add to the comment that this can be nullptr? Done http://gerrit.cloudera.org:8080/#/c/8775/1/be/src/exec/hdfs-scanner.cc File be/src/exec/hdfs-scanner.cc: http://gerrit.cloudera.org:8080/#/c/8775/1/be/src/exec/hdfs-scanner.cc@90 PS1, Line 90: auto > We usually only use auto for iterators and const& in for loops. I think thi Done http://gerrit.cloudera.org:8080/#/c/8775/1/be/src/exec/hdfs-scanner.cc@97 PS1, Line 97: entry.tuple_id > you could just use 'tuple_id' here (from L86). Done http://gerrit.cloudera.org:8080/#/c/8775/1/common/thrift/PlanNodes.thrift File common/thrift/PlanNodes.thrift: http://gerrit.cloudera.org:8080/#/c/8775/1/common/thrift/PlanNodes.thrift@202 PS1, Line 202: between a TupleId, SlotId pair to a > nit: I think it should be "mapping between ... and ..." or "mapping from .. Done http://gerrit.cloudera.org:8080/#/c/8775/1/common/thrift/PlanNodes.thrift@209 PS1, Line 209: optional > Shouldn't they all be required? Done http://gerrit.cloudera.org:8080/#/c/8775/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/8775/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@187 PS1, Line 187: // Map from pairs of TupleDescriptor, SlotIds to indices in PlanNodes.conjuncts_ and > I think this comment could benefit from a description of what makes a slot/ min-max pruning is discussed in the header, but dictionary pruning is not. I added a few sentences there so that its less surprising when seeing these supporting data-structures for the first time. http://gerrit.cloudera.org:8080/#/c/8775/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@188 PS1, Line 188: collectionConjuncts > nit: missing _ Done http://gerrit.cloudera.org:8080/#/c/8775/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@188 PS1, Line 188: // collectionConjuncts that are eligible for dictionary filtering. The TupleDescriptor > nit: I'd phrase it as "slots in the TupleDescriptor of this scan node map t Done http://gerrit.cloudera.org:8080/#/c/8775/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@624 PS1, Line 624: Preconditions.checkState(tupleIds.size() == 1); > It doesn't seem obvious to me why this is true. Can you add a comment? A bit unclear to me as well. However, if I understand this literally, we only support single slot conjuncts for dictionary pruning, so therefore should expect that that single slot only refers to a single tuple. I reduced the multiple checks against slot id and added a comment. http://gerrit.cloudera.org:8080/#/c/8775/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@656 PS1, Line 656:* Walks through conjuncts and collectionConjuncts and populates > nit: trailing _s Done http://gerrit.cloudera.org:8080/#/c/8775/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@668 PS1, Line 668: notEmptyCollections_.contains(entry.getKey()) > This could be factored into its own method together with the comment above. Done http://gerrit.cloudera.org:8080/#/c/8775/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1010 PS1, Line 1010: for (Map.Entry, List> entry : dictionaryFilterConjuncts_.entrySet()) { > long line Done http://gerrit.cloudera.org:8080/#/c/8775/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1075 PS1, Line 1075: if (!dictionaryFilterConjuncts_.isEmpty()) { > This check looks like it's not needed. whoops... missed the clear simplification. Done. http://gerrit.cloudera.org:8080/#/c/8775/1/testdata/workloads/functional-query/queries/QueryTest/parquet-filtering.test File
[Impala-ASF-CR] IMPALA-4993: extend dictionary filtering to collections
Hello Lars Volker, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8775 to look at the new patch set (#2). Change subject: IMPALA-4993: extend dictionary filtering to collections .. IMPALA-4993: extend dictionary filtering to collections Currently, top-level scalar columns in parquet files can be used at runtime to prune row-groups by evaluating certain conjuncts over the column's dictionary (if available). This change extends such pruning to scalar values that are stored in collection type columns. Currently, dictionary pruning works by finding eligible conjuncts for top-level slots. Since only top-level slots are supported, the slots are implicitly part of the scan node's tuple descriptor. With this change, we track eligible conjuncts by slot as well as the tuple that contains the slot (either top-level or nested collection). Since collection conjuncts are already managed by a map that associates tuple descriptors to a list of their conjuncts, this extension follows the existing representation. The frontend builds the mapping ofto conjuncts that are dictionary filterable. The backend is adjusted to use the same representation. In addition, collection readers are decomposed into scalar filterable columns and other, non-dictionary filterable readers. When filtering a row group using a conjunct associated to a (possibly) nested collection type, an additional tuple buffer is allocated per tuple descriptor. Testing: - e2e test extended to illustrate row-groups that are pruned by nested collection dictionary filters. Change-Id: If3a2abcfc3d0f7d18756816659fed77ce12668dd --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/runtime/collection-value-builder.h M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/workloads/functional-query/queries/QueryTest/parquet-filtering.test 10 files changed, 561 insertions(+), 194 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/8775/2 -- To view, visit http://gerrit.cloudera.org:8080/8775 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If3a2abcfc3d0f7d18756816659fed77ce12668dd Gerrit-Change-Number: 8775 Gerrit-PatchSet: 2 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-6286: Remove invalid runtime filter targets.
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8783 ) Change subject: IMPALA-6286: Remove invalid runtime filter targets. .. Patch Set 1: (6 comments) still getting the hang of target/source, outer, nullable in the context of runtime filters, but here are some comments. http://gerrit.cloudera.org:8080/#/c/8783/1/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java: http://gerrit.cloudera.org:8080/#/c/8783/1/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@278 PS1, Line 278: if (LOG.isTraceEnabled()) { : LOG.trace("Generating runtime filter from predicate " + joinPredicate); : } move this to L297 when you know that the filter will be generated. http://gerrit.cloudera.org:8080/#/c/8783/1/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@284 PS1, Line 284: incorrectly alter the query results produce incorrect results. [more direct] http://gerrit.cloudera.org:8080/#/c/8783/1/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@285 PS1, Line 285: the expr the target expression http://gerrit.cloudera.org:8080/#/c/8783/1/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@285 PS1, Line 285: should nit: must http://gerrit.cloudera.org:8080/#/c/8783/1/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@287 PS1, Line 287: Literal not following what is the Literal part of this predicate. http://gerrit.cloudera.org:8080/#/c/8783/1/testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-propagation.test File testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-propagation.test: http://gerrit.cloudera.org:8080/#/c/8783/1/testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-propagation.test@1483 PS1, Line 1483: s nit: produced -- To view, visit http://gerrit.cloudera.org:8080/8783 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I88153eea9f4b5117df60366fad2bd91776b95298 Gerrit-Change-Number: 8783 Gerrit-PatchSet: 1 Gerrit-Owner: Alex BehmGerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 07 Dec 2017 00:42:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4993: extend dictionary filtering to collections
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8775 ) Change subject: IMPALA-4993: extend dictionary filtering to collections .. Patch Set 3: planner tests: adds a test and updates where needed. -- To view, visit http://gerrit.cloudera.org:8080/8775 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If3a2abcfc3d0f7d18756816659fed77ce12668dd Gerrit-Change-Number: 8775 Gerrit-PatchSet: 3 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 12 Dec 2017 00:12:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4993: extend dictionary filtering to collections
Hello Lars Volker, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8775 to look at the new patch set (#3). Change subject: IMPALA-4993: extend dictionary filtering to collections .. IMPALA-4993: extend dictionary filtering to collections Currently, top-level scalar columns in parquet files can be used at runtime to prune row-groups by evaluating certain conjuncts over the column's dictionary (if available). This change extends such pruning to scalar values that are stored in collection type columns. Currently, dictionary pruning works by finding eligible conjuncts for top-level slots. Since only top-level slots are supported, the slots are implicitly part of the scan node's tuple descriptor. With this change, we track eligible conjuncts by slot as well as the tuple that contains the slot (either top-level or nested collection). Since collection conjuncts are already managed by a map that associates tuple descriptors to a list of their conjuncts, this extension follows the existing representation. The frontend builds the mapping ofto conjuncts that are dictionary filterable. The backend is adjusted to use the same representation. In addition, collection readers are decomposed into scalar filterable columns and other, non-dictionary filterable readers. When filtering a row group using a conjunct associated to a (possibly) nested collection type, an additional tuple buffer is allocated per tuple descriptor. Testing: - e2e test extended to illustrate row-groups that are pruned by nested collection dictionary filters. Change-Id: If3a2abcfc3d0f7d18756816659fed77ce12668dd --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/runtime/collection-value-builder.h M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test M testdata/workloads/functional-query/queries/QueryTest/parquet-filtering.test 11 files changed, 638 insertions(+), 194 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/8775/3 -- To view, visit http://gerrit.cloudera.org:8080/8775 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If3a2abcfc3d0f7d18756816659fed77ce12668dd Gerrit-Change-Number: 8775 Gerrit-PatchSet: 3 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] [PREVIEW] IMPALA-4886: Expose table metrics in the catalog web UI.
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8529 ) Change subject: [PREVIEW] IMPALA-4886: Expose table metrics in the catalog web UI. .. Patch Set 3: Code-Review+1 (6 comments) mostly small comments and optional suggestions. http://gerrit.cloudera.org:8080/#/c/8529/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/8529/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1395 PS3, Line 1395: metris nit: metrics http://gerrit.cloudera.org:8080/#/c/8529/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1395 PS3, Line 1395: pretty-prints them into a :* string. Returns the string representation of the table metrics condense: ... and returns a pretty-printed string representation. http://gerrit.cloudera.org:8080/#/c/8529/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/8529/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@252 PS3, Line 252: functionality nit: purposes http://gerrit.cloudera.org:8080/#/c/8529/3/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: http://gerrit.cloudera.org:8080/#/c/8529/3/fe/src/main/java/org/apache/impala/catalog/Table.java@579 PS3, Line 579: trully nit: truly http://gerrit.cloudera.org:8080/#/c/8529/3/fe/src/main/java/org/apache/impala/util/TopNCache.java File fe/src/main/java/org/apache/impala/util/TopNCache.java: http://gerrit.cloudera.org:8080/#/c/8529/3/fe/src/main/java/org/apache/impala/util/TopNCache.java@62 PS3, Line 62: function_.apply(t1).compareTo(function_.apply(t2)) nit: factor the two calls with a method called compareRanks(t1, t2) ? http://gerrit.cloudera.org:8080/#/c/8529/3/fe/src/main/java/org/apache/impala/util/TopNCache.java@78 PS3, Line 78: if (!heap_.remove(item)) easier to read? if (heap_.contains(item)) return; -- To view, visit http://gerrit.cloudera.org:8080/8529 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37d407979e6d3b1a444b6b6265900b148facde9e Gerrit-Change-Number: 8529 Gerrit-PatchSet: 3 Gerrit-Owner: Dimitris TsirogiannisGerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 12 Dec 2017 02:42:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4168: Adopt Oracle-style hint placement for INSERT/UPSERT
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8676 ) Change subject: IMPALA-4168: Adopt Oracle-style hint placement for INSERT/UPSERT .. Patch Set 1: (7 comments) Thanks for the change! I have some initial feedback for the tests. http://gerrit.cloudera.org:8080/#/c/8676/1/fe/src/test/java/org/apache/impala/analysis/ParserTest.java File fe/src/test/java/org/apache/impala/analysis/ParserTest.java: http://gerrit.cloudera.org:8080/#/c/8676/1/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@376 PS1, Line 376: // Test insert hints. instead of repeating the queries, how about we vary the hint location. I'm thinking of something like: for (placement : {OracleStyle, DefaultStyle}) { hint = String.format("%sshuffle%s", prefix, suffix) defaultHint = hint; oracleHint = "" if (placement == oracleStyle) { defaultHint = ""; oracleHint = hint; } TestInsertHints(String.format("insert %s into t %s select * from t", oracleHint, defaultHint),...) TestInsertHints(...) ... } Another place to inject this variation may be in TestInsertHints (I see that the upsert just checks for parsing, not whether the hints were applied as expected-- would be useful to upgrade these while here). http://gerrit.cloudera.org:8080/#/c/8676/1/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@392 PS1, Line 392: Adopt nit: remove the "Adopt". http://gerrit.cloudera.org:8080/#/c/8676/1/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@392 PS1, Line 392: IMPALA-4168: include jira issue for bug fixes, not new features. http://gerrit.cloudera.org:8080/#/c/8676/1/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@409 PS1, Line 409: . we can remove redundancy here as well. http://gerrit.cloudera.org:8080/#/c/8676/1/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@414 PS1, Line 414: IMPALA-4168: same here http://gerrit.cloudera.org:8080/#/c/8676/1/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@483 PS1, Line 483: . and remove redundancy here as well. http://gerrit.cloudera.org:8080/#/c/8676/1/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@499 PS1, Line 499: IMPALA-4168: same here -- To view, visit http://gerrit.cloudera.org:8080/8676 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ied7629d70197a0270cdc0853e00cc021fdb4dc20 Gerrit-Change-Number: 8676 Gerrit-PatchSet: 1 Gerrit-Owner: Kim Jin ChulGerrit-Reviewer: Alex Behm Gerrit-Reviewer: John Russell Gerrit-Reviewer: Kim Jin Chul Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 12 Dec 2017 01:50:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1422: support a constant on LHS of IN predicates.
Hello Dimitris Tsirogiannis, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8322 to look at the new patch set (#12). Change subject: IMPALA-1422: support a constant on LHS of IN predicates. .. IMPALA-1422: support a constant on LHS of IN predicates. Currently, constant expressions for the LHS of the IN predicate are not supported. This patch adds this support as a rewrite in StmtRewriter (where subqueries are rewritten to joins). Since there is a nested-loop variant of left semijoin, support for IN is handled by not erring out. NOT IN is handled by a rewrite to corresponding NOT EXISTS predicate. Support for NOT IN with a correlated subquery is not included in this change. Re-organized the frontend subquery analysis tests to expand coverage. Testing: - added frontend subquery analysis tests - added e2e tests Change-Id: I0d69889a3c72e90be9d4ccf47d2816819ae32acb --- M fe/src/main/java/org/apache/impala/analysis/InPredicate.java M fe/src/main/java/org/apache/impala/analysis/SlotRef.java M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java M testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test A testdata/workloads/functional-query/queries/QueryTest/subquery-in-constant-lhs.test M tests/query_test/test_queries.py 7 files changed, 908 insertions(+), 203 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/22/8322/12 -- To view, visit http://gerrit.cloudera.org:8080/8322 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0d69889a3c72e90be9d4ccf47d2816819ae32acb Gerrit-Change-Number: 8322 Gerrit-PatchSet: 12 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-1422: support a constant on LHS of IN predicates.
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8322 ) Change subject: IMPALA-1422: support a constant on LHS of IN predicates. .. Patch Set 11: updates alias generation to use outer-block's alias generators. -- To view, visit http://gerrit.cloudera.org:8080/8322 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0d69889a3c72e90be9d4ccf47d2816819ae32acb Gerrit-Change-Number: 8322 Gerrit-PatchSet: 11 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 01 Dec 2017 21:37:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1422: support a constant on LHS of IN predicates.
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8322 ) Change subject: IMPALA-1422: support a constant on LHS of IN predicates. .. Patch Set 12: (1 comment) http://gerrit.cloudera.org:8080/#/c/8322/12/testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test File testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test: http://gerrit.cloudera.org:8080/#/c/8322/12/testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test@2277 PS12, Line 2277: (select int_col from functional.tinyinttable where > Make this one more complicated or add another test like this: done. we're getting a HJ for the inner and a nested-loop for the outer. the lhs constant is pushed down into the HJ of the nested join. -- To view, visit http://gerrit.cloudera.org:8080/8322 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0d69889a3c72e90be9d4ccf47d2816819ae32acb Gerrit-Change-Number: 8322 Gerrit-PatchSet: 12 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Sat, 02 Dec 2017 00:04:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1422: support a constant on LHS of IN predicates.
Hello Dimitris Tsirogiannis, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8322 to look at the new patch set (#13). Change subject: IMPALA-1422: support a constant on LHS of IN predicates. .. IMPALA-1422: support a constant on LHS of IN predicates. Currently, constant expressions for the LHS of the IN predicate are not supported. This patch adds this support as a rewrite in StmtRewriter (where subqueries are rewritten to joins). Since there is a nested-loop variant of left semijoin, support for IN is handled by not erring out. NOT IN is handled by a rewrite to corresponding NOT EXISTS predicate. Support for NOT IN with a correlated subquery is not included in this change. Re-organized the frontend subquery analysis tests to expand coverage. Testing: - added frontend subquery analysis tests - added e2e tests Change-Id: I0d69889a3c72e90be9d4ccf47d2816819ae32acb --- M fe/src/main/java/org/apache/impala/analysis/InPredicate.java M fe/src/main/java/org/apache/impala/analysis/SlotRef.java M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java M testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test A testdata/workloads/functional-query/queries/QueryTest/subquery-in-constant-lhs.test M tests/query_test/test_queries.py 7 files changed, 932 insertions(+), 203 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/22/8322/13 -- To view, visit http://gerrit.cloudera.org:8080/8322 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0d69889a3c72e90be9d4ccf47d2816819ae32acb Gerrit-Change-Number: 8322 Gerrit-PatchSet: 13 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Vuk Ercegovac has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/8523 ) Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. IMPALA-5931: Generates scan ranges in planner for s3/adls Currently, for filesystems that do not include physical block information (e.g., block replica locations, caching), synthetic blocks are generated and stored in the catalog when metadata is loaded. Example file systems for which this is done includes S3, ADLS, and local fs. This change avoids generating these blocks when metadata is loaded. Instead, scan ranges are directly generated from such files by the backend coordinator. Previously, all scan ranges were produced by the planner in HDFSScanNode in the frontend. Now, those files without block information are sent to the coordinator represented by a split specification that determines how the coordinator will create scan ranges to send to executors. This change reduces the space needed in the catalog and reduces the scan range datastructures that are passed from the frontend to the backend when planning and coordinating a query. In addition a bug is avoided where non-splittable files were being split anyways to support the query parameter that places a limit on scan ranges. Testing: - local fs tests cover the code paths in this change - all core tests pass when configured with s3 - manually tried larger local filesystem tables (tpch) with multiple partitions and observed the same scan ranges. - TODO: adls testing Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 --- M be/src/scheduling/scheduler.cc M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java 5 files changed, 294 insertions(+), 166 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8523/3 -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 3 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8523 ) Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. Patch Set 3: all tests pass with s3 -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 3 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 20 Dec 2017 17:19:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6948: Delete catalog update topic entries upon catalog restart
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10289 ) Change subject: IMPALA-6948: Delete catalog update topic entries upon catalog restart .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/10289/1/be/src/statestore/statestore.cc File be/src/statestore/statestore.cc: http://gerrit.cloudera.org:8080/#/c/10289/1/be/src/statestore/statestore.cc@213 PS1, Line 213: topic_update_log_.clear(); > Good point. I updated the metrics and the related fields. I don't think res makes sense to not update. pls add a comment, perhaps in the header, to state that this bit of history is intentionally retained. http://gerrit.cloudera.org:8080/#/c/10289/1/be/src/statestore/statestore.cc@665 PS1, Line 665: subscriber->SetLastTopicVersionProcessed(topic_it->first, update.from_version); > Hm, not following. What do you suggest we dcheck? basically asking if from_version can be set here and would that worth checking? http://gerrit.cloudera.org:8080/#/c/10289/2/be/src/statestore/statestore.cc File be/src/statestore/statestore.cc: http://gerrit.cloudera.org:8080/#/c/10289/2/be/src/statestore/statestore.cc@216 PS2, Line 216: key_size_metric_->SetValue(0); : topic_size_metric_->SetValue(0); : value_size_metric_->SetValue(0); these are shared across topics, so this should more accurately subtract what was cleared. since we keep track of the total per topic, perhaps subtract that off? -- To view, visit http://gerrit.cloudera.org:8080/10289 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I74a8ade8e498ac35cb56d3775d2c67a86988d9b6 Gerrit-Change-Number: 10289 Gerrit-PatchSet: 2 Gerrit-Owner: Dimitris TsirogiannisGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 04 May 2018 23:07:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6948: Delete catalog update topic entries upon catalog restart
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10289 ) Change subject: IMPALA-6948: Delete catalog update topic entries upon catalog restart .. Patch Set 1: (6 comments) http://gerrit.cloudera.org:8080/#/c/10289/1/be/src/catalog/catalog-server.cc File be/src/catalog/catalog-server.cc: http://gerrit.cloudera.org:8080/#/c/10289/1/be/src/catalog/catalog-server.cc@249 PS1, Line 249: last_sent_catalog_version_ add to the header comments for this member its expected range ( [0, ...)) and lets explicitly state that 0 means no updates have been sent. http://gerrit.cloudera.org:8080/#/c/10289/1/be/src/statestore/statestore.cc File be/src/statestore/statestore.cc: http://gerrit.cloudera.org:8080/#/c/10289/1/be/src/statestore/statestore.cc@213 PS1, Line 213: topic_update_log_.clear(); init the other members: total_key_size_bytes_, total_value_size_bytes_ (I see value bytes is decreased elsewhere) decrement the gauges? (I didn't see these decreased anywhere even though value bytes can decrease) why should last_version_ not be reset? http://gerrit.cloudera.org:8080/#/c/10289/1/be/src/statestore/statestore.cc@665 PS1, Line 665: // cleared. I didn't see that the case on L656 can happen when clearing. dcheck it? http://gerrit.cloudera.org:8080/#/c/10289/1/tests/statestore/test_statestore.py File tests/statestore/test_statestore.py: http://gerrit.cloudera.org:8080/#/c/10289/1/tests/statestore/test_statestore.py@525 PS1, Line 525: sets the clear_topic_entries flag in a topic update message. """ add the jira for context http://gerrit.cloudera.org:8080/#/c/10289/1/tests/statestore/test_statestore.py@535 PS1, Line 535: num_updates=1, : key_template= keep the same order of args as on L531. http://gerrit.cloudera.org:8080/#/c/10289/1/tests/statestore/test_statestore.py@548 PS1, Line 548: can some topic other than topic_name be called back here? similar question with the version. how can we ensure that this callback does not silently pass the test without checking the assertions on L546, 547? perhaps log and check that the log contains a line that proves the check was run? -- To view, visit http://gerrit.cloudera.org:8080/10289 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I74a8ade8e498ac35cb56d3775d2c67a86988d9b6 Gerrit-Change-Number: 10289 Gerrit-PatchSet: 1 Gerrit-Owner: Dimitris TsirogiannisGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 04 May 2018 19:02:38 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6948,IMPALA-6962: add end-to-end tests
Hello Dimitris Tsirogiannis, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10291 to look at the new patch set (#3). Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests .. IMPALA-6948,IMPALA-6962: add end-to-end tests Adds end-to-end tests to validate that following various metadata operations, the catalog state in catalogd and impalads is the same. For IMPALA-6962, catalogd process restart for tests is fixed. Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 --- M be/src/catalog/catalog-server.cc M tests/common/impala_cluster.py M tests/common/impala_service.py A tests/custom_cluster/test_metadata_replicas.py M tests/metadata/test_hms_integration.py A tests/util/hive_utils.py 6 files changed, 243 insertions(+), 68 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/10291/3 -- To view, visit http://gerrit.cloudera.org:8080/10291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 Gerrit-Change-Number: 10291 Gerrit-PatchSet: 3 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-6948,IMPALA-6962: add end-to-end tests
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10291 ) Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests .. Patch Set 2: (5 comments) http://gerrit.cloudera.org:8080/#/c/10291/2/tests/custom_cluster/test_metadata_replicas.py File tests/custom_cluster/test_metadata_replicas.py: http://gerrit.cloudera.org:8080/#/c/10291/2/tests/custom_cluster/test_metadata_replicas.py@39 PS2, Line 39: def test_load_all(self, cursor): : """ Loads metadata for all tables and validates that they're all the same.""" : # Use describe to issue table loads. : c_objects = self.cluster.catalogd.service.get_catalog_objects() : for obj in c_objects: : if obj[1] == "TABLE": cursor.execute("describe %s" % obj[0]) : # Use a sentinel ddl with sync_ddl to make sure that all impalads : # have receieved the effect of the previous ddl's. : cursor.execute("set sync_ddl=true") : cursor.execute("invalidate metadata functional.alltypes") : : self.__validate_metadata() > This test is going to be very expensive and I am not sure how much extra co Removing this test... alex raised concerns about expense as well. It takes about 7-8 s, which is still less than 50% of the total time (remaining time needed for custom cluster setup/teardown. However, since its not the main point of this change, I'll remove it; we can add something like this later on if needed. http://gerrit.cloudera.org:8080/#/c/10291/2/tests/custom_cluster/test_metadata_replicas.py@69 PS2, Line 69: cursor.execute("invalidate metadata") > No need for the global one, you may do "invalidate metadata x". Done http://gerrit.cloudera.org:8080/#/c/10291/2/tests/custom_cluster/test_metadata_replicas.py@79 PS2, Line 79: wait_time_s = specific_build_type_timeout(10, slow_build_timeout=20) : sleep(wait_time_s) > I agree with the TODO and I think we should fix it now. Essentially, in ord Added a catalog version to the /catalog page. For this test, I check the catalog version and the latest catalog version from each impalad. When they are in agreement, the rest of the test can proceed. When starting up the catalogd, we wait until its subscribed to the statestore. Since subscription happens *after* the catalog is instantiated and loaded, its safe to access the catalogd catalog version at this point. http://gerrit.cloudera.org:8080/#/c/10291/2/tests/custom_cluster/test_metadata_replicas.py@81 PS2, Line 81: self.cluster.refresh() > Hm, why is this needed? refreshes the cluster process list after restarting a new catalogd process. needed to kill it later on L88. http://gerrit.cloudera.org:8080/#/c/10291/2/tests/custom_cluster/test_metadata_replicas.py@123 PS2, Line 123: def __dump_objects(self, catalog, impalads): : """ For debugging, prints all objects and their version.""" : print "CATALOG OBJECTS" : for k, v in catalog.items(): : print "%s, %s, %d" % (k, v[0], v[1]) : : for idx in xrange(0,len(impalads)): : print "IMPALAD %d OBJECTS" % idx : for k, v in impalads[idx].items(): : print "%s, %s, %d" % (k, v[0], v[1]) > Is this used anywhere? just using it for debugging so removed. -- To view, visit http://gerrit.cloudera.org:8080/10291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 Gerrit-Change-Number: 10291 Gerrit-PatchSet: 2 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 08 May 2018 20:32:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7017: deflake/fix test catalog restart test
Vuk Ercegovac has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10397 Change subject: IMPALA-7017: deflake/fix test_catalog_restart test .. IMPALA-7017: deflake/fix test_catalog_restart test The custom_cluster/test_metadata_replicas.py:test_catalog_restart test has been recently flaky/broken for two reasons: 1) Variable support for Hive and non-hdfs filesystems. Other tests that depend on Hive have disabled tests for non-hdfs filesystems. Since the functionality tested is not intended for all filesystems, this change disables this test for all filesystems other than hdfs. 2) Several builds have been flaky when looking up catalogd's version. This change adds a retry for obtaining the version. Change-Id: Iab6edb01f0bd7f5408cfef28fd05fdc95fb78469 --- M tests/common/impala_service.py M tests/custom_cluster/test_metadata_replicas.py 2 files changed, 22 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/97/10397/1 -- To view, visit http://gerrit.cloudera.org:8080/10397 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Iab6edb01f0bd7f5408cfef28fd05fdc95fb78469 Gerrit-Change-Number: 10397 Gerrit-PatchSet: 1 Gerrit-Owner: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-6948: Delete catalog update topic entries upon catalog restart
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10289 ) Change subject: IMPALA-6948: Delete catalog update topic entries upon catalog restart .. Patch Set 3: (2 comments) looks good. one remaining question about tightening the protocol. http://gerrit.cloudera.org:8080/#/c/10289/1/be/src/statestore/statestore.cc File be/src/statestore/statestore.cc: http://gerrit.cloudera.org:8080/#/c/10289/1/be/src/statestore/statestore.cc@665 PS1, Line 665: subscriber->SetLastTopicVersionProcessed(topic_it->first, update.from_version); > I see what you mean. Ideally, I want to keep that operation (deleting the t makes sense. however, if I were to test these code paths, I would add a case where from_version is set *and* we clear_topic_entries(). current code in the catalog (a subscriber) does not allow this combination afaict. main question from the standpoint of the protocol is whether we even want to consider this combination. if not, then lets document it with a dcheck. http://gerrit.cloudera.org:8080/#/c/10289/3/be/src/statestore/statestore.cc File be/src/statestore/statestore.cc: http://gerrit.cloudera.org:8080/#/c/10289/3/be/src/statestore/statestore.cc@215 PS3, Line 215: - if all is correct, the metric won't be negative. however, what happens if it does go negative? -- To view, visit http://gerrit.cloudera.org:8080/10289 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I74a8ade8e498ac35cb56d3775d2c67a86988d9b6 Gerrit-Change-Number: 10289 Gerrit-PatchSet: 3 Gerrit-Owner: Dimitris TsirogiannisGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Mon, 07 May 2018 19:22:53 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6948: Delete catalog update topic entries upon catalog restart
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10289 ) Change subject: IMPALA-6948: Delete catalog update topic entries upon catalog restart .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10289 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I74a8ade8e498ac35cb56d3775d2c67a86988d9b6 Gerrit-Change-Number: 10289 Gerrit-PatchSet: 4 Gerrit-Owner: Dimitris TsirogiannisGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Mon, 07 May 2018 23:44:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6948,IMPALA-6962: add end-to-end tests
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10291 ) Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests .. Patch Set 3: (6 comments) http://gerrit.cloudera.org:8080/#/c/10291/3/be/src/catalog/catalog-server.cc File be/src/catalog/catalog-server.cc: http://gerrit.cloudera.org:8080/#/c/10291/3/be/src/catalog/catalog-server.cc@304 PS3, Line 304: long current_catalog_version; > Why is this dependent on the success of GetDbs()? I think the version and t did it this way to share the error handling and avoid multiple "error" fields. I could have an error field and a list of reasons? http://gerrit.cloudera.org:8080/#/c/10291/3/tests/common/impala_service.py File tests/common/impala_service.py: http://gerrit.cloudera.org:8080/#/c/10291/3/tests/common/impala_service.py@151 PS3, Line 151: """ Extracts and returns the version of the catalog object's 'thrift_txt' representation.""" > I think the JSON field is called 'thrift_string' the value that passed in here is coming from get_catalog_object_dump, which is just html, not json. http://gerrit.cloudera.org:8080/#/c/10291/3/tests/custom_cluster/test_metadata_replicas.py File tests/custom_cluster/test_metadata_replicas.py: http://gerrit.cloudera.org:8080/#/c/10291/3/tests/custom_cluster/test_metadata_replicas.py@39 PS3, Line 39: @pytest.mark.xfail(reason="IMPALA-6948") > Issue is fixed now. Done http://gerrit.cloudera.org:8080/#/c/10291/3/tests/custom_cluster/test_metadata_replicas.py@54 PS3, Line 54: "create table if not exists %s.x (a string)" % db_name) > We should not do IF NOT EXISTS here and IF EXISTS below in the drop. If the Done http://gerrit.cloudera.org:8080/#/c/10291/3/tests/custom_cluster/test_metadata_replicas.py@55 PS3, Line 55: cursor.execute("set SYNC_DDL=true") > Not obvious why we need this. Can we remove it if we consistently use the s Done http://gerrit.cloudera.org:8080/#/c/10291/3/tests/custom_cluster/test_metadata_replicas.py@57 PS3, Line 57: assert "x" in self.client.execute("show tables in %s" % db_name).data > The use of both cursor and self.client is somewhat confusing. How many conn Done -- To view, visit http://gerrit.cloudera.org:8080/10291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 Gerrit-Change-Number: 10291 Gerrit-PatchSet: 3 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 09 May 2018 00:26:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6948,IMPALA-6962: add end-to-end tests
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10291 ) Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests .. Patch Set 4: changed how catalogd process is restarted: now using os.system instead of the wrapper around popen. that change fixes an issue with zombie procs in the jenkins env (that unfortunately was not present in the dev env). -- To view, visit http://gerrit.cloudera.org:8080/10291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 Gerrit-Change-Number: 10291 Gerrit-PatchSet: 4 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 09 May 2018 05:35:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6948,IMPALA-6962: add end-to-end tests
Hello Dimitris Tsirogiannis, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10291 to look at the new patch set (#5). Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests .. IMPALA-6948,IMPALA-6962: add end-to-end tests Adds end-to-end tests to validate that following various metadata operations, the catalog state in catalogd and impalads is the same. For IMPALA-6962, catalogd process restart for tests is fixed. Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 --- M be/src/catalog/catalog-server.cc M tests/common/impala_cluster.py M tests/common/impala_service.py A tests/custom_cluster/test_metadata_replicas.py M tests/metadata/test_hms_integration.py A tests/util/hive_utils.py 6 files changed, 240 insertions(+), 68 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/10291/5 -- To view, visit http://gerrit.cloudera.org:8080/10291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 Gerrit-Change-Number: 10291 Gerrit-PatchSet: 5 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-6948,IMPALA-6962: add end-to-end tests
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10291 ) Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/10291/3/be/src/catalog/catalog-server.cc File be/src/catalog/catalog-server.cc: http://gerrit.cloudera.org:8080/#/c/10291/3/be/src/catalog/catalog-server.cc@304 PS3, Line 304: void CatalogServer::CatalogUrlCallback(const Webserver::ArgumentMap& args, > I just think it's weird that we don't show the dbs if there was an error wi done. makes sense, I was treating it as all-or-nothing. http://gerrit.cloudera.org:8080/#/c/10291/5/tests/custom_cluster/test_metadata_replicas.py File tests/custom_cluster/test_metadata_replicas.py: http://gerrit.cloudera.org:8080/#/c/10291/5/tests/custom_cluster/test_metadata_replicas.py@50 PS5, Line 50: self.client.execute("invalidate metadata functional.alltypes") > check that the catalog version is at least 50 at this point to make debuggi Done -- To view, visit http://gerrit.cloudera.org:8080/10291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 Gerrit-Change-Number: 10291 Gerrit-PatchSet: 5 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 09 May 2018 18:23:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6948,IMPALA-6962: add end-to-end tests
Hello Dimitris Tsirogiannis, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10291 to look at the new patch set (#6). Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests .. IMPALA-6948,IMPALA-6962: add end-to-end tests Adds end-to-end tests to validate that following various metadata operations, the catalog state in catalogd and impalads is the same. For IMPALA-6962, catalogd process restart for tests is fixed. Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 --- M be/src/catalog/catalog-server.cc M tests/common/impala_cluster.py M tests/common/impala_service.py A tests/custom_cluster/test_metadata_replicas.py M tests/metadata/test_hms_integration.py A tests/util/hive_utils.py 6 files changed, 242 insertions(+), 68 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/10291/6 -- To view, visit http://gerrit.cloudera.org:8080/10291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 Gerrit-Change-Number: 10291 Gerrit-PatchSet: 6 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR](2.x) IMPALA-6948,IMPALA-6962: add end-to-end tests
Vuk Ercegovac has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10363 Change subject: IMPALA-6948,IMPALA-6962: add end-to-end tests .. IMPALA-6948,IMPALA-6962: add end-to-end tests Adds end-to-end tests to validate that following various metadata operations, the catalog state in catalogd and impalads is the same. For IMPALA-6962, catalogd process restart for tests is fixed. Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 Reviewed-on: http://gerrit.cloudera.org:8080/10291 Reviewed-by: Alex BehmTested-by: Impala Public Jenkins (cherry picked from commit 28c1f76529b79d437b66a80b954ec227e0ddc6cd) --- M be/src/catalog/catalog-server.cc M tests/common/impala_cluster.py M tests/common/impala_service.py A tests/custom_cluster/test_metadata_replicas.py M tests/metadata/test_hms_integration.py A tests/util/hive_utils.py 6 files changed, 242 insertions(+), 68 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/10363/1 -- To view, visit http://gerrit.cloudera.org:8080/10363 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: newchange Gerrit-Change-Id: Ic6c5b39e29b2885cd30fede18833cbf23fb755f5 Gerrit-Change-Number: 10363 Gerrit-PatchSet: 1 Gerrit-Owner: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8523 ) Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. Patch Set 11: (1 comment) getting back to this change... the upcoming update includes the idea to better separate the split specs vs. concrete scan ranges. rebase changes caught this in the middle so the update also includes rebase changes (only affected HdfsScanNode.java). http://gerrit.cloudera.org:8080/#/c/8523/11/common/thrift/PlanNodes.thrift File common/thrift/PlanNodes.thrift: http://gerrit.cloudera.org:8080/#/c/8523/11/common/thrift/PlanNodes.thrift@228 PS11, Line 228: 4: optional THdfsFileSplitGeneratorSpec hdfs_file_split_generator_spec > will try it out. The additional changes are relatively mechanical. Since this design explicitly does not allow these specs to propagate further in the backend, I agree that we're better off going this route. -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 11 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 10 May 2018 21:42:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Hello Lars Volker, Dimitris Tsirogiannis, Alex Behm, Mostafa Mokhtar, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8523 to look at the new patch set (#12). Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. IMPALA-5931: Generates scan ranges in planner for s3/adls Currently, for filesystems that do not include physical block information (e.g., block replica locations, caching), synthetic blocks are generated and stored in the catalog when metadata is loaded. Example file systems for which this is done includes S3, ADLS, and local fs. This change avoids generating these blocks when metadata is loaded. Instead, scan ranges are directly generated from such files by the backend coordinator. Previously, all scan ranges were produced by the planner in HDFSScanNode in the frontend. Now, those files without block information are sent to the coordinator represented by a split specification that determines how the coordinator will create scan ranges to send to executors. This change reduces the space needed in the catalog and reduces the scan range data structures that are passed from the frontend to the backend when planning and coordinating a query. In addition a bug is avoided where non-splittable files were being split anyways to support the query parameter that places a limit on scan ranges. Testing: - added backend scheduler tests - mixed-filesystems test covers tables/queries with multiple fs's. - local fs tests cover the code paths in this change - all core tests pass when configured with s3 - manually tried larger local filesystem tables (tpch) with multiple partitions and observed the same scan ranges. - TODO: adls testing Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 --- M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler-test-util.cc M be/src/scheduling/scheduler-test-util.h M be/src/scheduling/scheduler-test.cc M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/util/CMakeLists.txt A be/src/util/flat_buffer.cc A be/src/util/flat_buffer.h M common/thrift/Frontend.thrift M common/thrift/PlanNodes.thrift M common/thrift/Planner.thrift M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java 21 files changed, 718 insertions(+), 233 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8523/12 -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 12 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8523 ) Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. Patch Set 14: running more tests... -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 14 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Sat, 19 May 2018 20:26:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8523 ) Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. Patch Set 12: (1 comment) http://gerrit.cloudera.org:8080/#/c/8523/12//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8523/12//COMMIT_MSG@13 PS12, Line 13: includes S3, ADLS, and local fs. > What about erasure coded HDFS data? there are todo's in this change to extend to hdfs. pls see the comment in PlanNodes.thrift. -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 12 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 15 May 2018 17:35:32 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8523 ) Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. Patch Set 15: (3 comments) http://gerrit.cloudera.org:8080/#/c/8523/15/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/8523/15/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@783 PS15, Line 783: boolean checkMissingDiskIds = FileSystemUtil.supportsStorageIds(partitionFs); : boolean supportsBlocks = FileSystemUtil.supportsStorageIds(partitionFs); > these are equivalent. how about getting rid of the checkMissingDiskIds, and see reply to last comment for naming. http://gerrit.cloudera.org:8080/#/c/8523/15/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@812 PS15, Line 812: result.first > was changing the polarity of this a bug fix? yes. there was a bug here, as well as double-counting total_bytes from a prev. merge, and several npe's from that latest thrift refactor (java thrift list is null, not empty) http://gerrit.cloudera.org:8080/#/c/8523/15/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@914 PS15, Line 914: checkMissingDiskIds && !fileDesc.getIsEc() > this was confusing because when I read the name "checkMissingDiskIds" I tho renamed to: fsHasBlocks -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 15 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 23 May 2018 20:44:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Hello Lars Volker, Tianyi Wang, Dimitris Tsirogiannis, Alex Behm, Mostafa Mokhtar, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8523 to look at the new patch set (#16). Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. IMPALA-5931: Generates scan ranges in planner for s3/adls Currently, for filesystems that do not include physical block information (e.g., block replica locations, caching), synthetic blocks are generated and stored in the catalog when metadata is loaded. Example file systems for which this is done includes S3, ADLS, and local fs. This change avoids generating these blocks when metadata is loaded. Instead, scan ranges are directly generated from such files by the backend coordinator. Previously, all scan ranges were produced by the planner in HDFSScanNode in the frontend. Now, those files without block information are sent to the coordinator represented by a split specification that determines how the coordinator will create scan ranges to send to executors. This change reduces the space needed in the catalog and reduces the scan range data structures that are passed from the frontend to the backend when planning and coordinating a query. In addition a bug is avoided where non-splittable files were being split anyways to support the query parameter that places a limit on scan ranges. Testing: - added backend scheduler tests - mixed-filesystems test covers tables/queries with multiple fs's. - local fs tests cover the code paths in this change - all core tests pass when configured with s3 - manually tried larger local filesystem tables (tpch) with multiple partitions and observed the same scan ranges. - TODO: adls testing Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 --- M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler-test-util.cc M be/src/scheduling/scheduler-test-util.h M be/src/scheduling/scheduler-test.cc M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/util/CMakeLists.txt A be/src/util/flat_buffer.cc A be/src/util/flat_buffer.h M common/thrift/Frontend.thrift M common/thrift/PlanNodes.thrift M common/thrift/Planner.thrift M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java 21 files changed, 676 insertions(+), 246 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8523/16 -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 16 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Hello Lars Volker, Tianyi Wang, Dimitris Tsirogiannis, Alex Behm, Mostafa Mokhtar, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8523 to look at the new patch set (#15). Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. IMPALA-5931: Generates scan ranges in planner for s3/adls Currently, for filesystems that do not include physical block information (e.g., block replica locations, caching), synthetic blocks are generated and stored in the catalog when metadata is loaded. Example file systems for which this is done includes S3, ADLS, and local fs. This change avoids generating these blocks when metadata is loaded. Instead, scan ranges are directly generated from such files by the backend coordinator. Previously, all scan ranges were produced by the planner in HDFSScanNode in the frontend. Now, those files without block information are sent to the coordinator represented by a split specification that determines how the coordinator will create scan ranges to send to executors. This change reduces the space needed in the catalog and reduces the scan range data structures that are passed from the frontend to the backend when planning and coordinating a query. In addition a bug is avoided where non-splittable files were being split anyways to support the query parameter that places a limit on scan ranges. Testing: - added backend scheduler tests - mixed-filesystems test covers tables/queries with multiple fs's. - local fs tests cover the code paths in this change - all core tests pass when configured with s3 - manually tried larger local filesystem tables (tpch) with multiple partitions and observed the same scan ranges. - TODO: adls testing Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 --- M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler-test-util.cc M be/src/scheduling/scheduler-test-util.h M be/src/scheduling/scheduler-test.cc M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/util/CMakeLists.txt A be/src/util/flat_buffer.cc A be/src/util/flat_buffer.h M common/thrift/Frontend.thrift M common/thrift/PlanNodes.thrift M common/thrift/Planner.thrift M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java 21 files changed, 677 insertions(+), 245 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8523/15 -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 15 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8523 ) Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. Patch Set 14: Next change handles the merge. It was straightforward; the ec files are handled as regular hdfs files, so do not go through the prev. synthetic block generation. For testing, I re-ran core and s3 tests. -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 14 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 23 May 2018 05:39:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8523 ) Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. Patch Set 17: Code-Review+2 rebase and fix minor conflict. carrying dan's +2 -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 17 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 24 May 2018 06:29:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Hello Lars Volker, Tianyi Wang, Dimitris Tsirogiannis, Alex Behm, Mostafa Mokhtar, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8523 to look at the new patch set (#17). Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. IMPALA-5931: Generates scan ranges in planner for s3/adls Currently, for filesystems that do not include physical block information (e.g., block replica locations, caching), synthetic blocks are generated and stored in the catalog when metadata is loaded. Example file systems for which this is done includes S3, ADLS, and local fs. This change avoids generating these blocks when metadata is loaded. Instead, scan ranges are directly generated from such files by the backend coordinator. Previously, all scan ranges were produced by the planner in HDFSScanNode in the frontend. Now, those files without block information are sent to the coordinator represented by a split specification that determines how the coordinator will create scan ranges to send to executors. This change reduces the space needed in the catalog and reduces the scan range data structures that are passed from the frontend to the backend when planning and coordinating a query. In addition a bug is avoided where non-splittable files were being split anyways to support the query parameter that places a limit on scan ranges. Testing: - added backend scheduler tests - mixed-filesystems test covers tables/queries with multiple fs's. - local fs tests cover the code paths in this change - all core tests pass when configured with s3 - manually tried larger local filesystem tables (tpch) with multiple partitions and observed the same scan ranges. - TODO: adls testing Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 --- M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler-test-util.cc M be/src/scheduling/scheduler-test-util.h M be/src/scheduling/scheduler-test.cc M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/util/CMakeLists.txt A be/src/util/flat_buffer.cc A be/src/util/flat_buffer.h M common/thrift/Frontend.thrift M common/thrift/PlanNodes.thrift M common/thrift/Planner.thrift M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java 21 files changed, 676 insertions(+), 246 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8523/17 -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 17 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-4025: Part 1: Generalize and cleanup StmtRewriter
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10495 ) Change subject: IMPALA-4025: Part 1: Generalize and cleanup StmtRewriter .. Patch Set 1: Code-Review+2 thanks for the cleanup -- To view, visit http://gerrit.cloudera.org:8080/10495 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9e7a6108d3d49be12ae032fdb54b5c3c23152a47 Gerrit-Change-Number: 10495 Gerrit-PatchSet: 1 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 24 May 2018 06:16:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6933: Avoids db name collisions for Kudu tests
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10513 ) Change subject: IMPALA-6933: Avoids db name collisions for Kudu tests .. Patch Set 1: went with the pid idea so to distinguish the same test method in separate processes. I added comments in conftest.py to state that the alternative api is "deprecated" and to prefer using the alternative unique_database fixture (it looks like these kudu tests are the only user of the other scheme). -- To view, visit http://gerrit.cloudera.org:8080/10513 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c2f8a35fec90ae0dabe80237d83954668b47f6e Gerrit-Change-Number: 10513 Gerrit-PatchSet: 1 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Michael Brown Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 25 May 2018 23:15:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6933: Avoids db name collisions for Kudu tests
Hello Michael Brown, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10513 to look at the new patch set (#2). Change subject: IMPALA-6933: Avoids db name collisions for Kudu tests .. IMPALA-6933: Avoids db name collisions for Kudu tests Kudu tests generate temporary db names in a way so that its unlikely, yet possible to collide. A recent test failure indicates such a collision came up. The fix changes the way that the name is generated so that it includes the classes name for which the db name is generated. This db name will make it easier to see which test created it and the name will not collide with other names generated by other tests. Testing: - ran the updated test locally Change-Id: I7c2f8a35fec90ae0dabe80237d83954668b47f6e --- M tests/common/kudu_test_suite.py M tests/conftest.py 2 files changed, 29 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/13/10513/2 -- To view, visit http://gerrit.cloudera.org:8080/10513 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c2f8a35fec90ae0dabe80237d83954668b47f6e Gerrit-Change-Number: 10513 Gerrit-PatchSet: 2 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Michael Brown Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8523 ) Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. Patch Set 17: hmm. looks like some dependency order for generating flat buffers. -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 17 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 24 May 2018 16:02:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8523 ) Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. Patch Set 17: Odd that this happens with gvo, but not for internal builds. -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 17 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 24 May 2018 17:40:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8523 ) Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. Patch Set 18: latest change explicitly adds a dep for flat buffers for the backend (in the same place that the be depends on thrift and protos). afaict, this should have been in there for this change-- not sure why some of the builds didn't need it. -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 18 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 24 May 2018 21:13:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Hello Lars Volker, Tianyi Wang, Dimitris Tsirogiannis, Alex Behm, Mostafa Mokhtar, Dan Hecht, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8523 to look at the new patch set (#18). Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. IMPALA-5931: Generates scan ranges in planner for s3/adls Currently, for filesystems that do not include physical block information (e.g., block replica locations, caching), synthetic blocks are generated and stored in the catalog when metadata is loaded. Example file systems for which this is done includes S3, ADLS, and local fs. This change avoids generating these blocks when metadata is loaded. Instead, scan ranges are directly generated from such files by the backend coordinator. Previously, all scan ranges were produced by the planner in HDFSScanNode in the frontend. Now, those files without block information are sent to the coordinator represented by a split specification that determines how the coordinator will create scan ranges to send to executors. This change reduces the space needed in the catalog and reduces the scan range data structures that are passed from the frontend to the backend when planning and coordinating a query. In addition a bug is avoided where non-splittable files were being split anyways to support the query parameter that places a limit on scan ranges. Testing: - added backend scheduler tests - mixed-filesystems test covers tables/queries with multiple fs's. - local fs tests cover the code paths in this change - all core tests pass when configured with s3 - manually tried larger local filesystem tables (tpch) with multiple partitions and observed the same scan ranges. - TODO: adls testing Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 --- M CMakeLists.txt M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler-test-util.cc M be/src/scheduling/scheduler-test-util.h M be/src/scheduling/scheduler-test.cc M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/util/CMakeLists.txt A be/src/util/flat_buffer.cc A be/src/util/flat_buffer.h M common/thrift/Frontend.thrift M common/thrift/PlanNodes.thrift M common/thrift/Planner.thrift M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java 22 files changed, 677 insertions(+), 247 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8523/18 -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 18 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-6933: Avoids db name collisions for Kudu tests
Vuk Ercegovac has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10513 Change subject: IMPALA-6933: Avoids db name collisions for Kudu tests .. IMPALA-6933: Avoids db name collisions for Kudu tests Kudu tests generate temporary db names in a way so that its unlikely, yet possible to collide. A recent test failure indicates such a collision came up. The fix changes the way that the name is generated so that it includes the classes name for which the db name is generated. This db name will make it easier to see which test created it and the name will not collide with other names generated by other tests. Testing: - ran the updated test locally Change-Id: I7c2f8a35fec90ae0dabe80237d83954668b47f6e --- M tests/common/kudu_test_suite.py 1 file changed, 6 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/13/10513/1 -- To view, visit http://gerrit.cloudera.org:8080/10513 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I7c2f8a35fec90ae0dabe80237d83954668b47f6e Gerrit-Change-Number: 10513 Gerrit-PatchSet: 1 Gerrit-Owner: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Hello Lars Volker, Dimitris Tsirogiannis, Alex Behm, Mostafa Mokhtar, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8523 to look at the new patch set (#13). Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. IMPALA-5931: Generates scan ranges in planner for s3/adls Currently, for filesystems that do not include physical block information (e.g., block replica locations, caching), synthetic blocks are generated and stored in the catalog when metadata is loaded. Example file systems for which this is done includes S3, ADLS, and local fs. This change avoids generating these blocks when metadata is loaded. Instead, scan ranges are directly generated from such files by the backend coordinator. Previously, all scan ranges were produced by the planner in HDFSScanNode in the frontend. Now, those files without block information are sent to the coordinator represented by a split specification that determines how the coordinator will create scan ranges to send to executors. This change reduces the space needed in the catalog and reduces the scan range data structures that are passed from the frontend to the backend when planning and coordinating a query. In addition a bug is avoided where non-splittable files were being split anyways to support the query parameter that places a limit on scan ranges. Testing: - added backend scheduler tests - mixed-filesystems test covers tables/queries with multiple fs's. - local fs tests cover the code paths in this change - all core tests pass when configured with s3 - manually tried larger local filesystem tables (tpch) with multiple partitions and observed the same scan ranges. - TODO: adls testing Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 --- M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler-test-util.cc M be/src/scheduling/scheduler-test-util.h M be/src/scheduling/scheduler-test.cc M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/util/CMakeLists.txt A be/src/util/flat_buffer.cc A be/src/util/flat_buffer.h M common/thrift/Frontend.thrift M common/thrift/PlanNodes.thrift M common/thrift/Planner.thrift M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java 21 files changed, 665 insertions(+), 240 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8523/13 -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 13 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8523 ) Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. Patch Set 12: (2 comments) http://gerrit.cloudera.org:8080/#/c/8523/12/common/thrift/Frontend.thrift File common/thrift/Frontend.thrift: http://gerrit.cloudera.org:8080/#/c/8523/12/common/thrift/Frontend.thrift@375 PS12, Line 375: per_node_scan_ranges > Thanks, the new thrift structures make more sense to me. Maybe it's simpler yeah, it lets me revert to upfront expansion as I had couple of changes back (but without mixed concrete/spec) and get rid of that iterator. http://gerrit.cloudera.org:8080/#/c/8523/12/common/thrift/PlanNodes.thrift File common/thrift/PlanNodes.thrift: http://gerrit.cloudera.org:8080/#/c/8523/12/common/thrift/PlanNodes.thrift@202 PS12, Line 202: THdfsFileSplitGeneratorSpec > update Done -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 12 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 18 May 2018 21:24:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4025: Part 1: Add percentile disc aggregation function
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/9777 ) Change subject: IMPALA-4025: Part 1: Add percentile_disc aggregation function .. Patch Set 7: (6 comments) http://gerrit.cloudera.org:8080/#/c/9777/7/fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java File fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java: http://gerrit.cloudera.org:8080/#/c/9777/7/fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java@99 PS7, Line 99: outputGroupingExprSmap_ is this used (or planned to be used) anywhere? http://gerrit.cloudera.org:8080/#/c/9777/7/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/9777/7/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@187 PS7, Line 187: void setContainsPercentile() { public? http://gerrit.cloudera.org:8080/#/c/9777/7/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@217 PS7, Line 217: boolean containsPercentile = false; make public as well? http://gerrit.cloudera.org:8080/#/c/9777/7/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@354 PS7, Line 354: boolean selectBlockContainsPercentile() { return containsPercentile; } intentional visibility? http://gerrit.cloudera.org:8080/#/c/9777/7/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java File fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java: http://gerrit.cloudera.org:8080/#/c/9777/7/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@108 PS7, Line 108: rewriteSelectStatementHook(stmt, analyzer); Any way to separate this refactor into its own change? first example I saw: where did the helper 'hasSubqueryInDisunction' go? I'm guessing this doesn't have anything to specifically do with the change for percentile. Looks like a good cleanup overall, but since there is some subtle existing functionality here, it would be helpful to separate clean-up from the actual change. http://gerrit.cloudera.org:8080/#/c/9777/5/fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java File fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java: http://gerrit.cloudera.org:8080/#/c/9777/5/fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java@253 PS5, Line 253: if (isLogical_) { change this to a precondition -- To view, visit http://gerrit.cloudera.org:8080/9777 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iacef7b3fcd74c4c73d88400ce27307c3baa0121e Gerrit-Change-Number: 9777 Gerrit-PatchSet: 7 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 17 May 2018 21:11:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8523 ) Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. Patch Set 12: (1 comment) http://gerrit.cloudera.org:8080/#/c/8523/12//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8523/12//COMMIT_MSG@13 PS12, Line 13: includes S3, ADLS, and local fs. > Does this change the dependency on fs.s3a.block.size? This change does not attempt to change the scan ranges that are produced for these file systems. Currently, we rely on the filestatus blocksize (see L177 of HdfsPartition) when synthesizing blocks to store in the catalog. This patch shifts that synthesis to the scheduler, so that its generated for each use instead of stored in memory. The block parameter used for this synthesis is set in this change on L784 of HdfsScanNode. Instead of filestatus blocksize, it uses the filesystem's default block size, which I think is the same thing for these file-systems (at least, from what I could tell from hadoop.fs.FileSystem). -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 12 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 16 May 2018 06:04:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@118 PS6, Line 118: FileBlock.createFbFileBlock(fbb, : loc.getOffset(), loc.getLength(), : (short) hostIndex.getIndex(REMOTE_NETWORK_ADDRESS)); > The reason is not strong: Though EC reads are remote, we might still don't ok, so you were worried that synthetic division into blocks could differ from how a file is chunked into blocks on hdfs. Assuming the block size provided by FileStatus never changes (L186), when would you see that the two different chunking schemes differ? -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 6 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 22 May 2018 18:27:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@118 PS6, Line 118: FileBlock.createFbFileBlock(fbb, : loc.getOffset(), loc.getLength(), : (short) hostIndex.getIndex(REMOTE_NETWORK_ADDRESS)); I know this is merged, but just wanted to revisit why we're storing block locations for ec files this way rather than synthesizing them via L137? -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 6 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 22 May 2018 17:58:42 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8523 ) Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. Patch Set 13: (4 comments) http://gerrit.cloudera.org:8080/#/c/8523/13/be/src/scheduling/scheduler.cc File be/src/scheduling/scheduler.cc: http://gerrit.cloudera.org:8080/#/c/8523/13/be/src/scheduling/scheduler.cc@302 PS13, Line 302: for (const TScanRangeLocationList& range : entry.second.concrete_range) { : expanded_locations.push_back(range); : } > that could be just expanded_locations.insert(concrete_range.begin(), concre Done http://gerrit.cloudera.org:8080/#/c/8523/13/common/thrift/Planner.thrift File common/thrift/Planner.thrift: http://gerrit.cloudera.org:8080/#/c/8523/13/common/thrift/Planner.thrift@108 PS13, Line 108: concrete_range > plural since the field is a list: concrete_ranges, split_specs Done http://gerrit.cloudera.org:8080/#/c/8523/13/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java File fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java: http://gerrit.cloudera.org:8080/#/c/8523/13/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java@224 PS13, Line 224: scanRangeSpecs_.getSplit_specSize() > this seems wrong - a single spec may result in multiple hosts, no? removed. this class never adds specs, but added a precondition check in init in the event that's changed for some reason. http://gerrit.cloudera.org:8080/#/c/8523/13/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/8523/13/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1126 PS13, Line 1126: scanRangeSpecs_.getSplit_specSize(); > shouldn't that do some calculation based on file size and blocks size and i good catch. needed to change other places that directly use the size of the spec list. -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 13 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Sat, 19 May 2018 01:39:32 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Hello Lars Volker, Dimitris Tsirogiannis, Alex Behm, Mostafa Mokhtar, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8523 to look at the new patch set (#14). Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. IMPALA-5931: Generates scan ranges in planner for s3/adls Currently, for filesystems that do not include physical block information (e.g., block replica locations, caching), synthetic blocks are generated and stored in the catalog when metadata is loaded. Example file systems for which this is done includes S3, ADLS, and local fs. This change avoids generating these blocks when metadata is loaded. Instead, scan ranges are directly generated from such files by the backend coordinator. Previously, all scan ranges were produced by the planner in HDFSScanNode in the frontend. Now, those files without block information are sent to the coordinator represented by a split specification that determines how the coordinator will create scan ranges to send to executors. This change reduces the space needed in the catalog and reduces the scan range data structures that are passed from the frontend to the backend when planning and coordinating a query. In addition a bug is avoided where non-splittable files were being split anyways to support the query parameter that places a limit on scan ranges. Testing: - added backend scheduler tests - mixed-filesystems test covers tables/queries with multiple fs's. - local fs tests cover the code paths in this change - all core tests pass when configured with s3 - manually tried larger local filesystem tables (tpch) with multiple partitions and observed the same scan ranges. - TODO: adls testing Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 --- M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler-test-util.cc M be/src/scheduling/scheduler-test-util.h M be/src/scheduling/scheduler-test.cc M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/util/CMakeLists.txt A be/src/util/flat_buffer.cc A be/src/util/flat_buffer.h M common/thrift/Frontend.thrift M common/thrift/PlanNodes.thrift M common/thrift/Planner.thrift M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java 21 files changed, 676 insertions(+), 241 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8523/14 -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 14 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR](2.x) IMPALA-5931: Generates scan ranges in planner for s3/adls
Hello Impala Public Jenkins, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/10692 to review the following change. Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. IMPALA-5931: Generates scan ranges in planner for s3/adls Currently, for filesystems that do not include physical block information (e.g., block replica locations, caching), synthetic blocks are generated and stored in the catalog when metadata is loaded. Example file systems for which this is done includes S3, ADLS, and local fs. This change avoids generating these blocks when metadata is loaded. Instead, scan ranges are directly generated from such files by the backend coordinator. Previously, all scan ranges were produced by the planner in HDFSScanNode in the frontend. Now, those files without block information are sent to the coordinator represented by a split specification that determines how the coordinator will create scan ranges to send to executors. This change reduces the space needed in the catalog and reduces the scan range data structures that are passed from the frontend to the backend when planning and coordinating a query. In addition a bug is avoided where non-splittable files were being split anyways to support the query parameter that places a limit on scan ranges. Testing: - added backend scheduler tests - mixed-filesystems test covers tables/queries with multiple fs's. - local fs tests cover the code paths in this change - all core tests pass when configured with s3 - manually tried larger local filesystem tables (tpch) with multiple partitions and observed the same scan ranges. - TODO: adls testing Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Reviewed-on: http://gerrit.cloudera.org:8080/8523 Reviewed-by: Vuk Ercegovac Tested-by: Impala Public Jenkins --- M CMakeLists.txt M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler-test-util.cc M be/src/scheduling/scheduler-test-util.h M be/src/scheduling/scheduler-test.cc M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/util/CMakeLists.txt A be/src/util/flat_buffer.cc A be/src/util/flat_buffer.h M common/thrift/Frontend.thrift M common/thrift/PlanNodes.thrift M common/thrift/Planner.thrift M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java 22 files changed, 677 insertions(+), 245 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/10692/1 -- To view, visit http://gerrit.cloudera.org:8080/10692 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: newchange Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 10692 Gerrit-PatchSet: 1 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-7144: Re-enable TestDescribeTableResults
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10643 ) Change subject: IMPALA-7144: Re-enable TestDescribeTableResults .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/10643/5/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java File fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java: http://gerrit.cloudera.org:8080/#/c/10643/5/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java@1979 PS5, Line 1979: rowIdx++; // Ignore the first empty line. : Preconditions.checkElementIndex(rowIdx + 1, rows.size()); simplify: rowIdx += 2; // Skips over the first empty line Preconditions.checkElementIndex(rowIdx, rows.size()); cols = rows.get(rowIdx).getColVals(); http://gerrit.cloudera.org:8080/#/c/10643/5/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java@1985 PS5, Line 1985: Preconditions.checkElementIndex(rowIdx + 1, rows.size()); simplify: ++rowIdx; Preconditions.checkElementIndex(rowIdx, rows.size()); cols = rows.get(rowIdx).getColVals(); -- To view, visit http://gerrit.cloudera.org:8080/10643 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3aeaecf5b6d906a66d338e165a6d506e3964563f Gerrit-Change-Number: 10643 Gerrit-PatchSet: 5 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 08 Jun 2018 15:59:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8523 ) Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. Patch Set 19: rebased (clean). re-ran s3 tests (pass). -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 19 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 08 Jun 2018 06:43:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7144: Re-enable TestDescribeTableResults
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10643 ) Change subject: IMPALA-7144: Re-enable TestDescribeTableResults .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10643 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3aeaecf5b6d906a66d338e165a6d506e3964563f Gerrit-Change-Number: 10643 Gerrit-PatchSet: 6 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 08 Jun 2018 16:06:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7144: Re-enable TestDescribeTableResults
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10643 ) Change subject: IMPALA-7144: Re-enable TestDescribeTableResults .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/10643/4/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java File fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java: http://gerrit.cloudera.org:8080/#/c/10643/4/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java@1974 PS4, Line 1974: location does it matter if location is set multiple times for a given set of rows? http://gerrit.cloudera.org:8080/#/c/10643/4/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java@1979 PS4, Line 1979: rowIdx that's not the index that is accessed (++rowIdx). -- To view, visit http://gerrit.cloudera.org:8080/10643 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3aeaecf5b6d906a66d338e165a6d506e3964563f Gerrit-Change-Number: 10643 Gerrit-PatchSet: 4 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 08 Jun 2018 06:54:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6929: Support multi-column range partitions for Kudu
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10441 ) Change subject: IMPALA-6929: Support multi-column range partitions for Kudu .. Patch Set 1: (4 comments) looks good, mainly nits and clarifying questions. http://gerrit.cloudera.org:8080/#/c/10441/1/fe/src/main/java/org/apache/impala/analysis/RangePartition.java File fe/src/main/java/org/apache/impala/analysis/RangePartition.java: http://gerrit.cloudera.org:8080/#/c/10441/1/fe/src/main/java/org/apache/impala/analysis/RangePartition.java@46 PS1, Line 46: Multi-value: : * PARTITION VALUE = (val1, val2, ..., valn) needs to updated? http://gerrit.cloudera.org:8080/#/c/10441/1/fe/src/main/java/org/apache/impala/analysis/RangePartition.java@81 PS1, Line 81: CREATE nit: and ALTER http://gerrit.cloudera.org:8080/#/c/10441/1/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test File testdata/workloads/functional-planner/queries/PlannerTest/kudu.test: http://gerrit.cloudera.org:8080/#/c/10441/1/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test@528 PS1, Line 528: '1' why was this change needed? http://gerrit.cloudera.org:8080/#/c/10441/1/testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test File testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test: http://gerrit.cloudera.org:8080/#/c/10441/1/testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test@575 PS1, Line 575: ('a', 0) <= values < ('b', 1)) just for my own info: what happens if the specified ranges overlap? what will happen to tuples that are inserted into the overlap? just wondering if analysis should attempt to ban such cases. -- To view, visit http://gerrit.cloudera.org:8080/10441 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0141dd3344a4f22b186f513b7406f286668ef1e7 Gerrit-Change-Number: 10441 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 08 Jun 2018 17:35:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7128 (part 1) Refactor interfaces for Db, View, Table, Partition
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10611 ) Change subject: IMPALA-7128 (part 1) Refactor interfaces for Db, View, Table, Partition .. Patch Set 2: (14 comments) made an initial pass on what seems to be the non-mechanical parts of this change. http://gerrit.cloudera.org:8080/#/c/10611/2/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java: http://gerrit.cloudera.org:8080/#/c/10611/2/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@175 PS2, Line 175: SimpleCatalog nit: stale http://gerrit.cloudera.org:8080/#/c/10611/2/fe/src/main/java/org/apache/impala/catalog/FeCatalog.java File fe/src/main/java/org/apache/impala/catalog/FeCatalog.java: http://gerrit.cloudera.org:8080/#/c/10611/2/fe/src/main/java/org/apache/impala/catalog/FeCatalog.java@86 PS2, Line 86: HDFS nit: FS ... I realize that some of the existing classes were misnamed HDFS, which is too specific (could be local, s3, etc.). But since we're redo'ing this area, perhaps we should refer to FS instead of HDFS? http://gerrit.cloudera.org:8080/#/c/10611/2/fe/src/main/java/org/apache/impala/catalog/FeDb.java File fe/src/main/java/org/apache/impala/catalog/FeDb.java: http://gerrit.cloudera.org:8080/#/c/10611/2/fe/src/main/java/org/apache/impala/catalog/FeDb.java@27 PS2, Line 27: from the frontend nit: rephrase as "Frontend interface for interacting with a database." http://gerrit.cloudera.org:8080/#/c/10611/2/fe/src/main/java/org/apache/impala/catalog/FeDb.java@31 PS2, Line 31: @return Both styles exist (javadoc or not) in the code base, but would be preferable to be consistent per file (L39 method does not use @return, for example). http://gerrit.cloudera.org:8080/#/c/10611/2/fe/src/main/java/org/apache/impala/catalog/FeHdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/FeHdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/10611/2/fe/src/main/java/org/apache/impala/catalog/FeHdfsPartition.java@31 PS2, Line 31: HDFS FS http://gerrit.cloudera.org:8080/#/c/10611/2/fe/src/main/java/org/apache/impala/catalog/FeHdfsPartition.java@31 PS2, Line 31: between remove http://gerrit.cloudera.org:8080/#/c/10611/2/fe/src/main/java/org/apache/impala/catalog/FeHdfsPartition.java@34 PS2, Line 34: Hdfs I'd rename this to FS (couple comments about this) http://gerrit.cloudera.org:8080/#/c/10611/2/fe/src/main/java/org/apache/impala/catalog/FeHdfsPartition.java@61 PS2, Line 61: /** nit: add a newline. several other cases below. http://gerrit.cloudera.org:8080/#/c/10611/2/fe/src/main/java/org/apache/impala/catalog/FeHdfsTable.java File fe/src/main/java/org/apache/impala/catalog/FeHdfsTable.java: http://gerrit.cloudera.org:8080/#/c/10611/2/fe/src/main/java/org/apache/impala/catalog/FeHdfsTable.java@36 PS2, Line 36: Note: despite the "HDFS" nomenclature, this table type is also used for : * interacting with S3 or other Hadoop-compatible filesystems. yup! shall we adopt the more general, preferred naming now? http://gerrit.cloudera.org:8080/#/c/10611/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/10611/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@961 PS2, Line 961: Comparison nit: Comparator http://gerrit.cloudera.org:8080/#/c/10611/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@964 PS2, Line 964: KeyValueComparator this change can introduce differences depending on how partitions are dealt with in various collections. why is this change needed in this patch? http://gerrit.cloudera.org:8080/#/c/10611/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/10611/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@563 PS2, Line 563: @Ov nit: add a line http://gerrit.cloudera.org:8080/#/c/10611/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@581 PS2, Line 581: @O nit: add a line http://gerrit.cloudera.org:8080/#/c/10611/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2016 PS2, Line 2016: HdfsPartition.KV_COMPARATOR how did you find all places to make this type of update, Collections.sort or something broader? -- To view, visit http://gerrit.cloudera.org:8080/10611 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id55f7d2e94d81e66ce720acb6315f15a89621b31 Gerrit-Change-Number: 10611 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 07 Jun 2018 17:00:35 +
[Impala-ASF-CR] IMPALA-6987: [DOCS] Refactor the INVALIDATE METADATA and REFRESH docs
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10339 ) Change subject: IMPALA-6987: [DOCS] Refactor the INVALIDATE METADATA and REFRESH docs .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10339 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2124e14900d0f82569c061cc46006447bb054b36 Gerrit-Change-Number: 10339 Gerrit-PatchSet: 5 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 08 Jun 2018 17:45:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7128 (part 2): add an interface for data sources
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10626 ) Change subject: IMPALA-7128 (part 2): add an interface for data sources .. Patch Set 2: Code-Review+2 (3 comments) http://gerrit.cloudera.org:8080/#/c/10626/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10626/2//COMMIT_MSG@12 PS2, Line 12: IMPALA-7131 pls mark these with todo's that reference this jira. http://gerrit.cloudera.org:8080/#/c/10626/2/fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java File fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java: http://gerrit.cloudera.org:8080/#/c/10626/2/fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java@42 PS2, Line 42: Represents a table backed by an external data source. pull this part up into the interface. http://gerrit.cloudera.org:8080/#/c/10626/2/fe/src/main/java/org/apache/impala/catalog/FeDataSource.java File fe/src/main/java/org/apache/impala/catalog/FeDataSource.java: http://gerrit.cloudera.org:8080/#/c/10626/2/fe/src/main/java/org/apache/impala/catalog/FeDataSource.java@21 PS2, Line 21: * Interface for interacting with data sources from the frontend. nit: use consistent phrasing with the other interfaces "Frontend interface for interacting with data sources." -- To view, visit http://gerrit.cloudera.org:8080/10626 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe704197dc2ad7c09b8340865f17567096aa630e Gerrit-Change-Number: 10626 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 08 Jun 2018 18:01:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6929: Support multi-column range partitions for Kudu
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10441 ) Change subject: IMPALA-6929: Support multi-column range partitions for Kudu .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10441 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0141dd3344a4f22b186f513b7406f286668ef1e7 Gerrit-Change-Number: 10441 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 08 Jun 2018 23:45:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6802 (part 4): Clean up authorization tests
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10442 ) Change subject: IMPALA-6802 (part 4): Clean up authorization tests .. Patch Set 3: (6 comments) http://gerrit.cloudera.org:8080/#/c/10442/3/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java File fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java: http://gerrit.cloudera.org:8080/#/c/10442/3/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java@918 PS3, Line 918: @Test perhaps add a comment explaining what's being tested... from a first glance, it looks like certain privs can see certain fields, and others cannot? if there are docs or jiras that explain this more, pls reference them here. http://gerrit.cloudera.org:8080/#/c/10442/3/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java@922 PS3, Line 922: consistent spacing (see comment below) http://gerrit.cloudera.org:8080/#/c/10442/3/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java@938 PS3, Line 938: and several more places below http://gerrit.cloudera.org:8080/#/c/10442/3/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java@946 PS3, Line 946: allExcept( : TPrivilegeLevel.ALL, TPrivilegeLevel.SELECT) factor out (repeated 3 times in this block, so it adds too much noise). http://gerrit.cloudera.org:8080/#/c/10442/3/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java@952 PS3, Line 952: ng[]{"id"}, explain what is being tested here (missing why "id" is special) http://gerrit.cloudera.org:8080/#/c/10442/3/fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java@1181 PS3, Line 1181: use consistent spacing for this... from a brief look, seems like no space is the preference. -- To view, visit http://gerrit.cloudera.org:8080/10442 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic4cf3a4751b41908ef81ec35d89a2713d9fa0dc4 Gerrit-Change-Number: 10442 Gerrit-PatchSet: 3 Gerrit-Owner: Adam Holley Gerrit-Reviewer: Adam Holley Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 08 Jun 2018 22:02:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7140 (part 1). Support fetching schema info in LocalCatalog
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10630 ) Change subject: IMPALA-7140 (part 1). Support fetching schema info in LocalCatalog .. Patch Set 4: (4 comments) http://gerrit.cloudera.org:8080/#/c/10630/4/fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java File fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java: http://gerrit.cloudera.org:8080/#/c/10630/4/fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java@37 PS4, Line 37: @param > why not? I agree with not slavish adherence to Javadoc when the parameters It was my understanding that flowing the param name in the description is preferable to using @param. I see that the project is inconsistent... some places use it, some don't, even in the same file. The several places where I've looked, it introduces duplication (e.g., Expr.castChild). Since we're far from being consistent, the @param doesn't seem to buy us much, so its terser to omit it. Open to alternatives though. http://gerrit.cloudera.org:8080/#/c/10630/4/fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java File fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java: http://gerrit.cloudera.org:8080/#/c/10630/4/fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java@78 PS4, Line 78: schemaInfo; > the problem is I needed to use tblName on line 75, so I need to checkNotNul I think this looks fine: this.db_ = Preconditions.checkNotNull(db); this.name_ = Preconditions.checkNotNull(tblName); this.schemaInfo = Preconditions.checkNotNull(schemaInfo); Preconditions.checkArgument(tblName.toLowerCase().equals(tblName)); http://gerrit.cloudera.org:8080/#/c/10630/4/fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java@190 PS4, Line 190: implementations > > Got confused about whether this is supposed to factor existing code thats fine, but pls leave a breadcrumb for where some of the lifting comes from. http://gerrit.cloudera.org:8080/#/c/10630/4/fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java@191 PS4, Line 191: stats > does num bytes get cached in the HMS? I think I missed that yes, have a look at setTableStats in Table.java -- To view, visit http://gerrit.cloudera.org:8080/10630 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I640f27e36198955e057da62a3ce25a858406e496 Gerrit-Change-Number: 10630 Gerrit-PatchSet: 4 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 13 Jun 2018 00:09:31 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7140 (part 1). Support fetching schema info in LocalCatalog
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10630 ) Change subject: IMPALA-7140 (part 1). Support fetching schema info in LocalCatalog .. Patch Set 4: (12 comments) http://gerrit.cloudera.org:8080/#/c/10630/4/fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java File fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java: http://gerrit.cloudera.org:8080/#/c/10630/4/fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java@37 PS4, Line 37: @param lets not use javadocs. http://gerrit.cloudera.org:8080/#/c/10630/4/fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java@57 PS4, Line 57: full fully qualified? http://gerrit.cloudera.org:8080/#/c/10630/4/fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java@77 PS4, Line 77: name fully qualified? http://gerrit.cloudera.org:8080/#/c/10630/4/fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java@80 PS4, Line 80: validateClusteringColumns based on HdfsTable::addColumnsFromFieldSchemas. pls add a todo for a potential refactor there (would be nice not to duplicate these types of exception strings). http://gerrit.cloudera.org:8080/#/c/10630/4/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/10630/4/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@119 PS4, Line 119: public todo: pull it up to FeFsTable or comment at site of use. http://gerrit.cloudera.org:8080/#/c/10630/4/fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java File fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java: http://gerrit.cloudera.org:8080/#/c/10630/4/fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java@78 PS4, Line 78: schemaInfo; I prefer the alternative style you used for these: this.db_ = Preconditions.checkNotNull(db) http://gerrit.cloudera.org:8080/#/c/10630/4/fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java@175 PS4, Line 175: HdfsTable Table? (so that the child class is not used) In either case, this is an odd dep. Seems like SchemaInfo can be the class that knows specifically about HMS, so the detail about what param to pull out would go there explicitly (then replace that method of Table) http://gerrit.cloudera.org:8080/#/c/10630/4/fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java@175 PS4, Line 175: getRowCount I saw that -1 is used as a default/unknown value in catalog.Table... needed here as well? http://gerrit.cloudera.org:8080/#/c/10630/4/fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java@190 PS4, Line 190: Table FeTable? http://gerrit.cloudera.org:8080/#/c/10630/4/fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java@190 PS4, Line 190: implementations Got confused about whether this is supposed to factor existing code in Table. Looks like it partially does-- pls clarify whether the intent is to factor or leave the duplication. http://gerrit.cloudera.org:8080/#/c/10630/4/fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java@191 PS4, Line 191: stats is num bytes needed as well? http://gerrit.cloudera.org:8080/#/c/10630/4/fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java@263 PS4, Line 263: private boolean isClusteringColumn(Column c) { duplicates the impl in Table, pls add a todo to refactor. -- To view, visit http://gerrit.cloudera.org:8080/10630 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I640f27e36198955e057da62a3ce25a858406e496 Gerrit-Change-Number: 10630 Gerrit-PatchSet: 4 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 12 Jun 2018 23:26:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7128 (part 2): add an interface for data sources
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10626 ) Change subject: IMPALA-7128 (part 2): add an interface for data sources .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10626 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe704197dc2ad7c09b8340865f17567096aa630e Gerrit-Change-Number: 10626 Gerrit-PatchSet: 4 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 12 Jun 2018 21:01:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7128 (part 1) Refactor interfaces for Db, View, Table, Partition
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10611 ) Change subject: IMPALA-7128 (part 1) Refactor interfaces for Db, View, Table, Partition .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10611 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id55f7d2e94d81e66ce720acb6315f15a89621b31 Gerrit-Change-Number: 10611 Gerrit-PatchSet: 5 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 12 Jun 2018 21:02:32 + Gerrit-HasComments: No