[Impala-ASF-CR] IMPALA-4163: Add sortby() query hint
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-4163: Add sortby() query hint .. Patch Set 3: (18 comments) http://gerrit.cloudera.org:8080/#/c/5051/3/fe/src/main/cup/sql-parser.cup File fe/src/main/cup/sql-parser.cup: Line 370: nonterminal List opt_plan_hints, plan_hint_list; do you really need both? why not have a non-existing hint = empty hint list? Line 2397: | /* empty */ what's the point of this? http://gerrit.cloudera.org:8080/#/c/5051/3/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java: Line 127: // Contains the result exprs corresponding to the columns of the target table, that are what do you mean by result exprs? are these basetbl exprs? Line 774: private void analyzeSortByColumns(List columnNames) throws AnalysisException { use line breaks where it makes sense. also, describe the rules for sort-by columns somewhere. analyzeSortByHint is more general (and correct) Line 787: } else { you'd also take this branch for hbase http://gerrit.cloudera.org:8080/#/c/5051/3/fe/src/main/java/org/apache/impala/analysis/PlanHint.java File fe/src/main/java/org/apache/impala/analysis/PlanHint.java: Line 34: private final String hintName_; remove 'hint' from members, this class already contains 'hint' in its name. Line 62: public String toString() { toSql test missing http://gerrit.cloudera.org:8080/#/c/5051/3/fe/src/main/java/org/apache/impala/analysis/TableRef.java File fe/src/main/java/org/apache/impala/analysis/TableRef.java: Line 87: // ArrayList initialization each. yes, please do. http://gerrit.cloudera.org:8080/#/c/5051/3/fe/src/main/java/org/apache/impala/planner/Planner.java File fe/src/main/java/org/apache/impala/planner/Planner.java: PS3, Line 511: clustered/noclustered plan :* hint and the sortby() "on the clustered/noclustered/sortby plan hint" is shorter Line 515:* will also sort by all columns specified in the sortby() hint. rewrite comment, instead of just tacking on http://gerrit.cloudera.org:8080/#/c/5051/3/fe/src/main/jflex/sql-scanner.flex File fe/src/main/jflex/sql-scanner.flex: Line 331: HintContent = [ ]* "+" [^\r\n*]* isn't [ ]* the same as " "*? Line 336: ContainsCommentEnd = [^]* "*/" [^]* what is [^]*? shouldn't 'anything' be simply ".*"? Line 342: TraditionalComment = "/*" !({HintContent}|{ContainsCommentEnd}) "*/" why do you need the ContainsCommentEnd here? wouldn't /* */ */ simply be a parse error? Line 434: yybegin(HINT); shouldn't this only apply to end-of-line hints? http://gerrit.cloudera.org:8080/#/c/5051/3/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java: Line 1776: "partition (year, month) %sshuffle,clustered,sortby(int_col,bool_col)%s " + mix up the order a bit http://gerrit.cloudera.org:8080/#/c/5051/3/fe/src/test/java/org/apache/impala/analysis/ParserTest.java File fe/src/test/java/org/apache/impala/analysis/ParserTest.java: Line 216: ParserError("select 1 /*+ sortby(() */"); also check that this is illegal: /* +sortby()\n and -- +sortby() */ \n Line 477: "insert into t %sSoRtBy( a , , ,,, b )%s select * from t", prefix, suffix)); is the capitalization necessary? http://gerrit.cloudera.org:8080/#/c/5051/3/testdata/workloads/functional-planner/queries/PlannerTest/insert.test File testdata/workloads/functional-planner/queries/PlannerTest/insert.test: Line 758:/*+ clustered,sortby(int_col, bool_col) */ move clustered to end -- To view, visit http://gerrit.cloudera.org:8080/5051 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I37a3ffab99aaa5d5a4fd1ac674b3e8b394a3c4c0 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4639: Add pytest option and xfail markers for tests that only run locally.
Alex Behm has posted comments on this change. Change subject: IMPALA-4639: Add pytest option and xfail markers for tests that only run locally. .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/5446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David KnuppGerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4633: Change broken gflag default for Kudu client mem
Matthew Jacobs has uploaded a new patch set (#2). Change subject: IMPALA-4633: Change broken gflag default for Kudu client mem .. IMPALA-4633: Change broken gflag default for Kudu client mem We discovered that the current Kudu client defaults in the KuduTableSink are causing a large number of queries to timeout, failing the query. The current default value of the 'mutation buffer size' is 100MB which results in higher write throughput than Kudu can currently handle on large clusters. By decreasing the value of this flag, more RPCs will be sent for the same amount of data, i.e. throttling the load on Kudu. We found tests to be more successful on 200 nodes with a 10MB buffer size than the previous 100MB value where most queries couldn't complete due to timeouts. These queries were not timing out with the 10MB value. This appears to work well on 9 node stress tests as well. Change-Id: I0b3544f9a93c82e347f6e97540d6b561c30d09fd --- M be/src/exec/kudu-table-sink.cc M be/src/exec/kudu-table-sink.h 2 files changed, 5 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/5503/2 -- To view, visit http://gerrit.cloudera.org:8080/5503 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0b3544f9a93c82e347f6e97540d6b561c30d09fd Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Matthew Jacobs
[Impala-ASF-CR] IMPALA-4639: Add pytest option and xfail markers for tests that only run locally.
Alex Behm has posted comments on this change. Change subject: IMPALA-4639: Add pytest option and xfail markers for tests that only run locally. .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/5446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David KnuppGerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers
Hello Jim Apple, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4715 to look at the new patch set (#14). Change subject: IMPALA-3200: Implement suballocator for splitting buffers .. IMPALA-3200: Implement suballocator for splitting buffers This is useful for situations like hash tables, where we want to make multiple non-spillable allocations of variable size from buffer pool memory and not incur the overhead of interacting with the global buffer pool. The allocator subdivides buffers to service allocations and uses a buddy allocation algorithm to merge freed allocations into larger chunks. This helps avoid external fragmentation and is quite effective at reusing memory given the typical doubling allocation patterns of hash tables in partitioned aggs and joins. Testing: The allocator has fairly robust internal consistency checks via assertions and unique_ptrs. Includes a unit test that exercises various allocation patterns. I tested porting hash tables over to allocate memory using an earlier version of the suballocator, which worked well (was able to run a wide range of queries successfully with good performance). Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788 --- M be/src/common/names.h M be/src/exec/hash-table.h M be/src/runtime/bufferpool/CMakeLists.txt A be/src/runtime/bufferpool/suballocator-test.cc A be/src/runtime/bufferpool/suballocator.cc A be/src/runtime/bufferpool/suballocator.h 6 files changed, 938 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/4715/14 -- To view, visit http://gerrit.cloudera.org:8080/4715 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788 Gerrit-PatchSet: 14 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers
Tim Armstrong has posted comments on this change. Change subject: IMPALA-3200: Implement suballocator for splitting buffers .. Patch Set 14: Code-Review+1 Carry +1 -- To view, visit http://gerrit.cloudera.org:8080/4715 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788 Gerrit-PatchSet: 14 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4639: Add pytest option and xfail markers for tests that only run locally.
Hello Michael Brown, Lars Volker, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5446 to look at the new patch set (#4). Change subject: IMPALA-4639: Add pytest option and xfail markers for tests that only run locally. .. IMPALA-4639: Add pytest option and xfail markers for tests that only run locally. As we're beginning to run Impala end-to-end tests on remote clusters, we're finding some tests that do not pass for infrastructure-related reasons (as opposed to product issues.) It would be useful to be able to xfail any tests that we know to be problematic within a given module, yet still run the others. This way, we can get passing test runs as we're ironing out those infrastructure issues. Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f --- M tests/conftest.py M tests/metadata/test_compute_stats.py M tests/query_test/test_mt_dop.py 3 files changed, 24 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/5446/4 -- To view, visit http://gerrit.cloudera.org:8080/5446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David KnuppGerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-3977: TransmitData() should not block
Lars Volker has posted comments on this change. Change subject: IMPALA-3977: TransmitData() should not block .. Patch Set 1: (4 comments) http://gerrit.cloudera.org:8080/#/c/5491/1/be/src/runtime/data-stream-sender.cc File be/src/runtime/data-stream-sender.cc: PS1, Line 226: FLAGS_datastream_sender_timeout_ms > rather than have the sleep period be a function of the total timeout, it ma Does it make sense to add some exponential back-off here, let's say up to a maximum value? That way, if the sender out-races the receiver we will try again sooner and bound delays. http://gerrit.cloudera.org:8080/#/c/5491/1/common/thrift/generate_error_codes.py File common/thrift/generate_error_codes.py: Line 233: ("DATASTREAM_SENDER_TIMEOUT", 72, "Sender timed out waiting for receiver fragment " Should we mark this as deprecated here? PS1, Line 314: DATASTREAM_RECEIVER_EAGAIN Would it make sense to name this more general, e.g. "RPC_TRY_AGAIN" so we can re-use it in similar situations? Line 315:"Datastream receiver is not yet ready"), nit: single line? -- To view, visit http://gerrit.cloudera.org:8080/5491 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I47f9d6c1de95a72ab73471dcd2a3118ef13e7db0 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4630: Change broken gflag default for Kudu client mem
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-4630: Change broken gflag default for Kudu client mem .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/5503/1//COMMIT_MSG Commit Message: Line 7: IMPALA-4630: Change broken gflag default for Kudu client mem > Wrong JIRA? Done Line 11: fail. The current mutation buffer size is 100MB, which is > can you qualify "fail"? did they time out? Done Line 15: successful on 200 node tests than the previous 100MB value. > Can you briefly describe the effect that this decrease has? For the same am Done -- To view, visit http://gerrit.cloudera.org:8080/5503 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0b3544f9a93c82e347f6e97540d6b561c30d09fd Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4569: fuzz test fixes
Dan Hecht has posted comments on this change. Change subject: IMPALA-4569: fuzz test fixes .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/5502/2//COMMIT_MSG Commit Message: PS2, Line 7: IMPALA-4569 wrong jira? -- To view, visit http://gerrit.cloudera.org:8080/5502 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4569: fuzz test fixes
Tim Armstrong has uploaded a new patch set (#2). Change subject: IMPALA-4569: fuzz test fixes .. IMPALA-4569: fuzz test fixes * Apply a 512m mem_limit to all fuzz tests. This limits aggregate memory consumption to ~5GB per daemon(assuming 10 concurrent tests). * Refactor the exec option handling to use the exec_option dimension. This avoids executing the test multiple times redundantly * Remove the xfails to reduce noise, since there is no immediate plan to fix the product bugs. Instead just pass the tests. Testing: Ran in a loop for ~1h to flush out flakiness. Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7 --- M tests/query_test/test_scanners_fuzz.py 1 file changed, 41 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/5502/2 -- To view, visit http://gerrit.cloudera.org:8080/5502 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker
[Impala-ASF-CR] IMPALA-4631: don't use floating point operations for time unit conversions
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4631: don't use floating point operations for time unit conversions .. Patch Set 6: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/121/ -- To view, visit http://gerrit.cloudera.org:8080/5434 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7237f579b201f5bd3930f66e9c2c8d700c37ffeb Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Michael Ho Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4569: fuzz test fixes
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4569: fuzz test fixes .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/5502/1/tests/query_test/test_scanners_fuzz.py File tests/query_test/test_scanners_fuzz.py: Line 59: 'disable_codegen' : cls.DISABLE_CODEGEN_VALUES, 'mem_limit' : cls.MEM_LIMITS})) > Should we add BATCH_SIZES here as well? I can see how you'd want to run sev That's a good point. Changed it to do that. -- To view, visit http://gerrit.cloudera.org:8080/5502 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4569: fuzz test fixes
Alex Behm has posted comments on this change. Change subject: IMPALA-4569: fuzz test fixes .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/5502 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4569: fuzz test fixes
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4569: fuzz test fixes .. Patch Set 2: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/122/ -- To view, visit http://gerrit.cloudera.org:8080/5502 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4633: Change broken gflag default for Kudu client mem
Dan Hecht has posted comments on this change. Change subject: IMPALA-4633: Change broken gflag default for Kudu client mem .. Patch Set 2: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/5503/2//COMMIT_MSG Commit Message: PS2, Line 11: The current default value of the : 'mutation buffer size' is 100MB which results in higher : write throughput than Kudu can currently handle on large : clusters. is there a kudu jira for making kudu tolerant of writers that are "too fast"? if so, can you mention it (or link the impala jira to it)? -- To view, visit http://gerrit.cloudera.org:8080/5503 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0b3544f9a93c82e347f6e97540d6b561c30d09fd Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers
Tim Armstrong has posted comments on this change. Change subject: IMPALA-3200: Implement suballocator for splitting buffers .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/4715/13/be/src/runtime/bufferpool/suballocator.h File be/src/runtime/bufferpool/suballocator.h: PS13, Line 173: Suballocation objects are owned by : /// either client code (if in use), the free list (if free) or its left child (if split). > I think this could use calling out even more explicitly - the client has to Tried making it more explicit. -- To view, visit http://gerrit.cloudera.org:8080/4715 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788 Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3977: TransmitData() should not block
Lars Volker has posted comments on this change. Change subject: IMPALA-3977: TransmitData() should not block .. Patch Set 1: I'll do the first round of reviews. -- To view, visit http://gerrit.cloudera.org:8080/5491 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I47f9d6c1de95a72ab73471dcd2a3118ef13e7db0 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] Add all build targets to CMake and speed up builds
Tim Armstrong has posted comments on this change. Change subject: Add all build targets to CMake and speed up builds .. Patch Set 7: Code-Review+2 rebase -- To view, visit http://gerrit.cloudera.org:8080/4790 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I23617adf13bdeb034c24f6bba14b5ae480e8dd26 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] Add all build targets to CMake and speed up builds
Impala Public Jenkins has posted comments on this change. Change subject: Add all build targets to CMake and speed up builds .. Patch Set 7: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/118/ -- To view, visit http://gerrit.cloudera.org:8080/4790 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I23617adf13bdeb034c24f6bba14b5ae480e8dd26 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4569: fuzz test fixes
Alex Behm has posted comments on this change. Change subject: IMPALA-4569: fuzz test fixes .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/5502/1/tests/query_test/test_scanners_fuzz.py File tests/query_test/test_scanners_fuzz.py: Line 59: 'disable_codegen' : cls.DISABLE_CODEGEN_VALUES, 'mem_limit' : cls.MEM_LIMITS})) Should we add BATCH_SIZES here as well? I can see how you'd want to run several batch sizes on the same file, but the same would also apply to DISABLE_CODEGEN_VALUES. Not asking you to change anything, just curious what you think. -- To view, visit http://gerrit.cloudera.org:8080/5502 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4630: Change broken gflag default for Kudu client mem
Alex Behm has posted comments on this change. Change subject: IMPALA-4630: Change broken gflag default for Kudu client mem .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/5503/1//COMMIT_MSG Commit Message: Line 7: IMPALA-4630: Change broken gflag default for Kudu client mem Wrong JIRA? Line 11: fail. The current mutation buffer size is 100MB, which is can you qualify "fail"? did they time out? Line 15: successful on 200 node tests than the previous 100MB value. Can you briefly describe the effect that this decrease has? For the same amount of data will we do more RPCs? -- To view, visit http://gerrit.cloudera.org:8080/5503 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0b3544f9a93c82e347f6e97540d6b561c30d09fd Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4640: Fix number of rows displayed by parquet-reader tool
Lars Volker has uploaded a new patch set (#2). Change subject: IMPALA-4640: Fix number of rows displayed by parquet-reader tool .. IMPALA-4640: Fix number of rows displayed by parquet-reader tool The variable just never got updated in the code. This change also adds verification that all columns contain the same number of rows. Change-Id: I281a784a0aa2df4ed1852dfb864587a0c1aa4d9a --- M be/src/util/parquet-reader.cc 1 file changed, 19 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/5453/2 -- To view, visit http://gerrit.cloudera.org:8080/5453 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I281a784a0aa2df4ed1852dfb864587a0c1aa4d9a Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-2605: Omit the sort and mini stress tests
Jim Apple has posted comments on this change. Change subject: IMPALA-2605: Omit the sort and mini stress tests .. Patch Set 2: > Maybe we should dump out the value of some of these metrics before > running the tests in tests/run-tests.py. That would help diagnose > in future. So you're suggesting we add logging for, say, impala.thrift-server.backend.connections-in-use, impala.thrift-server.beeswax-frontend.connections-in-use, and impala.thrift-server.hiveserver2-frontend.connections-in-use, abandon this patch, and see what we can diagnose the next time this happens? -- To view, visit http://gerrit.cloudera.org:8080/5401 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibd30abf8215415e0f2830b725e43b005daa2bb2d Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim AppleGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2605: Omit the sort and mini stress tests
Tim Armstrong has posted comments on this change. Change subject: IMPALA-2605: Omit the sort and mini stress tests .. Patch Set 2: I guess I'm uneasy that we don't fully understand the cause of the problem. It seems like it's probably a test bug and I don't think we have a path to solving that. I looked again at the backtraces in the ticket and just see a bunch of threads waiting to read from the client connection. My other thought is that, even if we disable these tests, will the same problem just happen with the other enabled stress tests. Maybe we should dump out the value of some of these metrics before running the tests in tests/run-tests.py. That would help diagnose in future. -- To view, visit http://gerrit.cloudera.org:8080/5401 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibd30abf8215415e0f2830b725e43b005daa2bb2d Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim AppleGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4631: don't use floating point operations for time unit conversions
Dan Hecht has posted comments on this change. Change subject: IMPALA-4631: don't use floating point operations for time unit conversions .. Patch Set 6: Code-Review+2 Rebase -- To view, visit http://gerrit.cloudera.org:8080/5434 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7237f579b201f5bd3930f66e9c2c8d700c37ffeb Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Michael Ho Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3977: TransmitData() should not block
Henry Robinson has posted comments on this change. Change subject: IMPALA-3977: TransmitData() should not block .. Patch Set 1: (8 comments) http://gerrit.cloudera.org:8080/#/c/5491/1/be/src/runtime/data-stream-mgr.cc File be/src/runtime/data-stream-mgr.cc: PS1, Line 53: STREAM_EXPIRATION_TIME_MS Do you think we still need the stream cache? In the case where a receiver has completed before it receives an initial RPC, we should be able to rely on cancellation being detected on the sender side. The cost is that we might retry that initial RPC a couple of times before we receive cancellation from the coordinator. Can you think this through and work out whether there's a state where the stream cache is really worth the extra complexity? PS1, Line 99: if (acquire_lock) lock_.lock(); Can we just require now that the caller takes the lock? http://gerrit.cloudera.org:8080/#/c/5491/1/be/src/runtime/data-stream-sender.cc File be/src/runtime/data-stream-sender.cc: PS1, Line 209: TTransmitDataResult res; Let's move this inside the loop so it's freshly initialized on each iteration. That avoids any bugs where one RPC returns EAGAIN and sets a field in the result that - for some reason - a subsequent successful RPC does not overwrite. PS1, Line 217: DoTransmitDataRpc Since a row batch can potentially be many megabytes large, I wonder if it makes sense to have an initial 'probe' RPC that has no payload, but just confirms that the receiver is ready. Then we can limit the retry logic to that probe only, and if a subsequent RPC returns EAGAIN, we know that the receiver is closed and don't need to retry. Line 224: if (res.status.status_code != TErrorCode::DATASTREAM_RECEIVER_EAGAIN) break; Is cancellation detected during this loop? That's part of the point of moving control back to the sender on EAGAIN, so that cancellation can be detected. Otherwise this blocks from the sender's perspective in the same way that it did before. PS1, Line 225: VLOG_RPC << "Datastream receiver not yet ready. Retrying"; Not very informative without the fragment ID / destination address etc. PS1, Line 226: FLAGS_datastream_sender_timeout_ms rather than have the sleep period be a function of the total timeout, it makes more sense to have a fixed retry interval and exit this loop if the timeout has been exceeded. For example, if someone sets the timeout to ten minutes (because they're being ultra conservative on a heavy cluster), this will retry every two minutes. Line 228: I think it's a bit confusing to have this method return EAGAIN upon failure. Can you replace the status with timeout if it times-out of the loop? -- To view, visit http://gerrit.cloudera.org:8080/5491 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I47f9d6c1de95a72ab73471dcd2a3118ef13e7db0 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4648: remove build thirdparty.sh
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4648: remove build_thirdparty.sh .. Patch Set 2: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/116/ -- To view, visit http://gerrit.cloudera.org:8080/5477 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I36e11384877e5115430baf0170b31a508cb01ba3 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Impala Public Jenkins Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4648: remove build thirdparty.sh
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4648: remove build_thirdparty.sh .. Patch Set 2: Code-Review+2 rebase -- To view, visit http://gerrit.cloudera.org:8080/5477 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I36e11384877e5115430baf0170b31a508cb01ba3 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] Add all build targets to CMake and speed up builds
Jim Apple has posted comments on this change. Change subject: Add all build targets to CMake and speed up builds .. Patch Set 7: > Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/120/ Sorry, that was me :-( -- To view, visit http://gerrit.cloudera.org:8080/4790 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I23617adf13bdeb034c24f6bba14b5ae480e8dd26 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs
Tim Armstrong has posted comments on this change. Change subject: IMPALA-1430: enable codegen for native UDAs .. Patch Set 6: (7 comments) http://gerrit.cloudera.org:8080/#/c/5161/6/be/src/exprs/agg-fn-evaluator.cc File be/src/exprs/agg-fn-evaluator.cc: PS6, Line 542: Expr::InlineConstants(AnyValUtil::ColumnTypeToTypeDesc(intermediate_type()), : AnyValUtil::ColumnTypesToTypeDescs(arg_types), codegen, *uda_fn); > I am not sure I understand the intention of this. This is preserving the pre-existing behaviour of the UDA interface, which treats the last (in-out) argument as the return type in the function context. Changing how the types are exposed could break UDAs that were written against the old interface. I updated the comment to hopefully make this clearer. I don't feel strongly about whether this is the right API (on one hand, it doesn't match the C++ function signature, on the other the intermediate type is logically the return type) but I don't think it's worth breaking compatibility for. The constants aren't inlined into the input expressions, only to the UDA function itself, so the second scenario isn't possible. http://gerrit.cloudera.org:8080/#/c/5161/6/be/src/exprs/scalar-fn-call.cc File be/src/exprs/scalar-fn-call.cc: PS6, Line 327: !udf->isDeclaration() > May help to add a comment on why this can be a declaration only (i.e. no fu Done PS6, Line 328: InlineConstants(AnyValUtil::ColumnTypeToTypeDesc(type_), : AnyValUtil::ColumnTypesToTypeDescs(arg_types), codegen, udf); > Please see comments in AggFnEvaluator. It seems that we should be able to k That was the initial path I took but it's not possible because UDAs treat the final in/out argument as the return type. I could pass in an additional argument to determine which mode it should use if you prefer. PS6, Line 449: DCHECK(has_varargs || arg_types.size() == num_fixed_args); : DCHECK(!has_varargs || arg_types.size() > num_fixed_args); > DCHECK(arg_types.size() == num_fixed_args || (has_var_args && arg_types.si We actually want to reject the case of passing 0 varargs into a varargs function. The frontend currently rejects those cases and I added a comment to this function that specifically mentions this case. It would be reasonable to pass 0 args into varargs functions, but a bunch of code here doesn't handle that correctly - it assumes that having 0 varargs means that the function signature doesn't have a varargs argument. PS6, Line 478: codegen->void_type() : : CodegenAnyVal::GetLoweredType(codegen, *return_type); > nit: one line ? clang-format seems to prefer it this way. I don't feel strongly http://gerrit.cloudera.org:8080/#/c/5161/6/be/src/exprs/scalar-fn-call.h File be/src/exprs/scalar-fn-call.h: PS6, Line 59: 'cache_entry' > May help to have a comment about what 'cache_entry' is. Something like the Done Line 60: /// updated to point to the library (or its use count is incremented if already loaded). > Please add a comment that the caller is expected to call FinalizeFunction() Done -- To view, visit http://gerrit.cloudera.org:8080/5161 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] Add all build targets to CMake and speed up builds
Impala Public Jenkins has posted comments on this change. Change subject: Add all build targets to CMake and speed up builds .. Patch Set 7: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/120/ -- To view, visit http://gerrit.cloudera.org:8080/4790 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I23617adf13bdeb034c24f6bba14b5ae480e8dd26 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs
Tim Armstrong has uploaded a new patch set (#7). Change subject: IMPALA-1430: enable codegen for native UDAs .. IMPALA-1430: enable codegen for native UDAs This uses the existing infrastructure for codegening builtin UDAs and for codegening calls to UDFs. GetUdf() is refactored to support both UDFs and UDAs. IR UDAs are still not allowed by the frontend. It's unclear if we want to enable them going forward because of the difficulties in testing and supporting IR UDFs/UDAs. Testing: test_udfs.py tests UDAs with codegen enabled and disabled Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f --- M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/exec/partitioned-aggregation-node.cc M be/src/exprs/agg-fn-evaluator.cc M be/src/exprs/agg-fn-evaluator.h M be/src/exprs/anyval-util.cc M be/src/exprs/anyval-util.h M be/src/exprs/expr.cc M be/src/exprs/expr.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h 11 files changed, 163 insertions(+), 124 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/5161/7 -- To view, visit http://gerrit.cloudera.org:8080/5161 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4639: Add pytest option and xfail markers for tests that only run locally.
Lars Volker has posted comments on this change. Change subject: IMPALA-4639: Add pytest option and xfail markers for tests that only run locally. .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/5446/3/tests/conftest.py File tests/conftest.py: Line 483: if "localhost" in pytest.config.option.impalad: > or "127.0.0.1" ? Any address starting with 127.x.x.x is the local machine I think. -- To view, visit http://gerrit.cloudera.org:8080/5446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David KnuppGerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] Add all build targets to CMake and speed up builds
Impala Public Jenkins has posted comments on this change. Change subject: Add all build targets to CMake and speed up builds .. Patch Set 6: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/117/ -- To view, visit http://gerrit.cloudera.org:8080/4790 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I23617adf13bdeb034c24f6bba14b5ae480e8dd26 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4569: fuzz test fixes
Tim Armstrong has uploaded a new change for review. http://gerrit.cloudera.org:8080/5502 Change subject: IMPALA-4569: fuzz test fixes .. IMPALA-4569: fuzz test fixes * Apply a 512m mem_limit to all fuzz tests. This limits aggregate memory consumption to ~5GB per daemon(assuming 10 concurrent tests). * Refactor the exec option handling to use the exec_option dimension. This avoids executing the test multiple times redundantly * Remove the xfails to reduce noise, since there is no immediate plan to fix the product bugs. Instead just pass the tests. Testing: Ran in a loop for ~1h to flush out flakiness. Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7 --- M tests/query_test/test_scanners_fuzz.py 1 file changed, 23 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/5502/1 -- To view, visit http://gerrit.cloudera.org:8080/5502 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-4630: Change broken gflag default for Kudu client mem
Matthew Jacobs has uploaded a new change for review. http://gerrit.cloudera.org:8080/5503 Change subject: IMPALA-4630: Change broken gflag default for Kudu client mem .. IMPALA-4630: Change broken gflag default for Kudu client mem We discovered that the current kudu client defaults in the kudu table sink are causing a large number of queries to fail. The current mutation buffer size is 100MB, which is results in the higher write throughput than Kudu can currently handle on large clusters. By decreasing the default value of this flag, we found tests to be more successful on 200 node tests than the previous 100MB value. This value appeared to work well on 9 node stress tests as well. Change-Id: I0b3544f9a93c82e347f6e97540d6b561c30d09fd --- M be/src/exec/kudu-table-sink.cc M be/src/exec/kudu-table-sink.h 2 files changed, 4 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/5503/1 -- To view, visit http://gerrit.cloudera.org:8080/5503 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I0b3544f9a93c82e347f6e97540d6b561c30d09fd Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs
[Impala-ASF-CR] IMPALA-4630: Change broken gflag default for Kudu client mem
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-4630: Change broken gflag default for Kudu client mem .. Patch Set 1: In addition to stress testing and the 200 node tests from Mostafa, this also passed a private test job. -- To view, visit http://gerrit.cloudera.org:8080/5503 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0b3544f9a93c82e347f6e97540d6b561c30d09fd Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4659: fuzz test fixes
Hello Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5502 to look at the new patch set (#3). Change subject: IMPALA-4659: fuzz test fixes .. IMPALA-4659: fuzz test fixes * Apply a 512m mem_limit to all fuzz tests. This limits aggregate memory consumption to ~5GB per daemon(assuming 10 concurrent tests). * Refactor the exec option handling to use the exec_option dimension. This avoids executing the test multiple times redundantly * Remove the xfails to reduce noise, since there is no immediate plan to fix the product bugs. Instead just pass the tests. Testing: Ran in a loop for ~1h to flush out flakiness. Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7 --- M tests/query_test/test_scanners_fuzz.py 1 file changed, 41 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/5502/3 -- To view, visit http://gerrit.cloudera.org:8080/5502 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4662: Fix NULL literal handling in Kudu IN list predicates
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-4662: Fix NULL literal handling in Kudu IN list predicates .. Patch Set 1: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/5505/1/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java File fe/src/main/java/org/apache/impala/planner/KuduScanNode.java: PS1, Line 388: Preconditions.checkNotNull(value == null); Shouldn't this be checkNotNull(value)? -- To view, visit http://gerrit.cloudera.org:8080/5505 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iaf2c10a326373ad80aef51a85cec64071daefa7b Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Dimitris Tsirogiannis Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4662: Fix NULL literal handling in Kudu IN list predicates
Hello Dimitris Tsirogiannis, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5505 to look at the new patch set (#2). Change subject: IMPALA-4662: Fix NULL literal handling in Kudu IN list predicates .. IMPALA-4662: Fix NULL literal handling in Kudu IN list predicates The KuduScanNode attempts to push IN list predicates to the Kudu scan, but NULL literals cannot be pushed. The code in KuduScanNode needed to check if the Literals in the InPredicate is a NullLiteral, in which case the entire IN list should not be pushed to Kudu. The same handling is already in place for binary predicate pushdown. Change-Id: Iaf2c10a326373ad80aef51a85cec64071daefa7b --- M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test 2 files changed, 20 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/5505/2 -- To view, visit http://gerrit.cloudera.org:8080/5505 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iaf2c10a326373ad80aef51a85cec64071daefa7b Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown
[Impala-ASF-CR] CDH-48291: Fix flaky test TestRequestPoolService
Dimitris Tsirogiannis has uploaded a new change for review. http://gerrit.cloudera.org:8080/5507 Change subject: CDH-48291: Fix flaky test TestRequestPoolService .. CDH-48291: Fix flaky test TestRequestPoolService This commit fixes a flaky test caused by tight timeouts. The fix is to bump up the timeouts in this test. Change-Id: I71a036d5b1227508e2175808d7e92837ae247d99 --- M fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java 1 file changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/5507/1 -- To view, visit http://gerrit.cloudera.org:8080/5507 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I71a036d5b1227508e2175808d7e92837ae247d99 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis
[Impala-ASF-CR] Add all build targets to CMake and speed up builds
Impala Public Jenkins has submitted this change and it was merged. Change subject: Add all build targets to CMake and speed up builds .. Add all build targets to CMake and speed up builds Use CMake's dependency resolution always instead of serial execution of targets via shell scripts. This improves parallelism by building fe, be, and other targets at the same time and avoid some overhead from invoking "make" multiple times. This reduces the time taken for an incremental compilation of fe and be from 56s to 24s with this command: ./buildall.sh -debug -noclean -notests -skiptests -ninja Also use Impala-lzo's build script. This depends on the IMPALA-4277 fixes to the Impala-lzo build script. Log directory creation is also moved from impala-config.sh to buildall.sh. This means that impala-config.sh has no side-effects and can be run concurrently with no issues. Also make sure that "make" builds all the same artifacts as buildall.sh when run with no args. Testing: Ran a jenkins core job, also experimented locally. Ran a jenkins core job with distcc disabled - this exposed some concurrency bugs where impala-config.sh fails if run concurrently. Change-Id: I23617adf13bdeb034c24f6bba14b5ae480e8dd26 Reviewed-on: http://gerrit.cloudera.org:8080/4790 Reviewed-by: Tim ArmstrongTested-by: Impala Public Jenkins --- M CMakeLists.txt M be/src/benchmarks/CMakeLists.txt M be/src/service/CMakeLists.txt M bin/impala-config.sh M bin/make_impala.sh M buildall.sh M ext-data-source/CMakeLists.txt M fe/CMakeLists.txt 8 files changed, 85 insertions(+), 66 deletions(-) Approvals: Impala Public Jenkins: Verified Tim Armstrong: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/4790 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I23617adf13bdeb034c24f6bba14b5ae480e8dd26 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] Add all build targets to CMake and speed up builds
Impala Public Jenkins has posted comments on this change. Change subject: Add all build targets to CMake and speed up builds .. Patch Set 7: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/4790 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I23617adf13bdeb034c24f6bba14b5ae480e8dd26 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4662: Fix NULL literal handling in Kudu IN list predicates
Matthew Jacobs has uploaded a new change for review. http://gerrit.cloudera.org:8080/5505 Change subject: IMPALA-4662: Fix NULL literal handling in Kudu IN list predicates .. IMPALA-4662: Fix NULL literal handling in Kudu IN list predicates The KuduScanNode attempts to push IN list predicates to the Kudu scan, but NULL literals cannot be pushed. The code in KuduScanNode needed to check if the Literals in the InPredicate is a NullLiteral, in which case the entire IN list should not be pushed to Kudu. The same handling is already in place for binary predicate pushdown. Change-Id: Iaf2c10a326373ad80aef51a85cec64071daefa7b --- M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test 2 files changed, 15 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/5505/1 -- To view, visit http://gerrit.cloudera.org:8080/5505 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Iaf2c10a326373ad80aef51a85cec64071daefa7b Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs
[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test
Taras Bobrovytsky has posted comments on this change. Change subject: IMPALA-4467: Add support for DML statements in stress test .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/5093/4//COMMIT_MSG Commit Message: Line 24: following flag: --dml-mod-values 11 13 17. For each mod value 4 DML > I agree you can achieve the same thing with mod values in principle. Let me In order to calculate %rows, we don't actually need to know how many rows are in the table. %rows = 100 / mod_value. This is why I think it still makes sense to keep the mod_values. Another way to think of mod_value is if mod_value = N, then the DML statement should affect every Nth row. -- To view, visit http://gerrit.cloudera.org:8080/5093 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test
Taras Bobrovytsky has uploaded a new patch set (#7). Change subject: IMPALA-4467: Add support for DML statements in stress test .. IMPALA-4467: Add support for DML statements in stress test - Add support for insert, upsert, update and and delete statements. - Add support for compute stats with mt_dop query options. - Update impyla version in order to be able to have access to query error text for DML queries. - Made flake8 fixes. flake8 on this file is clean. For every Kudu table in the databases, we make a copy and add a '_original' suffix to the table name. The DML queries will only make modifications to the non original table, the original table will never be modified. The orignal tables could be used to bring the non-original table to the inital state. Two flags were added for doing this: --reset-databases-before-binary-search and --reset-databases-after-binary-search. The DML queries are generated based on the mod values passed in with the following flag: --dml-mod-values 11 13 17. For each mod value 4 DML queries are generated. The DML operations will touch table rows where primary_key % mod_value = 0. So, the larger the mod value, the more rows would be affected. The DML queries are generated in such a way that the data for the insert, upsert, and update queries is taken from the table with the _original suffix. The stress test generates DML queries for only kudu databases. For example, --tpch-kudu-db=tpch_100_kudu --tpch-db=tpch_100 --generate-dml-queries would only generate queries for the tpch_100_kudu database. Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40 --- M infra/python/deps/requirements.txt M tests/stress/concurrent_select.py M tests/util/parse_util.py 3 files changed, 689 insertions(+), 269 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/5093/7 -- To view, visit http://gerrit.cloudera.org:8080/5093 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-4659: fuzz test fixes
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4659: fuzz test fixes .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/5502/2//COMMIT_MSG Commit Message: PS2, Line 7: IMPALA-4569 > wrong jira? Done -- To view, visit http://gerrit.cloudera.org:8080/5502 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4654: KuduScanner must return when ReachedLimit()
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4654: KuduScanner must return when ReachedLimit() .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/5493 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iaddd5a1b2647995d68e6d37d0500b3a322de Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4631: don't use floating point operations for time unit conversions
Jim Apple has posted comments on this change. Change subject: IMPALA-4631: don't use floating point operations for time unit conversions .. Patch Set 6: Verified+1 http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/121/ -- To view, visit http://gerrit.cloudera.org:8080/5434 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7237f579b201f5bd3930f66e9c2c8d700c37ffeb Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4662: Fix NULL literal handling in Kudu IN list predicates
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-4662: Fix NULL literal handling in Kudu IN list predicates .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/5505/1/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java File fe/src/main/java/org/apache/impala/planner/KuduScanNode.java: PS1, Line 384: prediates > Typo: predicates Done http://gerrit.cloudera.org:8080/#/c/5505/1/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test File testdata/workloads/functional-planner/queries/PlannerTest/kudu.test: PS1, Line 319: select id from functional_kudu.alltypestiny where id in (1, null); > Is it necessary to add additional tests for other types? (string, bool, etc Sure -- To view, visit http://gerrit.cloudera.org:8080/5505 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iaf2c10a326373ad80aef51a85cec64071daefa7b Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test
Taras Bobrovytsky has uploaded a new patch set (#6). Change subject: IMPALA-4467: Add support for DML statements in stress test .. IMPALA-4467: Add support for DML statements in stress test - Add support for insert, upsert, update and and delete statements. - Add support for compute stats with mt_dop query options. - Update impyla version in order to be able to have access to query error text for DML queries. - Made flake8 fixes. flake8 on this file is clean. For every Kudu table in the databases, we make a copy and add a '_original' suffix to the table name. The DML queries will only make modifications to the non original table, the original table will never be modified. The orignal tables could be used to bring the non-original table to the inital state. Two flags were added for doing this: --reset-databases-before-binary-search and --reset-databases-after-binary-search. The DML queries are generated based on the mod values passed in with the following flag: --dml-mod-values 11 13 17. For each mod value 4 DML queries are generated. The DML operations will touch table rows where primary_key % mod_value = 0. So, the larger the mod value, the more rows would be affected. The DML queries are generated in such a way that the data for the insert, upsert, and update queries is taken from the table with the _original suffix. The stress test generates DML queries for only kudu databases. For example, --tpch-kudu-db=tpch_100_kudu --tpch-db=tpch_100 --generate-dml-queries would only generate queries for the tpch_100_kudu database. Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40 --- M infra/python/deps/requirements.txt M tests/stress/concurrent_select.py M tests/util/parse_util.py 3 files changed, 689 insertions(+), 269 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/5093/6 -- To view, visit http://gerrit.cloudera.org:8080/5093 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test
Hello Michael Brown, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5093 to look at the new patch set (#6). Change subject: IMPALA-4467: Add support for DML statements in stress test .. IMPALA-4467: Add support for DML statements in stress test - Add support for insert, upsert, update and and delete statements. - Add support for compute stats with mt_dop query options. - Update impyla version in order to be able to have access to query error text for DML queries. - Made flake8 fixes. flake8 on this file is clean. For every Kudu table in the databases, we make a copy and add a '_original' suffix to the table name. The DML queries will only make modifications to the non original table, the original table will never be modified. The orignal tables could be used to bring the non-original table to the inital state. Two flags were added for doing this: --reset-databases-before-binary-search and --reset-databases-after-binary-search. The DML queries are generated based on the mod values passed in with the following flag: --dml-mod-values 11 13 17. For each mod value 4 DML queries are generated. The DML operations will touch table rows where primary_key % mod_value = 0. So, the larger the mod value, the more rows would be affected. The DML queries are generated in such a way that the data for the insert, upsert, and update queries is taken from the table with the _original suffix. The stress test generates DML queries for only kudu databases. For example, --tpch-kudu-db=tpch_100_kudu --tpch-db=tpch_100 --generate-dml-queries would only generate queries for the tpch_100_kudu database. Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40 --- M infra/python/deps/requirements.txt M tests/stress/concurrent_select.py M tests/util/parse_util.py 3 files changed, 689 insertions(+), 269 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/5093/6 -- To view, visit http://gerrit.cloudera.org:8080/5093 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test
Taras Bobrovytsky has posted comments on this change. Change subject: IMPALA-4467: Add support for DML statements in stress test .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/5093/4/tests/stress/concurrent_select.py File tests/stress/concurrent_select.py: Line 1481: "UPDATE a SET {update_list} FROM {table_name} a JOIN {table_name}_original b " > To me it's not really about validating the results, but more about predicta Done. Skipping creation of DML queries for tables with more than 1 primary key column. -- To view, visit http://gerrit.cloudera.org:8080/5093 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test
Taras Bobrovytsky has posted comments on this change. Change subject: IMPALA-4467: Add support for DML statements in stress test .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/5093/6/tests/stress/concurrent_select.py File tests/stress/concurrent_select.py: Line 1471: # TODO: IMPALA-: Add support for tables with multiple primary keys. > fill in real JIRA or just leave the TODO Yes, I'll fill in the Jira. Michael wants every TODO to have a Jira attached. -- To view, visit http://gerrit.cloudera.org:8080/5093 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4649: add a mechanism to pass flags into make
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4649: add a mechanism to pass flags into make .. Patch Set 2: Code-Review+1 Rebased, carry +1 -- To view, visit http://gerrit.cloudera.org:8080/5480 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I17b13cbaf395f962762d5cff3d650ffb077934a4 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Jim Apple Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4648: remove build thirdparty.sh
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4648: remove build_thirdparty.sh .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/5477 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I36e11384877e5115430baf0170b31a508cb01ba3 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test
Alex Behm has posted comments on this change. Change subject: IMPALA-4467: Add support for DML statements in stress test .. Patch Set 7: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/5093/4//COMMIT_MSG Commit Message: Line 24: following flag: --dml-mod-values 11 13 17. For each mod value 4 DML > In order to calculate %rows, we don't actually need to know how many rows a Ahh yes, you are right. Thanks for clarifying. Not sure what I was thinking. I was clearly wrong. -- To view, visit http://gerrit.cloudera.org:8080/5093 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4662: Fix NULL literal handling in Kudu IN list predicates
Michael Brown has posted comments on this change. Change subject: IMPALA-4662: Fix NULL literal handling in Kudu IN list predicates .. Patch Set 2: Code-Review+1 Thanks for adding the additional test cases. My +1 means I'm satisfied that you addressed my comments, but obviously someone should take a look again at your FE changes. -- To view, visit http://gerrit.cloudera.org:8080/5505 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iaf2c10a326373ad80aef51a85cec64071daefa7b Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4659: fuzz test fixes
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4659: fuzz test fixes .. Patch Set 3: Code-Review+2 Carry +2 -- To view, visit http://gerrit.cloudera.org:8080/5502 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4659: fuzz test fixes
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-4659: fuzz test fixes .. IMPALA-4659: fuzz test fixes * Apply a 512m mem_limit to all fuzz tests. This limits aggregate memory consumption to ~5GB per daemon(assuming 10 concurrent tests). * Refactor the exec option handling to use the exec_option dimension. This avoids executing the test multiple times redundantly * Remove the xfails to reduce noise, since there is no immediate plan to fix the product bugs. Instead just pass the tests. Testing: Ran in a loop for ~1h to flush out flakiness. Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7 Reviewed-on: http://gerrit.cloudera.org:8080/5502 Reviewed-by: Tim ArmstrongTested-by: Impala Public Jenkins --- M tests/query_test/test_scanners_fuzz.py 1 file changed, 41 insertions(+), 25 deletions(-) Approvals: Impala Public Jenkins: Verified Tim Armstrong: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/5502 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4649: add a mechanism to pass flags into make
Hello Sailesh Mukil, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5480 to look at the new patch set (#2). Change subject: IMPALA-4649: add a mechanism to pass flags into make .. IMPALA-4649: add a mechanism to pass flags into make Testing: Tested that buildall.sh works as expected. Built locally with IMPALA_MAKE_FLAGS unset to confirm I didn't break anything. Built locally with IMPALA_MAKE_FLAGS=--load-average=$IMPALA_BUILD_THREADS and looked at "ps auxf" output to confirm it's passed through. Change-Id: I17b13cbaf395f962762d5cff3d650ffb077934a4 --- M bin/impala-config.sh M bin/make_impala.sh M testdata/bin/copy-udfs-udas.sh 3 files changed, 8 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/5480/2 -- To view, visit http://gerrit.cloudera.org:8080/5480 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I17b13cbaf395f962762d5cff3d650ffb077934a4 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-4659: fuzz test fixes
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4659: fuzz test fixes .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/5502 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4654: KuduScanner must return when ReachedLimit()
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-4654: KuduScanner must return when ReachedLimit() .. IMPALA-4654: KuduScanner must return when ReachedLimit() Fixes a bug in the KuduScanner where the scan node's limit was not respected and thus the scanner thread would continue executing until the scan range was fully consumed. This could result in completed queries leaving fragments running and those threads could be using significant CPU and memory. For example, the query 'select * from tpch_kudu.lineitem limit 90' when running in the minicluster and lineitem is partitioned into 3 hash partitions would end up leaving a scanner thread running for ~60 seconds. In real world scenarios this can cause unexpected resource consumption. This could build up over time leading to query failures if these queries are submitted frequently. The fix is to ensure KuduScanner::GetNext() returns with eos=true when it finds ReachedLimit=true. An unnecessary and somewhat confusing flag 'batch_done' was being returned by a helper function DecodeRowsIntoRowBatch, which isn't necessary and was removed in order to make it more clear how the code in GetNext() should behave. Change-Id: Iaddd5a1b2647995d68e6d37d0500b3a322de Reviewed-on: http://gerrit.cloudera.org:8080/5493 Reviewed-by: Alex BehmReviewed-by: Tim Armstrong Reviewed-by: Dan Hecht Tested-by: Internal Jenkins --- M be/src/exec/kudu-scanner.cc M be/src/exec/kudu-scanner.h M tests/query_test/test_kudu.py 3 files changed, 19 insertions(+), 28 deletions(-) Approvals: Internal Jenkins: Verified Alex Behm: Looks good to me, but someone else must approve Dan Hecht: Looks good to me, approved Tim Armstrong: Looks good to me, but someone else must approve -- To view, visit http://gerrit.cloudera.org:8080/5493 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Iaddd5a1b2647995d68e6d37d0500b3a322de Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-4648: remove build thirdparty.sh
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-4648: remove build_thirdparty.sh .. IMPALA-4648: remove build_thirdparty.sh It is not needed by any build processes. Change-Id: I36e11384877e5115430baf0170b31a508cb01ba3 Reviewed-on: http://gerrit.cloudera.org:8080/5477 Reviewed-by: Tim ArmstrongTested-by: Impala Public Jenkins --- D bin/build_thirdparty.sh 1 file changed, 0 insertions(+), 244 deletions(-) Approvals: Impala Public Jenkins: Verified Tim Armstrong: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/5477 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I36e11384877e5115430baf0170b31a508cb01ba3 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4662: Fix NULL literal handling in Kudu IN list predicates
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-4662: Fix NULL literal handling in Kudu IN list predicates .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/5505/1/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java File fe/src/main/java/org/apache/impala/planner/KuduScanNode.java: PS1, Line 388: Preconditions.checkNotNull(value == null); > Shouldn't this be checkNotNull(value)? It depends, if you want this Precondition to do anything then yes :) Good catch. Fixing this too. -- To view, visit http://gerrit.cloudera.org:8080/5505 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iaf2c10a326373ad80aef51a85cec64071daefa7b Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4662: Fix NULL literal handling in Kudu IN list predicates
Michael Brown has posted comments on this change. Change subject: IMPALA-4662: Fix NULL literal handling in Kudu IN list predicates .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/5505/1/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java File fe/src/main/java/org/apache/impala/planner/KuduScanNode.java: PS1, Line 384: prediates Typo: predicates http://gerrit.cloudera.org:8080/#/c/5505/1/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test File testdata/workloads/functional-planner/queries/PlannerTest/kudu.test: PS1, Line 319: select id from functional_kudu.alltypestiny where id in (1, null); Is it necessary to add additional tests for other types? (string, bool, etc.) -- To view, visit http://gerrit.cloudera.org:8080/5505 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iaf2c10a326373ad80aef51a85cec64071daefa7b Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test
Alex Behm has posted comments on this change. Change subject: IMPALA-4467: Add support for DML statements in stress test .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/5093/6/tests/stress/concurrent_select.py File tests/stress/concurrent_select.py: Line 1471: # TODO: IMPALA-: Add support for tables with multiple primary keys. fill in real JIRA or just leave the TODO -- To view, visit http://gerrit.cloudera.org:8080/5093 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers
Dan Hecht has posted comments on this change. Change subject: IMPALA-3200: Implement suballocator for splitting buffers .. Patch Set 14: (1 comment) http://gerrit.cloudera.org:8080/#/c/4715/14/be/src/runtime/bufferpool/suballocator.h File be/src/runtime/bufferpool/suballocator.h: Line 104: static constexpr int64_t MAX_ALLOCATION_BYTES = 1L << LOG_MAX_ALLOCATION_BYTES; > this is really large (entire virtual address space on current x64 implement after reading the rest of the code I see it's used so to bound the depth of the tree and number of free lists. Maybe comment thats what it's for. -- To view, visit http://gerrit.cloudera.org:8080/4715 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788 Gerrit-PatchSet: 14 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers
Jim Apple has posted comments on this change. Change subject: IMPALA-3200: Implement suballocator for splitting buffers .. Patch Set 14: (1 comment) http://gerrit.cloudera.org:8080/#/c/4715/14/be/src/runtime/bufferpool/suballocator.cc File be/src/runtime/bufferpool/suballocator.cc: Line 137: return Status::OK(); Now that I read one of Dan's comments, I wonder if this should be an error returned here. Otherwise, it looks like this method never returns an error and could just have a void return type. -- To view, visit http://gerrit.cloudera.org:8080/4715 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788 Gerrit-PatchSet: 14 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4639: Add pytest option and xfail markers for tests that only run locally.
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4639: Add pytest option and xfail markers for tests that only run locally. .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/5446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David KnuppGerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4639: Add pytest option and xfail markers for tests that only run locally.
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-4639: Add pytest option and xfail markers for tests that only run locally. .. IMPALA-4639: Add pytest option and xfail markers for tests that only run locally. As we're beginning to run Impala end-to-end tests on remote clusters, we're finding some tests that do not pass for infrastructure-related reasons (as opposed to product issues.) It would be useful to be able to xfail any tests that we know to be problematic within a given module, yet still run the others. This way, we can get passing test runs as we're ironing out those infrastructure issues. Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f Reviewed-on: http://gerrit.cloudera.org:8080/5446 Reviewed-by: Alex BehmTested-by: Internal Jenkins --- M tests/conftest.py M tests/metadata/test_compute_stats.py M tests/query_test/test_mt_dop.py 3 files changed, 24 insertions(+), 0 deletions(-) Approvals: Internal Jenkins: Verified Alex Behm: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/5446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4633: Change broken gflag default for Kudu client mem
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-4633: Change broken gflag default for Kudu client mem .. IMPALA-4633: Change broken gflag default for Kudu client mem We discovered that the current Kudu client defaults in the KuduTableSink are causing a large number of queries to timeout, failing the query. The current default value of the 'mutation buffer size' is 100MB which results in higher write throughput than Kudu can currently handle on large clusters. By decreasing the value of this flag, more RPCs will be sent for the same amount of data, i.e. throttling the load on Kudu. We found tests to be more successful on 200 nodes with a 10MB buffer size than the previous 100MB value where most queries couldn't complete due to timeouts. These queries were not timing out with the 10MB value. This appears to work well on 9 node stress tests as well. Change-Id: I0b3544f9a93c82e347f6e97540d6b561c30d09fd Reviewed-on: http://gerrit.cloudera.org:8080/5503 Reviewed-by: Dan HechtTested-by: Internal Jenkins --- M be/src/exec/kudu-table-sink.cc M be/src/exec/kudu-table-sink.h 2 files changed, 5 insertions(+), 4 deletions(-) Approvals: Internal Jenkins: Verified Dan Hecht: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/5503 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I0b3544f9a93c82e347f6e97540d6b561c30d09fd Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Matthew Jacobs
[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-3200: Implement suballocator for splitting buffers .. Patch Set 19: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/123/ -- To view, visit http://gerrit.cloudera.org:8080/4715 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788 Gerrit-PatchSet: 19 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers
Tim Armstrong has posted comments on this change. Change subject: IMPALA-3200: Implement suballocator for splitting buffers .. Patch Set 19: Code-Review+2 Carry +2 -- To view, visit http://gerrit.cloudera.org:8080/4715 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788 Gerrit-PatchSet: 19 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers
Hello Jim Apple, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4715 to look at the new patch set (#18). Change subject: IMPALA-3200: Implement suballocator for splitting buffers .. IMPALA-3200: Implement suballocator for splitting buffers This is useful for situations like hash tables, where we want to make multiple non-spillable allocations of variable size from buffer pool memory and not incur the overhead of interacting with the global buffer pool. The allocator subdivides buffers to service allocations and uses a buddy allocation algorithm to merge freed allocations into larger chunks. This helps avoid external fragmentation and is quite effective at reusing memory given the typical doubling allocation patterns of hash tables in partitioned aggs and joins. Testing: The allocator has fairly robust internal consistency checks via assertions and unique_ptrs. Includes a unit test that exercises various allocation patterns. I tested porting hash tables over to allocate memory using an earlier version of the suballocator, which worked well (was able to run a wide range of queries successfully with good performance). Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788 --- M be/src/common/names.h M be/src/exec/hash-table.h M be/src/runtime/bufferpool/CMakeLists.txt A be/src/runtime/bufferpool/suballocator-test.cc A be/src/runtime/bufferpool/suballocator.cc A be/src/runtime/bufferpool/suballocator.h 6 files changed, 857 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/4715/18 -- To view, visit http://gerrit.cloudera.org:8080/4715 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788 Gerrit-PatchSet: 18 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers
Tim Armstrong has posted comments on this change. Change subject: IMPALA-3200: Implement suballocator for splitting buffers .. Patch Set 15: (3 comments) http://gerrit.cloudera.org:8080/#/c/4715/14/be/src/runtime/bufferpool/suballocator.cc File be/src/runtime/bufferpool/suballocator.cc: Line 107: > But once we switch to using NEVER, will we still need WHEN_NEEDED? I pulled it out of this patch. I'll keep the code around, would be easy to add back in in future. http://gerrit.cloudera.org:8080/#/c/4715/15/be/src/runtime/bufferpool/suballocator.cc File be/src/runtime/bufferpool/suballocator.cc: Line 50: } > missing space Done http://gerrit.cloudera.org:8080/#/c/4715/15/be/src/runtime/bufferpool/suballocator.h File be/src/runtime/bufferpool/suballocator.h: PS15, Line 128: bein > missing space Done -- To view, visit http://gerrit.cloudera.org:8080/4715 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788 Gerrit-PatchSet: 15 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers
Dan Hecht has posted comments on this change. Change subject: IMPALA-3200: Implement suballocator for splitting buffers .. Patch Set 15: (1 comment) http://gerrit.cloudera.org:8080/#/c/4715/15/be/src/runtime/bufferpool/suballocator.cc File be/src/runtime/bufferpool/suballocator.cc: Line 50: } missing space -- To view, visit http://gerrit.cloudera.org:8080/4715 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788 Gerrit-PatchSet: 15 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4633: Change broken gflag default for Kudu client mem
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4633: Change broken gflag default for Kudu client mem .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/5503 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0b3544f9a93c82e347f6e97540d6b561c30d09fd Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3202,IMPALA-2298: rework scratch file I/O
Tim Armstrong has uploaded a new patch set (#15). Change subject: IMPALA-3202,IMPALA-2298: rework scratch file I/O .. IMPALA-3202,IMPALA-2298: rework scratch file I/O Refactor BufferedBlockMgr/TmpFileMgr to push more I/O logic into TmpFileMgr, in anticipation of it being shared with BufferPool. TmpFileMgr now handles: * Scratch space allocation and recycling * Read and write I/O The interface is also greatly changed so that it is built around Write() and Read() calls, abstracting away the details of temporary file allocation from clients. This means the TmpFileMgr::File class can be hidden from clients. Write error recovery: Also implement write error recovery in TmpFileMgr. If an error occurs while writing to scratch and we have multiple scratch directories, we will try one of the other directories before cancelling the query. File-level blacklisting is used to prevent excessive repeated attempts to resize a scratch file during a single query. Device-level blacklisting is still disabled because it is problematic to permanently take a scratch directory out of use. To reduce the number of error paths, all I/O errors are now handled asynchronously. Previously errors creating or extending the file were returned synchronously from WriteUnpinnedBlock(). This required modifying DiskIoMgr to create the file if not present when opened. Future Work: * Support for recycling variable-length scratch file ranges. I omitted this to avoid making the patch even large. Testing: Updated BufferedBlockMgr unit test to reflect changes in behaviour: * Scratch space is no longer permanently associated with a block, and is remapped every time a new block is written to disk . * Files are now blacklisted - updated existing tests and enable the disable blacklisting test. Added some basic testing of recycling of scratch file ranges in the TmpFileMgr unit test. I also manually tested the code in two ways. First by removing permissions for /tmp/impala-scratch and ensuring that a spilling query fails cleanly. Second, by creating a tiny ramdisk (16M) and running with two scratch directories: one on /tmp and one on the tiny ramdisk. When spilling, an out of space error is encountered for the tiny ramdisk and impala spills the remaining data (72M) to /tmp. Change-Id: I8c9c587df006d2f09d72dd636adafbd295fcdc17 --- M be/src/runtime/buffered-block-mgr-test.cc M be/src/runtime/buffered-block-mgr.cc M be/src/runtime/buffered-block-mgr.h M be/src/runtime/disk-io-mgr-test.cc M be/src/runtime/disk-io-mgr.cc M be/src/runtime/disk-io-mgr.h M be/src/runtime/query-state.cc A be/src/runtime/tmp-file-mgr-internal.h M be/src/runtime/tmp-file-mgr-test.cc M be/src/runtime/tmp-file-mgr.cc M be/src/runtime/tmp-file-mgr.h A be/src/util/buffer-ref.h M be/src/util/filesystem-util.cc M be/src/util/filesystem-util.h M common/thrift/ImpalaInternalService.thrift 15 files changed, 1,151 insertions(+), 506 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/5141/15 -- To view, visit http://gerrit.cloudera.org:8080/5141 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8c9c587df006d2f09d72dd636adafbd295fcdc17 Gerrit-PatchSet: 15 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test
Hello Michael Brown, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5093 to look at the new patch set (#8). Change subject: IMPALA-4467: Add support for DML statements in stress test .. IMPALA-4467: Add support for DML statements in stress test - Add support for insert, upsert, update and and delete statements. - Add support for compute stats with mt_dop query options. - Update impyla version in order to be able to have access to query error text for DML queries. - Made flake8 fixes. flake8 on this file is clean. For every Kudu table in the databases, we make a copy and add a '_original' suffix to the table name. The DML queries will only make modifications to the non original table, the original table will never be modified. The orignal tables could be used to bring the non-original table to the inital state. Two flags were added for doing this: --reset-databases-before-binary-search and --reset-databases-after-binary-search. The DML queries are generated based on the mod values passed in with the following flag: --dml-mod-values 11 13 17. For each mod value 4 DML queries are generated. The DML operations will touch table rows where primary_key % mod_value = 0. So, the larger the mod value, the more rows would be affected. The DML queries are generated in such a way that the data for the insert, upsert, and update queries is taken from the table with the _original suffix. The stress test generates DML queries for only kudu databases. For example, --tpch-kudu-db=tpch_100_kudu --tpch-db=tpch_100 --generate-dml-queries would only generate queries for the tpch_100_kudu database. Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40 --- M infra/python/deps/requirements.txt M tests/stress/concurrent_select.py M tests/util/parse_util.py 3 files changed, 691 insertions(+), 269 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/5093/8 -- To view, visit http://gerrit.cloudera.org:8080/5093 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers
Hello Jim Apple, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4715 to look at the new patch set (#17). Change subject: IMPALA-3200: Implement suballocator for splitting buffers .. IMPALA-3200: Implement suballocator for splitting buffers This is useful for situations like hash tables, where we want to make multiple non-spillable allocations of variable size from buffer pool memory and not incur the overhead of interacting with the global buffer pool. The allocator subdivides buffers to service allocations and uses a buddy allocation algorithm to merge freed allocations into larger chunks. This helps avoid external fragmentation and is quite effective at reusing memory given the typical doubling allocation patterns of hash tables in partitioned aggs and joins. Testing: The allocator has fairly robust internal consistency checks via assertions and unique_ptrs. Includes a unit test that exercises various allocation patterns. I tested porting hash tables over to allocate memory using an earlier version of the suballocator, which worked well (was able to run a wide range of queries successfully with good performance). Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788 --- M be/src/common/names.h M be/src/exec/hash-table.h M be/src/runtime/bufferpool/CMakeLists.txt A be/src/runtime/bufferpool/suballocator-test.cc A be/src/runtime/bufferpool/suballocator.cc A be/src/runtime/bufferpool/suballocator.h 6 files changed, 856 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/4715/17 -- To view, visit http://gerrit.cloudera.org:8080/4715 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788 Gerrit-PatchSet: 17 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers
Tim Armstrong has posted comments on this change. Change subject: IMPALA-3200: Implement suballocator for splitting buffers .. Patch Set 14: (9 comments) http://gerrit.cloudera.org:8080/#/c/4715/14/be/src/runtime/bufferpool/suballocator.cc File be/src/runtime/bufferpool/suballocator.cc: Line 68: > extraneous blank Done Line 80: DCHECK(free_node != nullptr); > it looks like this dcheck can fire if SplitToSize() fails to allocate a nod Good point - SplitToSize() should return an error in that case. Fixed. Line 107: } > why do we need both policies? WHEN_NEEDED will be convenient for porting the existing exec nodes, and I think we'll need NEVER in future when exec nodes want to have better control over their reservations. If we want to remove NEVER for the time being that also works. Line 137: return Status::OK(); > Now that I read one of Dan's comments, I wonder if this should be an error Yep this was a mistake. Line 216: *ptr_from_prev = move(node->next_free_); > should this reset result->prev_free_ so we don't have a dangling reference, Good point - Done http://gerrit.cloudera.org:8080/#/c/4715/14/be/src/runtime/bufferpool/suballocator.h File be/src/runtime/bufferpool/suballocator.h: Line 31: /// allocation algorithm optimised for power-of-two allocations. Uses a binary buddy > does it require that the buffer pool buffers themselves are power of two? w Expanded to be clearer about how min_buffer_len plays into this and what is a power of two. We require that we can allocate power-of-two buffers from the buffer pool. Line 104: static constexpr int64_t MAX_ALLOCATION_BYTES = 1L << LOG_MAX_ALLOCATION_BYTES; > after reading the rest of the code I see it's used so to bound the depth of Done PS14, Line 125: 'node' should already be free and not in : /// any free list > what does this mean given that free nodes are in the free list? do you mean It's really referring to node->in_use so updated to reflect. PS14, Line 159: support > supported? Done -- To view, visit http://gerrit.cloudera.org:8080/4715 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788 Gerrit-PatchSet: 14 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers
Hello Jim Apple, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4715 to look at the new patch set (#15). Change subject: IMPALA-3200: Implement suballocator for splitting buffers .. IMPALA-3200: Implement suballocator for splitting buffers This is useful for situations like hash tables, where we want to make multiple non-spillable allocations of variable size from buffer pool memory and not incur the overhead of interacting with the global buffer pool. The allocator subdivides buffers to service allocations and uses a buddy allocation algorithm to merge freed allocations into larger chunks. This helps avoid external fragmentation and is quite effective at reusing memory given the typical doubling allocation patterns of hash tables in partitioned aggs and joins. Testing: The allocator has fairly robust internal consistency checks via assertions and unique_ptrs. Includes a unit test that exercises various allocation patterns. I tested porting hash tables over to allocate memory using an earlier version of the suballocator, which worked well (was able to run a wide range of queries successfully with good performance). Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788 --- M be/src/common/names.h M be/src/exec/hash-table.h M be/src/runtime/bufferpool/CMakeLists.txt A be/src/runtime/bufferpool/suballocator-test.cc A be/src/runtime/bufferpool/suballocator.cc A be/src/runtime/bufferpool/suballocator.h 6 files changed, 941 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/4715/15 -- To view, visit http://gerrit.cloudera.org:8080/4715 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788 Gerrit-PatchSet: 15 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers
Tim Armstrong has posted comments on this change. Change subject: IMPALA-3200: Implement suballocator for splitting buffers .. Patch Set 15: Code-Review+1 carry +1 -- To view, visit http://gerrit.cloudera.org:8080/4715 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788 Gerrit-PatchSet: 15 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test
Taras Bobrovytsky has uploaded a new patch set (#8). Change subject: IMPALA-4467: Add support for DML statements in stress test .. IMPALA-4467: Add support for DML statements in stress test - Add support for insert, upsert, update and and delete statements. - Add support for compute stats with mt_dop query options. - Update impyla version in order to be able to have access to query error text for DML queries. - Made flake8 fixes. flake8 on this file is clean. For every Kudu table in the databases, we make a copy and add a '_original' suffix to the table name. The DML queries will only make modifications to the non original table, the original table will never be modified. The orignal tables could be used to bring the non-original table to the inital state. Two flags were added for doing this: --reset-databases-before-binary-search and --reset-databases-after-binary-search. The DML queries are generated based on the mod values passed in with the following flag: --dml-mod-values 11 13 17. For each mod value 4 DML queries are generated. The DML operations will touch table rows where primary_key % mod_value = 0. So, the larger the mod value, the more rows would be affected. The DML queries are generated in such a way that the data for the insert, upsert, and update queries is taken from the table with the _original suffix. The stress test generates DML queries for only kudu databases. For example, --tpch-kudu-db=tpch_100_kudu --tpch-db=tpch_100 --generate-dml-queries would only generate queries for the tpch_100_kudu database. Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40 --- M infra/python/deps/requirements.txt M tests/stress/concurrent_select.py M tests/util/parse_util.py 3 files changed, 691 insertions(+), 269 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/5093/8 -- To view, visit http://gerrit.cloudera.org:8080/5093 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-4654: KuduScanner must return when ReachedLimit()
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-4654: KuduScanner must return when ReachedLimit() .. Patch Set 2: > Change looks good to me. > > As discussed, I think there is still one remaining bug with plans > like scan+topn where the coordinator may not necessarily cancel > fragments containing scans. In that case Close() is called on the > plan tree during tear down, but unlike other scan nodes the Kudu > scan node does not use done_ to terminate the scanner threads. For > non-Kudu scan nodes setting done_ triggers teardown of all scanner > threads. The scanner threads detect this indirectly via > ScannerContext::cancelled() which is checked on a per-row-batch > basis. I can't repro this yet (topn doesn't do it) so I'm hesitant to make the change in this CR, but I agree we should do it separately. It's worth digging into the topn case and seeing why it doesn't repro, but I can't spend too much time on that right now. I'll make a separate change to check done_ and that way we don't have to take it into a release branch until it has more bake time / time to find a repro. This does deserve more attention but given I can't find an obvious bug I can't work on it right now. > > Todd has a good point. Our limit check is broken. Agree to deal > with that everywhere in s separate change. Filed https://issues.cloudera.org/browse/IMPALA-4658 -- To view, visit http://gerrit.cloudera.org:8080/5493 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iaddd5a1b2647995d68e6d37d0500b3a322de Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No