[Impala-ASF-CR] IMPALA-2112: Support primary key/foreign key constraints as part of create table in Impala.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14592 ) Change subject: IMPALA-2112: Support primary key/foreign key constraints as part of create table in Impala. .. Patch Set 12: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5203/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/14592 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id03d8d4d41a2ac1b15e7060e2a013e334d044ee7 Gerrit-Change-Number: 14592 Gerrit-PatchSet: 12 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 12 Nov 2019 19:14:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8525: preads should use hdfsPreadFully rather than hdfsPread
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/14635 ) Change subject: IMPALA-8525: preads should use hdfsPreadFully rather than hdfsPread .. Patch Set 3: Code-Review+1 (1 comment) I think this change makes sense. I'm willing to bump this to +2. Take a look at Tim's comment about mentioning ABFS perf. http://gerrit.cloudera.org:8080/#/c/14635/1/be/src/runtime/io/hdfs-file-reader.cc File be/src/runtime/io/hdfs-file-reader.cc: http://gerrit.cloudera.org:8080/#/c/14635/1/be/src/runtime/io/hdfs-file-reader.cc@225 PS1, Line 225: if (hdfsPreadFully( : hdfs_fs_, hdfs_file, position_in_file, buffer, bytes_to_read) == -1) { > Yeah, that is a good observation. Looking through the code you are right. F Thinking about it another way: If someone is overwriting files and they overwrite with something larger, then I think we would not read the whole file. That would cause its own set of problems. So, I'm thinking that this is undefined behavior and this behavior is not something users should be relying on. -- To view, visit http://gerrit.cloudera.org:8080/14635 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I29ea34897096bc790abdeb98073a47f1c4c10feb Gerrit-Change-Number: 14635 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 12 Nov 2019 21:06:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9150: Use HBase's stop-hbase.sh script for minicluster
Joe McDonnell has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14697 Change subject: IMPALA-9150: Use HBase's stop-hbase.sh script for minicluster .. IMPALA-9150: Use HBase's stop-hbase.sh script for minicluster testdata/bin/kill-hbase.sh currently uses the generic kill-java-service.sh script to kill the region servers, then the master, and then the zookeeper. Recent versions of HBase become unusable after performing this type of shutdown. The master seems to get stuck trying to recover, even after restarting the minicluster. The root cause in HBase is unclear, but HBase provides the stop-hbase.sh script, which does a more graceful shutdown. This switches tesdata/bin/kill-hbase.sh to use this script, which avoids the recovery problems. Testing: Ran the test-with-docker.py tests (which does a minicluster restart). Before the change, the HBase tests timed out due to HBase getting stuck recovering. After the change, tests ran normally. Change-Id: I67283f9098c73c849023af8bfa7af62308bf3ed3 --- M testdata/bin/kill-hbase.sh 1 file changed, 4 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/97/14697/1 -- To view, visit http://gerrit.cloudera.org:8080/14697 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I67283f9098c73c849023af8bfa7af62308bf3ed3 Gerrit-Change-Number: 14697 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell
[Impala-ASF-CR] IMPALA-9128: part 1: log on slow data stream RPCs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14662 ) Change subject: IMPALA-9128: part 1: log on slow data stream RPCs .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5202/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/14662 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I258ac91b9fbbdbc86d0e8091c34f511f8957c4cd Gerrit-Change-Number: 14662 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 12 Nov 2019 19:05:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9128: part 1: log on slow data stream RPCs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14662 ) Change subject: IMPALA-9128: part 1: log on slow data stream RPCs .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/14662 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I258ac91b9fbbdbc86d0e8091c34f511f8957c4cd Gerrit-Change-Number: 14662 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 12 Nov 2019 19:05:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8855: [DOCS] Document the generic VALUES clause
Alex Rodoni has posted comments on this change. ( http://gerrit.cloudera.org:8080/14661 ) Change subject: IMPALA-8855: [DOCS] Document the generic VALUES clause .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/14661/2/docs/topics/impala_values.xml File docs/topics/impala_values.xml: http://gerrit.cloudera.org:8080/#/c/14661/2/docs/topics/impala_values.xml@58 PS2, Line 58: The corresponding columns must have the same data > They can be different as long as they can be cast between each other. E.g. Done -- To view, visit http://gerrit.cloudera.org:8080/14661 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2568450993323236535a8f1d022dee7d09ecf62b Gerrit-Change-Number: 14661 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 12 Nov 2019 19:45:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9128: part 2: dump traces for slow RPCs
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/14668 ) Change subject: IMPALA-9128: part 2: dump traces for slow RPCs .. Patch Set 5: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/14668/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14668/5//COMMIT_MSG@38 PS5, Line 38: * basic perf testing Is tracing on by default? I assume the perf impact is negligible? -- To view, visit http://gerrit.cloudera.org:8080/14668 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic7af4b45c43ec731d742d3696112c5f800849947 Gerrit-Change-Number: 14668 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 12 Nov 2019 20:02:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8855: [DOCS] Document the generic VALUES clause
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14661 ) Change subject: IMPALA-8855: [DOCS] Document the generic VALUES clause .. Patch Set 3: Verified+1 Build Successful https://jenkins.impala.io/job/gerrit-docs-auto-test/531/ : Doc tests passed. -- To view, visit http://gerrit.cloudera.org:8080/14661 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2568450993323236535a8f1d022dee7d09ecf62b Gerrit-Change-Number: 14661 Gerrit-PatchSet: 3 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 12 Nov 2019 20:09:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9128: part 1: log on slow data stream RPCs
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14662 ) Change subject: IMPALA-9128: part 1: log on slow data stream RPCs .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/14662/4/be/src/runtime/krpc-data-stream-sender.cc File be/src/runtime/krpc-data-stream-sender.cc: http://gerrit.cloudera.org:8080/#/c/14662/4/be/src/runtime/krpc-data-stream-sender.cc@374 PS4, Line 374: if (IsSlowRpc(elapsed_time_ns)) { > Are there rpcs that this will log for that we wouldn't have noticed were sl I don't think so, I was thinking that it might be useful to directly track the amount of time spent waiting, since if this triggers it proves that it blocked the sender thread (in theory a slow RPC might not block the sender if the sender if running very slow). But I guess that's probably not super-useful, since a slow RPC is likely to be a symptom of something regardless - I can remove it if you think it's not worth the extra noise. -- To view, visit http://gerrit.cloudera.org:8080/14662 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I258ac91b9fbbdbc86d0e8091c34f511f8957c4cd Gerrit-Change-Number: 14662 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 12 Nov 2019 18:56:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9128: part 1: log on slow data stream RPCs
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/14662 ) Change subject: IMPALA-9128: part 1: log on slow data stream RPCs .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/14662/4/be/src/runtime/krpc-data-stream-sender.cc File be/src/runtime/krpc-data-stream-sender.cc: http://gerrit.cloudera.org:8080/#/c/14662/4/be/src/runtime/krpc-data-stream-sender.cc@374 PS4, Line 374: if (IsSlowRpc(elapsed_time_ns)) { > I don't think so, I was thinking that it might be useful to directly track Got it. I think its probably fine - if we're triggering this logging, something is probably going on that needs to be addressed anyways, so a little extra noise in the logs isn't necessary a bad thing -- To view, visit http://gerrit.cloudera.org:8080/14662 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I258ac91b9fbbdbc86d0e8091c34f511f8957c4cd Gerrit-Change-Number: 14662 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 12 Nov 2019 19:01:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8855: [DOCS] Document the generic VALUES clause
Hello Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14661 to look at the new patch set (#3). Change subject: IMPALA-8855: [DOCS] Document the generic VALUES clause .. IMPALA-8855: [DOCS] Document the generic VALUES clause - Added a paragraph on implicit conversion in impala_datatypes.xml Change-Id: I2568450993323236535a8f1d022dee7d09ecf62b --- M docs/impala.ditamap M docs/topics/impala_datatypes.xml M docs/topics/impala_insert.xml M docs/topics/impala_langref_unsupported.xml A docs/topics/impala_values.xml 5 files changed, 156 insertions(+), 85 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/14661/3 -- To view, visit http://gerrit.cloudera.org:8080/14661 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2568450993323236535a8f1d022dee7d09ecf62b Gerrit-Change-Number: 14661 Gerrit-PatchSet: 3 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-8855: [DOCS] Document the generic VALUES clause
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14661 ) Change subject: IMPALA-8855: [DOCS] Document the generic VALUES clause .. Patch Set 3: Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/531/ Testing docs change - this change appears to modify docs/ and no code. This is experimental - please report any issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317 -- To view, visit http://gerrit.cloudera.org:8080/14661 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2568450993323236535a8f1d022dee7d09ecf62b Gerrit-Change-Number: 14661 Gerrit-PatchSet: 3 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 12 Nov 2019 19:47:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9150: Use HBase's stop-hbase.sh script for minicluster
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14697 ) Change subject: IMPALA-9150: Use HBase's stop-hbase.sh script for minicluster .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5008/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14697 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I67283f9098c73c849023af8bfa7af62308bf3ed3 Gerrit-Change-Number: 14697 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 12 Nov 2019 21:20:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9127: explicit probe state machine in hash join
Tim Armstrong has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/14688 ) Change subject: IMPALA-9127: explicit probe state machine in hash join .. IMPALA-9127: explicit probe state machine in hash join This refactors the main loop in PartitionedHashJoinNode::GetNext() to use an explicit state machine, rather than the hard-to-follow implicit state machine previously used. A new state variable 'probe_state_' is used to drive the loop, with DCHECKs enforcing invariants of other member variables. I deliberately tried to minimise changes to other functions (including any attempts to factor logic out of GetNext()) to minimise the scope of this patch. The new logic is mostly equivalent to the old logic, although there may be a different number of trips through the loop because of the way the cascading checks in the old version worked. A few notable changes: * DoneProbing() is consistently called when probing is finished, including in cases, like probing a single spilled partition, where it wasn't previously. * The repeated AtCapacity() checks are consolidated into a single check that happens at the end of the loop. Resources attached to batches should still be flushed at the appropriate points, since each previous "if (out_batch->AtCapacity()) break;" corresponds to a new loop iteration in the new code. * OutputNullAwareNullProbe() and OutputNullAwareProbeRows() now explicitly signal when they are done using an output argument, instead of implicitly via AtCapacity(), which is incredibly error-prone. Testing: We have adequate coverage for different join modes, including with spilling. * Ran exhaustive tests. * Ran a single node stress test with TPC-H and TPC-DS * Ran a single node stress test with a debug action to force spilling: --impalad_args="-default_query_options=DEBUG_ACTION=-1:OPEN:SET_DENY_RESERVATION_PROBABILITY@0.5" Change-Id: I32ebdf0054d2ce4562b851439e300323601fb064 --- M be/src/exec/partitioned-hash-join-node-ir.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h 3 files changed, 329 insertions(+), 201 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/14688/5 -- To view, visit http://gerrit.cloudera.org:8080/14688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I32ebdf0054d2ce4562b851439e300323601fb064 Gerrit-Change-Number: 14688 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-9082: make WebserverTest error checking stricter
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14672 ) Change subject: IMPALA-9082: make WebserverTest error checking stricter .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/14672 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I820336271cf25130538ceae2eed10a72a73d2adc Gerrit-Change-Number: 14672 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 12 Nov 2019 17:17:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9128: part 2: dump traces for slow RPCs
Hello Thomas Tauber-Marshall, Lars Volker, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14668 to look at the new patch set (#4). Change subject: IMPALA-9128: part 2: dump traces for slow RPCs .. IMPALA-9128: part 2: dump traces for slow RPCs This adds trace events for data stream RPCs and dumps them when they take longer than --impala_slow_rpc_threshold_ms. I needed to modify the KRPC code to do this because it currently only dumps traces for RPCs with deadlines. I plan to add some version of this upstream in Kudu so that we don't diverge our KRPC implementation. Example output from test_exchange_small_buffer: I 08:38:53.732910 26509 rpcz_store.cc:265] Call impala.DataStreamService.TransmitData from 127.0.0.1:42434 (request call id 43) took 7799ms. Request Metrics: {} I 08:38:53.732928 26509 rpcz_store.cc:269] Trace: 08:38:45.933412 (+ 0us) impala-service-pool.cc:167] Inserting onto call queue 08:38:45.933449 (+37us) impala-service-pool.cc:254] Handling call 08:38:45.933470 (+21us) krpc-data-stream-mgr.cc:227] Added early sender 08:38:47.906542 (+1973072us) krpc-data-stream-recvr.cc:327] Enqueuing deferred RPC 08:38:53.732858 (+5826316us) krpc-data-stream-recvr.cc:506] Processing deferred RPC 08:38:53.732860 (+ 2us) krpc-data-stream-recvr.cc:399] Deserializing batch 08:38:53.732888 (+28us) krpc-data-stream-recvr.cc:426] Enqueuing deserialized batch 08:38:53.732895 (+ 7us) inbound_call.cc:162] Queueing success response Testing: * Ran exhaustive and ASAN tests TODO: * stress testing * basic perf testing Change-Id: Ic7af4b45c43ec731d742d3696112c5f800849947 --- M .clang-tidy M be/src/kudu/rpc/rpcz_store.cc M be/src/runtime/krpc-data-stream-mgr.cc M be/src/runtime/krpc-data-stream-recvr.cc M tests/custom_cluster/test_exchange_deferred_batches.py M tests/custom_cluster/test_exchange_delays.py 6 files changed, 43 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/14668/4 -- To view, visit http://gerrit.cloudera.org:8080/14668 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic7af4b45c43ec731d742d3696112c5f800849947 Gerrit-Change-Number: 14668 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-8138: Reintroduce rpc debugging options
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14641 ) Change subject: IMPALA-8138: Reintroduce rpc debugging options .. Patch Set 3: (3 comments) A few more comments. Planning to take another look later today. http://gerrit.cloudera.org:8080/#/c/14641/2/be/src/util/debug-util.cc File be/src/util/debug-util.cc: http://gerrit.cloudera.org:8080/#/c/14641/2/be/src/util/debug-util.cc@395 PS2, Line 395: } : string error_msg = tokens.size() == 3 ? : tokens[2] : > I'm not sure what you mean. Its a property of this particular debug action Yeah, didn't see the check for "iequals(cmd, "FAIL")" above. Although, I think the pattern in Impala is to document the return type for each method, so would be nice to document that for this method as well. http://gerrit.cloudera.org:8080/#/c/14641/2/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/14641/2/common/thrift/ImpalaService.thrift@92 PS2, Line 92: ::...::@@@..." > Sure, since the debug action in this patch is being passed in as a command Yeah, maybe passing in a path to a file that contains JSON is the right way to do it. Just a thought really. I think another way to make it easier to understand how to use DEBUG_ACTIONS, would be to include some examples. Can you add a few? At least, a few relevant to the changes you are making. http://gerrit.cloudera.org:8080/#/c/14641/3/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/14641/3/common/thrift/ImpalaService.thrift@101 PS3, Line 101: and error should mention that the DebugAction might change code paths depending on the value of error - e.g. in impala-service-pool.cc QueueInboundCall, it either calls FailAndReleaseRpc or RejectTooBusy depending on the value of error. Right now, it sounds like it just returns a Status(error), and thats it. -- To view, visit http://gerrit.cloudera.org:8080/14641 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9c047ebce6d32c5ae461f70279391fa2df4c2029 Gerrit-Change-Number: 14641 Gerrit-PatchSet: 3 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 12 Nov 2019 17:31:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9127: explicit probe state machine in hash join
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14688 ) Change subject: IMPALA-9127: explicit probe state machine in hash join .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5006/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I32ebdf0054d2ce4562b851439e300323601fb064 Gerrit-Change-Number: 14688 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 12 Nov 2019 17:43:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9128: part 2: dump traces for slow RPCs
Tim Armstrong has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/14668 ) Change subject: IMPALA-9128: part 2: dump traces for slow RPCs .. IMPALA-9128: part 2: dump traces for slow RPCs This adds trace events for data stream RPCs and dumps them when they take longer than --impala_slow_rpc_threshold_ms. I needed to modify the KRPC code to do this because it currently only dumps traces for RPCs with deadlines. I plan to add some version of this upstream in Kudu so that we don't diverge our KRPC implementation. Example output from test_exchange_small_buffer: I 08:38:53.732910 26509 rpcz_store.cc:265] Call impala.DataStreamService.TransmitData from 127.0.0.1:42434 (request call id 43) took 7799ms. Request Metrics: {} I 08:38:53.732928 26509 rpcz_store.cc:269] Trace: 08:38:45.933412 (+ 0us) impala-service-pool.cc:167] Inserting onto call queue 08:38:45.933449 (+37us) impala-service-pool.cc:254] Handling call 08:38:45.933470 (+21us) krpc-data-stream-mgr.cc:227] Added early sender 08:38:47.906542 (+1973072us) krpc-data-stream-recvr.cc:327] Enqueuing deferred RPC 08:38:53.732858 (+5826316us) krpc-data-stream-recvr.cc:506] Processing deferred RPC 08:38:53.732860 (+ 2us) krpc-data-stream-recvr.cc:399] Deserializing batch 08:38:53.732888 (+28us) krpc-data-stream-recvr.cc:426] Enqueuing deserialized batch 08:38:53.732895 (+ 7us) inbound_call.cc:162] Queueing success response Testing: * Ran exhaustive and ASAN tests TODO: * stress testing * basic perf testing Change-Id: Ic7af4b45c43ec731d742d3696112c5f800849947 --- M be/src/kudu/rpc/rpcz_store.cc M be/src/runtime/krpc-data-stream-mgr.cc M be/src/runtime/krpc-data-stream-recvr.cc M tests/custom_cluster/test_exchange_deferred_batches.py M tests/custom_cluster/test_exchange_delays.py 5 files changed, 42 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/14668/3 -- To view, visit http://gerrit.cloudera.org:8080/14668 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic7af4b45c43ec731d742d3696112c5f800849947 Gerrit-Change-Number: 14668 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-9090 Add name of table being scanned in HDFS scan node profile
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14660 ) Change subject: IMPALA-9090 Add name of table being scanned in HDFS scan node profile .. Patch Set 1: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/14660 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If5da1112bcf38ae55b89eccfd7c7fad860819a99 Gerrit-Change-Number: 14660 Gerrit-PatchSet: 1 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 12 Nov 2019 17:14:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9090 Add name of table being scanned in HDFS scan node profile
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14660 ) Change subject: IMPALA-9090 Add name of table being scanned in HDFS scan node profile .. Patch Set 1: I probably should have asked in the JIRA to include it for all scan nodes (Kudu, Hbase, etc). I can start the merge as-is, since this is a good improvement, but I wouldn't mind if you looked at what it would take to add for the other node types. -- To view, visit http://gerrit.cloudera.org:8080/14660 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If5da1112bcf38ae55b89eccfd7c7fad860819a99 Gerrit-Change-Number: 14660 Gerrit-PatchSet: 1 Gerrit-Owner: Xiaomeng Zhang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 12 Nov 2019 17:14:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9128: part 2: dump traces for slow RPCs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14668 ) Change subject: IMPALA-9128: part 2: dump traces for slow RPCs .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5007/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14668 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic7af4b45c43ec731d742d3696112c5f800849947 Gerrit-Change-Number: 14668 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 12 Nov 2019 18:22:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9128: part 1: log on slow data stream RPCs
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/14662 ) Change subject: IMPALA-9128: part 1: log on slow data stream RPCs .. Patch Set 4: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/14662/1/be/src/runtime/krpc-data-stream-sender.h File be/src/runtime/krpc-data-stream-sender.h: http://gerrit.cloudera.org:8080/#/c/14662/1/be/src/runtime/krpc-data-stream-sender.h@225 PS1, Line 225: RuntimeProfile::SummaryStatsCounter* recvr_time_stats_ = nullptr; > I think you're right on that one - it looks like the timer starts when the Looks like you've decided to tackle this by leaving 'receiver_latency_ns' alone and doing the tracing in a follow up patch? That works for me http://gerrit.cloudera.org:8080/#/c/14662/4/be/src/runtime/krpc-data-stream-sender.cc File be/src/runtime/krpc-data-stream-sender.cc: http://gerrit.cloudera.org:8080/#/c/14662/4/be/src/runtime/krpc-data-stream-sender.cc@374 PS4, Line 374: if (IsSlowRpc(elapsed_time_ns)) { Are there rpcs that this will log for that we wouldn't have noticed were slow in ...CompleteCB? -- To view, visit http://gerrit.cloudera.org:8080/14662 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I258ac91b9fbbdbc86d0e8091c34f511f8957c4cd Gerrit-Change-Number: 14662 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 12 Nov 2019 18:48:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9128: part 2: dump traces for slow RPCs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14668 ) Change subject: IMPALA-9128: part 2: dump traces for slow RPCs .. Patch Set 3: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/5005/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/14668 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic7af4b45c43ec731d742d3696112c5f800849947 Gerrit-Change-Number: 14668 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 12 Nov 2019 17:13:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8855: [DOCS] Document the generic VALUES clause
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14661 ) Change subject: IMPALA-8855: [DOCS] Document the generic VALUES clause .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/14661/2/docs/topics/impala_values.xml File docs/topics/impala_values.xml: http://gerrit.cloudera.org:8080/#/c/14661/2/docs/topics/impala_values.xml@58 PS2, Line 58: The corresponding columns must have the same data They can be different as long as they can be cast between each other. E.g. here's a query with mixed string and timestamp. [localhost:21000] default> values (cast('2019-01-01' as timestamp)), ('2019-02-02'); Query: values (cast('2019-01-01' as timestamp)), ('2019-02-02') +-+ | cast('2019-01-01' as timestamp) | +-+ | 2019-01-01 00:00:00 | | 2019-02-02 00:00:00 | +-+ Fetched 2 row(s) in 0.11s Maybe reword to "must have compatible data types in all rows", or similar. I didn't see anywhere in the docs where we actually discuss what types can be implicitly cast to other types, but maybe I missed it. -- To view, visit http://gerrit.cloudera.org:8080/14661 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2568450993323236535a8f1d022dee7d09ecf62b Gerrit-Change-Number: 14661 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 12 Nov 2019 17:22:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9128: part 2: dump traces for slow RPCs
Hello Thomas Tauber-Marshall, Lars Volker, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14668 to look at the new patch set (#5). Change subject: IMPALA-9128: part 2: dump traces for slow RPCs .. IMPALA-9128: part 2: dump traces for slow RPCs This adds trace events for data stream RPCs and dumps them when they take longer than --impala_slow_rpc_threshold_ms. I needed to modify the KRPC code to do this because it currently only dumps traces for RPCs with deadlines. I plan to add some version of this upstream in Kudu so that we don't diverge our KRPC implementation. Example output from test_exchange_small_buffer: I 08:38:53.732910 26509 rpcz_store.cc:265] Call impala.DataStreamService.TransmitData from 127.0.0.1:42434 (request call id 43) took 7799ms. Request Metrics: {} I 08:38:53.732928 26509 rpcz_store.cc:269] Trace: 08:38:45.933412 (+ 0us) impala-service-pool.cc:167] Inserting onto call queue 08:38:45.933449 (+37us) impala-service-pool.cc:254] Handling call 08:38:45.933470 (+21us) krpc-data-stream-mgr.cc:227] Added early sender 08:38:47.906542 (+1973072us) krpc-data-stream-recvr.cc:327] Enqueuing deferred RPC 08:38:53.732858 (+5826316us) krpc-data-stream-recvr.cc:506] Processing deferred RPC 08:38:53.732860 (+ 2us) krpc-data-stream-recvr.cc:399] Deserializing batch 08:38:53.732888 (+28us) krpc-data-stream-recvr.cc:426] Enqueuing deserialized batch 08:38:53.732895 (+ 7us) inbound_call.cc:162] Queueing success response Disabled +-clang-diagnostic-gnu-zero-variadic-macro-arguments because it had false positives on the TRACE_TO invocations. Testing: * Ran exhaustive and ASAN tests TODO: * stress testing * basic perf testing Change-Id: Ic7af4b45c43ec731d742d3696112c5f800849947 --- M .clang-tidy M be/src/kudu/rpc/rpcz_store.cc M be/src/runtime/krpc-data-stream-mgr.cc M be/src/runtime/krpc-data-stream-recvr.cc M tests/custom_cluster/test_exchange_deferred_batches.py M tests/custom_cluster/test_exchange_delays.py 6 files changed, 43 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/14668/5 -- To view, visit http://gerrit.cloudera.org:8080/14668 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic7af4b45c43ec731d742d3696112c5f800849947 Gerrit-Change-Number: 14668 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-9082: make WebserverTest error checking stricter
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14672 ) Change subject: IMPALA-9082: make WebserverTest error checking stricter .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/14672 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I820336271cf25130538ceae2eed10a72a73d2adc Gerrit-Change-Number: 14672 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 12 Nov 2019 17:46:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9082: make WebserverTest error checking stricter
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14672 ) Change subject: IMPALA-9082: make WebserverTest error checking stricter .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5201/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/14672 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I820336271cf25130538ceae2eed10a72a73d2adc Gerrit-Change-Number: 14672 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 12 Nov 2019 17:46:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9127: explicit probe state machine in hash join
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14688 ) Change subject: IMPALA-9127: explicit probe state machine in hash join .. Patch Set 5: This depends on https://gerrit.cloudera.org/#/c/14632/4 -- To view, visit http://gerrit.cloudera.org:8080/14688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I32ebdf0054d2ce4562b851439e300323601fb064 Gerrit-Change-Number: 14688 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 12 Nov 2019 17:46:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8525: preads should use hdfsPreadFully rather than hdfsPread
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14635 ) Change subject: IMPALA-8525: preads should use hdfsPreadFully rather than hdfsPread .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/14635/2/be/src/runtime/io/hdfs-file-reader.cc File be/src/runtime/io/hdfs-file-reader.cc: http://gerrit.cloudera.org:8080/#/c/14635/2/be/src/runtime/io/hdfs-file-reader.cc@224 PS2, Line 224: if (FLAGS_use_hdfs_pread || IsS3APath(scan_range_->file_string()->c_str())) { > Oddly enough, none of this makes a significant difference for ABFS. I plan If you did perf testing for ABFS (even if it was basic/ad-hoc), can you mention it in the commit message. This is really useful info. -- To view, visit http://gerrit.cloudera.org:8080/14635 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I29ea34897096bc790abdeb98073a47f1c4c10feb Gerrit-Change-Number: 14635 Gerrit-PatchSet: 2 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 12 Nov 2019 16:05:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8525: preads should use hdfsPreadFully rather than hdfsPread
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14635 ) Change subject: IMPALA-8525: preads should use hdfsPreadFully rather than hdfsPread .. Patch Set 3: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/14635 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I29ea34897096bc790abdeb98073a47f1c4c10feb Gerrit-Change-Number: 14635 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 12 Nov 2019 16:06:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8205: Support number of true and false statistics for boolean column
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14666 ) Change subject: IMPALA-8205: Support number of true and false statistics for boolean column .. Patch Set 1: (8 comments) Thanks for fixing so much tests! I did a first round review and this patch makes sense to me. Will look into it deeper later. http://gerrit.cloudera.org:8080/#/c/14666/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14666/1//COMMIT_MSG@9 PS1, Line 9: This change compute the real number of true and false statistics information for boolean columns; Before this, impala used to set numTrues and numFalses to hardcoded -1 to indicate that its statistics is missing; Please wrap the commit message at 72 characters per line and replace ";" with "." http://gerrit.cloudera.org:8080/#/c/14666/1/be/src/exec/catalog-op-executor.cc File be/src/exec/catalog-op-executor.cc: http://gerrit.cloudera.org:8080/#/c/14666/1/be/src/exec/catalog-op-executor.cc@286 PS1, Line 286: , nit: . http://gerrit.cloudera.org:8080/#/c/14666/1/be/src/exec/incr-stats-util.cc File be/src/exec/incr-stats-util.cc: http://gerrit.cloudera.org:8080/#/c/14666/1/be/src/exec/incr-stats-util.cc@164 PS1, Line 164: num_trues num_new_trues? http://gerrit.cloudera.org:8080/#/c/14666/1/be/src/exec/incr-stats-util.cc@165 PS1, Line 165: num_falses num_new_falses? http://gerrit.cloudera.org:8080/#/c/14666/1/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: http://gerrit.cloudera.org:8080/#/c/14666/1/common/thrift/CatalogObjects.thrift@172 PS1, Line 172: required Shoule this be optional? It's just set for boolean column. http://gerrit.cloudera.org:8080/#/c/14666/1/common/thrift/CatalogObjects.thrift@198 PS1, Line 198: required optional too? http://gerrit.cloudera.org:8080/#/c/14666/1/fe/src/main/java/org/apache/impala/catalog/ColumnStats.java File fe/src/main/java/org/apache/impala/catalog/ColumnStats.java: http://gerrit.cloudera.org:8080/#/c/14666/1/fe/src/main/java/org/apache/impala/catalog/ColumnStats.java@512 PS1, Line 512: numTrues nit: numTrues_ http://gerrit.cloudera.org:8080/#/c/14666/1/fe/src/main/java/org/apache/impala/catalog/ColumnStats.java@513 PS1, Line 513: numFalses nit: numFalses_ -- To view, visit http://gerrit.cloudera.org:8080/14666 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I991bee8e7fdc644d908289f5fe2ee8032cc2c431 Gerrit-Change-Number: 14666 Gerrit-PatchSet: 1 Gerrit-Owner: Anonymous Coward <583424...@qq.com> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 12 Nov 2019 23:32:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9128: part 1: log on slow data stream RPCs
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14662 ) Change subject: IMPALA-9128: part 1: log on slow data stream RPCs .. IMPALA-9128: part 1: log on slow data stream RPCs Allows modifying the threshold for KRPC's server-side slow RPC logging (which is enabled for all KRPCs). Added additional logging for data stream RPCs TransmitData and EndDataStream, and for slow waits that delay the query. Adds statistics for RPC time to provide some clues if there are slow data stream RPCs. I tested this with a low threshold and delays added: start-impala-cluster.py \ --impalad_args=--impala_slow_rpc_threshold_ms=1 \ --impalad_args=--debug_actions=END_DATA_STREAM_DELAY:JITTER@3000@1.0 Example Profile output: - NetworkThroughput: (Avg: 102.98 MB/sec ; Min: 5.58 MB/sec ; Max: 171.79 MB/sec ; Number of samples: 296) - RpcNetworkTime: (Avg: 13.468ms ; Min: 91.309us ; Max: 2s395ms ; Number of samples: 299) - RpcRecvrTime: (Avg: 13.406ms ; Min: 83.160us ; Max: 2s395ms ; Number of samples: 299) Example log output (with log threshold of 1ms): I1107 14:33:50.487251 24933 krpc-data-stream-sender.cc:363] ad4fa70619170ace:b58b2eba0006] Long delay waiting for RPC to 127.0.1.1:27000 (fragment_instance_id=ad4fa70619170ace:b58b2eba): took 451.036ms I1107 14:33:51.295518 21361 rpcz_store.cc:265] Call impala.DataStreamService.EndDataStream from 127.0.0.1:43952 (request call id 82) took 1259ms. Request Metrics: {} I1107 14:33:44.843204 21332 krpc-data-stream-sender.cc:342] Slow TransmitData RPC to 127.0.1.1:27000 (fragment_instance_id=ad4fa70619170ace:b58b2eba0006): took 2.194ms. Receiver time: 457.902us Network time: 1.736ms I1107 14:33:45.139068 21333 krpc-data-stream-sender.cc:342] Slow EndDataStream RPC to 127.0.1.1:27001 (fragment_instance_id=ad4fa70619170ace:b58b2eba0004): took 61.340ms. Receiver time: 81.908us Network time: 61.259ms Change-Id: I258ac91b9fbbdbc86d0e8091c34f511f8957c4cd Reviewed-on: http://gerrit.cloudera.org:8080/14662 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/common/global-flags.cc M be/src/rpc/rpc-mgr.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h 4 files changed, 87 insertions(+), 4 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/14662 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I258ac91b9fbbdbc86d0e8091c34f511f8957c4cd Gerrit-Change-Number: 14662 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-9128: part 1: log on slow data stream RPCs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14662 ) Change subject: IMPALA-9128: part 1: log on slow data stream RPCs .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/14662 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I258ac91b9fbbdbc86d0e8091c34f511f8957c4cd Gerrit-Change-Number: 14662 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 12 Nov 2019 23:36:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2112: Support primary key/foreign key constraints as part of create table in Impala.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14592 ) Change subject: IMPALA-2112: Support primary key/foreign key constraints as part of create table in Impala. .. Patch Set 12: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5203/ -- To view, visit http://gerrit.cloudera.org:8080/14592 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id03d8d4d41a2ac1b15e7060e2a013e334d044ee7 Gerrit-Change-Number: 14592 Gerrit-PatchSet: 12 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 12 Nov 2019 23:48:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7506: [DOCS] Global INVALIDATE is supported in local catalog mode
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14700 ) Change subject: IMPALA-7506: [DOCS] Global INVALIDATE is supported in local catalog mode .. Patch Set 1: Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/532/ Testing docs change - this change appears to modify docs/ and no code. This is experimental - please report any issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317 -- To view, visit http://gerrit.cloudera.org:8080/14700 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30ca29995a1b6667e2738803fc0a0639f8f08fe9 Gerrit-Change-Number: 14700 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 12 Nov 2019 23:24:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7506: [DOCS] Global INVALIDATE is supported in local catalog mode
Alex Rodoni has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14700 Change subject: IMPALA-7506: [DOCS] Global INVALIDATE is supported in local catalog mode .. IMPALA-7506: [DOCS] Global INVALIDATE is supported in local catalog mode Change-Id: I30ca29995a1b6667e2738803fc0a0639f8f08fe9 --- M docs/topics/impala_metadata.xml 1 file changed, 0 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/14700/1 -- To view, visit http://gerrit.cloudera.org:8080/14700 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I30ca29995a1b6667e2738803fc0a0639f8f08fe9 Gerrit-Change-Number: 14700 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni
[Impala-ASF-CR] IMPALA-7506: [DOCS] Global INVALIDATE is supported in local catalog mode
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14700 ) Change subject: IMPALA-7506: [DOCS] Global INVALIDATE is supported in local catalog mode .. Patch Set 1: Verified+1 Build Successful https://jenkins.impala.io/job/gerrit-docs-auto-test/532/ : Doc tests passed. -- To view, visit http://gerrit.cloudera.org:8080/14700 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30ca29995a1b6667e2738803fc0a0639f8f08fe9 Gerrit-Change-Number: 14700 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 12 Nov 2019 23:36:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9082: make WebserverTest error checking stricter
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14672 ) Change subject: IMPALA-9082: make WebserverTest error checking stricter .. IMPALA-9082: make WebserverTest error checking stricter WebserverTest::TestWithSpnego has been flaky lately, but I have been unable to repro it. This patch is an attempt to make it easier to debug the issue the next time it shows up in automated builds. The test performs a GET that is expected to fail and then checks that the metrics show the failed GET. The flaky failures occur when the metrics do not show a failed attempt. It appears that what's happening is the GET is failing before actually reaching the webserver. However, because the GET is expected to fail and because we only verify that it did fail by checking that HttpGet() returned some error status, whatever unexpected error is occuring is getting lost. This patch instead checks the actual text of the error that is returned by HttpGet() to make sure it is correct and logs the error if it isn't. Change-Id: I820336271cf25130538ceae2eed10a72a73d2adc Reviewed-on: http://gerrit.cloudera.org:8080/14672 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/testutil/gtest-util.h M be/src/util/webserver-test.cc 2 files changed, 13 insertions(+), 2 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/14672 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I820336271cf25130538ceae2eed10a72a73d2adc Gerrit-Change-Number: 14672 Gerrit-PatchSet: 3 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9128: part 2: dump traces for slow RPCs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14668 ) Change subject: IMPALA-9128: part 2: dump traces for slow RPCs .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5204/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/14668 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic7af4b45c43ec731d742d3696112c5f800849947 Gerrit-Change-Number: 14668 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 12 Nov 2019 22:19:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9082: make WebserverTest error checking stricter
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14672 ) Change subject: IMPALA-9082: make WebserverTest error checking stricter .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/14672 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I820336271cf25130538ceae2eed10a72a73d2adc Gerrit-Change-Number: 14672 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 12 Nov 2019 22:18:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9092 : Disable show create table tests on Kudu
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/14664 ) Change subject: IMPALA-9092 : Disable show create table tests on Kudu .. Patch Set 3: I think it makes sense to move forward with this, since the fix for IMPALA-9068 needs a newer CDP GBN. We don't want to block IMPALA-9068 to wait for IMPALA-9092. -- To view, visit http://gerrit.cloudera.org:8080/14664 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37c0b6d82372bc6380285afcd94f0c1e123f2eda Gerrit-Change-Number: 14664 Gerrit-PatchSet: 3 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 12 Nov 2019 22:20:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8525: preads should use hdfsPreadFully rather than hdfsPread
Hello Tim Armstrong, Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14635 to look at the new patch set (#5). Change subject: IMPALA-8525: preads should use hdfsPreadFully rather than hdfsPread .. IMPALA-8525: preads should use hdfsPreadFully rather than hdfsPread Modifies HdfsFileReader so that it calls hdfsPreadFully instead of hdfsPread. hdfsPreadFully is a new libhdfs API introduced by HDFS-14564 (Add libhdfs APIs for readFully; add readFully to ByteBufferPositionedReadable). hdfsPreadFully improves performance of preads, especially when reading data from S3. The major difference between hdfsPread and hdfsPreadFully is that hdfsPreadFully is guaranteed to read all the requested bytes, whereas hdfsPread is only guaranteed to read up to the number of requested bytes. hdfsPreadFully reduces the amount of JNI array allocations necessary when reading data from S3. When any read method in libhdfs is called, the method allocates an array whose size is equal to the amount of data requested. The issue is that Java's InputStream#read only guarantees that it will read up to the amount of data requested. This can lead to issues where a libhdfs read request allocates a large Java array, even though the read request only partially fills it up. PositionedReadable#readFully on the other hand, guarantees that all requested data will be read, thus preventing any unnecessary JNI array allocations. hdfsPreadFully improves the effectiveness of fs.s3a.experimental.input.fadvise=RANDOM (HADOOP-13203). S3A recommends setting fadvise=RANDOM when doing random reads, which is common in Impala when reading Parquet or ORC files. fadvise=RANDOM causes the HTTP GET request that reads the S3 data to simply request the data bounded by the parameters of the current read request (e.g. for 'read(long position, ..., int length)' it requests 'length' bytes). The chunk-size optimization in HdfsFileReader hurts performance when fadvise=RANDOM because each HTTP GET request will only request 'chunk-size' amount of bytes at a time. Which is why this patch removes the chunk-size optimization as well. hdfsPreadFully helps here because all the data in the scan range will be requested by a single HTTP GET request. Since hdfsPreadFully improves S3 read performance, this patch enables preads for S3A files by default. Even if fadvise=SEQUENTIAL, hdfsPreadFully still improves performance since it avoids unnecessary JNI allocation overhead. The chunk-size optimization (added in https://gerrit.cloudera.org/#/c/63/) is no longer necessary after this patch. hdfsPreadFully prevents any unnecessary array allocations. Furthermore, it is likely the chunk-size optimization was added due to overhead fixed by HDFS-14285. Fixes a bug in IMPALA-8884 where the 'impala-server.io-mgr.queue-$i.read-size' statistics were being updated with the chunk-size passed to HdfsFileReader::ReadFromPosInternal, which is not necessarily equivalent to the amount of data actually read. Testing: * Ran core tests * Ran core tests on S3 * Ad-hoc functional and performance testing on ABFS; no perf regression observed; planning to further investigate the interaction between hdfsPreadFully + ABFS in a future JIRA Change-Id: I29ea34897096bc790abdeb98073a47f1c4c10feb --- M be/src/common/global-flags.cc M be/src/runtime/io/hdfs-file-reader.cc M be/src/runtime/io/hdfs-file-reader.h M be/src/runtime/io/local-file-reader.cc M be/src/runtime/io/request-ranges.h M be/src/runtime/io/scan-range.cc 6 files changed, 20 insertions(+), 52 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/14635/5 -- To view, visit http://gerrit.cloudera.org:8080/14635 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I29ea34897096bc790abdeb98073a47f1c4c10feb Gerrit-Change-Number: 14635 Gerrit-PatchSet: 5 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-8525: preads should use hdfsPreadFully rather than hdfsPread
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14635 ) Change subject: IMPALA-8525: preads should use hdfsPreadFully rather than hdfsPread .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/14635/1/be/src/runtime/io/hdfs-file-reader.cc File be/src/runtime/io/hdfs-file-reader.cc: http://gerrit.cloudera.org:8080/#/c/14635/1/be/src/runtime/io/hdfs-file-reader.cc@225 PS1, Line 225: if (hdfsPreadFully( : hdfs_fs_, hdfs_file, position_in_file, buffer, bytes_to_read) == -1) { > Thinking about it another way: If someone is overwriting files and they ove Done http://gerrit.cloudera.org:8080/#/c/14635/2/be/src/runtime/io/hdfs-file-reader.cc File be/src/runtime/io/hdfs-file-reader.cc: http://gerrit.cloudera.org:8080/#/c/14635/2/be/src/runtime/io/hdfs-file-reader.cc@224 PS2, Line 224: if (FLAGS_use_hdfs_pread || IsS3APath(scan_range_->file_string()->c_str())) { > If you did perf testing for ABFS (even if it was basic/ad-hoc), can you men Done -- To view, visit http://gerrit.cloudera.org:8080/14635 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I29ea34897096bc790abdeb98073a47f1c4c10feb Gerrit-Change-Number: 14635 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 13 Nov 2019 02:21:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9150: Use HBase's stop-hbase.sh script for minicluster
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14697 ) Change subject: IMPALA-9150: Use HBase's stop-hbase.sh script for minicluster .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5009/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14697 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I67283f9098c73c849023af8bfa7af62308bf3ed3 Gerrit-Change-Number: 14697 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 13 Nov 2019 02:42:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9128: part 2: dump traces for slow RPCs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14668 ) Change subject: IMPALA-9128: part 2: dump traces for slow RPCs .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/14668 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic7af4b45c43ec731d742d3696112c5f800849947 Gerrit-Change-Number: 14668 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 13 Nov 2019 02:48:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8525: preads should use hdfsPreadFully rather than hdfsPread
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14635 ) Change subject: IMPALA-8525: preads should use hdfsPreadFully rather than hdfsPread .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5010/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14635 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I29ea34897096bc790abdeb98073a47f1c4c10feb Gerrit-Change-Number: 14635 Gerrit-PatchSet: 4 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 13 Nov 2019 03:19:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2112: Support primary key/foreign key constraints as part of create table in Impala.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14592 ) Change subject: IMPALA-2112: Support primary key/foreign key constraints as part of create table in Impala. .. Patch Set 12: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/14592 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id03d8d4d41a2ac1b15e7060e2a013e334d044ee7 Gerrit-Change-Number: 14592 Gerrit-PatchSet: 12 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 13 Nov 2019 06:19:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2112: Support primary key/foreign key constraints as part of create table in Impala.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14592 ) Change subject: IMPALA-2112: Support primary key/foreign key constraints as part of create table in Impala. .. Patch Set 12: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/14592 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id03d8d4d41a2ac1b15e7060e2a013e334d044ee7 Gerrit-Change-Number: 14592 Gerrit-PatchSet: 12 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 13 Nov 2019 06:30:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2112: Support primary key/foreign key constraints as part of create table in Impala.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14592 ) Change subject: IMPALA-2112: Support primary key/foreign key constraints as part of create table in Impala. .. Patch Set 13: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5207/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/14592 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id03d8d4d41a2ac1b15e7060e2a013e334d044ee7 Gerrit-Change-Number: 14592 Gerrit-PatchSet: 13 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 13 Nov 2019 06:31:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2112: Support primary key/foreign key constraints as part of create table in Impala.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14592 ) Change subject: IMPALA-2112: Support primary key/foreign key constraints as part of create table in Impala. .. Patch Set 13: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/14592 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id03d8d4d41a2ac1b15e7060e2a013e334d044ee7 Gerrit-Change-Number: 14592 Gerrit-PatchSet: 13 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 13 Nov 2019 06:31:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9150: Use HBase's stop-hbase.sh script for minicluster
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14697 ) Change subject: IMPALA-9150: Use HBase's stop-hbase.sh script for minicluster .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/14697 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I67283f9098c73c849023af8bfa7af62308bf3ed3 Gerrit-Change-Number: 14697 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 13 Nov 2019 06:31:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9092 : Disable show create table tests on Kudu
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/14664 ) Change subject: IMPALA-9092 : Disable show create table tests on Kudu .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/14664/3/tests/metadata/test_ddl.py File tests/metadata/test_ddl.py: http://gerrit.cloudera.org:8080/#/c/14664/3/tests/metadata/test_ddl.py@684 PS3, Line 684: external.purge.table This should be "external.table.purge" http://gerrit.cloudera.org:8080/#/c/14664/3/tests/metadata/test_ddl.py@687 PS3, Line 687: del properties['external.purge.table'] This should be "external.table.purge" -- To view, visit http://gerrit.cloudera.org:8080/14664 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37c0b6d82372bc6380285afcd94f0c1e123f2eda Gerrit-Change-Number: 14664 Gerrit-PatchSet: 3 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 13 Nov 2019 01:41:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9150: Use HBase's stop-hbase.sh script for minicluster
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14697 to look at the new patch set (#2). Change subject: IMPALA-9150: Use HBase's stop-hbase.sh script for minicluster .. IMPALA-9150: Use HBase's stop-hbase.sh script for minicluster testdata/bin/kill-hbase.sh currently uses the generic kill-java-service.sh script to kill the region servers, then the master, and then the zookeeper. Recent versions of HBase become unusable after performing this type of shutdown. The master seems to get stuck trying to recover, even after restarting the minicluster. The root cause in HBase is unclear, but HBase provides the stop-hbase.sh script, which does a more graceful shutdown. This switches tesdata/bin/kill-hbase.sh to use this script, which avoids the recovery problems. Testing: - Ran the test-with-docker.py tests (which does a minicluster restart). Before the change, the HBase tests timed out due to HBase getting stuck recovering. After the change, tests ran normally. - Added a minicluster restart after dataload so that this is tested. Change-Id: I67283f9098c73c849023af8bfa7af62308bf3ed3 --- M testdata/bin/create-load-data.sh M testdata/bin/kill-hbase.sh 2 files changed, 9 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/97/14697/2 -- To view, visit http://gerrit.cloudera.org:8080/14697 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I67283f9098c73c849023af8bfa7af62308bf3ed3 Gerrit-Change-Number: 14697 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9150: Use HBase's stop-hbase.sh script for minicluster
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14697 ) Change subject: IMPALA-9150: Use HBase's stop-hbase.sh script for minicluster .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5206/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/14697 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I67283f9098c73c849023af8bfa7af62308bf3ed3 Gerrit-Change-Number: 14697 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 13 Nov 2019 02:01:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9150: Use HBase's stop-hbase.sh script for minicluster
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14697 ) Change subject: IMPALA-9150: Use HBase's stop-hbase.sh script for minicluster .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/14697 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I67283f9098c73c849023af8bfa7af62308bf3ed3 Gerrit-Change-Number: 14697 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 13 Nov 2019 02:05:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8855: [DOCS] Document the generic VALUES clause
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14661 ) Change subject: IMPALA-8855: [DOCS] Document the generic VALUES clause .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/14661 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2568450993323236535a8f1d022dee7d09ecf62b Gerrit-Change-Number: 14661 Gerrit-PatchSet: 3 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 13 Nov 2019 02:06:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8138: Reintroduce rpc debugging options
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14641 ) Change subject: IMPALA-8138: Reintroduce rpc debugging options .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/14641/3/be/src/rpc/impala-service-pool.cc File be/src/rpc/impala-service-pool.cc: http://gerrit.cloudera.org:8080/#/c/14641/3/be/src/rpc/impala-service-pool.cc@207 PS3, Line 207: FailAndReleaseRpc perhaps not in this patch, but would it be good to have an option to just completely ignore the incoming RPC call? I think right now this responds with an error, but it would be good to test what happens when there is no RPC response as well. http://gerrit.cloudera.org:8080/#/c/14641/3/tests/custom_cluster/test_rpc_exception.py File tests/custom_cluster/test_rpc_exception.py: http://gerrit.cloudera.org:8080/#/c/14641/3/tests/custom_cluster/test_rpc_exception.py@63 PS3, Line 63: execute_test_query > it seems like this while loop in this method is necessary because there is Not sure how easy this would be, more of a suggestion / something to think about. -- To view, visit http://gerrit.cloudera.org:8080/14641 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9c047ebce6d32c5ae461f70279391fa2df4c2029 Gerrit-Change-Number: 14641 Gerrit-PatchSet: 3 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Wed, 13 Nov 2019 02:11:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9128: part 2: dump traces for slow RPCs
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14668 ) Change subject: IMPALA-9128: part 2: dump traces for slow RPCs .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/14668/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14668/5//COMMIT_MSG@38 PS5, Line 38: * basic perf testing > Is tracing on by default? I assume the perf impact is negligible? I think so in practice, but wanted to do a sanity test. It does always log the trace to a buffer, which requires allocating memory from an arena and substituting the message text. Should be negligible in the context of other overhead from the RPC, but wanted to be sure, since I did add several additional traces per RPC. -- To view, visit http://gerrit.cloudera.org:8080/14668 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic7af4b45c43ec731d742d3696112c5f800849947 Gerrit-Change-Number: 14668 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 13 Nov 2019 01:55:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2112: Support primary key/foreign key constraints as part of create table in Impala.
Anurag Mantripragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/14592 ) Change subject: IMPALA-2112: Support primary key/foreign key constraints as part of create table in Impala. .. Patch Set 12: The GVD failed for unrelated test failure. Created https://issues.apache.org/jira/browse/IMPALA-9152 for it. Reruning GVD. -- To view, visit http://gerrit.cloudera.org:8080/14592 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id03d8d4d41a2ac1b15e7060e2a013e334d044ee7 Gerrit-Change-Number: 14592 Gerrit-PatchSet: 12 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 13 Nov 2019 01:54:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2112: Support primary key/foreign key constraints as part of create table in Impala.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14592 ) Change subject: IMPALA-2112: Support primary key/foreign key constraints as part of create table in Impala. .. Patch Set 12: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5205/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/14592 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id03d8d4d41a2ac1b15e7060e2a013e334d044ee7 Gerrit-Change-Number: 14592 Gerrit-PatchSet: 12 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 13 Nov 2019 01:54:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8138: Reintroduce rpc debugging options
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14641 ) Change subject: IMPALA-8138: Reintroduce rpc debugging options .. Patch Set 3: (7 comments) http://gerrit.cloudera.org:8080/#/c/14641/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14641/3//COMMIT_MSG@26 PS3, Line 26: RPC_SERVICE_POOL IMPALA_SERVICE_POOL? http://gerrit.cloudera.org:8080/#/c/14641/3/be/src/rpc/rpc-mgr.cc File be/src/rpc/rpc-mgr.cc: http://gerrit.cloudera.org:8080/#/c/14641/3/be/src/rpc/rpc-mgr.cc@193 PS3, Line 193: DCHECK(IsResolvedAddress(address_)); move this to Init? since that is when address is passed in? http://gerrit.cloudera.org:8080/#/c/14641/3/be/src/util/debug-util.cc File be/src/util/debug-util.cc: http://gerrit.cloudera.org:8080/#/c/14641/3/be/src/util/debug-util.cc@400 PS3, Line 400: if (ImpaladMetrics::DEBUG_ACTION_NUM_FAIL != nullptr) { this should never be nullptr right? add a DCHECK asserting it is never a nullptr? http://gerrit.cloudera.org:8080/#/c/14641/3/tests/custom_cluster/test_rpc_exception.py File tests/custom_cluster/test_rpc_exception.py: http://gerrit.cloudera.org:8080/#/c/14641/3/tests/custom_cluster/test_rpc_exception.py@63 PS3, Line 63: must_fail=True is this used anywhere? http://gerrit.cloudera.org:8080/#/c/14641/3/tests/custom_cluster/test_rpc_exception.py@63 PS3, Line 63: execute_test_query it seems like this while loop in this method is necessary because there is only a certain probability that a failure is actually injected. the reason the probability is necessary is that we don't want all RPC attempts to fail. would a better way to do this be something similar to https://github.com/apache/impala/commit/19cb8dc1c1c2247e91adc4bf62cab27a7c1e4381#diff-ab4af79ee4df02bf95d708a1d207f79aR189-R201 maybe there is a generic way to create a FAIL_FIRST debug action? http://gerrit.cloudera.org:8080/#/c/14641/3/tests/custom_cluster/test_rpc_exception.py@67 PS3, Line 67: while self._get_num_fails(impalad) == 0: should there be a limit or a timeout to the number of times a query is attempted? otherwise if this test breaks this loop may never exit http://gerrit.cloudera.org:8080/#/c/14641/3/tests/custom_cluster/test_rpc_exception.py@76 PS3, Line 76: assert self._get_num_fails(impalad) > 0 is this necessary? if the while loop exits this should always be true, right? -- To view, visit http://gerrit.cloudera.org:8080/14641 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9c047ebce6d32c5ae461f70279391fa2df4c2029 Gerrit-Change-Number: 14641 Gerrit-PatchSet: 3 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Wed, 13 Nov 2019 01:56:42 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8525: preads should use hdfsPreadFully rather than hdfsPread
Hello Tim Armstrong, Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14635 to look at the new patch set (#4). Change subject: IMPALA-8525: preads should use hdfsPreadFully rather than hdfsPread .. IMPALA-8525: preads should use hdfsPreadFully rather than hdfsPread Modifies HdfsFileReader so that it calls hdfsPreadFully instead of hdfsPread. hdfsPreadFully is a new libhdfs API introduced by HDFS-14564 (Add libhdfs APIs for readFully; add readFully to ByteBufferPositionedReadable). hdfsPreadFully improves performance of preads, especially when reading data from S3. The major difference between hdfsPread and hdfsPreadFully is that hdfsPreadFully is guaranteed to read all the requested bytes, whereas hdfsPread is only guaranteed to read up to the number of requested bytes. hdfsPreadFully reduces the amount of JNI array allocations necessary when reading data from S3. When any read method in libhdfs is called, the method allocates an array whose size is equal to the amount of data requested. The issue is that Java's InputStream#read only guarantees that it will read up to the amount of data requested. This can lead to issues where a libhdfs read request allocates a large Java array, even though the read request only partially fills it up. PositionedReadable#readFully on the other hand, guarantees that all requested data will be read, thus preventing any unnecessary JNI array allocations. hdfsPreadFully improves the effectiveness of fs.s3a.experimental.input.fadvise=RANDOM (HADOOP-13203). S3A recommends setting fadvise=RANDOM when doing random reads, which is common in Impala when reading Parquet or ORC files. fadvise=RANDOM causes the HTTP GET request that reads the S3 data to simply request the data bounded by the parameters of the current read request (e.g. for 'read(long position, ..., int length)' it requests 'length' bytes). The chunk-size optimization in HdfsFileReader hurts performance when fadvise=RANDOM because each HTTP GET request will only request 'chunk-size' amount of bytes at a time. Which is why this patch removes the chunk-size optimization as well. hdfsPreadFully helps here because all the data in the scan range will be requested by a single HTTP GET request. Since hdfsPreadFully improves S3 read performance, this patch enables preads for S3A files by default. Even if fadvise=SEQUENTIAL, hdfsPreadFully still improves performance since it avoids unnecessary JNI allocation overhead. The chunk-size optimization (added in https://gerrit.cloudera.org/#/c/63/) is no longer necessary after this patch. hdfsPreadFully prevents any unnecessary array allocations. Furthermore, it is likely the chunk-size optimization was added due to overhead fixed by HDFS-14285. Fixes a bug in IMPALA-8884 where the 'impala-server.io-mgr.queue-$i.read-size' statistics were being updated with the chunk-size passed to HdfsFileReader::ReadFromPosInternal, which is not necessarily equivalent to the amount of data actually read. Testing: * Ran core tests * Ran core tests on S3 * Ad-hoc functional and performance testing on ABFS; no perf regression observed; planning to further investigate the the interaction between hdfsPreadFully + ABFS in a future JIRA Change-Id: I29ea34897096bc790abdeb98073a47f1c4c10feb --- M be/src/common/global-flags.cc M be/src/runtime/io/hdfs-file-reader.cc M be/src/runtime/io/hdfs-file-reader.h M be/src/runtime/io/local-file-reader.cc M be/src/runtime/io/request-ranges.h M be/src/runtime/io/scan-range.cc 6 files changed, 20 insertions(+), 52 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/14635/4 -- To view, visit http://gerrit.cloudera.org:8080/14635 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I29ea34897096bc790abdeb98073a47f1c4c10feb Gerrit-Change-Number: 14635 Gerrit-PatchSet: 4 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong