[Impala-ASF-CR] IMPALA-5717: Support for reading ORC data files
Hello Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9134 to look at the new patch set (#4). Change subject: IMPALA-5717: Support for reading ORC data files .. IMPALA-5717: Support for reading ORC data files This patch integrates the orc library into Impala and implements HdfsOrcScanner as a middle layer between them. The HdfsOrcScanner supplies input needed from the orc-reader, tracks memory consumption of the reader and transfers the reader's output (orc::ColumnVectorBatch) into impala::RowBatch. The ORC version we used is release-1.4.3. Currently, we only support reading premitive types. Writing into ORC table has not been supported neither. Tests - Most of the end-to-end tests can run on ORC format. - Add tpcds, tpch tests for ORC. - Add some ORC specific tests. - Have passed all the tests. Change-Id: Ia7b6ae4ce3b9ee8125b21993702faa87537790a4 --- M CMakeLists.txt M be/CMakeLists.txt M be/src/exec/CMakeLists.txt A be/src/exec/hdfs-orc-scanner.cc A be/src/exec/hdfs-orc-scanner.h M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-mt.cc M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M bin/bootstrap_toolchain.py M bin/impala-config.sh A cmake_modules/FindOrc.cmake M common/thrift/CatalogObjects.thrift M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java M fe/src/main/java/org/apache/impala/catalog/HdfsStorageDescriptor.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/jflex/sql-scanner.flex M testdata/LineItemMultiBlock/README.dox A testdata/LineItemMultiBlock/lineitem_orc_multiblock_one_stripe.orc A testdata/LineItemMultiBlock/lineitem_sixblocks.orc A testdata/LineItemMultiBlock/lineitem_threeblocks.orc M testdata/bin/generate-schema-statements.py M testdata/bin/load_nested.py M testdata/bin/run-hive-server.sh M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-query/functional-query_core.csv M testdata/workloads/functional-query/functional-query_dimensions.csv M testdata/workloads/functional-query/functional-query_exhaustive.csv M testdata/workloads/functional-query/functional-query_pairwise.csv A testdata/workloads/functional-query/queries/QueryTest/nested-types-orc-basic.test M testdata/workloads/tpcds/tpcds_core.csv M testdata/workloads/tpcds/tpcds_dimensions.csv M testdata/workloads/tpcds/tpcds_exhaustive.csv M testdata/workloads/tpcds/tpcds_pairwise.csv M testdata/workloads/tpch/tpch_core.csv M testdata/workloads/tpch/tpch_dimensions.csv M testdata/workloads/tpch/tpch_exhaustive.csv M testdata/workloads/tpch/tpch_pairwise.csv M tests/common/test_dimensions.py M tests/comparison/cli_options.py M tests/query_test/test_decimal_queries.py M tests/query_test/test_nested_types.py M tests/query_test/test_scanners.py M tests/query_test/test_tpch_queries.py 49 files changed, 1,447 insertions(+), 157 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/9134/4 -- To view, visit http://gerrit.cloudera.org:8080/9134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7b6ae4ce3b9ee8125b21993702faa87537790a4 Gerrit-Change-Number: 9134 Gerrit-PatchSet: 4 Gerrit-Owner: Quanlong HuangGerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5886 & IMPALA-4812 Update run-tests.py to handle exit code 5
David Knupp has posted comments on this change. ( http://gerrit.cloudera.org:8080/9494 ) Change subject: IMPALA-5886 & IMPALA-4812 Update run-tests.py to handle exit_code 5 .. Patch Set 11: (9 comments) http://gerrit.cloudera.org:8080/#/c/9494/9/tests/run-tests.py File tests/run-tests.py: http://gerrit.cloudera.org:8080/#/c/9494/9/tests/run-tests.py@66 PS9, Line 66: tests_collected is a collection of items tests_collected actually winds up being a list of node ids, not test instances. http://gerrit.cloudera.org:8080/#/c/9494/11/tests/run-tests.py File tests/run-tests.py: http://gerrit.cloudera.org:8080/#/c/9494/11/tests/run-tests.py@64 PS11, Line 64: """ Custom pytest plugin to collect details of tests This comment is no longer seems accurate. This plugin really just keeps a record of the tests that were collected, and the tests that were run. I don't think we gather any other details than those. http://gerrit.cloudera.org:8080/#/c/9494/11/tests/run-tests.py@66 PS11, Line 66: tests_collected is a collection of items I don't think this is right. The elements of either list are pytest "nodeids", which is just a unique string identifier for each test that takes the form of: [] E.g., from a typical test run for Impala, an example nodeid would be something like: "query_test.test_insert.TestInsertQueries.test_insert[exec_option: {'sync_ddl': 0, 'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: text/none]" http://gerrit.cloudera.org:8080/#/c/9494/11/tests/run-tests.py@73 PS11, Line 73: https://docs.pytest.org/en/2.9.2/writing_plugins.html#_pytest.hookspec.pytest_collection_modifyitems This link is really specific to pytest_collection_modifyitems hook, not pytest_runtest_logreport. Can you move this to L81? http://gerrit.cloudera.org:8080/#/c/9494/11/tests/run-tests.py@78 PS11, Line 78: self.tests_executed = [] In a standard test run, one would expect that tests_collected and tests_executed would be same at the end. However, pytest reporting happens in multiple stages (in other words, pytest_runtest_logreport gets called multiple times), and you'll find that tests_executed winds up being exactly three times as long as tests_collected in your patch, because report.passed gets eval'ed three times. I did a quick test with some perf tests, and printed out tests_executed to show the multiple entries: [ 'test_impala_perf.py::test_cluster_workload[tpch:_300]', 'test_impala_perf.py::test_cluster_workload[tpch:_300]', 'test_impala_perf.py::test_cluster_workload[tpch:_300]', 'test_impala_perf.py::test_cluster_workload[tpcds-unmodified_kudu:_1000]', 'test_impala_perf.py::test_cluster_workload[tpcds-unmodified_kudu:_1000]', 'test_impala_perf.py::test_cluster_workload[tpcds-unmodified_kudu:_1000]', 'test_impala_perf.py::test_cluster_workload[tpch_kudu:_300]', 'test_impala_perf.py::test_cluster_workload[tpch_kudu:_300]', 'test_impala_perf.py::test_cluster_workload[tpch_kudu:_300] ] The quick way around this would be to make both of these sets(), rather than lists. http://gerrit.cloudera.org:8080/#/c/9494/11/tests/run-tests.py@85 PS11, Line 85: if report.passed: Add the link here that is specific to pytest_runtest_logreport: https://docs.pytest.org/en/2.9.2/_modules/_pytest/hookspec.html#pytest_runtest_logreport http://gerrit.cloudera.org:8080/#/c/9494/11/tests/run-tests.py@97 PS11, Line 97: exit_code = pytest.main(args, plugins=[testcounterplugin]) I think I asked earlier to have some distinction drawn between exit_code variables. E.g., this should be pytest_exit_code, and then later, when we determine how to exit this script, we should use runtest_exit_code. There have been a lot of comments though, so maybe it just got overlooked. (You might need to click show all messages at the top of the messages section.) http://gerrit.cloudera.org:8080/#/c/9494/11/tests/run-tests.py@107 PS11, Line 107: #though 5 is a normal return code, our jenkins handles non-zero RC as error in This is odd wording in this context. Don't forget that this code goes into the upstream Apache repo, where "our Jenkins" has no real significance. I would reword this comment to be more general, like: # EXIT_NOTESTSCOLLECTED is a pytest return code meaning "no tests collected" and is equal to 5. # This would typically be an error, but not necessarily in the case of this script, because it executes # pytest multiple times, and we may selectively choose not to run some of the tests for one or two of # those executions. http://gerrit.cloudera.org:8080/#/c/9494/11/tests/run-tests.py@112 PS11, Line 112: len(testcounterplugin.tests_executed) <= 0 I think this comment was missed from before as well. I don't thinks len(tests_executed) can ever
[Impala-ASF-CR] IMPALA-5717: Support for ORC data files
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/9134 ) Change subject: IMPALA-5717: Support for ORC data files .. Patch Set 3: (18 comments) http://gerrit.cloudera.org:8080/#/c/9134/3/be/src/exec/hdfs-orc-scanner.cc File be/src/exec/hdfs-orc-scanner.cc: http://gerrit.cloudera.org:8080/#/c/9134/3/be/src/exec/hdfs-orc-scanner.cc@130 PS3, Line 130: ScanRange* HdfsOrcScanner::FindFooterSplit(HdfsFileDesc* file) { > We could move this to HdfsScanner and share the code - the logic is generic Done http://gerrit.cloudera.org:8080/#/c/9134/3/be/src/exec/hdfs-orc-scanner.cc@142 PS3, Line 142: mem_tracker_.reset(new MemTracker(-1, "OrcReader", mem_tracker)); > I think we should track it against the HdfsScanNode's MemTracker instead of Done http://gerrit.cloudera.org:8080/#/c/9134/3/be/src/exec/hdfs-orc-scanner.cc@151 PS3, Line 151: mem_tracker_->CloseAndUnregisterFromParent(); > Just add the comment since I made it crashed when use Close at first... Removed these since we use HdfsScanNode's MemTracker http://gerrit.cloudera.org:8080/#/c/9134/3/be/src/exec/hdfs-orc-scanner.cc@162 PS3, Line 162: if (ImpaladMetrics::MEM_POOL_TOTAL_BYTES != nullptr) { > So the non-NULL checks in mem-pool.cc are redundant too? I learn this from To be consistent with logics in MemPool, let's update the metric. http://gerrit.cloudera.org:8080/#/c/9134/3/be/src/exec/hdfs-orc-scanner.cc@333 PS3, Line 333: if (col_type.type == TYPE_ARRAY) { > Complex types will be skipped here. Their column ids won't be set into the Have added tests for this. Replaced these with DCHECK. http://gerrit.cloudera.org:8080/#/c/9134/3/be/src/exec/hdfs-orc-scanner.cc@484 PS3, Line 484: if (start_with_first_stripe && misaligned_stripe_skipped) { > I think we had a specific Parquet test for this code path, so we're probabl Done http://gerrit.cloudera.org:8080/#/c/9134/3/be/src/exec/hdfs-orc-scanner.cc@498 PS3, Line 498: // TODO ValidateColumnOffsets > Is this still needed? Removed this. The orc library will handle this so we just need to catch exceptions from the orc lib. http://gerrit.cloudera.org:8080/#/c/9134/3/be/src/exec/hdfs-orc-scanner.cc@500 PS3, Line 500: google::uint64 stripe_offset = stripe.offset(); > Why not uint64_t? Are they not the same underlying type? Done http://gerrit.cloudera.org:8080/#/c/9134/3/be/src/exec/hdfs-orc-scanner.cc@513 PS3, Line 513: // TODO: check if this stripe can be skipped by stats. > File a follow-on JIRA for this enhancement? Done http://gerrit.cloudera.org:8080/#/c/9134/3/be/src/exec/hdfs-orc-scanner.cc@517 PS3, Line 517: unique_ptr input_stream(new ScanRangeInputStream(this)); > I haven't fully thought through this idea but I wonder if we should have an Though ORC-262 has no progress, I think we can still prefech data and let the ORC lib reading from an in-memory InputStream. Created a JIRA for this: IMPALA-6636. http://gerrit.cloudera.org:8080/#/c/9134/3/be/src/exec/hdfs-orc-scanner.cc@677 PS3, Line 677: vector decompressed_footer_buffer; > You're right! Will review these logics deeper. The logics of parsing the file tail are now replaced by using the orc lib. http://gerrit.cloudera.org:8080/#/c/9134/3/be/src/exec/hdfs-orc-scanner.cc@751 PS3, Line 751: // TODO > We should do this in the initial patch if it's possible to crash Impala wit Handled by the ORC lib http://gerrit.cloudera.org:8080/#/c/9134/3/be/src/exec/hdfs-orc-scanner.cc@820 PS3, Line 820: while (row_id < capacity && ScratchBatchNotEmpty()) { > We should file a follow-on JIRA to codegen the runtime filter + conjunct ev Done http://gerrit.cloudera.org:8080/#/c/9134/3/be/src/exec/hdfs-orc-scanner.cc@859 PS3, Line 859: bool HdfsOrcScanner::EvalRuntimeFilters(TupleRow* row) { > Can you add a TODO to combine this with the Parquet implementation? Done http://gerrit.cloudera.org:8080/#/c/9134/3/be/src/exec/hdfs-orc-scanner.cc@881 PS3, Line 881: inline void HdfsOrcScanner::ReadRow(const orc::ColumnVectorBatch , int row_idx, > We don't need to do this in this patch, but it would probably be faster to Add a TODO for this. http://gerrit.cloudera.org:8080/#/c/9134/3/be/src/exec/hdfs-orc-scanner.cc@884 PS3, Line 884: dynamic_cast > Sure! Excited to know that you start to test the scanner's performance! Done http://gerrit.cloudera.org:8080/#/c/9134/3/be/src/exec/hdfs-orc-scanner.cc@975 PS3, Line 975: // TODO warn if slot_desc->type().GetByteSize() != 16 > You could DCHECK in that case, the only valid values are 4, 8 and 16 Add DCHECKs in default http://gerrit.cloudera.org:8080/#/c/9134/3/testdata/workloads/functional-query/functional-query_exhaustive.csv File testdata/workloads/functional-query/functional-query_exhaustive.csv: http://gerrit.cloudera.org:8080/#/c/9134/3/testdata/workloads/functional-query/functional-query_exhaustive.csv@25 PS3, Line 25: file_format: orc, dataset:
[Impala-ASF-CR] IMPALA-6609: Fix ownership of class members in KrpcDataStreamRecvr
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/9527 ) Change subject: IMPALA-6609: Fix ownership of class members in KrpcDataStreamRecvr .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/9527/4/be/src/runtime/krpc-data-stream-recvr.cc File be/src/runtime/krpc-data-stream-recvr.cc: http://gerrit.cloudera.org:8080/#/c/9527/4/be/src/runtime/krpc-data-stream-recvr.cc@465 PS4, Line 465: // TakeOverEarlySender() is called by the same thread which calls Close(). : // The receiver cannot be closed while this function is in progress so : // 'recvr_->mgr_' shouldn't be NULL. : DCHECK(TestInfo::is_test() || FragmentInstanceState::IsFragmentExecThread()); : DCHECK(!recvr_->closed_ && recvr_->mgr_ != nullptr); > Moved the DCHECKs. can still cancel* this sender queue -- To view, visit http://gerrit.cloudera.org:8080/9527 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3378496e2201b16c662b9daa634333480705c61a Gerrit-Change-Number: 9527 Gerrit-PatchSet: 4 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 10 Mar 2018 02:51:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6609: Fix ownership of class members in KrpcDataStreamRecvr
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/9527 ) Change subject: IMPALA-6609: Fix ownership of class members in KrpcDataStreamRecvr .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/9527/3/be/src/runtime/krpc-data-stream-recvr.cc File be/src/runtime/krpc-data-stream-recvr.cc: http://gerrit.cloudera.org:8080/#/c/9527/3/be/src/runtime/krpc-data-stream-recvr.cc@193 PS3, Line 193: DCHECK(FragmentInstanceState::IsFragmentExecThread()); > This may not work in BE test. Done -- To view, visit http://gerrit.cloudera.org:8080/9527 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3378496e2201b16c662b9daa634333480705c61a Gerrit-Change-Number: 9527 Gerrit-PatchSet: 3 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 10 Mar 2018 02:51:07 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6609: Fix ownership of class members in KrpcDataStreamRecvr
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/9527 ) Change subject: IMPALA-6609: Fix ownership of class members in KrpcDataStreamRecvr .. Patch Set 4: (10 comments) http://gerrit.cloudera.org:8080/#/c/9527/4/be/src/runtime/fragment-instance-state.h File be/src/runtime/fragment-instance-state.h: http://gerrit.cloudera.org:8080/#/c/9527/4/be/src/runtime/fragment-instance-state.h@136 PS4, Line 136: current_thread_name(); > rather than introduce this, maybe we can infer whether this is a FIS from G Done http://gerrit.cloudera.org:8080/#/c/9527/4/be/src/runtime/krpc-data-stream-recvr.h File be/src/runtime/krpc-data-stream-recvr.h: http://gerrit.cloudera.org:8080/#/c/9527/4/be/src/runtime/krpc-data-stream-recvr.h@73 PS4, Line 73: /// the recvr instance from the tracking structure of its KrpcDataStreamMgr in all cases. > Let's add an implementation comment explaining the synchronization rules, w Done http://gerrit.cloudera.org:8080/#/c/9527/3/be/src/runtime/krpc-data-stream-recvr.cc File be/src/runtime/krpc-data-stream-recvr.cc: http://gerrit.cloudera.org:8080/#/c/9527/3/be/src/runtime/krpc-data-stream-recvr.cc@193 PS3, Line 193: SCOPED_TIMER(recvr_->queue_get_batch_time_); This may not work in BE test. http://gerrit.cloudera.org:8080/#/c/9527/4/be/src/runtime/krpc-data-stream-recvr.cc File be/src/runtime/krpc-data-stream-recvr.cc: http://gerrit.cloudera.org:8080/#/c/9527/4/be/src/runtime/krpc-data-stream-recvr.cc@194 PS4, Line 194: DCHECK(TestInfo::is_test() || FragmentInstanceState::IsFragmentExecThread()); > should this also check !is_closed_ or is that not actually required here? Done http://gerrit.cloudera.org:8080/#/c/9527/4/be/src/runtime/krpc-data-stream-recvr.cc@400 PS4, Line 400: Returns if the deferred RPCs queue is empty. > This is not necessary. It's quite clear from the code. Done http://gerrit.cloudera.org:8080/#/c/9527/4/be/src/runtime/krpc-data-stream-recvr.cc@400 PS4, Line 400: The deferred RPC queue is drained : // in Cancel(). If the sender queue is already cancelled, we should return at this : // point. There should be no insertion into 'deferred_rpcs_' after cancellation. > I think this can be moved below by 1 line just above DCHECK(!is_cancelled_) Done http://gerrit.cloudera.org:8080/#/c/9527/4/be/src/runtime/krpc-data-stream-recvr.cc@465 PS4, Line 465: // TakeOverEarlySender() is called by the same thread which calls Close(). : // The receiver cannot be closed while this function is in progress so : // 'recvr_->mgr_' shouldn't be NULL. : DCHECK(TestInfo::is_test() || FragmentInstanceState::IsFragmentExecThread()); : DCHECK(!recvr_->closed_ && recvr_->mgr_ != nullptr); > how about moving this to the top of the function. That way, we check it mor Moved the DCHECKs. We do need to check the deferred_rpc_tracker_ under the lock. because query cancellation can still the receiver while this function is in progress. http://gerrit.cloudera.org:8080/#/c/9527/4/be/src/runtime/krpc-data-stream-recvr.cc@582 PS4, Line 582: // Add the receiver and sender sides' profiles as children of the owning exchange : // node's profile. The parent profile has the same lifetime as the exchange node while : // these profiles need to outlive the parent profile so that they can be accessed even : // after the owning exchange node has been closed. > This is already explained in the header file, so I think we can remove it f Done http://gerrit.cloudera.org:8080/#/c/9527/4/be/src/runtime/krpc-data-stream-recvr.cc@664 PS4, Line 664: // All threads which try to consume from / insert into a sender queues needs to hold : // the queue's lock and check for 'is_cancelled_' flag before proceeding. No deferred : // RPCs should be inserted or processed after a queue is cancelled. > This is explained in multiple places. Do you think it makes sense just to a Done http://gerrit.cloudera.org:8080/#/c/9527/4/be/src/runtime/krpc-data-stream-recvr.cc@672 PS4, Line 672: owners' resources > unowned resources? Done -- To view, visit http://gerrit.cloudera.org:8080/9527 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3378496e2201b16c662b9daa634333480705c61a Gerrit-Change-Number: 9527 Gerrit-PatchSet: 4 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 10 Mar 2018 02:50:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6609: Fix ownership of class members in KrpcDataStreamRecvr
Hello Sailesh Mukil, Tim Armstrong, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9527 to look at the new patch set (#5). Change subject: IMPALA-6609: Fix ownership of class members in KrpcDataStreamRecvr .. IMPALA-6609: Fix ownership of class members in KrpcDataStreamRecvr A KrpcDataStreamRecvr is co-owned by the singleton KrpcDataStreamMgr and an exchange node. It's possible that some threads (e.g. RPC service threads) may still retain reference to the KrpcDataStreamRecvr after its owning exchange node has been closed. This is problematic as some members in the receiver (e.g. sender/receiver profiles) are actually owned by the exchange node so accessing them after the exchange node is closed and possibly deleted may lead to user-after-free. This patch changes the ownership of some members in KrpcDataStreamRecvr to the receiver. In particular, the profiles are now owned by the receiver and various stat counters and timers are in turn owned by these profiles. This prevents the use-after-free problem described above. This patch also moves the access to deferred_rpc_tracker_ in TakeOverEarlySenders() to be under the sender queue's 'lock_' to prevent another instance of IMPALA-6554. Testing done: core debug build. Change-Id: I3378496e2201b16c662b9daa634333480705c61a --- M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/krpc-data-stream-recvr.cc M be/src/runtime/krpc-data-stream-recvr.h M be/src/runtime/query-state.cc 5 files changed, 104 insertions(+), 30 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/9527/5 -- To view, visit http://gerrit.cloudera.org:8080/9527 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3378496e2201b16c662b9daa634333480705c61a Gerrit-Change-Number: 9527 Gerrit-PatchSet: 5 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6576: Add metrics for data stream service memory usage
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9562 ) Change subject: IMPALA-6576: Add metrics for data stream service memory usage .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/9562/3/be/src/service/data-stream-service.cc File be/src/service/data-stream-service.cc: http://gerrit.cloudera.org:8080/#/c/9562/3/be/src/service/data-stream-service.cc@58 PS3, Line 58: treamService"); > nit: The naming conventions for all the other metrics seems to be like this I noticed that, too, but the one we added in IMPALA-6269 is camel case. https://gerrit.cloudera.org/#/c/9292/ Looking at /metrics the other service names are camel case, too, e.g. rpc-method.backend.ImpalaInternalService.UpdateFilter.call_duration. I'm open to changing this to data-stream-service, but then we should also change the one from IMPALA-6269. That will require to change it in CM, too, which we can do in the change to expose the ones from this current change. Let me know which way you prefer. http://gerrit.cloudera.org:8080/#/c/9562/3/be/src/util/memory-metrics.h File be/src/util/memory-metrics.h: http://gerrit.cloudera.org:8080/#/c/9562/3/be/src/util/memory-metrics.h@203 PS3, Line 203: static BufferPoolMetric* LIMIT; : static BufferPoolMetric* SYSTEM_ALLOCATED; : static BufferPoolMetric* RESERVED; : static BufferPoolMetric* UNUSED_RESERVATION_BYTES; : static BufferPoolMetric* NUM_FREE_BUFFERS; : static BufferPoolMetric* FREE_BUFFER_BYTES; : static BufferPoolMetric* CLEAN_PAGES_LIMIT; : static BufferPoolMetric* NUM_CLEAN_PAGES; : static BufferPoolMetric* CLEAN_PAGE_BYTES; > Do we need to make the metrics for MemTrackerMetric, globally visible like I didn't see why. Once they are registered we don't need to access them anymore. The ones here are globally unique, whereas the MemTrackerMetrics could be instantiated for multiple MemTrackers. http://gerrit.cloudera.org:8080/#/c/9562/2/tests/custom_cluster/test_krpc_metrics.py File tests/custom_cluster/test_krpc_metrics.py: http://gerrit.cloudera.org:8080/#/c/9562/2/tests/custom_cluster/test_krpc_metrics.py@93 PS2, Line 93: metric_name = 'rpc.impala.DataStreamService.rpcs_queue_overflow' : before = self.get_metric(metric_name) : assert before == 0 : : self.client.execute(self.TEST_QUERY) : after = self.get_metric(metric_name) : assert before < after > How about making this a helper function? And then taking metric_name as a p For this change here, making it a helper won't benefit us, since the other metrics are gauges that don't necessarily increase monotonically. I think it's better to defer refactoring to a future change when we can benefit from it. That way we'll know which interface to pick. -- To view, visit http://gerrit.cloudera.org:8080/9562 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5033b8dfda0b23d4230535ba13c3e050a35d01a3 Gerrit-Change-Number: 9562 Gerrit-PatchSet: 2 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 10 Mar 2018 02:21:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Fix test dimensions in test errorlog.py
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9546 ) Change subject: Fix test dimensions in test_errorlog.py .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/9546 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icac90d308337f2cfb51e7de5bd23d410da073a73 Gerrit-Change-Number: 9546 Gerrit-PatchSet: 4 Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Comment-Date: Sat, 10 Mar 2018 01:33:02 + Gerrit-HasComments: No
[Impala-ASF-CR] Fix test dimensions in test errorlog.py
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9546 ) Change subject: Fix test dimensions in test_errorlog.py .. Fix test dimensions in test_errorlog.py Bug: The test dimensions were set up such that the test ran for all text-format variants (including compressed text) The test is file-format independent and slow (>40s), so we should only run it once. Change-Id: Icac90d308337f2cfb51e7de5bd23d410da073a73 Reviewed-on: http://gerrit.cloudera.org:8080/9546 Reviewed-by: Alex BehmTested-by: Impala Public Jenkins --- M tests/query_test/test_errorlog.py 1 file changed, 6 insertions(+), 5 deletions(-) Approvals: Alex Behm: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/9546 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Icac90d308337f2cfb51e7de5bd23d410da073a73 Gerrit-Change-Number: 9546 Gerrit-PatchSet: 5 Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown
[Impala-ASF-CR] IMPALA-6627: [DOCS] Hive incompatibility with serialization.null.format
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9563 ) Change subject: IMPALA-6627: [DOCS] Hive incompatibility with serialization.null.format .. IMPALA-6627: [DOCS] Hive incompatibility with serialization.null.format Change-Id: Ic412c2fe2eba03d4493defee497656d6c74936f3 Reviewed-on: http://gerrit.cloudera.org:8080/9563 Reviewed-by: Alex BehmTested-by: Impala Public Jenkins --- M docs/topics/impala_langref_unsupported.xml 1 file changed, 8 insertions(+), 0 deletions(-) Approvals: Alex Behm: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/9563 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ic412c2fe2eba03d4493defee497656d6c74936f3 Gerrit-Change-Number: 9563 Gerrit-PatchSet: 4 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-6627: [DOCS] Hive incompatibility with serialization.null.format
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9563 ) Change subject: IMPALA-6627: [DOCS] Hive incompatibility with serialization.null.format .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/9563 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic412c2fe2eba03d4493defee497656d6c74936f3 Gerrit-Change-Number: 9563 Gerrit-PatchSet: 3 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Sat, 10 Mar 2018 01:31:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6627: [DOCS] Hive incompatibility with serialization.null.format
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9563 ) Change subject: IMPALA-6627: [DOCS] Hive incompatibility with serialization.null.format .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-docs-submit/209/ -- To view, visit http://gerrit.cloudera.org:8080/9563 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic412c2fe2eba03d4493defee497656d6c74936f3 Gerrit-Change-Number: 9563 Gerrit-PatchSet: 3 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Sat, 10 Mar 2018 01:27:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6627: [DOCS] Hive incompatibility with serialization.null.format
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9563 ) Change subject: IMPALA-6627: [DOCS] Hive incompatibility with serialization.null.format .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9563 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic412c2fe2eba03d4493defee497656d6c74936f3 Gerrit-Change-Number: 9563 Gerrit-PatchSet: 3 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Alex Rodoni Gerrit-Comment-Date: Sat, 10 Mar 2018 01:26:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6627: [DOCS] Hive incompatibility with serialization.null.format
Hello Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9563 to look at the new patch set (#3). Change subject: IMPALA-6627: [DOCS] Hive incompatibility with serialization.null.format .. IMPALA-6627: [DOCS] Hive incompatibility with serialization.null.format Change-Id: Ic412c2fe2eba03d4493defee497656d6c74936f3 --- M docs/topics/impala_langref_unsupported.xml 1 file changed, 8 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/9563/3 -- To view, visit http://gerrit.cloudera.org:8080/9563 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic412c2fe2eba03d4493defee497656d6c74936f3 Gerrit-Change-Number: 9563 Gerrit-PatchSet: 3 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Alex Rodoni
[Impala-ASF-CR] IMPALA-3866 Improve error reporting for scratch write errors
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9420 ) Change subject: IMPALA-3866 Improve error reporting for scratch write errors .. Patch Set 7: (16 comments) http://gerrit.cloudera.org:8080/#/c/9420/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9420/7//COMMIT_MSG@7 PS7, Line 7: IMPALA-3866 Improve error reporting for scratch write errors nit: colon after JIRA number http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/disk-io-mgr-test.cc File be/src/runtime/io/disk-io-mgr-test.cc: http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/disk-io-mgr-test.cc@347 PS7, Line 347: nit: extra blank line. http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/disk-io-mgr-test.cc@355 PS7, Line 355: int32_t* data = pool_.Add(new int32_t); Can't we stack allocate this? The memory shouldn't be referenced once writer is unregistered. I.e. int32_t data = rand(); AddWriteRange(..., , ...); http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/disk-io-mgr-test.cc@362 PS7, Line 362: WriteRange** new_range = pool_.Add(new WriteRange*); Same with new_range - can't we stack allocate it? http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/disk-io-mgr-test.cc@380 PS7, Line 380: write_range = pool_.Add(new WriteRange*); I don't think allocated the pointer on the heap is needed, is it? http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/disk-io-mgr-test.cc@385 PS7, Line 385: *write_range = pool_.Add(new WriteRange(file_name, offset, 0, callback)); We already overwrote the value of 'write_range' on l380, so this doesn't actually return anything to the caller. Maybe we don't need 'write_range' to be an argument, even? http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/disk-io-mgr-with-fault-injection.h File be/src/runtime/io/disk-io-mgr-with-fault-injection.h: http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/disk-io-mgr-with-fault-injection.h@46 PS7, Line 46: virtual int Open(const char* file, int option1, int option2); Let's document explicitly which functions they're wrapping. http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/disk-io-mgr-with-fault-injection.h@48 PS7, Line 48: ); Let's add an "override" specifier to the end of overriding functions. http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/disk-io-mgr-with-fault-injection.h@58 PS7, Line 58: function_ Our convention is to not use underscores on struct members, since they're meant to be public. http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/disk-io-mgr.h File be/src/runtime/io/disk-io-mgr.h: http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/disk-io-mgr.h@401 PS7, Line 401: // Wrapper functions around open(), fdopen(), fseek(), fwrite() and fclose(). The I feel like maybe there's already too much functionality in the DiskIoMgr class. Can we pull out these functions into a helper class that we can subclass? Maybe something like LocalFilesystem? http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/disk-io-mgr.h@404 PS7, Line 404: virtual int Open(const char* file, int option1, int option2); I also think if we're going to pull these out into functions, we should just make them return Status so they're consistent with the usual Impala calling conventions. This would also slightly reduce the amount of logic in the core DiskIoMgr class, which would be a win. http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/disk-io-mgr.cc File be/src/runtime/io/disk-io-mgr.cc: http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/disk-io-mgr.cc@1120 PS7, Line 1120: } else { We can unnest the body of this "else" now that we're doing an early return on the other branch. http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/errno-to-error-status-converter.h File be/src/runtime/io/errno-to-error-status-converter.h: http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/errno-to-error-status-converter.h@34 PS7, Line 34: ErrnoToErrorStatusConverter Maybe we can just name this ErrorConverter to be more concise? http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/errno-to-error-status-converter.h@43 PS7, Line 43: params Can we document what 'params' does? http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/errno-to-error-status-converter.cc File be/src/runtime/io/errno-to-error-status-converter.cc: http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/errno-to-error-status-converter.cc@23 PS7, Line 23: using std::unordered_map; We can just do #include "common/names.h" under the other includes and remove these using statements. http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/errno-to-error-status-converter.cc@68 PS7, Line 68: for(const auto& item : params) { nit: needs space -- To view, visit
[Impala-ASF-CR] IMPALA-6621: Improve set lookup performance for in-predicate evaluation
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/9570 ) Change subject: IMPALA-6621: Improve set lookup performance for in-predicate evaluation .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/9570/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9570/1//COMMIT_MSG@23 PS1, Line 23: | Data Type | Llvm 3 hybrid | Llvm 3 | Llvm 5 hybrid | Llvm 5 | > Which strategy is implemented by this patch? The strategy implemented by this patch is the hybrid one (on llvm3). The approach is detailed in the first paragraph of the commit message: "This patch takes a hybrid approach based on benchmarking results and uses boost::flat_set for int, big int, and float datatypes and boost::unordered_set for the rest (tiny int, small int, double, string, timestamp, decimal)." -- To view, visit http://gerrit.cloudera.org:8080/9570 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifd1627d779d10a16468cc3c2d0bc26a497e048df Gerrit-Change-Number: 9570 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bikramjeet Vig Gerrit-Comment-Date: Sat, 10 Mar 2018 00:43:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6629: Clean up catalog update logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9566 ) Change subject: IMPALA-6629: Clean up catalog update logging .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/9566 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaa096b8c84304f28b37ac5e6794d688ba0a949a7 Gerrit-Change-Number: 9566 Gerrit-PatchSet: 2 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Sat, 10 Mar 2018 00:42:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6629: Clean up catalog update logging
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9566 ) Change subject: IMPALA-6629: Clean up catalog update logging .. IMPALA-6629: Clean up catalog update logging IMPALA-5990 introduced some redundant and unclear logging in the process of assembling and sending catalog updates. This patch removes the duplication, rewords some logs, and adds a log message when a catalog update is fully assembled. Change-Id: Iaa096b8c84304f28b37ac5e6794d688ba0a949a7 Reviewed-on: http://gerrit.cloudera.org:8080/9566 Reviewed-by: Tianyi WangTested-by: Impala Public Jenkins --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-server.h M be/src/service/fe-support.cc M fe/src/main/java/org/apache/impala/catalog/Catalog.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/service/FeSupport.java 6 files changed, 25 insertions(+), 24 deletions(-) Approvals: Tianyi Wang: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/9566 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Iaa096b8c84304f28b37ac5e6794d688ba0a949a7 Gerrit-Change-Number: 9566 Gerrit-PatchSet: 3 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang
[Impala-ASF-CR] IMPALA-6621: Improve set lookup performance for in-predicate evaluation
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9570 ) Change subject: IMPALA-6621: Improve set lookup performance for in-predicate evaluation .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/9570/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9570/1//COMMIT_MSG@23 PS1, Line 23: | Data Type | Llvm 3 hybrid | Llvm 3 | Llvm 5 hybrid | Llvm 5 | Which strategy is implemented by this patch? -- To view, visit http://gerrit.cloudera.org:8080/9570 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifd1627d779d10a16468cc3c2d0bc26a497e048df Gerrit-Change-Number: 9570 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Alex Behm Gerrit-Comment-Date: Sat, 10 Mar 2018 00:29:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5191, IMPALA-6415: [DOCS] Document breaking change of alias and ordinal substitution
Alex Rodoni has posted comments on this change. ( http://gerrit.cloudera.org:8080/9211 ) Change subject: IMPALA-5191, IMPALA-6415: [DOCS] Document breaking change of alias and ordinal substitution .. Patch Set 1: > Alex, did you have some time to look at this change? The texts were fine. Let me build this locally and check the conref. -- To view, visit http://gerrit.cloudera.org:8080/9211 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I558230d07212da62d2cd12e07a52ceba03e980a8 Gerrit-Change-Number: 9211 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-NagyGerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: John Russell Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Sat, 10 Mar 2018 00:25:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6610: Impala shell fetches the value of ldap password cmd incorrectly
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9506 ) Change subject: IMPALA-6610: Impala shell fetches the value of ldap_password_cmd incorrectly .. Patch Set 4: I agree with Bharath that we should not do magic transformations on the output produced by the script. However, I think it's reasonable to inspect the password for common pitfalls and issue a warning in the shell, something like: "Warning: Ldap password contains trailing newline. Did you use echo instead of echo -n?" Improvements to the description of this parameter are also welcome. -- To view, visit http://gerrit.cloudera.org:8080/9506 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie570166aea62af223905b7f0124e9efb15a88ac7 Gerrit-Change-Number: 9506 Gerrit-PatchSet: 4 Gerrit-Owner: Donghui XuGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Donghui Xu Gerrit-Reviewer: Fredy Wijaya Gerrit-Comment-Date: Sat, 10 Mar 2018 00:25:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6621: Improve set lookup performance for in-predicate evaluation
Bikramjeet Vig has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9570 Change subject: IMPALA-6621: Improve set lookup performance for in-predicate evaluation .. IMPALA-6621: Improve set lookup performance for in-predicate evaluation Currently when using a SET_LOOKUP strategy for in-predicates in impala we use a std:set object for checking membership. This patch takes a hybrid approach based on benchmarking results and uses boost::flat_set for int, big int, and float datatypes and boost::unordered_set for the rest (tiny int, small int, double, string, timestamp, decimal). The intent of this change is to fix a regression when upgrading the toolchain to use LLVM 5.0.1 (IMPALA-5980). Performance: Ran a query for each data type with a large in predicate containing 500 elements on a single node with mt_dop set to 1. +---+---+--+---+--+ | Data Type | Llvm 3 hybrid | Llvm 3 | Llvm 5 hybrid | Llvm 5 | +---+---+--+---+--+ | Table used: tpch100_parquet.lineitem | +---+---+--+--+---+ | big int | 17s782ms | 13s941ms | 13s201ms | 25s604ms | | string| 40s750ms | 64s | 40s723ms | 73s | | decimal | 13s929ms | 22s272ms | 13s710ms | 34s338ms | | int | 19s368ms | 11s308ms | 9s169ms | 15s254ms | +---+---+--+--+---+ | Table used: alltypes with 33638400 rows | +---+---+--+--+---+ | double| 5s726ms | 5s894ms | 5s595ms | 6s592ms | | small int | 4s776ms | 5s057ms | 4s740ms | 5s358ms | | float | 7s223ms | 6s397ms | 6s287ms | 6s926ms | +---+---+--+---+--+ Also added a targeted perf query that uses a large in-predicate over a decimal column. Testing: - Ran expr-test and test_exprs successfully. Change-Id: Ifd1627d779d10a16468cc3c2d0bc26a497e048df --- M be/src/benchmarks/in-predicate-benchmark.cc M be/src/exprs/in-predicate.h M fe/src/main/java/org/apache/impala/analysis/InPredicate.java M testdata/workloads/targeted-perf/queries/primitive_filter_in_predicate.test 4 files changed, 694 insertions(+), 189 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/9570/1 -- To view, visit http://gerrit.cloudera.org:8080/9570 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ifd1627d779d10a16468cc3c2d0bc26a497e048df Gerrit-Change-Number: 9570 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig
[Impala-ASF-CR] IMPALA-6627: [DOCS] Hive incompatibility with serialization.null.format
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9563 ) Change subject: IMPALA-6627: [DOCS] Hive incompatibility with serialization.null.format .. Patch Set 2: -Code-Review New patch? -- To view, visit http://gerrit.cloudera.org:8080/9563 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic412c2fe2eba03d4493defee497656d6c74936f3 Gerrit-Change-Number: 9563 Gerrit-PatchSet: 2 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Alex Rodoni Gerrit-Comment-Date: Sat, 10 Mar 2018 00:13:55 + Gerrit-HasComments: No
[native-toolchain-CR] Bump LLVM to 5.0.1
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8932 ) Change subject: Bump LLVM to 5.0.1 .. Patch Set 2: Code-Review+2 This seems fine to merge at the same time as we switch it over in Impala. -- To view, visit http://gerrit.cloudera.org:8080/8932 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9b7b97cb135d202eaa9a0bae03e722a2505b712 Gerrit-Change-Number: 8932 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 10 Mar 2018 00:08:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6627: [DOCS] Hive incompatibility with serialization.null.format
Hello Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9563 to look at the new patch set (#2). Change subject: IMPALA-6627: [DOCS] Hive incompatibility with serialization.null.format .. IMPALA-6627: [DOCS] Hive incompatibility with serialization.null.format Change-Id: Ic412c2fe2eba03d4493defee497656d6c74936f3 --- M docs/topics/impala_langref_unsupported.xml 1 file changed, 7 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/9563/2 -- To view, visit http://gerrit.cloudera.org:8080/9563 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic412c2fe2eba03d4493defee497656d6c74936f3 Gerrit-Change-Number: 9563 Gerrit-PatchSet: 2 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Behm
[Impala-ASF-CR] IMPALA-6627: [DOCS] Hive incompatibility with serialization.null.format
Alex Rodoni has posted comments on this change. ( http://gerrit.cloudera.org:8080/9563 ) Change subject: IMPALA-6627: [DOCS] Hive incompatibility with serialization.null.format .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/9563/1/docs/topics/impala_langref_unsupported.xml File docs/topics/impala_langref_unsupported.xml: http://gerrit.cloudera.org:8080/#/c/9563/1/docs/topics/impala_langref_unsupported.xml@191 PS1, Line 191: serialization.null.format property for Parquet and > weird indentation? Done -- To view, visit http://gerrit.cloudera.org:8080/9563 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic412c2fe2eba03d4493defee497656d6c74936f3 Gerrit-Change-Number: 9563 Gerrit-PatchSet: 1 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Alex Rodoni Gerrit-Comment-Date: Sat, 10 Mar 2018 00:07:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3193: Show table's comment on show tables
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8851 ) Change subject: IMPALA-3193: Show table's comment on show tables .. Patch Set 19: (3 comments) Nice work! Thanks for your patience with reviews. Dimitris is on vacation this week, and this change somehow slipped my radar. Thanks Jim for bringing this to my attention. This patch is ready to merge after my minor remaining comments. http://gerrit.cloudera.org:8080/#/c/8851/19/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/8851/19/be/src/service/client-request-state.cc@269 PS19, Line 269: DCHECK(names.size() == comments.size()); no need for this, already checked in SetResultSet() http://gerrit.cloudera.org:8080/#/c/8851/19/fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java File fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java: http://gerrit.cloudera.org:8080/#/c/8851/19/fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java@128 PS19, Line 128: public Table getTable(String dbName, String tableName) { I think we should remove these overrides because they can be super hard to reason about. If there are FE unit tests that need tables loaded before running, then I think those tests should do what is done in getTables() here. For an example where we follow that preferred practice, take a look at FrontendTest.TestGetTablesTypeTable() http://gerrit.cloudera.org:8080/#/c/8851/19/fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java@136 PS19, Line 136: public List getTables(String dbName, PatternMatcher matcher) Please remove if possible, see above. -- To view, visit http://gerrit.cloudera.org:8080/8851 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I44f814af05db6f3c027718ade9f474f8b8153bcd Gerrit-Change-Number: 8851 Gerrit-PatchSet: 19 Gerrit-Owner: Kim Jin ChulGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kim Jin Chul Gerrit-Comment-Date: Sat, 10 Mar 2018 00:03:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6240: [DOCS] Document PARQUET ARRAY RESOLUTION query option
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9534 ) Change subject: IMPALA-6240: [DOCS] Document PARQUET_ARRAY_RESOLUTION query option .. IMPALA-6240: [DOCS] Document PARQUET_ARRAY_RESOLUTION query option Cherry-picks: not for 2.x Change-Id: I12696b609609ea16c05d8b7e84b2bae0be6d6cb5 Reviewed-on: http://gerrit.cloudera.org:8080/9534 Reviewed-by: Alex BehmTested-by: Impala Public Jenkins --- M docs/impala.ditamap M docs/impala_keydefs.ditamap A docs/topics/impala_parquet_array_resolution.xml 3 files changed, 208 insertions(+), 0 deletions(-) Approvals: Alex Behm: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/9534 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I12696b609609ea16c05d8b7e84b2bae0be6d6cb5 Gerrit-Change-Number: 9534 Gerrit-PatchSet: 5 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell
[Impala-ASF-CR] IMPALA-6240: [DOCS] Document PARQUET ARRAY RESOLUTION query option
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9534 ) Change subject: IMPALA-6240: [DOCS] Document PARQUET_ARRAY_RESOLUTION query option .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/9534 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I12696b609609ea16c05d8b7e84b2bae0be6d6cb5 Gerrit-Change-Number: 9534 Gerrit-PatchSet: 4 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Comment-Date: Fri, 09 Mar 2018 23:37:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6240: [DOCS] Document PARQUET ARRAY RESOLUTION query option
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9534 ) Change subject: IMPALA-6240: [DOCS] Document PARQUET_ARRAY_RESOLUTION query option .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-docs-submit/208/ -- To view, visit http://gerrit.cloudera.org:8080/9534 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I12696b609609ea16c05d8b7e84b2bae0be6d6cb5 Gerrit-Change-Number: 9534 Gerrit-PatchSet: 4 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Comment-Date: Fri, 09 Mar 2018 23:29:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6394: Restart HDFS when blocks are under replicated
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9469 ) Change subject: IMPALA-6394: Restart HDFS when blocks are under replicated .. IMPALA-6394: Restart HDFS when blocks are under replicated HDFS sometimes fails to fully replicate all the blocks in 30 seconds and no progress is made. This patch tries to restart HDFS several times before aborting the data loading. Change-Id: Iefd4c2fc6c287f054e385de52bdc42b0bdbd7915 Reviewed-on: http://gerrit.cloudera.org:8080/9469 Reviewed-by: Alex BehmTested-by: Impala Public Jenkins --- M testdata/bin/create-load-data.sh 1 file changed, 14 insertions(+), 8 deletions(-) Approvals: Alex Behm: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/9469 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Iefd4c2fc6c287f054e385de52bdc42b0bdbd7915 Gerrit-Change-Number: 9469 Gerrit-PatchSet: 5 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tianyi Wang
[Impala-ASF-CR] IMPALA-6394: Restart HDFS when blocks are under replicated
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9469 ) Change subject: IMPALA-6394: Restart HDFS when blocks are under replicated .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/9469 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iefd4c2fc6c287f054e385de52bdc42b0bdbd7915 Gerrit-Change-Number: 9469 Gerrit-PatchSet: 4 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Fri, 09 Mar 2018 22:54:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5886 & IMPALA-4812 Update run-tests.py to handle exit code 5
Nithya Janarthanan has posted comments on this change. ( http://gerrit.cloudera.org:8080/9494 ) Change subject: IMPALA-5886 & IMPALA-4812 Update run-tests.py to handle exit_code 5 .. Patch Set 11: https://jenkins.impala.io/job/gerrit-verify-dryrun-external/90/console - gerrit-dry-run was successful -- To view, visit http://gerrit.cloudera.org:8080/9494 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If82f974cc2d1e917464d4053563eaf4afc559150 Gerrit-Change-Number: 9494 Gerrit-PatchSet: 11 Gerrit-Owner: Nithya JanarthananGerrit-Reviewer: David Knupp Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Nithya Janarthanan Gerrit-Comment-Date: Fri, 09 Mar 2018 22:13:54 + Gerrit-HasComments: No
[Impala-ASF-CR] Fix test dimensions in test errorlog.py
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9546 ) Change subject: Fix test dimensions in test_errorlog.py .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2079/ -- To view, visit http://gerrit.cloudera.org:8080/9546 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icac90d308337f2cfb51e7de5bd23d410da073a73 Gerrit-Change-Number: 9546 Gerrit-PatchSet: 4 Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Comment-Date: Fri, 09 Mar 2018 21:44:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6341, IMPALA-5917: Reduce mem-limit for start-impala-cluster.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9395 ) Change subject: IMPALA-6341, IMPALA-5917: Reduce mem-limit for start-impala-cluster. .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/9395/2/testdata/workloads/functional-query/queries/QueryTest/semi-joins-exhaustive.test File testdata/workloads/functional-query/queries/QueryTest/semi-joins-exhaustive.test: http://gerrit.cloudera.org:8080/#/c/9395/2/testdata/workloads/functional-query/queries/QueryTest/semi-joins-exhaustive.test@9 PS2, Line 9: # of the relevant fragment is 3.6GB when tested.) It might be worth reducing it further, since we could potentially still have problems if it was running with another test that had high memory consumption. Anyway, this is clearly an improvement. -- To view, visit http://gerrit.cloudera.org:8080/9395 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iabca7a95560bd27c2de2b0a147ee9a3c45199db7 Gerrit-Change-Number: 9395 Gerrit-PatchSet: 2 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 09 Mar 2018 21:31:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6341, IMPALA-5917: Reduce mem-limit for start-impala-cluster.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9395 ) Change subject: IMPALA-6341, IMPALA-5917: Reduce mem-limit for start-impala-cluster. .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9395 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iabca7a95560bd27c2de2b0a147ee9a3c45199db7 Gerrit-Change-Number: 9395 Gerrit-PatchSet: 2 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 09 Mar 2018 21:30:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6551: Change Kudu TPCDS and TPCH columns to DECIMAL
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/9484 ) Change subject: IMPALA-6551: Change Kudu TPCDS and TPCH columns to DECIMAL .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/9484/3/testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test File testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test: http://gerrit.cloudera.org:8080/#/c/9484/3/testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test@339 PS3, Line 339:kudu predicates: l_shipdate < '1995-01-01', l_shipdate >= '1994-01-01' > To be clear: we should handle this in a separate patch I will add a follow on patch right now. -- To view, visit http://gerrit.cloudera.org:8080/9484 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f7e4464dc6705cadd610a82c459390a9c0dfe4f Gerrit-Change-Number: 9484 Gerrit-PatchSet: 3 Gerrit-Owner: Grant HenkeGerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 09 Mar 2018 21:21:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6576: Add metrics for data stream service memory usage
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/9562 ) Change subject: IMPALA-6576: Add metrics for data stream service memory usage .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/9562/3/be/src/service/data-stream-service.cc File be/src/service/data-stream-service.cc: http://gerrit.cloudera.org:8080/#/c/9562/3/be/src/service/data-stream-service.cc@58 PS3, Line 58: DataStreamService nit: The naming conventions for all the other metrics seems to be like this: "data-stream-service" Let's be consistent with that. http://gerrit.cloudera.org:8080/#/c/9562/3/be/src/util/memory-metrics.h File be/src/util/memory-metrics.h: http://gerrit.cloudera.org:8080/#/c/9562/3/be/src/util/memory-metrics.h@203 PS3, Line 203: static BufferPoolMetric* LIMIT; : static BufferPoolMetric* SYSTEM_ALLOCATED; : static BufferPoolMetric* RESERVED; : static BufferPoolMetric* UNUSED_RESERVATION_BYTES; : static BufferPoolMetric* NUM_FREE_BUFFERS; : static BufferPoolMetric* FREE_BUFFER_BYTES; : static BufferPoolMetric* CLEAN_PAGES_LIMIT; : static BufferPoolMetric* NUM_CLEAN_PAGES; : static BufferPoolMetric* CLEAN_PAGE_BYTES; Do we need to make the metrics for MemTrackerMetric, globally visible like this? http://gerrit.cloudera.org:8080/#/c/9562/2/tests/custom_cluster/test_krpc_metrics.py File tests/custom_cluster/test_krpc_metrics.py: http://gerrit.cloudera.org:8080/#/c/9562/2/tests/custom_cluster/test_krpc_metrics.py@93 PS2, Line 93: metric_name = 'rpc.impala.DataStreamService.rpcs_queue_overflow' : before = self.get_metric(metric_name) : assert before == 0 : : self.client.execute(self.TEST_QUERY) : after = self.get_metric(metric_name) : assert before < after How about making this a helper function? And then taking metric_name as a parameter? We'll be adding more metrics in the future, so having this helper function would be nice. The 'before' check may have to go though, since it may not be 0 for anything after the first query is run. -- To view, visit http://gerrit.cloudera.org:8080/9562 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5033b8dfda0b23d4230535ba13c3e050a35d01a3 Gerrit-Change-Number: 9562 Gerrit-PatchSet: 3 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 09 Mar 2018 21:02:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6551: Change Kudu TPCDS and TPCH columns to DECIMAL
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/9484 ) Change subject: IMPALA-6551: Change Kudu TPCDS and TPCH columns to DECIMAL .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/9484/3/testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test File testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test: http://gerrit.cloudera.org:8080/#/c/9484/3/testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test@339 PS3, Line 339:kudu predicates: l_shipdate < '1995-01-01', l_shipdate >= '1994-01-01' > Does the Kudu Java API support pushing predicates on DECIMAL columns? The p To be clear: we should handle this in a separate patch -- To view, visit http://gerrit.cloudera.org:8080/9484 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f7e4464dc6705cadd610a82c459390a9c0dfe4f Gerrit-Change-Number: 9484 Gerrit-PatchSet: 3 Gerrit-Owner: Grant HenkeGerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 09 Mar 2018 20:50:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6629: Clean up catalog update logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9566 ) Change subject: IMPALA-6629: Clean up catalog update logging .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2077/ -- To view, visit http://gerrit.cloudera.org:8080/9566 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaa096b8c84304f28b37ac5e6794d688ba0a949a7 Gerrit-Change-Number: 9566 Gerrit-PatchSet: 2 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Fri, 09 Mar 2018 20:56:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6629: Clean up catalog update logging
Tianyi Wang has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/9566 ) Change subject: IMPALA-6629: Clean up catalog update logging .. IMPALA-6629: Clean up catalog update logging IMPALA-5990 introduced some redundant and unclear logging in the process of assembling and sending catalog updates. This patch removes the duplication, rewords some logs, and adds a log message when a catalog update is fully assembled. Change-Id: Iaa096b8c84304f28b37ac5e6794d688ba0a949a7 --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-server.h M be/src/service/fe-support.cc M fe/src/main/java/org/apache/impala/catalog/Catalog.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/service/FeSupport.java 6 files changed, 25 insertions(+), 24 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/9566/2 -- To view, visit http://gerrit.cloudera.org:8080/9566 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iaa096b8c84304f28b37ac5e6794d688ba0a949a7 Gerrit-Change-Number: 9566 Gerrit-PatchSet: 2 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Tianyi Wang
[Impala-ASF-CR] Fix test dimensions in test errorlog.py
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9546 ) Change subject: Fix test dimensions in test_errorlog.py .. Patch Set 4: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/2074/ -- To view, visit http://gerrit.cloudera.org:8080/9546 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icac90d308337f2cfb51e7de5bd23d410da073a73 Gerrit-Change-Number: 9546 Gerrit-PatchSet: 4 Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Comment-Date: Fri, 09 Mar 2018 20:54:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6551: Change Kudu TPCDS and TPCH columns to DECIMAL
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/9484 ) Change subject: IMPALA-6551: Change Kudu TPCDS and TPCH columns to DECIMAL .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/9484/3/testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test File testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test: http://gerrit.cloudera.org:8080/#/c/9484/3/testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test@339 PS3, Line 339:kudu predicates: l_shipdate < '1995-01-01', l_shipdate >= '1994-01-01' Does the Kudu Java API support pushing predicates on DECIMAL columns? The predicates listed under "Kudu predicate" are pushed down into Kudu, reducing the number of rows Kudu has to return to Impala for the scan, where the ones listed under "predicates" are evaluated on the Impala side. Look at KuduScanNode.java and tryConvertBinaryKuduPredicate(), can we add DECIMAL to the switch there? Otherwise, we're missing an opportunity to improve perf in situations like this. I also happened to notice a bug while looking through KuduScanNode.java, filed: IMPALA-6635 -- To view, visit http://gerrit.cloudera.org:8080/9484 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f7e4464dc6705cadd610a82c459390a9c0dfe4f Gerrit-Change-Number: 9484 Gerrit-PatchSet: 3 Gerrit-Owner: Grant HenkeGerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 09 Mar 2018 20:49:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6337: Fix infinite loop in Impala shell
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/9195 ) Change subject: IMPALA-6337: Fix infinite loop in Impala shell .. Patch Set 10: > Patch Set 10: > > (2 comments) I decided to fix the bug in sqlparse instead: https://github.com/andialbrecht/sqlparse/pull/396 -- To view, visit http://gerrit.cloudera.org:8080/9195 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9142f21a888189d351f00ce09baeba123bc0959b Gerrit-Change-Number: 9195 Gerrit-PatchSet: 10 Gerrit-Owner: Fredy WijayaGerrit-Reviewer: David Knupp Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 09 Mar 2018 20:48:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6576: Add metrics for data stream service memory usage
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9562 ) Change subject: IMPALA-6576: Add metrics for data stream service memory usage .. Patch Set 2: (6 comments) Thanks for the review. Please see my inline comments and PS3. http://gerrit.cloudera.org:8080/#/c/9562/2/be/src/service/data-stream-service.h File be/src/service/data-stream-service.h: http://gerrit.cloudera.org:8080/#/c/9562/2/be/src/service/data-stream-service.h@43 PS2, Line 43: metrics > nit: May be easier to follow to call it metrics_group ? Done http://gerrit.cloudera.org:8080/#/c/9562/2/be/src/util/memory-metrics.h File be/src/util/memory-metrics.h: http://gerrit.cloudera.org:8080/#/c/9562/2/be/src/util/memory-metrics.h@247 PS2, Line 247: each type > What does "each type" mean ? May want to elaborate exactly what this functi Done http://gerrit.cloudera.org:8080/#/c/9562/2/be/src/util/memory-metrics.h@254 PS2, Line 254: class > why not just enum MemTrackerMetricType ? I mostly used the new C++11 enum class for consistency, since the rest of the file does so (see L218 above). They also provide stronger type guarantees. http://gerrit.cloudera.org:8080/#/c/9562/2/be/src/util/memory-metrics.h@262 PS2, Line 262: MemTrackerMetricType type_ > Can this be const ? Yes, done. http://gerrit.cloudera.org:8080/#/c/9562/2/be/src/util/memory-metrics.h@263 PS2, Line 263: MemTracker* mem_tracker_ > What's the story with the lifecycle management of the metrics wrt mem_track They both are owned transitively by the exec env singleton which lives until the end of the process. If we were ever to destruct the exec env instance, the memtracker would get destructed before the webserver and before the metrics so there could be an issue. However, we seem to ignore this in general, e.g. rendering the /memz page during shutdown has the same issues. I updated the comment to make the requirement explicit. http://gerrit.cloudera.org:8080/#/c/9562/2/be/src/util/memory-metrics.cc File be/src/util/memory-metrics.cc: http://gerrit.cloudera.org:8080/#/c/9562/2/be/src/util/memory-metrics.cc@305 PS2, Line 305: void MemTrackerMetric::InitMetrics(MetricGroup* metrics, MemTracker* mem_tracker, const string& name) { > nit: long line Done -- To view, visit http://gerrit.cloudera.org:8080/9562 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5033b8dfda0b23d4230535ba13c3e050a35d01a3 Gerrit-Change-Number: 9562 Gerrit-PatchSet: 2 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 09 Mar 2018 20:18:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6576: Add metrics for data stream service memory usage
Hello Michael Ho, Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9562 to look at the new patch set (#3). Change subject: IMPALA-6576: Add metrics for data stream service memory usage .. IMPALA-6576: Add metrics for data stream service memory usage This change adds metrics for the data stream service memory usage. Both current and peak usage are exposed. It adds a test to test_krpc_metrics.py to make sure that the expected metrics are present and that the peak usage shows a non-zero value after running a query. Change-Id: I5033b8dfda0b23d4230535ba13c3e050a35d01a3 --- M be/src/runtime/exec-env.cc M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M be/src/util/memory-metrics.cc M be/src/util/memory-metrics.h M common/thrift/metrics.json M tests/custom_cluster/test_krpc_metrics.py 7 files changed, 104 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/9562/3 -- To view, visit http://gerrit.cloudera.org:8080/9562 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5033b8dfda0b23d4230535ba13c3e050a35d01a3 Gerrit-Change-Number: 9562 Gerrit-PatchSet: 3 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6610: Impala shell fetches the value of ldap password cmd incorrectly
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/9506 ) Change subject: IMPALA-6610: Impala shell fetches the value of ldap_password_cmd incorrectly .. Patch Set 4: Maybe we could fix (a) or (c) separately (or add more documentation if you think that is lacking) but my opinion is that the onus is on the end user to provide the correct command and the shell shouldn't make any changes like stripping newlines etc. Wondering if others have a different opinion on this? -- To view, visit http://gerrit.cloudera.org:8080/9506 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie570166aea62af223905b7f0124e9efb15a88ac7 Gerrit-Change-Number: 9506 Gerrit-PatchSet: 4 Gerrit-Owner: Donghui XuGerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Donghui Xu Gerrit-Reviewer: Fredy Wijaya Gerrit-Comment-Date: Fri, 09 Mar 2018 20:03:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6629: Clean up catalog update logging
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9566 ) Change subject: IMPALA-6629: Clean up catalog update logging .. Patch Set 1: Code-Review+2 (1 comment) I tried this patch and the new logging looks really nice! Thank you. http://gerrit.cloudera.org:8080/#/c/9566/1/be/src/catalog/catalog-server.cc File be/src/catalog/catalog-server.cc: http://gerrit.cloudera.org:8080/#/c/9566/1/be/src/catalog/catalog-server.cc@463 PS1, Line 463: const uint8_t *item_data, uint32_t size, bool deleted) { move star one char to the right (const uint8_t* item_data) -- To view, visit http://gerrit.cloudera.org:8080/9566 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaa096b8c84304f28b37ac5e6794d688ba0a949a7 Gerrit-Change-Number: 9566 Gerrit-PatchSet: 1 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Comment-Date: Fri, 09 Mar 2018 19:46:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6341, IMPALA-5917: Reduce mem-limit for start-impala-cluster.
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/9395 ) Change subject: IMPALA-6341, IMPALA-5917: Reduce mem-limit for start-impala-cluster. .. Patch Set 2: Tim, Please take a look at semi-joins-exhaustive.test? Thanks! -- To view, visit http://gerrit.cloudera.org:8080/9395 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iabca7a95560bd27c2de2b0a147ee9a3c45199db7 Gerrit-Change-Number: 9395 Gerrit-PatchSet: 2 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 09 Mar 2018 19:45:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3916: [DOCS] Reserved keywords updated for Impala 3.0
Alex Rodoni has posted comments on this change. ( http://gerrit.cloudera.org:8080/9540 ) Change subject: IMPALA-3916: [DOCS] Reserved keywords updated for Impala 3.0 .. Patch Set 2: (1 comment) > We should also document the following new option which changes the > set reserved words for compatibility. We should be explicit that > this option is intended to buy existing users some time, and that > we plan to eventually remove this option altogether. > > DEFINE_string(reserved_words_version, "3.0.0", "Reserved words > compatibility version. " > "Reserved words cannot be used as identifiers in SQL. This flag > determines the impala" > " version from which the reserved word list is taken. The value > must be one of " > "[\"2.11.0\", \"3.0.0\"]."); > We should also document the following new option which changes the > set reserved words for compatibility. We should be explicit that > this option is intended to buy existing users some time, and that > we plan to eventually remove this option altogether. > > DEFINE_string(reserved_words_version, "3.0.0", "Reserved words > compatibility version. " > "Reserved words cannot be used as identifiers in SQL. This flag > determines the impala" > " version from which the reserved word list is taken. The value > must be one of " > "[\"2.11.0\", \"3.0.0\"]."); I mentioned this option in the upgrade guide. Since it is only a temporary option, I don't think it should be in this doc. http://gerrit.cloudera.org:8080/#/c/9540/2/docs/topics/impala_reserved_words.xml File docs/topics/impala_reserved_words.xml: http://gerrit.cloudera.org:8080/#/c/9540/2/docs/topics/impala_reserved_words.xml@71 PS2, Line 71: > Please ignore, sorry. Done -- To view, visit http://gerrit.cloudera.org:8080/9540 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I09dbaf5e2d22f07a8bca8a601c04e94f3a4b36a0 Gerrit-Change-Number: 9540 Gerrit-PatchSet: 2 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: John Russell Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Fri, 09 Mar 2018 19:45:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5690: Part 2: Upgrade thrift to 0.9.3-p3
Tianyi Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/9300 ) Change subject: IMPALA-5690: Part 2: Upgrade thrift to 0.9.3-p3 .. Patch Set 7: (2 comments) http://gerrit.cloudera.org:8080/#/c/9300/7/be/src/rpc/thrift-server.h File be/src/rpc/thrift-server.h: http://gerrit.cloudera.org:8080/#/c/9300/7/be/src/rpc/thrift-server.h@323 PS7, Line 323: TLSv1_0 > Done I don't understand why it's equivalent. In 0.9.0 TLSv1_method() is called here https://github.com/apache/thrift/blob/0.9.x/lib/cpp/src/thrift/transport/TSSLSocket.cpp#L60, but SSLTLS in 0.9.3 correspond to TLSv23_method(). http://gerrit.cloudera.org:8080/#/c/9300/7/be/src/rpc/thrift-server.cc File be/src/rpc/thrift-server.cc: http://gerrit.cloudera.org:8080/#/c/9300/7/be/src/rpc/thrift-server.cc@99 PS7, Line 99: return protocol != SSLv3 && protocol != TLSv1_2; > I think we still need SSLTLS at https://gerrit.cloudera.org/c/9300/7/be/src Do we still need SSLTLS here? https://gerrit.cloudera.org/c/9300/7/be/src/rpc/thrift-server.h#323 -- To view, visit http://gerrit.cloudera.org:8080/9300 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I639227721502eaa10398d9490ff6ac63aa71b3a6 Gerrit-Change-Number: 9300 Gerrit-PatchSet: 7 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Fri, 09 Mar 2018 19:43:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6341, IMPALA-5917: Reduce mem-limit for start-impala-cluster.
Hello Tim Armstrong, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9395 to look at the new patch set (#2). Change subject: IMPALA-6341, IMPALA-5917: Reduce mem-limit for start-impala-cluster. .. IMPALA-6341, IMPALA-5917: Reduce mem-limit for start-impala-cluster. We've observed empirically that giving Impala 80% of system memory doesn't leave enough room for the minicluster and ASAN overhead, leading to the OOM killer striking during test runs (sometimes). This commit reduces the threshold to 70%. This commit also reduces the memory usage of semi-joins-exhaustive.test by roughly halving the number of records it deals with. This was necessary for tests to pass on a machine with 32GB of RAM. Testing: I've run the ASAN build (more) happily with this change. I've run exhaustive tests on a 32GB machine. Change-Id: Iabca7a95560bd27c2de2b0a147ee9a3c45199db7 --- M bin/start-impala-cluster.py M testdata/workloads/functional-query/queries/QueryTest/semi-joins-exhaustive.test 2 files changed, 13 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/9395/2 -- To view, visit http://gerrit.cloudera.org:8080/9395 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iabca7a95560bd27c2de2b0a147ee9a3c45199db7 Gerrit-Change-Number: 9395 Gerrit-PatchSet: 2 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6573: Create consistent response on column access failures
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9509 ) Change subject: IMPALA-6573: Create consistent response on column access failures .. Patch Set 2: (4 comments) http://gerrit.cloudera.org:8080/#/c/9509/2/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/9509/2/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@2394 PS2, Line 2394: .setColumnVisible(false).toRequest()); I don't think this is the right fix. Based on your tests the discrepancy in DESCRIBE is as follows: describe functional.complextypes_fileformat.id Reports: .* versus describe functional.complextypes_fileformat.zz Reports: Imo the bug is in DescribeTableStmt.java in L121 in the handling of the AnalysisException returned from path.resolve(). The right fix is to change the handling of rawPath_.size() > 1 to the following slightly different snippet: if (rawPath_.size() > 1) { analyzer.registerPrivReq(new PrivilegeRequestBuilder() .any().onAnyColumn(rawPath_.get(0), rawPath_.get(1)).toRequest()); } After that change both queries above consistently report .* http://gerrit.cloudera.org:8080/#/c/9509/2/fe/src/main/java/org/apache/impala/authorization/AuthorizeableColumn.java File fe/src/main/java/org/apache/impala/authorization/AuthorizeableColumn.java: http://gerrit.cloudera.org:8080/#/c/9509/2/fe/src/main/java/org/apache/impala/authorization/AuthorizeableColumn.java@63 PS2, Line 63: if (isColumnVisible()) { For changes like this we need to be really convinced that we are fixing the right thing. Changing the output of getName() for a AuthorizeableColumn seems really odd. Why would the name of the column change depending on some flag? Semantics seem weird to me. Anyway, see my other comment on why I think this isn't even the right fix. http://gerrit.cloudera.org:8080/#/c/9509/2/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java File fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java: http://gerrit.cloudera.org:8080/#/c/9509/2/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java@574 PS2, Line 574: AuthzError("select 1 from functional.complex_nested_struct_col.f1", As far as I can tell, these tests also worked as expected before the fix (I tried on my machine). Did you just add the for completeness? http://gerrit.cloudera.org:8080/#/c/9509/2/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java@1489 PS2, Line 1489: AuthzError("describe functional.complextypes_fileformat.id", These tests also pass without your fix because we prefix match the expected error message, so they don't really cover this bug. -- To view, visit http://gerrit.cloudera.org:8080/9509 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9676a21a171862bb1f664e82d0a7d1db0d36462a Gerrit-Change-Number: 9509 Gerrit-PatchSet: 2 Gerrit-Owner: Adam HolleyGerrit-Reviewer: Adam Holley Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Fredy Wijaya Gerrit-Comment-Date: Fri, 09 Mar 2018 19:34:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6629: Clean up catalog update logging
Tianyi Wang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9566 Change subject: IMPALA-6629: Clean up catalog update logging .. IMPALA-6629: Clean up catalog update logging IMPALA-5990 introduced some redundant and unclear logging in the process of assembling and sending catalog updates. This patch removes the duplication, rewords some logs, and adds a log message when a catalog update is fully assembled. Change-Id: Iaa096b8c84304f28b37ac5e6794d688ba0a949a7 --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-server.h M be/src/service/fe-support.cc M fe/src/main/java/org/apache/impala/catalog/Catalog.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/service/FeSupport.java 6 files changed, 25 insertions(+), 24 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/9566/1 -- To view, visit http://gerrit.cloudera.org:8080/9566 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Iaa096b8c84304f28b37ac5e6794d688ba0a949a7 Gerrit-Change-Number: 9566 Gerrit-PatchSet: 1 Gerrit-Owner: Tianyi Wang
[Impala-ASF-CR] IMPALA-6394: Restart HDFS when blocks are under replicated
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9469 ) Change subject: IMPALA-6394: Restart HDFS when blocks are under replicated .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2076/ -- To view, visit http://gerrit.cloudera.org:8080/9469 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iefd4c2fc6c287f054e385de52bdc42b0bdbd7915 Gerrit-Change-Number: 9469 Gerrit-PatchSet: 4 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Fri, 09 Mar 2018 19:16:37 + Gerrit-HasComments: No
[Impala-ASF-CR](2.x) IMPALA-6405: Error when string to decimal cast overflows
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9530 ) Change subject: IMPALA-6405: Error when string to decimal cast overflows .. Patch Set 1: Verified-1 Both builds have failed. Can you have a look? -- To view, visit http://gerrit.cloudera.org:8080/9530 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: comment Gerrit-Change-Id: Icffccac1c1c2361447ae4b0de9b6c2ec7de071db Gerrit-Change-Number: 9530 Gerrit-PatchSet: 1 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Comment-Date: Fri, 09 Mar 2018 18:43:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3916: [DOCS] Reserved keywords updated for Impala 3.0
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9540 ) Change subject: IMPALA-3916: [DOCS] Reserved keywords updated for Impala 3.0 .. Patch Set 2: We should also document the following new option which changes the set reserved words for compatibility. We should be explicit that this option is intended to buy existing users some time, and that we plan to eventually remove this option altogether. DEFINE_string(reserved_words_version, "3.0.0", "Reserved words compatibility version. " "Reserved words cannot be used as identifiers in SQL. This flag determines the impala" " version from which the reserved word list is taken. The value must be one of " "[\"2.11.0\", \"3.0.0\"]."); -- To view, visit http://gerrit.cloudera.org:8080/9540 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I09dbaf5e2d22f07a8bca8a601c04e94f3a4b36a0 Gerrit-Change-Number: 9540 Gerrit-PatchSet: 2 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: John Russell Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Fri, 09 Mar 2018 17:57:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3916: [DOCS] Reserved keywords updated for Impala 3.0
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9540 ) Change subject: IMPALA-3916: [DOCS] Reserved keywords updated for Impala 3.0 .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/9540/2/docs/topics/impala_reserved_words.xml File docs/topics/impala_reserved_words.xml: http://gerrit.cloudera.org:8080/#/c/9540/2/docs/topics/impala_reserved_words.xml@71 PS2, Line 71: > Without looking at Please ignore, sorry. -- To view, visit http://gerrit.cloudera.org:8080/9540 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I09dbaf5e2d22f07a8bca8a601c04e94f3a4b36a0 Gerrit-Change-Number: 9540 Gerrit-PatchSet: 2 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: John Russell Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Fri, 09 Mar 2018 17:47:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3916: [DOCS] Reserved keywords updated for Impala 3.0
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9540 ) Change subject: IMPALA-3916: [DOCS] Reserved keywords updated for Impala 3.0 .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/9540/2/docs/topics/impala_reserved_words.xml File docs/topics/impala_reserved_words.xml: http://gerrit.cloudera.org:8080/#/c/9540/2/docs/topics/impala_reserved_words.xml@22 PS2, Line 22: Impala Reserved Words I think this page could use some general revamping. In particular, we should show how the list of Impala keywords across different Impala versions stacks up against the SQL2016 list of reserved words. Tianyi has a nice spreadsheet. That table will show that some SQL2016 reserved words are not reserved in Impala and that some Impala reserved words are not part of SQL2016. Given that discrepancy, we should explain Impala's process for deciding whether a particular SQL2016 word should be reserved or not in Impala. Tianyi can provide more details, but I can summarize the process roughly here: * By default, Impala wants to be standard conforming, so we wish to have the same list of reserved words as SQL2016 * At the same time, Impala wants to be compatible with earlier versions of Impala to avoid breaking existing tables/workloads. So we remove from the SQL2016 list of reserved words those that are Impala built-in functions that do not need to be reserved (COUNT, AVG are examples) * For those remaining SQL2016 words we use our gut feeling to decide whether user are likely to have databases/tables/columns with those names. If we feel that a word is likely to be used already then we look whether Impala is going to need to reserve it to implement new SQL features that use it. If a word is likely to be used and there is a low chance of Impala need it to be reserved in the future, then we do not reserve that word. Otherwise, we reserve that SQL2016 word in Impala. http://gerrit.cloudera.org:8080/#/c/9540/2/docs/topics/impala_reserved_words.xml@71 PS2, Line 71: Without looking at -- To view, visit http://gerrit.cloudera.org:8080/9540 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I09dbaf5e2d22f07a8bca8a601c04e94f3a4b36a0 Gerrit-Change-Number: 9540 Gerrit-PatchSet: 2 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: John Russell Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Fri, 09 Mar 2018 17:47:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6314: Add run time scalar subquery check for uncorrelated subqueries
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/9005 ) Change subject: IMPALA-6314: Add run time scalar subquery check for uncorrelated subqueries .. Patch Set 13: PS13 was only a rebase. -- To view, visit http://gerrit.cloudera.org:8080/9005 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0f52b93a60eeacedd242a2f17fa6b99c4fc38e06 Gerrit-Change-Number: 9005 Gerrit-PatchSet: 13 Gerrit-Owner: Zoltan Borok-NagyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vuk Ercegovac Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 09 Mar 2018 17:38:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6314: Add run time scalar subquery check for uncorrelated subqueries
Hello Attila Jeges, Dimitris Tsirogiannis, Tim Armstrong, Csaba Ringhofer, Alex Behm, Vuk Ercegovac, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9005 to look at the new patch set (#13). Change subject: IMPALA-6314: Add run time scalar subquery check for uncorrelated subqueries .. IMPALA-6314: Add run time scalar subquery check for uncorrelated subqueries If a scalar subquery is used with a binary predicate, or, used in an arithmetic expression, it must return only one row/column to be valid. If this cannot be guaranteed at parse time through a single row aggregate or limit clause, Impala fails the query like such. E.g., currently the following query is not allowed: SELECT bigint_col FROM alltypesagg WHERE id = (SELECT id FROM alltypesagg WHERE id = 1) However, it would be allowed if the query contained a LIMIT 1 clause, or instead of id it was max(id). This commit makes the example valid by introducing a runtime check to test if the subquery returns a single row. If the subquery returns more than one row, it aborts the query with an error. I added a new node type, called CardinalityCheckNode. It is created during planning on top of the subquery when needed, then during execution it checks if its child only returns a single row. I extended the frontend tests and e2e tests as well. Change-Id: I0f52b93a60eeacedd242a2f17fa6b99c4fc38e06 --- M be/src/exec/CMakeLists.txt A be/src/exec/cardinality-check-node.cc A be/src/exec/cardinality-check-node.h M be/src/exec/exec-node.cc M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/HdfsCachingOp.java M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/analysis/StatementBase.java M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java A fe/src/main/java/org/apache/impala/planner/CardinalityCheckNode.java M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java M testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test M testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test M testdata/workloads/functional-query/queries/QueryTest/nested-types-subplan.test M testdata/workloads/functional-query/queries/QueryTest/subquery.test 28 files changed, 958 insertions(+), 78 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/9005/13 -- To view, visit http://gerrit.cloudera.org:8080/9005 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0f52b93a60eeacedd242a2f17fa6b99c4fc38e06 Gerrit-Change-Number: 9005 Gerrit-PatchSet: 13 Gerrit-Owner: Zoltan Borok-NagyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vuk Ercegovac Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-6324: Support reading RLE-encoded boolean values in Parquet scanner
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9403 ) Change subject: IMPALA-6324: Support reading RLE-encoded boolean values in Parquet scanner .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/9403/3/be/src/util/rle-encoding.h File be/src/util/rle-encoding.h: http://gerrit.cloudera.org:8080/#/c/9403/3/be/src/util/rle-encoding.h@674 PS3, Line 674:memset(values + num_consumed, repeated_value, num_repeats_to_set); > I have created a benchmark and it turned out that memset and the for loop Yes, please do add the benchmark. The old benchmark was comparing two old implementations that were no longer relevant. -- To view, visit http://gerrit.cloudera.org:8080/9403 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4644bf8cf5d2b7238b05076407fbf78ab5d2c14f Gerrit-Change-Number: 9403 Gerrit-PatchSet: 3 Gerrit-Owner: Csaba RinghoferGerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 09 Mar 2018 17:34:38 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6324: Support reading RLE-encoded boolean values in Parquet scanner
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/9403 ) Change subject: IMPALA-6324: Support reading RLE-encoded boolean values in Parquet scanner .. Patch Set 3: (2 comments) There is something wrong with the decoding, so I have no patch ready for review, but I have a question related to this commit in the comment for rle-encoding.h. http://gerrit.cloudera.org:8080/#/c/9403/3/be/src/util/rle-encoding.h File be/src/util/rle-encoding.h: http://gerrit.cloudera.org:8080/#/c/9403/3/be/src/util/rle-encoding.h@674 PS3, Line 674:memset(values + num_consumed, repeated_value, num_repeats_to_set); > We could also be lazy and not implement it for sizeof(T) > 1 until we actua I have created a benchmark and it turned out that memset and the for loop are actually different, memset seems to be a few percent faster for short runs (<=32), and the loop is a few percent faster for long runs (>=128). These results are surprising to me, I have expected the opposite. Should I add the benchmark (rle-benchmark.cc) to this commit? I saw that you have removed a file with similar name in https://gerrit.cloudera.org/#/c/8267/ http://gerrit.cloudera.org:8080/#/c/9403/3/testdata/workloads/functional-query/queries/QueryTest/parquet-rle-encoded-bool.test File testdata/workloads/functional-query/queries/QueryTest/parquet-rle-encoded-bool.test: http://gerrit.cloudera.org:8080/#/c/9403/3/testdata/workloads/functional-query/queries/QueryTest/parquet-rle-encoded-bool.test@4 PS3, Line 4: select count(*) from rle_encoded_bool where b; > This wouldn't detect if the boolean values were returned in the wrong order Thanks for the tip, I have created such a table, and something is actually wrong with the decoded values. I am still investigating the issue. -- To view, visit http://gerrit.cloudera.org:8080/9403 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4644bf8cf5d2b7238b05076407fbf78ab5d2c14f Gerrit-Change-Number: 9403 Gerrit-PatchSet: 3 Gerrit-Owner: Csaba RinghoferGerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 09 Mar 2018 17:25:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6605: Exception hidden on complex types
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9514 ) Change subject: IMPALA-6605: Exception hidden on complex types .. Patch Set 12: Fredy and I discussed this change and we concluded that we are not convinced that it works as intended in all cases, or whether it introduces unintended consequences or inconsistency in behavior. This code is very tricky and we need to tread carefully. To move forward with this change, I think we need to first add a auth test suite that systematically enumerates the cases and tests their expected behavior. Once we have a complete picture of the challenging cases, we can talk about modifying this code. My gut feeling is that some internal interfaces are not quite right for serving this new desired behavior, and making such a drastic change only makes sense when all cases are well understood and tested. The different combinations of cases in my mind (there might be more): * All path types: table ref, slot ref, star * Absolute table/slot/star paths * Relative table/slot/star paths (rooted a registered alias) * Failure cases: 1. path has no resolution 2. path has one or more resolutions, but they are not legal for the given path type 3. path has multiple legal resolutions (path is ambiguous) -- To view, visit http://gerrit.cloudera.org:8080/9514 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3ae62fe63b82899d1467a23422d4a73f1471aa84 Gerrit-Change-Number: 9514 Gerrit-PatchSet: 12 Gerrit-Owner: Fredy WijayaGerrit-Reviewer: Adam Holley Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Fredy Wijaya Gerrit-Comment-Date: Fri, 09 Mar 2018 17:22:30 + Gerrit-HasComments: No
[Impala-ASF-CR] Fix test dimensions in test errorlog.py
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9546 ) Change subject: Fix test dimensions in test_errorlog.py .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2074/ -- To view, visit http://gerrit.cloudera.org:8080/9546 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icac90d308337f2cfb51e7de5bd23d410da073a73 Gerrit-Change-Number: 9546 Gerrit-PatchSet: 4 Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Comment-Date: Fri, 09 Mar 2018 17:13:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6551: Change Kudu TPCDS and TPCH columns to DECIMAL
Hello Thomas Tauber-Marshall, Dimitris Tsirogiannis, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9484 to look at the new patch set (#3). Change subject: IMPALA-6551: Change Kudu TPCDS and TPCH columns to DECIMAL .. IMPALA-6551: Change Kudu TPCDS and TPCH columns to DECIMAL Before Kudu supported DECIMAL columns the TPCDS and TPCH columns were djusted to use DOUBLE in place of DECIMAL. This patch undoes that change now that Kudu supports DECIMAL. Testing: - Updated the Kudu planner tests - Updated concurrent_select.py - Updated test_tpch_queries.py Change-Id: I2f7e4464dc6705cadd610a82c459390a9c0dfe4f --- M testdata/datasets/tpcds/tpcds_kudu_template.sql M testdata/datasets/tpch/tpch_kudu_template.sql M testdata/datasets/tpch/tpch_schema_template.sql M testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test D testdata/workloads/tpcds/queries/tpcds-kudu-q19.test D testdata/workloads/tpcds/queries/tpcds-kudu-q27.test D testdata/workloads/tpcds/queries/tpcds-kudu-q3.test D testdata/workloads/tpcds/queries/tpcds-kudu-q34.test D testdata/workloads/tpcds/queries/tpcds-kudu-q42.test D testdata/workloads/tpcds/queries/tpcds-kudu-q43.test D testdata/workloads/tpcds/queries/tpcds-kudu-q46.test D testdata/workloads/tpcds/queries/tpcds-kudu-q47.test D testdata/workloads/tpcds/queries/tpcds-kudu-q52.test D testdata/workloads/tpcds/queries/tpcds-kudu-q53.test D testdata/workloads/tpcds/queries/tpcds-kudu-q55.test D testdata/workloads/tpcds/queries/tpcds-kudu-q59.test D testdata/workloads/tpcds/queries/tpcds-kudu-q6.test D testdata/workloads/tpcds/queries/tpcds-kudu-q61.test D testdata/workloads/tpcds/queries/tpcds-kudu-q63.test D testdata/workloads/tpcds/queries/tpcds-kudu-q65.test D testdata/workloads/tpcds/queries/tpcds-kudu-q68.test D testdata/workloads/tpcds/queries/tpcds-kudu-q7.test D testdata/workloads/tpcds/queries/tpcds-kudu-q73.test D testdata/workloads/tpcds/queries/tpcds-kudu-q79.test D testdata/workloads/tpcds/queries/tpcds-kudu-q8.test D testdata/workloads/tpcds/queries/tpcds-kudu-q88.test D testdata/workloads/tpcds/queries/tpcds-kudu-q89.test D testdata/workloads/tpcds/queries/tpcds-kudu-q96.test D testdata/workloads/tpcds/queries/tpcds-kudu-q98.test D testdata/workloads/tpch/queries/tpch-kudu-q1.test D testdata/workloads/tpch/queries/tpch-kudu-q10.test D testdata/workloads/tpch/queries/tpch-kudu-q11.test D testdata/workloads/tpch/queries/tpch-kudu-q12.test D testdata/workloads/tpch/queries/tpch-kudu-q13.test D testdata/workloads/tpch/queries/tpch-kudu-q14.test D testdata/workloads/tpch/queries/tpch-kudu-q15.test D testdata/workloads/tpch/queries/tpch-kudu-q16.test D testdata/workloads/tpch/queries/tpch-kudu-q17.test D testdata/workloads/tpch/queries/tpch-kudu-q18.test D testdata/workloads/tpch/queries/tpch-kudu-q19.test D testdata/workloads/tpch/queries/tpch-kudu-q2.test D testdata/workloads/tpch/queries/tpch-kudu-q20.test D testdata/workloads/tpch/queries/tpch-kudu-q21.test D testdata/workloads/tpch/queries/tpch-kudu-q22.test D testdata/workloads/tpch/queries/tpch-kudu-q3.test D testdata/workloads/tpch/queries/tpch-kudu-q4.test D testdata/workloads/tpch/queries/tpch-kudu-q5.test D testdata/workloads/tpch/queries/tpch-kudu-q6.test D testdata/workloads/tpch/queries/tpch-kudu-q7.test D testdata/workloads/tpch/queries/tpch-kudu-q8.test D testdata/workloads/tpch/queries/tpch-kudu-q9.test M tests/query_test/test_tpch_queries.py M tests/stress/concurrent_select.py 53 files changed, 111 insertions(+), 22,160 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/9484/3 -- To view, visit http://gerrit.cloudera.org:8080/9484 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2f7e4464dc6705cadd610a82c459390a9c0dfe4f Gerrit-Change-Number: 9484 Gerrit-PatchSet: 3 Gerrit-Owner: Grant HenkeGerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] Fix test dimensions in test errorlog.py
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9546 ) Change subject: Fix test dimensions in test_errorlog.py .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9546 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icac90d308337f2cfb51e7de5bd23d410da073a73 Gerrit-Change-Number: 9546 Gerrit-PatchSet: 4 Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Comment-Date: Fri, 09 Mar 2018 17:10:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6551: Change Kudu TPCDS and TPCH columns to DECIMAL
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/9484 ) Change subject: IMPALA-6551: Change Kudu TPCDS and TPCH columns to DECIMAL .. Patch Set 3: Fixed and ran the FE tests locally. I also kicked off an exhaustive pre-review test to make sure I didn't miss anything here: https://jenkins.impala.io/job/pre-review-test/149/ -- To view, visit http://gerrit.cloudera.org:8080/9484 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f7e4464dc6705cadd610a82c459390a9c0dfe4f Gerrit-Change-Number: 9484 Gerrit-PatchSet: 3 Gerrit-Owner: Grant HenkeGerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 09 Mar 2018 17:06:44 + Gerrit-HasComments: No
[Impala-ASF-CR] Fix test dimensions in test errorlog.py
Michael Brown has posted comments on this change. ( http://gerrit.cloudera.org:8080/9546 ) Change subject: Fix test dimensions in test_errorlog.py .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9546 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icac90d308337f2cfb51e7de5bd23d410da073a73 Gerrit-Change-Number: 9546 Gerrit-PatchSet: 3 Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Comment-Date: Fri, 09 Mar 2018 16:26:13 + Gerrit-HasComments: No
[native-toolchain-CR] IMPALA-5717: Build ORC C++ lib in toolchain
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/9274 ) Change subject: IMPALA-5717: Build ORC C++ lib in toolchain .. Patch Set 5: Code-Review+1 OK, thank you so much! -- To view, visit http://gerrit.cloudera.org:8080/9274 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6abd86a5a683f19aa44b47629edf283b938b7e7e Gerrit-Change-Number: 9274 Gerrit-PatchSet: 5 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 09 Mar 2018 09:11:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6605: Exception hidden on complex types
Fredy Wijaya has uploaded a new patch set (#12). ( http://gerrit.cloudera.org:8080/9514 ) Change subject: IMPALA-6605: Exception hidden on complex types .. IMPALA-6605: Exception hidden on complex types This patch fixes the issue when we have a column privilege on a particular table, but the statement has an analysis error. Example: > grant select(int_struct_col) on table functional.allcomplextypes > to role test_role; > select 1 from functional.allcomplextypes.int_struct_col; The select statement above should throw an AnalysisException and not an AuthorizationException since we have int_struct_col column privilege on functional.allcomplextypes table. Testing: - Ran all front-end tests Change-Id: I3ae62fe63b82899d1467a23422d4a73f1471aa84 --- M fe/src/main/java/org/apache/impala/analysis/Analyzer.java A fe/src/main/java/org/apache/impala/common/PathResolutionException.java M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java 3 files changed, 115 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/9514/12 -- To view, visit http://gerrit.cloudera.org:8080/9514 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3ae62fe63b82899d1467a23422d4a73f1471aa84 Gerrit-Change-Number: 9514 Gerrit-PatchSet: 12 Gerrit-Owner: Fredy WijayaGerrit-Reviewer: Adam Holley Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Fredy Wijaya