[Impala-ASF-CR] IMPALA-4277: remove references for unsupported s3/s3n connectors
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4277: remove references for unsupported s3/s3n connectors .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/4778 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibfadd2bc91c7dbcb6f2bc962c404caea30f9b776 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3739: Enable stress tests on Kudu
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-3739: Enable stress tests on Kudu .. Patch Set 10: Code-Review+2 Fix data loading for Kudu tables and rebase. Carry MJ's +2. -- To view, visit http://gerrit.cloudera.org:8080/4327 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3c9fc3dae24b761f031ee8e014bd611a49029d34 Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris TsirogiannisGerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3739: Enable stress tests on Kudu
Hello Michael Brown, Matthew Jacobs, Internal Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4327 to look at the new patch set (#10). Change subject: IMPALA-3739: Enable stress tests on Kudu .. IMPALA-3739: Enable stress tests on Kudu This commit modifies the stress test framework to run TPC-H and TPC-DS workloads against Kudu. The follwing changes are included in this commit: 1. Created template files with DDL and DML statements for loading TPC-H and TPC-DS data in Kudu 2. Created a script (load-tpc-kudu.py) to load data in Kudu. The script is invoked by the stress test runner to load test data in an existing Impala/Kudu cluster (both local and CM-managed clusters are supported). 3. Created SQL files with TPC-DS queries to be executed in Kudu. SQL files with TPC-H queries for Kudu were added in a previous patch. 4. Modified the stress test runner to take additional parameters specific to Kudu (e.g. kudu master addr) The stress test runner for Kudu was tested on EC2 clusters for both TPC-H and TPC-DS workloads. Missing functionality: * No CRUD operations in the existing TPC-H/TPC-DS workloads for Kudu. * Not all supported TPC-DS queries are included. Currently, only the TPC-DS queries from the testdata/workloads/tpcds/queries directory were modified to run against Kudu. Change-Id: I3c9fc3dae24b761f031ee8e014bd611a49029d34 --- A testdata/bin/load-tpc-kudu.py M testdata/datasets/functional/functional_schema_template.sql A testdata/datasets/tpcds/tpcds_kudu_template.sql A testdata/datasets/tpch/tpch_kudu_template.sql A testdata/workloads/tpcds/queries/tpcds-kudu-q19.test A testdata/workloads/tpcds/queries/tpcds-kudu-q27.test A testdata/workloads/tpcds/queries/tpcds-kudu-q3.test A testdata/workloads/tpcds/queries/tpcds-kudu-q34.test A testdata/workloads/tpcds/queries/tpcds-kudu-q42.test A testdata/workloads/tpcds/queries/tpcds-kudu-q43.test A testdata/workloads/tpcds/queries/tpcds-kudu-q46.test A testdata/workloads/tpcds/queries/tpcds-kudu-q47.test A testdata/workloads/tpcds/queries/tpcds-kudu-q52.test A testdata/workloads/tpcds/queries/tpcds-kudu-q53.test A testdata/workloads/tpcds/queries/tpcds-kudu-q55.test A testdata/workloads/tpcds/queries/tpcds-kudu-q59.test A testdata/workloads/tpcds/queries/tpcds-kudu-q6.test A testdata/workloads/tpcds/queries/tpcds-kudu-q61.test A testdata/workloads/tpcds/queries/tpcds-kudu-q63.test A testdata/workloads/tpcds/queries/tpcds-kudu-q65.test A testdata/workloads/tpcds/queries/tpcds-kudu-q68.test A testdata/workloads/tpcds/queries/tpcds-kudu-q7.test A testdata/workloads/tpcds/queries/tpcds-kudu-q73.test A testdata/workloads/tpcds/queries/tpcds-kudu-q79.test A testdata/workloads/tpcds/queries/tpcds-kudu-q8.test A testdata/workloads/tpcds/queries/tpcds-kudu-q88.test A testdata/workloads/tpcds/queries/tpcds-kudu-q89.test A testdata/workloads/tpcds/queries/tpcds-kudu-q96.test A testdata/workloads/tpcds/queries/tpcds-kudu-q98.test M tests/comparison/db_connection.py M tests/stress/concurrent_select.py 31 files changed, 2,459 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/4327/10 -- To view, visit http://gerrit.cloudera.org:8080/4327 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3c9fc3dae24b761f031ee8e014bd611a49029d34 Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris TsirogiannisGerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown
[Impala-ASF-CR] IMPALA-4329: Prevent crash in scheduler when no backends are registered
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4329: Prevent crash in scheduler when no backends are registered .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/4776 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6d93158f34841ea66dc3682290266262c87ea7ff Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4277: remove references for unsupported s3/s3n connectors
Alex Behm has posted comments on this change. Change subject: IMPALA-4277: remove references for unsupported s3/s3n connectors .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/4778 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibfadd2bc91c7dbcb6f2bc962c404caea30f9b776 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4285/IMPALA-4286: Fixes for Parquet scanner with MT DOP > 0.
Hello Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4767 to look at the new patch set (#4). Change subject: IMPALA-4285/IMPALA-4286: Fixes for Parquet scanner with MT_DOP > 0. .. IMPALA-4285/IMPALA-4286: Fixes for Parquet scanner with MT_DOP > 0. IMPALA-4258: The problem was that there was a reference to HdfsScanner::batch_ hidden inside WriteEmptyTuples(). The batch_ reference is NULL when the scanner is run with MT_DOP > 1. IMPALA-4286: When there are no scan ranges HdfsScanNodeBase::Open() exits early without initializing the reader context. This lead to a DCHECK in IoMgr::GetNextRange() called from HdfsScanNodeMt. The fix is to short-circuit the empty scan range case in HdfsScanNodeMt::GetNext(). I combined these two bugfixes because the new basic test covers both cases. Testing: Added a new test_mt_dop.py test. Change-Id: I79c0f6fd2aeb4bc6fa5f87219a485194fef2db1b --- M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-rcfile-scanner.cc M be/src/exec/hdfs-scan-node-mt.cc M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc A testdata/workloads/functional-query/queries/QueryTest/mt-dop.test A tests/query_test/test_mt_dop.py 10 files changed, 88 insertions(+), 62 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/4767/4 -- To view, visit http://gerrit.cloudera.org:8080/4767 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I79c0f6fd2aeb4bc6fa5f87219a485194fef2db1b Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht
[Impala-ASF-CR] IMPALA-4155: Update default partition when table is altered
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4155: Update default partition when table is altered .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/4750 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I59bf21caa5c5e7867d07d87cda0c0a5b4b994859 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3342: Add thread counters to monitor plan fragment execution
Tim Armstrong has posted comments on this change. Change subject: IMPALA-3342: Add thread counters to monitor plan fragment execution .. Patch Set 6: (3 comments) http://gerrit.cloudera.org:8080/#/c/4633/6//COMMIT_MSG Commit Message: Line 9: This change removes the use of total_cpu_timer which incorrectly > Do you mean provide a snippet of the profile in the commit message? Yes http://gerrit.cloudera.org:8080/#/c/4633/6/be/src/runtime/runtime-state.cc File be/src/runtime/runtime-state.cc: Line 138 Why don't we create the thread counters here like before? This seems much cleaner rather than having PlanFragmentExecutor do it. http://gerrit.cloudera.org:8080/#/c/4633/6/be/src/runtime/runtime-state.h File be/src/runtime/runtime-state.h: Line 242: RuntimeProfile::ThreadCounters* plan_fragment_counters_; > We access this object variable in plan-fragment-executor.Hence I kept it p The convention is that the trailing underscore means that the member is private. Generally all class members should be private and exposed if needed via accessor functions. Sometimes we use friend classes if two classes are very tightly coupled by design. -- To view, visit http://gerrit.cloudera.org:8080/4633 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifa88aa6f3371fa42d11ecc122f43c7d83623c300 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: anujphadkeGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Yonghyun Hwang Gerrit-Reviewer: anujphadke Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions
Michael Ho has posted comments on this change. Change subject: IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions .. Patch Set 4: (13 comments) http://gerrit.cloudera.org:8080/#/c/4655/4/be/src/codegen/codegen-anyval.h File be/src/codegen/codegen-anyval.h: PS4, Line 242: if (val.is_null) goto null_block; Can you please elaborate this comment with code snippet like the following ? if (val.is_null) goto null_block; non_null_block: //default entry point after calling this function null_block: http://gerrit.cloudera.org:8080/#/c/4655/4/be/src/exec/partitioned-aggregation-node.cc File be/src/exec/partitioned-aggregation-node.cc: PS4, Line 1065: can ignore the NULL bit of its destination value is : // ignored ? PS4, Line 1067: set did you mean with NULL bit cleared ? Line 1599: RETURN_IF_ERROR(agg_expr->GetCodegendComputeFn(state_, _expr_fn)); DCHECK(agg_expr_fn != NULL); PS4, Line 1620: src.CodegenBranchIfNull(, ret_block); It seems that we emit this check for all paths except for the UDA path. Why is it safe to make the assumption that 'src' is not NULL for the UDA path ? Also, it appears that we may consider doing some refactoring by computing 'bool use_uda_interface' up front and only emit this cmp-and-branch if that's false. Please see comments below about 'use_uda_interface'. PS4, Line 1628: dst_type.type != TYPE_TIMESTAMP : && !dst_type.IsStringType() This seems to be reused in multiple places. May be it's worth factoring this expression out. PS4, Line 1637: dst_type.type != TYPE_TIMESTAMP : && !dst_type.IsStringType() It may be easier to read if we do the check once up front: agg_op = evaluator->agg_op(); bool use_uda_interface = agg_op != COUNT && (dst_type.type == TYPE_TIMESTAMP || dst_type.IsStringType || (dst_type.type == TYPE_DECIMAL && agg_op() == AggFnEvaluator::SUM); if (use_uda_interface) { } else { switch (agg_op) { } } PS4, Line 1676: src.CodegenBranchIfNull(, ret_block); Actually, referring to the comment above again, it seems that we should just do the check before the if-stmt sequence and emit the cmp-and-branch instructions up front for these cases. IMHO, I find the new code harder to follow because the basic blocks in the emitted code are not as explicitly laid out as the old code. PS4, Line 1685: resulting in with results stored in PS4, Line 1725: codegen->GetFunction(symbol); Should we consider adding "bool clone" as the second argument similar to the other variant of GetFunction() ? PS4, Line 1745: Value* input_lowered_ptr = : codegen->CreateEntryBlockAlloca(builder->GetInsertBlock()->getParent(), : LlvmCodeGen::NamedVariable("input_lowered_ptr", : input_vals[i].value()->getType())); : builder->CreateStore(input_vals[i].value(), input_lowered_ptr); : Type* input_unlowered_ptr_type = CodegenAnyVal::GetUnloweredPtrType( : codegen, evaluator->input_expr_ctxs()[i]->root()->type()); : Value* input_unlowered_ptr = builder->CreateBitCast( : input_lowered_ptr, input_unlowered_ptr_type, "input_unlowered_ptr"); Do you think it's worth factoring this logic out to a function given it's repeated again from line 1757-1764 ? http://gerrit.cloudera.org:8080/#/c/4655/4/be/src/exprs/compound-predicates.cc File be/src/exprs/compound-predicates.cc: PS4, Line 68: Please consider removing these extra blank spaces too. http://gerrit.cloudera.org:8080/#/c/4655/4/be/src/runtime/descriptors.cc File be/src/runtime/descriptors.cc: PS4, Line 603: Value* tuple_bytes = builder->CreateBitCast(tuple, codegen->ptr_type()); : Constant* null_byte_offset = : ConstantInt::get(codegen->int_type(), null_indicator_offset_.byte_offset); : Value* null_byte_ptr = : builder->CreateInBoundsGEP(tuple_bytes, null_byte_offset, "null_byte_ptr"); What do you think about factoring this code snippet as a utility function for generating code to return null_byte_ptr ? This seems to be useful for CodegenIsNull() too. -- To view, visit http://gerrit.cloudera.org:8080/4655 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4277: remove references for unsupported s3/s3n connectors
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4277: remove references for unsupported s3/s3n connectors .. Patch Set 1: (1 comment) Carry +1 http://gerrit.cloudera.org:8080/#/c/4778/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java: Line 281: return !(fs instanceof S3AFileSystem|| fs instanceof LocalFileSystem); > nit: missing space before || Done -- To view, visit http://gerrit.cloudera.org:8080/4778 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibfadd2bc91c7dbcb6f2bc962c404caea30f9b776 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3552: make incremental stats max serialized size configurable
Yonghyun Hwang has posted comments on this change. Change subject: IMPALA-3552: make incremental stats max serialized size configurable .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/4772/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: PS1, Line 169: MAX_INCREMENTAL_STATS_SIZE_BYTES > I think he can try to use runtimeEnv as well. or have this in catalog not i To my best understanding, MAX_INC_*_BYTES is used in both of impalad and catalogd. So, RuntimeEnv solely could not be used. (libfesupport.so would complain during runtime.) Initially, I also thought BackendConfig is the best option for this because we can remove MAX_INC_*_BYTES from HdfsTable and consolidate BE options in one place. :) Alex/Huaisi/Bharath, if we decide to use BackendConfig, I will abandon this change and re-do the work, which would take few days. Please let me have your inputs. -- To view, visit http://gerrit.cloudera.org:8080/4772 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4d189ffc1c56e7a4e27e6233e9119250a2395a19 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun HwangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Yonghyun Hwang Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4277: remove references for unsupported s3/s3n connectors
Dan Hecht has posted comments on this change. Change subject: IMPALA-4277: remove references for unsupported s3/s3n connectors .. Patch Set 1: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/4778/1/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java: Line 281: return !(fs instanceof S3AFileSystem|| fs instanceof LocalFileSystem); nit: missing space before || -- To view, visit http://gerrit.cloudera.org:8080/4778 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibfadd2bc91c7dbcb6f2bc962c404caea30f9b776 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4285: Fix Parquet scanner with MT DOP > 0 and no materialized slots.
Alex Behm has posted comments on this change. Change subject: IMPALA-4285: Fix Parquet scanner with MT_DOP > 0 and no materialized slots. .. Patch Set 3: Hold off on reviewing, found some issues. -- To view, visit http://gerrit.cloudera.org:8080/4767 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I79c0f6fd2aeb4bc6fa5f87219a485194fef2db1b Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4277: remove references for unsupported s3/s3n connectors
Tim Armstrong has uploaded a new change for review. http://gerrit.cloudera.org:8080/4778 Change subject: IMPALA-4277: remove references for unsupported s3/s3n connectors .. IMPALA-4277: remove references for unsupported s3/s3n connectors We only support s3a://. Support will be removed for s3:// in Hadoop 3.0 by HADOOP-12709 Change-Id: Ibfadd2bc91c7dbcb6f2bc962c404caea30f9b776 Reviewed-on: http://gerrit.cloudera.org:8080/4748 Reviewed-by: Tim ArmstrongTested-by: Tim Armstrong (cherry picked from commit 3cb3f34d6d65bf52b2b8ba57a02d9ac785c8a937) --- M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java 1 file changed, 1 insertion(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/4778/1 -- To view, visit http://gerrit.cloudera.org:8080/4778 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ibfadd2bc91c7dbcb6f2bc962c404caea30f9b776 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-4329: Prevent crash in scheduler when no backends are registered
Henry Robinson has posted comments on this change. Change subject: IMPALA-4329: Prevent crash in scheduler when no backends are registered .. Patch Set 3: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/4776/3//COMMIT_MSG Commit Message: PS3, Line 12: the top of assignment_ctx.assignment_heap SelectRemoteBackendHost() grammar, plus maybe say ", called from ComputeScanRangeAssignment()", because that's where the bug actually is. -- To view, visit http://gerrit.cloudera.org:8080/4776 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6d93158f34841ea66dc3682290266262c87ea7ff Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4329: Prevent crash in scheduler when no backends are registered
Lars Volker has uploaded a new patch set (#3). Change subject: IMPALA-4329: Prevent crash in scheduler when no backends are registered .. IMPALA-4329: Prevent crash in scheduler when no backends are registered The scheduler crashed with a segmentation fault when there were no backends registered: After not being able to find a local backend (none are configured at all), the previous code would eventually try to return the top of assignment_ctx.assignment_heap SelectRemoteBackendHost(), but that heap would be empty. Subsequently, when using the IP address of that heap node, a segmentation fault would occur. This change adds a check and aborts scheduling with an error. It also contains a test. Change-Id: I6d93158f34841ea66dc3682290266262c87ea7ff --- M be/src/scheduling/simple-scheduler-test-util.cc M be/src/scheduling/simple-scheduler-test-util.h M be/src/scheduling/simple-scheduler-test.cc M be/src/scheduling/simple-scheduler.cc M common/thrift/generate_error_codes.py 5 files changed, 35 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/4776/3 -- To view, visit http://gerrit.cloudera.org:8080/4776 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6d93158f34841ea66dc3682290266262c87ea7ff Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson
[Impala-ASF-CR] IMPALA-3725 Support Kudu UPSERT in Impala
Thomas Tauber-Marshall has posted comments on this change. Change subject: IMPALA-3725 Support Kudu UPSERT in Impala .. Patch Set 8: (6 comments) http://gerrit.cloudera.org:8080/#/c/4047/8/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java: Line 689: throw new AnalysisException("UPSERT does not currently support any plan hints."); > make this a warning instead like we do for unrecognized hints Done http://gerrit.cloudera.org:8080/#/c/4047/8/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java: Line 3456: public void TestUpsert() { > Let's move this into a new AnalyzeUpsertStmt.java file. This file is alread Done http://gerrit.cloudera.org:8080/#/c/4047/8/fe/src/test/java/org/apache/impala/analysis/ParserTest.java File fe/src/test/java/org/apache/impala/analysis/ParserTest.java: Line 1653: ParsesOk(String.format("upsert into %s t [shuffle] select a from src", optTbl)); > move into TestPlanHints() Done http://gerrit.cloudera.org:8080/#/c/4047/8/testdata/workloads/functional-planner/queries/PlannerTest/kudu-upsert.test File testdata/workloads/functional-planner/queries/PlannerTest/kudu-upsert.test: Line 40:runtime filters: RF000 -> a.string_col > add DISTRIBUTEDPLAN here also just to make sure we can generate one Done http://gerrit.cloudera.org:8080/#/c/4047/8/testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test File testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test: Line 289: upsert into table tdata (id, valf) values (2, 10), (120, 20), (0, 0) > throw in a few NULLs somewhere Great catch! This actually didn't work as written. The latest version fixes it, but the fix is unfortunately somewhat complicated. Line 306: upsert into table tdata (valb, name, id) > add an upsert without a query stmt I assume that you're talking about how the query statement is optional for the parser when the permutation is present, for consistency with insert? If so, its not actually possible to successfully run such a query (at least at the moment), since either you list some columns in the permutation, in which case you will get the error that the number of columns in the permutation don't match the number of columns in the query stmt (which is 0), or else you don't list columns in the permutation, in which case you will get the error that you must list all of the primary key columns, and we already have parser/analyzer tests for those situations. -- To view, visit http://gerrit.cloudera.org:8080/4047 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8df5cea36b642e267f85ff6b163f3dd96b8386e9 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-MarshallGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3552: make incremental stats max serialized size configurable
Huaisi Xu has posted comments on this change. Change subject: IMPALA-3552: make incremental stats max serialized size configurable .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/4772/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: PS1, Line 169: MAX_INCREMENTAL_STATS_SIZE_BYTES > - Actually implementing this way seems to be little confusing to me. Reason I think he can try to use runtimeEnv as well. or have this in catalog not in a table. -- To view, visit http://gerrit.cloudera.org:8080/4772 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4d189ffc1c56e7a4e27e6233e9119250a2395a19 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun HwangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Yonghyun Hwang Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4329: Prevent crash in scheduler when no backends are registered
Lars Volker has uploaded a new patch set (#2). Change subject: IMPALA-4329: Prevent crash in scheduler when no backends are registered .. IMPALA-4329: Prevent crash in scheduler when no backends are registered The scheduler crashed with a segmentation fault when there were no backends registered. This change adds a check and aborts scheduling with an error. It also contains a test. Change-Id: I6d93158f34841ea66dc3682290266262c87ea7ff --- M be/src/scheduling/simple-scheduler-test-util.cc M be/src/scheduling/simple-scheduler-test-util.h M be/src/scheduling/simple-scheduler-test.cc M be/src/scheduling/simple-scheduler.cc M common/thrift/generate_error_codes.py 5 files changed, 35 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/4776/2 -- To view, visit http://gerrit.cloudera.org:8080/4776 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6d93158f34841ea66dc3682290266262c87ea7ff Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson
[Impala-ASF-CR] IMPALA-3552: make incremental stats max serialized size configurable
Bharath Vissapragada has posted comments on this change. Change subject: IMPALA-3552: make incremental stats max serialized size configurable .. Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/4772/1//COMMIT_MSG Commit Message: PS1, Line 7: make nit: Make PS1, Line 16: impala Impala http://gerrit.cloudera.org:8080/#/c/4772/1/be/src/common/global-flags.cc File be/src/common/global-flags.cc: PS1, Line 119: inc_stat_max_size whats the unit here? May be rename it to incremental_stats_max_size_mb or incremental_stats_max_size_bytes to make it more self explanatory? http://gerrit.cloudera.org:8080/#/c/4772/1/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java File fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java: PS1, Line 289: MAX_INCREMENTAL_STATS_SIZE_BYTES, RuntimeEnv.INSTANCE.incStatMaxSize() Please replace with --incremental_stats_size_ http://gerrit.cloudera.org:8080/#/c/4772/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: PS1, Line 169: MAX_INCREMENTAL_STATS_SIZE_BYTES - Actually implementing this way seems to be little confusing to me. Reason being, this variable is set only set in the Catalog code and remains un-initialized in the frontend code. Given we share classes between the Catalog and fe code, I think its confusing to the readers why we source it from RuntimeEnv at one place and MAX_INCREMENTAL_STATS_SIZE_BYTES in the other. - Initially when you asked my suggestion on the best way to pass this flag, I thought its only being passed to the Catalog and I suggested not to use the BackendConfig way. Now that I read the patch, I realized this is being used in the ComputeStats as well. How about falling back to the BackendConfig way to do this? Although its hacky, its easier to read and it can be set from both JniCatalog and JniFrontend (Look for --load_auth_to_local flag which does the same). - Sorry for the back and forth on this but after reading the patch, I think the hacky BackendConfig is more readable. Alex/Huaisi/Yonghyun thoughts? -- To view, visit http://gerrit.cloudera.org:8080/4772 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4d189ffc1c56e7a4e27e6233e9119250a2395a19 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun HwangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Yonghyun Hwang Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3552: make incremental stats max serialized size configurable
Huaisi Xu has posted comments on this change. Change subject: IMPALA-3552: make incremental stats max serialized size configurable .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/4772/1/be/src/catalog/catalog.cc File be/src/catalog/catalog.cc: PS1, Line 33: inc_stat_max_size > thanks. I will add the unit in option description. (in DEFINE_uint64(...) Sorry. I meant from the name we should know which unit it is. http://gerrit.cloudera.org:8080/#/c/4772/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: PS1, Line 176: 0 > Actually, HdfsTable.MAX_INCREMENTAL_STATS_SIZE_BYTES is set in CatalogServi that is set by you right? I think a static member should not be set like this anyway. http://gerrit.cloudera.org:8080/#/c/4772/1/fe/src/main/java/org/apache/impala/common/RuntimeEnv.java File fe/src/main/java/org/apache/impala/common/RuntimeEnv.java: PS1, Line 71: incStatMaxSize > Sure. :) Do you recommend me to update computeLineage() as well? do not care about the rest.. :-) -- To view, visit http://gerrit.cloudera.org:8080/4772 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4d189ffc1c56e7a4e27e6233e9119250a2395a19 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun HwangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Yonghyun Hwang Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4329: Prevent crash in scheduler when no backends are registered
Henry Robinson has posted comments on this change. Change subject: IMPALA-4329: Prevent crash in scheduler when no backends are registered .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/4776/1/be/src/scheduling/simple-scheduler.cc File be/src/scheduling/simple-scheduler.cc: PS1, Line 541: indentation - did you clang-format? PS1, Line 541: " I think you should add an error code here (NO_REGISTERED_BACKENDS). -- To view, visit http://gerrit.cloudera.org:8080/4776 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6d93158f34841ea66dc3682290266262c87ea7ff Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Henry Robinson Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4329: Prevent crash in scheduler when no backends are registered
Lars Volker has uploaded a new change for review. http://gerrit.cloudera.org:8080/4776 Change subject: IMPALA-4329: Prevent crash in scheduler when no backends are registered .. IMPALA-4329: Prevent crash in scheduler when no backends are registered The scheduler crashed with a segmentation fault when there were no backends registered. This change adds a check and aborts scheduling with an error. It also contains a test. Change-Id: I6d93158f34841ea66dc3682290266262c87ea7ff --- M be/src/scheduling/simple-scheduler-test-util.cc M be/src/scheduling/simple-scheduler-test-util.h M be/src/scheduling/simple-scheduler-test.cc M be/src/scheduling/simple-scheduler.cc 4 files changed, 32 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/4776/1 -- To view, visit http://gerrit.cloudera.org:8080/4776 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I6d93158f34841ea66dc3682290266262c87ea7ff Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker
[Impala-ASF-CR] IMPALA-3552: make incremental stats max serialized size configurable
Yonghyun Hwang has posted comments on this change. Change subject: IMPALA-3552: make incremental stats max serialized size configurable .. Patch Set 1: (7 comments) http://gerrit.cloudera.org:8080/#/c/4772/1/be/src/catalog/catalog.cc File be/src/catalog/catalog.cc: PS1, Line 33: inc_stat_max_size > could you be explicit on the unit? thanks. I will add the unit in option description. (in DEFINE_uint64(...) http://gerrit.cloudera.org:8080/#/c/4772/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: Line 164: SentryConfig sentryConfig, TUniqueId catalogServiceId, String kerberosPrincipal, long incStatMaxSize) { > long line I will take care of this. http://gerrit.cloudera.org:8080/#/c/4772/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: PS1, Line 176: 0 > sorry I meant should not be a public static.. Actually, HdfsTable.MAX_INCREMENTAL_STATS_SIZE_BYTES is set in CatalogServiceCatalog. Let me make this as a private var and create a method to set it. http://gerrit.cloudera.org:8080/#/c/4772/1/fe/src/main/java/org/apache/impala/common/RuntimeEnv.java File fe/src/main/java/org/apache/impala/common/RuntimeEnv.java: PS1, Line 71: incStatMaxSize > getIncStatMaxSize Sure. :) Do you recommend me to update computeLineage() as well? http://gerrit.cloudera.org:8080/#/c/4772/1/fe/src/main/java/org/apache/impala/service/JniCatalog.java File fe/src/main/java/org/apache/impala/service/JniCatalog.java: Line 84: boolean allowAuthToLocal, String kerberosPrincipal, long incStatMaxSize) throws InternalException { > long line will take care of it. Line 101: numMetadataLoadingThreads, sentryConfig, getServiceId(), kerberosPrincipal, incStatMaxSize); > long will take care of it. http://gerrit.cloudera.org:8080/#/c/4772/1/fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java File fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java: Line 34: super(loadInBackground, numLoadingThreads, sentryConfig, catalogServiceId, null, 200*1024*1024); > long will take care of it. thanks. :) -- To view, visit http://gerrit.cloudera.org:8080/4772 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4d189ffc1c56e7a4e27e6233e9119250a2395a19 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun HwangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Yonghyun Hwang Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4120: Incorrect results with LEAD() analytic function
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-4120: Incorrect results with LEAD() analytic function .. Patch Set 3: Code-Review+1 (1 comment) Thanks! It'll be good to run this through the query generator. http://gerrit.cloudera.org:8080/#/c/4740/3/be/src/exprs/agg-fn-evaluator.h File be/src/exprs/agg-fn-evaluator.h: PS3, Line 153: beyond the scope of the call site I'm still not sure this is precise enough... this sounds like it's scoped at the call site. It's memory that lives until QueryMaintenance frees the function's memory, right? -- To view, visit http://gerrit.cloudera.org:8080/4740 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I85bb1745232d8dd383a6047c86019c6378ab571f Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael HoGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Ho Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3719: Simplify CREATE TABLE statements with Kudu tables
Alex Behm has posted comments on this change. Change subject: IMPALA-3719: Simplify CREATE TABLE statements with Kudu tables .. Patch Set 12: (2 comments) http://gerrit.cloudera.org:8080/#/c/4414/12/fe/src/main/java/org/apache/impala/analysis/TableDef.java File fe/src/main/java/org/apache/impala/analysis/TableDef.java: Line 280: if (options_.fileFormat == THdfsFileFormat.KUDU) { This doesn't seem right. We used to be able to change the ROW FORMAT for TEXT and SEQUENCE. What was wrong with the previous code? http://gerrit.cloudera.org:8080/#/c/4414/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: Line 1209: org.apache.hadoop.hive.metastore.api.Table msTbl = existingTbl.getMetaStoreTable(); Should we only do this if existingTbl is not loaded? Some possible inconsistency issues that we should probably figure out sometime: The HMS table may have already been dropped, and then we'll not drop the corresponding Kudu table. -- To view, visit http://gerrit.cloudera.org:8080/4414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7b9d51b2720ab57649abdb7d5c710ea04ff50dc1 Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris TsirogiannisGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1169: Admission control info on the queries debug webpage
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-1169: Admission control info on the queries debug webpage .. Patch Set 2: (9 comments) As we discussed in person, we should call out how this should behave for other "query types" E.g. DDL always is considered to be admitted and bypasses the new preparing(?) list. (Is that true?) What about COMPUTE STATS? Can you check for the COMPUTE STATS statement itself and see what happens with the child queries it generates (there should be 2). I'd like to capture this information clearly so we can work with the CM folks to expose things the same way in their UI. http://gerrit.cloudera.org:8080/#/c/4756/2/be/src/scheduling/simple-scheduler.cc File be/src/scheduling/simple-scheduler.cc: PS2, Line 879: Status SimpleScheduler::Schedule Can you comment in the header the post-conditions: returns with schedule is_admitted==true OR an error PS2, Line 900: // TODO-MT: call AdmitQuery() this should also have set_is_admitted(true) until this TODO is resolved. http://gerrit.cloudera.org:8080/#/c/4756/2/be/src/service/impala-http-handler.cc File be/src/service/impala-http-handler.cc: PS2, Line 348: waiting now 'waiting' is a bit ambiguous... can you rename to waiting_for_close now? http://gerrit.cloudera.org:8080/#/c/4756/2/be/src/service/query-exec-state.h File be/src/service/query-exec-state.h: PS2, Line 157: bool is_admitted() const { : if (exec_request_.stmt_type == TStmtType::QUERY || : exec_request_.stmt_type == TStmtType::DML || : (catalog_op_type() == TCatalogOpType::DDL && : ddl_type() == TDdlType::CREATE_TABLE_AS_SELECT)) { : return schedule_ == NULL ? false : schedule_->is_admitted(); : } : return true; : } I think this is big enough to move it out of the header. Also, please add a comment and mention that this is always true for X query types. PS2, Line 166: std::string request_pool() const { : return schedule_ == NULL ? "" : schedule_->request_pool(); : } comment here too, when does this return an empty string? http://gerrit.cloudera.org:8080/#/c/4756/2/tests/custom_cluster/test_admission_controller.py File tests/custom_cluster/test_admission_controller.py: PS2, Line 602: _get_json_metrics the metrics page exposes json too (and 'metrics' is a bit overloaded), so we should indicate this is the queries page. e.g. _get_queries_page_counts() . not the best name ever but differentiates it from metrics and its json output. PS2, Line 603: the number of 'admitted' and 'queued' queries the number of queries currently in the 'admitted' and 'queued' states (Indicates these are the current values, the metrics are monotonically increasing counters. In more metrics-y terminology that's a gauge vs a counter.) PS2, Line 710: json_metrics = self._get_json_metrics() : assert json_metrics['admitted'] == 0 : assert json_metrics['queued'] == 0 here's where it'd be helpful to differentiate between current counts (i.e. should be 0) and the counter deltas computed above (which are > 0); it's not clear why this makes it look like admitted is 0 but it is nonzero in the checks just a few lines above. Can you rename the variables and brief comment? http://gerrit.cloudera.org:8080/#/c/4756/1/www/queries.tmpl File www/queries.tmpl: PS1, Line 26: {{num_not_scheduled_queries}} queries not scheduled yet > Sure, maybe "not yet executing", "pending execution", or "setting up"? I like "Preparing for Execution", but let's wait to hear from others as well since the naming of some of the plumbing should reflect the same names as well. -- To view, visit http://gerrit.cloudera.org:8080/4756 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I504e3c829a14318721e3a42de6281bcc578f7283 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-MarshallGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4155: Update default partition when table is altered
Alex Behm has posted comments on this change. Change subject: IMPALA-4155: Update default partition when table is altered .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/4750 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I59bf21caa5c5e7867d07d87cda0c0a5b4b994859 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4155: Update default partition when table is altered
Hello Internal Jenkins, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4750 to look at the new patch set (#5). Change subject: IMPALA-4155: Update default partition when table is altered .. IMPALA-4155: Update default partition when table is altered If the table format is changed by the Alter Table statement, the default partition in partitioned tables used to not get updated. This caused a problem because Insert picks up the file format for new partitions from the default partition. This patch fixes the problem by calling addDefaultPartition(). Also removed "drop table if not exists" in tests in alter-table.test because we already have the unique_database fixture. Change-Id: I59bf21caa5c5e7867d07d87cda0c0a5b4b994859 --- M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M testdata/workloads/functional-query/queries/QueryTest/alter-table.test 3 files changed, 39 insertions(+), 31 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/4750/5 -- To view, visit http://gerrit.cloudera.org:8080/4750 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I59bf21caa5c5e7867d07d87cda0c0a5b4b994859 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-3552: make incremental stats max serialized size configurable
Huaisi Xu has posted comments on this change. Change subject: IMPALA-3552: make incremental stats max serialized size configurable .. Patch Set 1: (7 comments) http://gerrit.cloudera.org:8080/#/c/4772/1/be/src/catalog/catalog.cc File be/src/catalog/catalog.cc: PS1, Line 33: inc_stat_max_size could you be explicit on the unit? http://gerrit.cloudera.org:8080/#/c/4772/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: Line 164: SentryConfig sentryConfig, TUniqueId catalogServiceId, String kerberosPrincipal, long incStatMaxSize) { long line http://gerrit.cloudera.org:8080/#/c/4772/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: PS1, Line 176: 0 This should not be a static variable since no one outside this class is using this anymore. http://gerrit.cloudera.org:8080/#/c/4772/1/fe/src/main/java/org/apache/impala/common/RuntimeEnv.java File fe/src/main/java/org/apache/impala/common/RuntimeEnv.java: PS1, Line 71: incStatMaxSize getIncStatMaxSize http://gerrit.cloudera.org:8080/#/c/4772/1/fe/src/main/java/org/apache/impala/service/JniCatalog.java File fe/src/main/java/org/apache/impala/service/JniCatalog.java: Line 84: boolean allowAuthToLocal, String kerberosPrincipal, long incStatMaxSize) throws InternalException { long line Line 101: numMetadataLoadingThreads, sentryConfig, getServiceId(), kerberosPrincipal, incStatMaxSize); long http://gerrit.cloudera.org:8080/#/c/4772/1/fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java File fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java: Line 34: super(loadInBackground, numLoadingThreads, sentryConfig, catalogServiceId, null, 200*1024*1024); long -- To view, visit http://gerrit.cloudera.org:8080/4772 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4d189ffc1c56e7a4e27e6233e9119250a2395a19 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun HwangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Yonghyun Hwang Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4086: Add benchmark for simple scheduler
Dan Hecht has posted comments on this change. Change subject: IMPALA-4086: Add benchmark for simple scheduler .. Patch Set 1: Lars, are you still working on this? Anything blocking from making progress on the review? -- To view, visit http://gerrit.cloudera.org:8080/4554 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I89ec1c6c1828bb0b86d1e13ce4dfc5a8df865c2e Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4057:fix webserver interface with 127.0.0.1 when start impala process
Dan Hecht has abandoned this change. Change subject: IMPALA-4057:fix webserver_interface with 127.0.0.1 when start impala process .. Abandoned It looks like this is redundant with https://gerrit.cloudera.org/#/c/4553/. You can restore if that's not the case. -- To view, visit http://gerrit.cloudera.org:8080/4202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: abandon Gerrit-Change-Id: Idcc29853b6d5db23b82a7000e0d2b80ac7403f87 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: hewentingGerrit-Reviewer: Henry Robinson
[Impala-ASF-CR] IMPALA-4050: Support starting webserver specified by hostname
Dan Hecht has abandoned this change. Change subject: IMPALA-4050: Support starting webserver specified by hostname .. Abandoned It looks like this is redundant with https://gerrit.cloudera.org/#/c/4553/. You can restore if that's not the case. -- To view, visit http://gerrit.cloudera.org:8080/4314 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: abandon Gerrit-Change-Id: I9131c6e47ce8e56eda20e9eb82d68b85de2fa866 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: hewentingGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Henry Robinson
[Impala-ASF-CR] Minor fixes to remove more "cloudera"s from the code.
Jim Apple has uploaded a new change for review. http://gerrit.cloudera.org:8080/4774 Change subject: Minor fixes to remove more "cloudera"s from the code. .. Minor fixes to remove more "cloudera"s from the code. Change-Id: I27687a3b677def72f1867df54c8e50c93d5cab3a --- M tests/benchmark/report_benchmark_results.py M tests/test-hive-udfs/src/main/java/org/apache/impala/UnresolvedUdf.java 2 files changed, 15 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/4774/1 -- To view, visit http://gerrit.cloudera.org:8080/4774 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I27687a3b677def72f1867df54c8e50c93d5cab3a Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple
[Impala-ASF-CR] IMPALA-3552: make incremental stats max serialized size configurable
Yonghyun Hwang has posted comments on this change. Change subject: IMPALA-3552: make incremental stats max serialized size configurable .. Patch Set 1: local tests as well as jenkins are passing. http://sandbox.jenkins.cloudera.com/job/impala-private-build-and-test/4447/ -- To view, visit http://gerrit.cloudera.org:8080/4772 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4d189ffc1c56e7a4e27e6233e9119250a2395a19 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun HwangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Yonghyun Hwang Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3211: provide toolchain build id for bootstrapping
Michael Brown has posted comments on this change. Change subject: IMPALA-3211: provide toolchain build id for bootstrapping .. Patch Set 2: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/4771 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibcc25ae82511713d0ff05ded37ef162925f2f0fb Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3676: Use clang as a static analysis tool
Jim Apple has uploaded a new patch set (#4). Change subject: IMPALA-3676: Use clang as a static analysis tool .. IMPALA-3676: Use clang as a static analysis tool This patch adds a script to run clang-tidy over the whole code base. It is a first step towards running clang-tidy over patches as a tool to help users spot bugs before code review. Because of the number of clang-tidy checks, this patch only addresses some of them. In particular, only checks starting with 'clang' are considered. Many of them which are flaky or not part of our style are excluded from the analysis. This patch also exlcudes some checks which are part of our current style but which would be too laborious to fix over the entire codebase, like using nullptr rather than NULL. This patch also fixes a number of small bugs found by clang-tidy. Change-Id: I4ed168488cb30ddeccd0087f3840541d858f9c06 --- A .clang-tidy M CMakeLists.txt M be/CMakeLists.txt M be/src/benchmarks/bloom-filter-benchmark.cc M be/src/benchmarks/in-predicate-benchmark.cc M be/src/benchmarks/parse-timestamp-benchmark.cc M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-server.h M be/src/codegen/codegen-anyval.cc M be/src/codegen/impala-ir.cc M be/src/codegen/llvm-codegen.cc M be/src/common/init.cc M be/src/common/logging.h M be/src/common/status.h M be/src/exec/delimited-text-parser.cc M be/src/exec/delimited-text-parser.h M be/src/exec/exec-node.h M be/src/exec/external-data-source-executor.h M be/src/exec/hash-table-test.cc M be/src/exec/hbase-scan-node.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/exec/hdfs-table-writer.h M be/src/exec/parquet-column-readers.h M be/src/exec/partitioned-hash-join-builder-ir.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/scanner-context.inline.h M be/src/exec/write-stream.h M be/src/experiments/bit-stream-utils.8byte.inline.h M be/src/experiments/tuple-splitter-test.cc M be/src/exprs/aggregate-functions-ir.cc M be/src/exprs/aggregate-functions.h M be/src/exprs/compound-predicates.h M be/src/exprs/expr-test.cc M be/src/exprs/slot-ref.cc M be/src/exprs/string-functions-ir.cc M be/src/rpc/TAcceptQueueServer.h M be/src/rpc/authentication.cc M be/src/rpc/authentication.h M be/src/rpc/thrift-server.h M be/src/runtime/buffered-block-mgr.h M be/src/runtime/buffered-tuple-stream.cc M be/src/runtime/buffered-tuple-stream.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/data-stream-sender.cc M be/src/runtime/data-stream-sender.h M be/src/runtime/descriptors.cc M be/src/runtime/descriptors.h M be/src/runtime/exec-env.h M be/src/runtime/free-pool-test.cc M be/src/runtime/hbase-table.cc M be/src/runtime/hdfs-fs-cache.h M be/src/runtime/multi-precision-test.cc M be/src/runtime/runtime-filter-bank.h M be/src/runtime/runtime-filter.h M be/src/runtime/scoped-buffer.h M be/src/runtime/string-buffer.h M be/src/runtime/timestamp-parse-util.cc M be/src/runtime/timestamp-value.cc M be/src/runtime/tmp-file-mgr.cc M be/src/runtime/tmp-file-mgr.h M be/src/scheduling/query-schedule.cc M be/src/scheduling/query-schedule.h M be/src/scheduling/simple-scheduler-test.cc M be/src/service/fe-support.cc M be/src/service/impala-beeswax-server.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-exec-state.cc M be/src/service/query-exec-state.h M be/src/service/query-result-set.cc M be/src/statestore/failure-detector.h M be/src/statestore/statestore-subscriber.h M be/src/statestore/statestore.cc M be/src/statestore/statestore.h M be/src/testutil/test-udas.cc M be/src/testutil/test-udas.h M be/src/transport/TSasl.h M be/src/transport/TSaslClientTransport.h M be/src/transport/TSaslServerTransport.h M be/src/transport/TSaslTransport.h M be/src/udf/uda-test-harness.h M be/src/udf/uda-test.cc M be/src/udf/udf-ir.cc M be/src/udf/udf.cc M be/src/udf/udf.h M be/src/udf_samples/uda-sample.cc A be/src/util/aligned-new.h M be/src/util/benchmark-test.cc M be/src/util/bit-stream-utils.inline.h M be/src/util/bit-util-test.cc M be/src/util/blocking-queue-test.cc M be/src/util/blocking-queue.h M be/src/util/bloom-filter-test.cc M be/src/util/buffer-builder.h M be/src/util/compress.cc M be/src/util/decompress.cc M be/src/util/metrics.h M be/src/util/minidump.cc M be/src/util/network-util.cc M be/src/util/parquet-reader.cc M be/src/util/periodic-counter-updater.h M be/src/util/pprof-path-handlers.cc M be/src/util/progress-updater.cc M be/src/util/rle-test.cc M be/src/util/runtime-profile.h M be/src/util/thread-pool.h M be/src/util/webserver.cc A bin/run_clang_tidy.sh M buildall.sh M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java 114 files changed, 580 insertions(+), 352 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/4758/4 -- To view, visit
[Impala-ASF-CR] IMPALA-3552: make incremental stats max serialized size configurable
Yonghyun Hwang has uploaded a new change for review. http://gerrit.cloudera.org:8080/4772 Change subject: IMPALA-3552: make incremental stats max serialized size configurable .. IMPALA-3552: make incremental stats max serialized size configurable The fix fox "IMPALA-2648/IMPALA-2664" introduced a conservative limitation on the maximum serialized size of incremental stats. As a side effect, some users with very large tables are experiencing regressions especially when upgrading. To mitigate the issue, the change introduces a new gflag, 'inc_stat_max_size' to make the max serialized size configurable, which allows impala users to specify their own maximum serialized size. Default value for inc_stat_max_size is 200MB. Change-Id: I4d189ffc1c56e7a4e27e6233e9119250a2395a19 --- M be/src/catalog/catalog.cc M be/src/common/global-flags.cc M be/src/service/fe-support.cc M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/common/RuntimeEnv.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java 10 files changed, 33 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/4772/1 -- To view, visit http://gerrit.cloudera.org:8080/4772 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I4d189ffc1c56e7a4e27e6233e9119250a2395a19 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yonghyun Hwang
[Impala-ASF-CR] IMPALA-3676: Use clang as a static analysis tool
Jim Apple has posted comments on this change. Change subject: IMPALA-3676: Use clang as a static analysis tool .. Patch Set 1: (6 comments) http://gerrit.cloudera.org:8080/#/c/4758/2/be/CMakeLists.txt File be/CMakeLists.txt: Line 106: SET(CXX_FLAGS_TIDY "${CXX_FLAGS_TIDY} --gcc-toolchain=${GCC_ROOT}") > Ah right, I didn't realise the ASAN build got the --gcc-toolchain flag in a I think that might be just for building the toolchain. I tried changing it and compile_commands.json didn't change. http://gerrit.cloudera.org:8080/#/c/4758/2/be/src/exprs/aggregate-functions-ir.cc File be/src/exprs/aggregate-functions-ir.cc: Line 175 > I think it might manifest as a link-time error if you do anything with the Done http://gerrit.cloudera.org:8080/#/c/4758/2/be/src/exprs/slot-ref.cc File be/src/exprs/slot-ref.cc: PS2, Line 176: static > It doesn't make a difference for correctness, and I don't see any reason we Done http://gerrit.cloudera.org:8080/#/c/4758/1/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: > See my other commetn about space savings I'm interested in your thoughts on which struct reordering s are, in your opinion, worth it http://gerrit.cloudera.org:8080/#/c/4758/1/be/src/util/aligned-new.h File be/src/util/aligned-new.h: Line 42: throw std::bad_alloc(); > Missed responding to this one - no callers in the Impala codebase catch the Done http://gerrit.cloudera.org:8080/#/c/4758/1/be/src/util/runtime-profile.h File be/src/util/runtime-profile.h: Line 335 > I don't think there are enough RuntimeProfile objects to produce a noticabl I have put these variables back in the logical (but suboptimaly packed) order and added a NOLINT annotation. I think the default should be to pack your structs, but that it's fine not to do so sometimes. Thoughts? -- To view, visit http://gerrit.cloudera.org:8080/4758 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4ed168488cb30ddeccd0087f3840541d858f9c06 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim AppleGerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3211: provide toolchain build id for bootstrapping
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-3211: provide toolchain build id for bootstrapping .. Patch Set 2: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/4771 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibcc25ae82511713d0ff05ded37ef162925f2f0fb Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3211: provide toolchain build id for bootstrapping
Tim Armstrong has posted comments on this change. Change subject: IMPALA-3211: provide toolchain build id for bootstrapping .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/4771/1/bin/bootstrap_toolchain.py File bin/bootstrap_toolchain.py: PS1, Line 91: file_name = "{0}-{1}-{2}-{3}.tar.gz".format(product, version, compiler, label) : url_path = "/{0}/{1}/{2}-{3}/{1}-{2}-{3}-{4}.tar.gz".format(toolchain_build_id, product, : version, compiler, label) > Would you be agreeable to using kwargs in str.format()? That would allow yo Good idea. -- To view, visit http://gerrit.cloudera.org:8080/4771 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibcc25ae82511713d0ff05ded37ef162925f2f0fb Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3211: provide toolchain build id for bootstrapping
Michael Brown has posted comments on this change. Change subject: IMPALA-3211: provide toolchain build id for bootstrapping .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/4771/1/bin/bootstrap_toolchain.py File bin/bootstrap_toolchain.py: PS1, Line 91: file_name = "{0}-{1}-{2}-{3}.tar.gz".format(product, version, compiler, label) : url_path = "/{0}/{1}/{2}-{3}/{1}-{2}-{3}-{4}.tar.gz".format(toolchain_build_id, product, : version, compiler, label) Would you be agreeable to using kwargs in str.format()? That would allow you to have more readable format strings. formatted_str = '{key1}/{key2}'.format(key1=value1, key2=value2) -- To view, visit http://gerrit.cloudera.org:8080/4771 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibcc25ae82511713d0ff05ded37ef162925f2f0fb Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4241: remove spurious child queries event
Henry Robinson has posted comments on this change. Change subject: IMPALA-4241: remove spurious child queries event .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/4768/1/be/src/service/query-exec-state.cc File be/src/service/query-exec-state.cc: PS1, Line 623: child_queries remind me why we don't put line 616->622 in this kind of conditional? ISTR there was a reason we wanted to run child_query_executor_->WaitForAll() in all cases, but it escapes me now. -- To view, visit http://gerrit.cloudera.org:8080/4768 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3881d032622750444d750f161ad6843bdbd16c30 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3872: allow providing PyPi mirror for python packages
Tim Armstrong has uploaded a new change for review. http://gerrit.cloudera.org:8080/4770 Change subject: IMPALA-3872: allow providing PyPi mirror for python packages .. IMPALA-3872: allow providing PyPi mirror for python packages We still rely on the python.org json API, which doesn't seem to be mirrored (instead there's a html-based index format implemented by the mirrors). Change-Id: Ibc11f010332c0225121c86c9930e35c7ac01409c --- M infra/python/deps/pip_download.py 1 file changed, 8 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/4770/1 -- To view, visit http://gerrit.cloudera.org:8080/4770 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ibc11f010332c0225121c86c9930e35c7ac01409c Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-3211: provide toolchain build id for bootstrapping
Tim Armstrong has uploaded a new change for review. http://gerrit.cloudera.org:8080/4771 Change subject: IMPALA-3211: provide toolchain build id for bootstrapping .. IMPALA-3211: provide toolchain build id for bootstrapping Testing: Ran a private build, which succeeded. Change-Id: Ibcc25ae82511713d0ff05ded37ef162925f2f0fb --- M bin/bootstrap_toolchain.py M bin/impala-config.sh 2 files changed, 7 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/4771/1 -- To view, visit http://gerrit.cloudera.org:8080/4771 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ibcc25ae82511713d0ff05ded37ef162925f2f0fb Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-4241: remove spurious child queries event
Tim Armstrong has uploaded a new change for review. http://gerrit.cloudera.org:8080/4768 Change subject: IMPALA-4241: remove spurious child queries event .. IMPALA-4241: remove spurious child queries event "IMPALA-4037,IMPALA-4038: fix locking during query cancellation" accidentally added the "Child queries finished" event unconditionally. We should only do this if there are actually child queries. Change-Id: I3881d032622750444d750f161ad6843bdbd16c30 --- M be/src/service/query-exec-state.cc 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/4768/1 -- To view, visit http://gerrit.cloudera.org:8080/4768 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I3881d032622750444d750f161ad6843bdbd16c30 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-1670,IMPALA-4141: Support multiple partitions in ALTER TABLE ADD PARTITION
Attila Jeges has posted comments on this change. Change subject: IMPALA-1670,IMPALA-4141: Support multiple partitions in ALTER TABLE ADD PARTITION .. Patch Set 14: (1 comment) http://gerrit.cloudera.org:8080/#/c/4144/14/tests/metadata/test_hms_integration.py File tests/metadata/test_hms_integration.py: PS14, Line 714: # Sometimes metadata load is triggered here, so compare only the first three : # returned partitions. > I am not sure this explanation and the fix is correct. SYNC_DDL causes a me I did some additional investigation. You are correct that setting SYNC_DDL is not directly related to the issue, just made it easier to reproduce. (A) The issue can be reproduced manually with Hive and Impala shells, so this is definitely not a bug in the python tests. Steps to reproduce: [0] Start Impala and Hive shells side-by-side. [1] impala> create table ptbl (a int) partitioned by (x int); [2] impala> show partitions ptbl; [3] hive> alter table ptbl add partition (x=1); [4] impala> show partitions ptbl; [5] impala> set sync_ddl=1; (OR wait for few seconds between [6] and [7]) [6] impala> alter table ptbl add partition (x=2) cached in 'testPool'; [7] impala> show partitions ptbl; In [2] and [4] the list of partitions is empty. In [7] the list of partitions should contain (x=2) only, but it contains both (x=1) and (x=2). (A.1) This behavior can be reproduced without the IMPALA-1670 related changes, so this is not something that IMPALA-1670 introduced. (A.2) Setting SYNC_DDL in [5] is not necessary, we get the same behavior if we wait for a few seconds instead between [6] and [7]. (A.3) If we run [6] without any caching options, Impala works as expected and in [7] the list of partitions contains only (x=2) (B) In asf-gerrit/master branch (without the IMPALA-1670 change-set): CatalogOpExecutor.alterTableAddPartition() calls CatalogServiceCatalog.watchCacheDirs() with cache directive IDs (L1704). The comments for CatalogServiceCatalog.watchCacheDirs() state: "Adds a list of cache directive IDs for the given table name. Asynchronously refreshes the table metadata once all cache directives complete." So, this is what triggers the metadata load. How do you think I should proceed? -- To view, visit http://gerrit.cloudera.org:8080/4144 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750 Gerrit-PatchSet: 14 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Attila JegesGerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4325: StmtRewrite lost parentheses of CompoundPredicate
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4325: StmtRewrite lost parentheses of CompoundPredicate .. Patch Set 4: Verified-1 Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/326/ -- To view, visit http://gerrit.cloudera.org:8080/4753 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I79bfc67605206e0e026293bf7032a88227a95623 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yuanhao LuoGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Yuanhao Luo Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4301: Fix IGNORE NULLS with subquery rewriting.
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4301: Fix IGNORE NULLS with subquery rewriting. .. Patch Set 8: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/4732 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I708de7925fe6aeef582fd7510da93d24c71229d9 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4301: Fix IGNORE NULLS with subquery rewriting.
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-4301: Fix IGNORE NULLS with subquery rewriting. .. IMPALA-4301: Fix IGNORE NULLS with subquery rewriting. AnayticExpr.analyze() replaces the original FIRST/LAST_VALUE function with a FIRST/LAST_VALUE_IGNORE_NULLS function if the IGNORE NULLS clause is specified. The bug was that several places in AnalyticExpr.analyze() assumed and asserted that only the original FIRST/LAST_VALUE function could be encountered during analysis. However, with subquery rewriting the IGNORE NULLS version of the function may also be seen because the whole statement is re-analyzed after rewriting. The fix is to unset the IGNORE NULLS flag of the function params after changing the analytic function name. Change-Id: I708de7925fe6aeef582fd7510da93d24c71229d9 Reviewed-on: http://gerrit.cloudera.org:8080/4732 Reviewed-by: Alex BehmTested-by: Internal Jenkins --- M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java M fe/src/main/java/org/apache/impala/analysis/FunctionParams.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java 3 files changed, 17 insertions(+), 9 deletions(-) Approvals: Internal Jenkins: Verified Alex Behm: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/4732 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I708de7925fe6aeef582fd7510da93d24c71229d9 Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs
[Impala-ASF-CR] IMPALA-4309: Introduce Expr rewrite phase and supporting classes.
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-4309: Introduce Expr rewrite phase and supporting classes. .. Patch Set 4: (13 comments) Flushing out some comments. Haven't looked at tests yet. http://gerrit.cloudera.org:8080/#/c/4746/4//COMMIT_MSG Commit Message: PS4, Line 13: need remove http://gerrit.cloudera.org:8080/#/c/4746/4/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java File fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java: PS4, Line 378: rewriter_.reset(); : analysisResult_.stmt_.rewriteExprs(rewriter_); : reAnalyze = rewriter_.changed(); I haven't looked at the rewriter_ class yet, but it's not intuitive to me that the rewriter maintains state to indicate that it modified whatever entity it was applied on (Exprs, subqueries, etc). It should either return the modified entity or the entity itself should track modification. Again, I haven't gone through all the classes, just wanted to add a note. PS4, Line 383: StmtRewriter.rewrite(analysisResult_); This is the dual of L379 :). Here, the rewriter applies rewrite() on the analysisResult_ whereas in L379 the analysisResult_ calls rewrite by taking the rewriter as a param. Maybe at some point we need to make it consistent. http://gerrit.cloudera.org:8080/#/c/4746/4/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: PS4, Line 60: rewrite Maybe "rewriter"? Seeing a verb here is kind of weird :) PS4, Line 1037: if (conjunct instanceof BetweenPredicate) { : // We might be in the first analysis pass, so the conjunct may not have been : // rewritten yet. : conjunct = new BetweenToCompoundRule().rewrite(conjunct, this); : } Maybe comment that this is needed for the EvalPredicate because the BE can't handle the original between predicate. http://gerrit.cloudera.org:8080/#/c/4746/4/fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java File fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java: PS4, Line 37: : : : : : : So nice!!! PS4, Line 152: : : : : : I think it's best to add a comment on why BetweenPredicate doesn't need reset any more. PS4, Line 24: predicates predicate PS4, Line 25: there remove http://gerrit.cloudera.org:8080/#/c/4746/4/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java File fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java: PS4, Line 220: createStmt_.reset(); We reset a createTableStmt? http://gerrit.cloudera.org:8080/#/c/4746/4/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java File fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java: PS4, Line 206: stmt.whereClause_ = new BetweenToCompoundRule().rewrite(stmt.whereClause_, analyzer); Shouldn't this happen already by the rewriteExprs() call at the top-level select stmt? http://gerrit.cloudera.org:8080/#/c/4746/4/fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java File fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java: PS4, Line 118: Expr clonedConjunct = betweenToCompoundRule_.rewrite(conjunct.clone(), analyzer); Maybe I am missing something, but shouldn't the conjunct already be rewritten? http://gerrit.cloudera.org:8080/#/c/4746/4/fe/src/main/java/org/apache/impala/rewrite/ExprRewriteRule.java File fe/src/main/java/org/apache/impala/rewrite/ExprRewriteRule.java: PS4, Line 35: returns the :* transformed Expr tree What happens if no changes occurred? Is the return value the same as 'expr'. Maybe expand the comment. -- To view, visit http://gerrit.cloudera.org:8080/4746 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2279dc984bcf7742db4fa3b1aa67283ecbb05e6e Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4325: StmtRewrite lost parentheses of CompoundPredicate
Alex Behm has posted comments on this change. Change subject: IMPALA-4325: StmtRewrite lost parentheses of CompoundPredicate .. Patch Set 3: Great, thanks! I'll get this merged for you. -- To view, visit http://gerrit.cloudera.org:8080/4753 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I79bfc67605206e0e026293bf7032a88227a95623 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yuanhao LuoGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Yuanhao Luo Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4325: StmtRewrite lost parentheses of CompoundPredicate
Yuanhao Luo has posted comments on this change. Change subject: IMPALA-4325: StmtRewrite lost parentheses of CompoundPredicate .. Patch Set 3: Yes, I have walk around other planner tests in this file(but not other files), and the expected output of them would not change. Only the output of the query which there were parentheses in where clause of sub-query would be changed. -- To view, visit http://gerrit.cloudera.org:8080/4753 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I79bfc67605206e0e026293bf7032a88227a95623 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yuanhao LuoGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Yuanhao Luo Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4325: StmtRewrite lost parentheses of CompoundPredicate
Alex Behm has posted comments on this change. Change subject: IMPALA-4325: StmtRewrite lost parentheses of CompoundPredicate .. Patch Set 4: Code-Review+2 rebased -- To view, visit http://gerrit.cloudera.org:8080/4753 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I79bfc67605206e0e026293bf7032a88227a95623 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yuanhao LuoGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Yuanhao Luo Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4285: Fix Parquet scanner with MT DOP > 0 and no materialized slots.
Alex Behm has uploaded a new change for review. http://gerrit.cloudera.org:8080/4767 Change subject: IMPALA-4285: Fix Parquet scanner with MT_DOP > 0 and no materialized slots. .. IMPALA-4285: Fix Parquet scanner with MT_DOP > 0 and no materialized slots. The problem was that there was a reference to HdfsScanner::batch_ hidden inside WriteEmptyTuples(). The batch_ reference is NULL when the scanner is run in MT mode (i.e. with MT_DOP > 1). Testing: Did not add a test since we don't have a good place to add MT tests yet. Change-Id: I79c0f6fd2aeb4bc6fa5f87219a485194fef2db1b --- M be/src/exec/hdfs-scanner.h 1 file changed, 4 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/4767/1 -- To view, visit http://gerrit.cloudera.org:8080/4767 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I79c0f6fd2aeb4bc6fa5f87219a485194fef2db1b Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm
[Impala-ASF-CR] IMPALA-4325: StmtRewrite lost parentheses of CompoundPredicate
Alex Behm has posted comments on this change. Change subject: IMPALA-4325: StmtRewrite lost parentheses of CompoundPredicate .. Patch Set 3: Code-Review+2 Did you check whether the expected output of other planner tests changed? -- To view, visit http://gerrit.cloudera.org:8080/4753 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I79bfc67605206e0e026293bf7032a88227a95623 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yuanhao LuoGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Yuanhao Luo Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4325: StmtRewrite lost parentheses of CompoundPredicate
Yuanhao Luo has uploaded a new patch set (#3). Change subject: IMPALA-4325: StmtRewrite lost parentheses of CompoundPredicate .. IMPALA-4325: StmtRewrite lost parentheses of CompoundPredicate StmtRewrite lost parentheses of CompoundPredicate in pushNegationToOperands() and leads to incorrect toSql() result. Even though this issue would not leads to incorrect result of query, it makes user confuse of the logical operator precedence of predicates shown in EXPLAIN statement. Change-Id: I79bfc67605206e0e026293bf7032a88227a95623 --- M fe/src/main/java/org/apache/impala/analysis/Expr.java M testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test 2 files changed, 28 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/4753/3 -- To view, visit http://gerrit.cloudera.org:8080/4753 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I79bfc67605206e0e026293bf7032a88227a95623 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yuanhao LuoGerrit-Reviewer: Alex Behm
[Impala-ASF-CR] IMPALA-4325: StmtRewrite lost parentheses of CompoundPredicate
Yuanhao Luo has posted comments on this change. Change subject: IMPALA-4325: StmtRewrite lost parentheses of CompoundPredicate .. Patch Set 3: Thank you for your patience. -- To view, visit http://gerrit.cloudera.org:8080/4753 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I79bfc67605206e0e026293bf7032a88227a95623 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Yuanhao LuoGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Yuanhao Luo Gerrit-HasComments: No