[Impala-ASF-CR] IMPALA-3200: move bufferpool under runtime
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-3200: move bufferpool under runtime .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/5165 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3507066fb5cb955f0872d31a4619bd8bb7d647cd Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3200: move bufferpool under runtime
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-3200: move bufferpool under runtime .. IMPALA-3200: move bufferpool under runtime It is arguably a subcomponent of the runtime system Change-Id: I3507066fb5cb955f0872d31a4619bd8bb7d647cd Reviewed-on: http://gerrit.cloudera.org:8080/5165 Reviewed-by: Tim Armstrong Tested-by: Impala Public Jenkins --- M .clang-tidy M be/CMakeLists.txt M be/src/runtime/CMakeLists.txt R be/src/runtime/bufferpool/CMakeLists.txt R be/src/runtime/bufferpool/buffer-allocator.cc R be/src/runtime/bufferpool/buffer-allocator.h R be/src/runtime/bufferpool/buffer-pool-counters.h R be/src/runtime/bufferpool/buffer-pool-test.cc R be/src/runtime/bufferpool/buffer-pool.cc R be/src/runtime/bufferpool/buffer-pool.h R be/src/runtime/bufferpool/reservation-tracker-counters.h R be/src/runtime/bufferpool/reservation-tracker-test.cc R be/src/runtime/bufferpool/reservation-tracker.cc R be/src/runtime/bufferpool/reservation-tracker.h M be/src/runtime/mem-tracker.cc 15 files changed, 25 insertions(+), 27 deletions(-) Approvals: Impala Public Jenkins: Verified Tim Armstrong: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/5165 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I3507066fb5cb955f0872d31a4619bd8bb7d647cd Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-2890: Support ALTER TABLE statements for Kudu tables
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-2890: Support ALTER TABLE statements for Kudu tables .. Patch Set 2: (19 comments) http://gerrit.cloudera.org:8080/#/c/5136/2/common/thrift/JniCatalog.thrift File common/thrift/JniCatalog.thrift: Line 185: 4: optional string location > restore original numbers Oops, sorry about that. Done http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java File fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java: Line 88: throw new AnalysisException("Key-value partition specs are not supported for " + > how about "ALTER TABLE ADD PARTITION is not supported for Kudu tables." or Done http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/main/java/org/apache/impala/analysis/AlterTableAddRangePartitionStmt.java File fe/src/main/java/org/apache/impala/analysis/AlterTableAddRangePartitionStmt.java: Line 50: if (ifNotExists_) { > single line Done Line 79: return; > remove Done http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/main/java/org/apache/impala/analysis/AlterTableAddReplaceColsStmt.java File fe/src/main/java/org/apache/impala/analysis/AlterTableAddReplaceColsStmt.java: Line 123: if (c.hasEncoding() || c.hasCompression() || c.hasBlockSize()) { > Should we let Kudu handle this case and the one below? Does Kudu behave san Unfortunately we can't do that in this case. There is no Kudu API call that accepts all these options, so we have to reject them here. http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/main/java/org/apache/impala/analysis/AlterTableChangeColStmt.java File fe/src/main/java/org/apache/impala/analysis/AlterTableChangeColStmt.java: Line 103: if (newColDef_.hasKuduOptions()) { > we cannot change the default value? Currently not supported. They are working on changing this behavior but it's not there yet. http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java File fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java: Line 79: throw new AnalysisException("Key-value partition specs are not supported for " + > ALTER TABLE DROP PARTITION is not supported for Kudu tables Done http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/main/java/org/apache/impala/analysis/AlterTableDropRangePartitionStmt.java File fe/src/main/java/org/apache/impala/analysis/AlterTableDropRangePartitionStmt.java: Line 32: public class AlterTableDropRangePartitionStmt extends AlterTableStmt { > code seems very similar to AlterTableAddRangePartitionStmt, should we just Done Line 77: return; > remove Done http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: Line 553: case UPDATE_STATS: > I think we should be able to avoid the copy+paste by adding an additional c Good idea. Done Line 2231: KuduCatalogOpExecutor.renameTable((KuduTable) tbl, > I don't think we should rename the Kudu table for external tables. An exter Done http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java: Line 1831: String[] addDrop = {"add if not exists", "drop if exists"}; > what about just "add" and "drop" Added. Done Line 1868: AnalysisError("alter table functional_kudu.testtbl add columns (a1 int)", > How do we know a1 is going to be non-nullable at this point (in analysis)? Done Line 1875: AnalysisError("alter table functional_kudu.testtbl add columns " + > use a loop to cover the cases separately (if you decide to keep these check As I mentioned in a previous comment, we can't defer these to Kudu (not supported API), so they have to be checked during the analysis. Added the tests here. http://gerrit.cloudera.org:8080/#/c/5136/2/testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test File testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test: Line 52: distribute by range (id) (partition 1 < values <= 10) stored as kudu > Test dropping all partitions and scanning/inserting into the empty table. Done Line 156: # Insert a row that doesn't have values for the new columns; defaults are used > Sorry this doesn't make sense. I was thinking of ALTERing the default value No worries, we will have to add tests like that when we have the ability to change default values. Line 157: insert into tbl_to_alter (id,name,vali) values (3, 'test', 200) > Can you try this same test but with NULLs for the new columns? The insert s Done Line 257: create table copy_of_tbl (a int primary key) distribute by hash (a) into 3 buckets > new test for the external table "poi
[Impala-ASF-CR] IMPALA-2890: Support ALTER TABLE statements for Kudu tables
Dimitris Tsirogiannis has uploaded a new patch set (#4). Change subject: IMPALA-2890: Support ALTER TABLE statements for Kudu tables .. IMPALA-2890: Support ALTER TABLE statements for Kudu tables With this commit, we add support for additional ALTER TABLE statements against Kudu tables. The new supported ALTER TABLE operations for Kudu are: - ADD/DROP range partitions. Syntax: ALTER TABLE ADD [IF NOT EXISTS] RANGE ALTER TABLE DROP [IF EXISTS] RANGE - ADD/DROP/RENAME column. Syntax: ALTER TABLE ADD COLUMNS (col_spec, [col_spec, ...]) ALTER TABLE DROP COLUMN ALTER TABLE CHANGE COLUMN - Rename Kudu table using the 'kudu.table_name' table property. Example: ALTER TABLE SET TBLPROPERTY ('kudu.tbl_name'=''), will change the underlying Kudu table name to . - Renaming the HMS/Catalog table entry of a Kudu table is supported using the existing ALTER TABLE RENAME TO syntax. Not supported: - ALTER TABLE REPLACE COLUMNS Change-Id: I04bc87e04e05da5cc03edec79d13cedfd2012896 --- M common/thrift/JniCatalog.thrift M fe/src/main/cup/sql-parser.cup A fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableAddReplaceColsStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableChangeColStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableSetFileFormatStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableSetLocationStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java M fe/src/main/java/org/apache/impala/analysis/DistributeParam.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test M tests/query_test/test_kudu.py 20 files changed, 964 insertions(+), 59 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/36/5136/4 -- To view, visit http://gerrit.cloudera.org:8080/5136 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I04bc87e04e05da5cc03edec79d13cedfd2012896 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis
[Impala-ASF-CR] IMPALA-4511: Add missing total time counter() to PFE::Exec()
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4511: Add missing total_time_counter() to PFE::Exec() .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/5149 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifdd6c059bdf022303920720291ebb319a69d60ec Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4511: Add missing total time counter() to PFE::Exec()
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-4511: Add missing total_time_counter() to PFE::Exec() .. IMPALA-4511: Add missing total_time_counter() to PFE::Exec() Also add sanity check in debug build to catch this kind of bug in the future. Change-Id: Ifdd6c059bdf022303920720291ebb319a69d60ec Reviewed-on: http://gerrit.cloudera.org:8080/5149 Reviewed-by: Dan Hecht Tested-by: Internal Jenkins --- M be/src/runtime/plan-fragment-executor.cc 1 file changed, 11 insertions(+), 0 deletions(-) Approvals: Internal Jenkins: Verified Dan Hecht: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/5149 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ifdd6c059bdf022303920720291ebb319a69d60ec Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins
[Impala-ASF-CR] IMPALA-4432: Handle internal codegen disabling properly
Dan Hecht has posted comments on this change. Change subject: IMPALA-4432: Handle internal codegen disabling properly .. Patch Set 7: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/5105/7/be/src/runtime/runtime-state.h File be/src/runtime/runtime-state.h: PS7, Line 168: The This? PS7, Line 195: or there are expressions in the fragment : ///which cannot be interpreted. on second thought, maybe #2 makes more sense if worded like you had it: "it's not disabled by the internal hint or ..." given how the second clause works. up to you if you want to revert. -- To view, visit http://gerrit.cloudera.org:8080/5105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0b6a9ed723c64ba21b861608583cc9b6607d3397 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3342: Add thread counters to monitor plan fragment execution
Internal Jenkins has posted comments on this change. Change subject: IMPALA-3342: Add thread counters to monitor plan fragment execution .. Patch Set 10: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/4633 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifa88aa6f3371fa42d11ecc122f43c7d83623c300 Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: anujphadke Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Yonghyun Hwang Gerrit-Reviewer: anujphadke Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
Alex Behm has posted comments on this change. Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading .. Patch Set 2: (18 comments) High-level comments. Let's get through them first and then I can dig in deeper. http://gerrit.cloudera.org:8080/#/c/5148/2/fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java File fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java: Line 27: * This class encapsulates the logic required for mapping HDFS Much fluff, simplify to: Maps HDFS storage-UUIDs to per-host 0-based, sequential disk ids required by Impala backends. Line 29: * by the backend. required by Impala backends Line 31: public class DiskIdMapper { Make this a singleton to clarify the intended use. Line 36: // This is intentionally implemented as a static object to share the storageIDs across move this part to the class comment Line 38: // consistent across all the tables, without which, we can potentially confuse the also mention that the static solution requires the least amount of memory Line 48: private static ConcurrentHashMap Don't think we need a ConcurrentHashMap or AtomicInteger here, because whenever we update this entry, we also need to protect storageIdToInt, so we need a coarser-grained lock. Line 53: * TODO (Bharath): This method is still prone to some race conditions. Fix it before Correct. I don't think the ConcurrentHashMaps will help us because we need to atomically update both maps sometimes. Line 71: storageIdGenerator.put(host, new AtomicInteger(diskId)); Couple of races here like you already pointed out. http://gerrit.cloudera.org:8080/#/c/5148/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: Line 294: // Keeps track of all the parition paths that are filtered to add block metadata. this comment style for classes and functions /** * */ Line 368: if (!pathMd.filteredParitionPaths.contains(fileStatus.getPath())) continue; Looks like this contains() and the indexOf() below can become very expensive. Instead of the 3 arrays in pathMd we should consider using a HashMap from path to a class that contains the relevant metadata (FD and format). Line 423: for (int i=0; i < fileMd.pathFds.size(); ++i) { i = 0 (add spaces) Line 458: // Only build the storage IDs fs instances that support the BlockLocation API. ... for fs instances ... Line 720: private void loadAllPartitions( After reading through this a few times I'm still confused about the high-level code flow. It would be great to summarize the high-level steps in the function comment. My understanding is that we do something like this: 1. createPartition() for every partition; that populates pathMd; internally createPartition() lists all files in the corresponding partition directory 2. load the block location metadata based on the contents of pathMd The process above seems a bit backwards to me. In the first step we list all files under the partition's directory... and then we do the same thing again in step 2. Intuitively, I was thinking it should look something like this: 1. collect all dirs to call DFS.listFiles() on 2. for each dir, call listFiles(), examine the contents and map them to an msPartition. Based on that complete info do some filtering/checks and finally createPartition(). Another chat would probably help me understand the ideas here better. Line 777: if (dirsToLoad.contains(partDir) || positive is more readable to me if (!dirsToLoad.contains() && isChildPath()) { // Partition with a non-default data directory. dirsToLoad.add(); } http://gerrit.cloudera.org:8080/#/c/5148/2/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java: Line 275:* Returns true if the filesystem supports BlockLocation API. What API is it specifically? LocatedFileStatus.getBlockLocations()? Line 277: public static boolean supportsBlockLocationAPI(FileSystem fs) { supportsBlockLocationApi Line 406: public static boolean isPathChild(Path p1, Path p2) { isChildPath(Path p, Path parent) Line 408: Path parentDir = p1.getParent(); I think this can be improved by using the Path.getDepth(). You can do getParent() a fixed number of times and then do one equals() comparison. (Basically you walk up the path starting at p1 until the parentDir is at the same depth as p2) -- To view, visit http://gerrit.cloudera.org:8080/5148 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4494: Fix crash in SimpleScheduler
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4494: Fix crash in SimpleScheduler .. Patch Set 6: Code-Review+1 (1 comment) Thanks for your patience, I think this is looking good. http://gerrit.cloudera.org:8080/#/c/5127/6/be/src/scheduling/simple-scheduler.cc File be/src/scheduling/simple-scheduler.cc: Line 250: << be_desc.address; Could use continue; here to be consistent with the above checks. -- To view, visit http://gerrit.cloudera.org:8080/5127 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6e1196a2fa47e5954c4a190aa326c135d039a77f Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] Preview: IMPALA-4467: Add support for CRUD operations in stress test
Taras Bobrovytsky has uploaded a new patch set (#2). Change subject: Preview: IMPALA-4467: Add support for CRUD operations in stress test .. Preview: IMPALA-4467: Add support for CRUD operations in stress test NOTE: This is a preview, I'm only looking for high level comments about the overall design of the feature. - Added support for upsert and delete statements. Update and insert still need to be done. - Added support for compute stats. - Added support for query options to queries. Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40 --- M tests/stress/concurrent_select.py M tests/util/calculation_util.py 2 files changed, 254 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/5093/2 -- To view, visit http://gerrit.cloudera.org:8080/5093 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown
[Impala-ASF-CR] IMPALA-4516: Don't hold process wide lock connection to sessions map lock while cancelling queries
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4516: Don't hold process wide lock connection_to_sessions_map_lock_ while cancelling queries .. Patch Set 1: Change looks safe to me. We should update the comment in impala-server.h to reflect that it's now a terminal lock -- To view, visit http://gerrit.cloudera.org:8080/5173 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9fe37955027ad9fec3fdbbbcb199245c79bcac71 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3200: move bufferpool under runtime
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-3200: move bufferpool under runtime .. Patch Set 3: Build started: http://ec2-35-161-220-160.us-west-2.compute.amazonaws.com:8080/job/gerrit-verify-dryrun/38/ -- To view, visit http://gerrit.cloudera.org:8080/5165 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3507066fb5cb955f0872d31a4619bd8bb7d647cd Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4516: Don't hold process wide lock connection to sessions map lock while cancelling queries
Sailesh Mukil has posted comments on this change. Change subject: IMPALA-4516: Don't hold process wide lock connection_to_sessions_map_lock_ while cancelling queries .. Patch Set 1: Stress test running successfully for 5 hours so far. -- To view, visit http://gerrit.cloudera.org:8080/5173 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9fe37955027ad9fec3fdbbbcb199245c79bcac71 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4516: Don't hold process wide lock connection to sessions map lock while cancelling queries
Sailesh Mukil has uploaded a new change for review. http://gerrit.cloudera.org:8080/5173 Change subject: IMPALA-4516: Don't hold process wide lock connection_to_sessions_map_lock_ while cancelling queries .. IMPALA-4516: Don't hold process wide lock connection_to_sessions_map_lock_ while cancelling queries We hold the connection_to_sessions_map_lock_ while closing multiple sessions, which could map to a large number of queries, which means an even larger number of fragments. We hold this process wide lock and a series of other locks while sending cancel RPCs to all the fragments that fall under the above mentioned category. This could slow down the responsiveness to the client by the daemon. Moreover, holding the lock is unnecessary and we can do without it. This patch fixes this path by dropping the lock before we call CloseSessionInternal(). Change-Id: I9fe37955027ad9fec3fdbbbcb199245c79bcac71 --- M be/src/service/impala-server.cc 1 file changed, 18 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/73/5173/1 -- To view, visit http://gerrit.cloudera.org:8080/5173 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I9fe37955027ad9fec3fdbbbcb199245c79bcac71 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil
[Impala-ASF-CR] IMPALA-4431: Add audit event log control mechanism to prevent disk overflow
Donghui Xu has posted comments on this change. Change subject: IMPALA-4431: Add audit event log control mechanism to prevent disk overflow .. Patch Set 11: (3 comments) I have modified the code according to your opinion. The minimum number of audit event log file should be 1 and 0 represents no limit. http://gerrit.cloudera.org:8080/#/c/4971/10/be/src/common/logging.cc File be/src/common/logging.cc: Line 181: if (max_log_files < 1) return; > Why do we not allow 1 here? Is there a particular reason why it wouldn't wo Done PS10, Line 182: Check audit event log files : // Build glob pattern for input e.g. /tmp/impala_audit_event_log_1.0-* : string fname = strings::Substitute( > Maybe some of this could go on fewer lines. Done Line 188: } > nit: only one newline here. Done -- To view, visit http://gerrit.cloudera.org:8080/4971 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8c3229cbdb6275f969c15258c9ccab6efeb24368 Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Donghui Xu Gerrit-Reviewer: Donghui Xu Gerrit-Reviewer: Lars Volker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4431: Add audit event log control mechanism to prevent disk overflow
Donghui Xu has uploaded a new patch set (#11). Change subject: IMPALA-4431: Add audit event log control mechanism to prevent disk overflow .. IMPALA-4431: Add audit event log control mechanism to prevent disk overflow There is no limit to the number of audit event log files. When audit event log is enabled the growing number of log files will fill up the disk space. This patch adds checking and rotation mechanism on audit event log files to prevent file number out of control which causes disk overflow. Change-Id: I8c3229cbdb6275f969c15258c9ccab6efeb24368 --- M be/src/common/init.cc M be/src/common/logging.cc M be/src/common/logging.h M be/src/service/impala-server.cc M be/src/service/impala-server.h 5 files changed, 36 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/4971/11 -- To view, visit http://gerrit.cloudera.org:8080/4971 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8c3229cbdb6275f969c15258c9ccab6efeb24368 Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Donghui Xu Gerrit-Reviewer: Donghui Xu Gerrit-Reviewer: Lars Volker
[Impala-ASF-CR] IMPALA-2523: Make HdfsTableSink aware of clustered input
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-2523: Make HdfsTableSink aware of clustered input .. IMPALA-2523: Make HdfsTableSink aware of clustered input IMPALA-2521 introduced clustering for insert statements. This change makes the HdfsTableSink aware of clustered inputs, so that partitions are opened, written, and closed one by one. This change also adds/modifies tests in several ways: - clustered insert tests switch from selecting all rows from alltypessmall to alltypes. Together with varying settings for batch_size, this results in a larger number of row batches being written. - clustered insert tests select from alltypes instead of functional.alltypes to make sure we also select from various input formats. - clustered insert tests have been added to select from alltypestiny to create inserts with 1 and 2 rows per partition respectively. - exhaustive insert tests now use different values for batch_size: 1, 16, 0 (meaning default, 1024). This is limited to uncompressed parquet files, to maintain a reasonable runtime. On my machine execution of test.insert took 1778 seconds, compared to 1002 seconds with the just default row batch size. - There is additional testing in test_insert_behaviour.py to make sure that insertion over several row batches only creates one file per partition. - It renames the test_insert method to make it unique in the file and allow for effective filtering with -k. - It adds tests to the Analyzer test suite. Change-Id: Ibeda0bdabbfe44c8ac95bf7c982a75649e1b82d0 Reviewed-on: http://gerrit.cloudera.org:8080/4863 Reviewed-by: Lars Volker Reviewed-by: Tim Armstrong Tested-by: Internal Jenkins --- M be/src/exec/hbase-table-sink.cc M be/src/exec/hbase-table-writer.cc M be/src/exec/hbase-table-writer.h M be/src/exec/hdfs-avro-table-writer.cc M be/src/exec/hdfs-avro-table-writer.h M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/hdfs-sequence-table-writer.cc M be/src/exec/hdfs-sequence-table-writer.h M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-sink.h M be/src/exec/hdfs-table-writer.cc M be/src/exec/hdfs-table-writer.h M be/src/exec/hdfs-text-table-writer.cc M be/src/exec/hdfs-text-table-writer.h M bin/impala-config.sh M common/thrift/DataSinks.thrift M fe/src/main/java/org/apache/impala/analysis/DeleteStmt.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/analysis/UpdateStmt.java M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java M fe/src/main/java/org/apache/impala/planner/TableSink.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M testdata/workloads/functional-query/queries/QueryTest/insert.test M tests/query_test/test_insert.py M tests/query_test/test_insert_behaviour.py 26 files changed, 457 insertions(+), 161 deletions(-) Approvals: Lars Volker: Looks good to me, but someone else must approve Internal Jenkins: Verified Tim Armstrong: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/4863 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ibeda0bdabbfe44c8ac95bf7c982a75649e1b82d0 Gerrit-PatchSet: 20 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Ivanfi
[Impala-ASF-CR] IMPALA-2523: Make HdfsTableSink aware of clustered input
Internal Jenkins has posted comments on this change. Change subject: IMPALA-2523: Make HdfsTableSink aware of clustered input .. Patch Set 19: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/4863 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibeda0bdabbfe44c8ac95bf7c982a75649e1b82d0 Gerrit-PatchSet: 19 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Ivanfi Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
Bharath Vissapragada has posted comments on this change. Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading .. Patch Set 2: Moved disk-id-mapping logic to a new class as discussed with Alex. getDiskId() still is prone to some race conditions, so I added a TODO. I just sent out the code for review while I'm working on it. That particular change is siloed and shouldn't affect rest of the review. -- To view, visit http://gerrit.cloudera.org:8080/5148 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
Bharath Vissapragada has uploaded a new patch set (#2). Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading .. IMPALA-4172/IMPALA-3653: Improvements to block metadata loading This patch improves the block metadata loading (locations and disk storage IDs) for partitioned and un-partitioned tables in the Catalog server. Without this patch: -- We loop throuh each and every file in the table/partition directories and call getFileBlockLocations() on it to obtain the block metadata. This results in large no. of RPC calls to namenode, especially with tables with large no. of files/partitions. With this patch: --- We move the block metadata querying to use listStatus() call which accepts a directory as input and fetches the 'BlockLocation' objects for every file recursively in that directory. This improves the metadata loading in the following ways. - For non-partitioned tables, we query all the BlockLocations in a single RPC call in the base table directory and load the corresponding disk IDs. - For partitioned tables, we query the BlockLocations for all the partitions residing under the base table directories in a single RPC and then load every partition with non-default partition directory separately. Also, this patch does away with VolumeIds returned by the HDFS NN and uses the new StorageIDs returned by the BlockLocation class. These StorageIDs are UUID strings and hence are mapped to a per-node 0-based index as expected by the backend. In the upcoming versions of Hadoop APIs, getFileBlockStorageLocations() is deprecated and instead the listStatus() returns BlockLocations with storage IDs embedded. This patch makes use to improvement to reduce an addition RPC to NN to fetch the storage locations. Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26 --- A fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java 3 files changed, 304 insertions(+), 181 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/5148/2 -- To view, visit http://gerrit.cloudera.org:8080/5148 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Alex Behm
[Impala-ASF-CR] IMPALA-4432: Handle internal codegen disabling properly
Michael Ho has posted comments on this change. Change subject: IMPALA-4432: Handle internal codegen disabling properly .. Patch Set 6: (5 comments) http://gerrit.cloudera.org:8080/#/c/5105/6/be/src/runtime/runtime-state.h File be/src/runtime/runtime-state.h: PS6, Line 169: HasScalarFnToCodegen > Maybe ScalarFnRequiresCodegen() or similar? i.e. to make it clear that not Done Line 171: /// Returns true if there is hint to disable codegen. This can be true for single node > "a" Done Line 180: /// of the hint and all expressions in the fragment can be interpreted. > given that this depends on the scalar_fns_to_codegen_ state which isn't que Done. The exact point in which scalar_fn_to_codegen_ is usable may change in an upcoming change. PS6, Line 191: followings > following. Done Line 194: ///which cannot be interpreted. > I think this would be clearer without the double negative. eg.: Done -- To view, visit http://gerrit.cloudera.org:8080/5105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0b6a9ed723c64ba21b861608583cc9b6607d3397 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4432: Handle internal codegen disabling properly
Hello Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5105 to look at the new patch set (#7). Change subject: IMPALA-4432: Handle internal codegen disabling properly .. IMPALA-4432: Handle internal codegen disabling properly There are some conditions in which codegen is disabled internally even if it's enabled in the query option. For instance, the single node optimization or the expression evaluation requests sent from the FE to the BE. These internal disabling of codegen are advisory as their purposes are to reduce the latency for tables with no or very few rows. The internal disabling of codegen doesn't interact well with UDFs which cannot be interpreted (e.g. IR UDF) as it conflates the 'disable_codegen' query option set by the user. As a result, it's hard to differentiate between when codegen is disabled explicitly by users and when it is disabled internally. This change fixes the problem above by adding an explicit flag in TQueryCtx to indicate that codegen is disabled internally. This flag is only advisory. For cases in which codegen is needed to function, this internal flag is ignored and if codegen is disabled via query option, an error is thrown. For this new flag to work with ScalarFnCall, codegen needs to happen after ScalarFnCall::Prepare() because it's hard to tell if a fragment contains any UDF that cannot be interpreted until after ScalarFnCall::Prepare() is called. However, Prepare() needs the codegen object to codegen so it needs to be created before Prepare(). We can either always create the codegen module or defer codegen to a point after ScalarFnCall::Prepare(). The former has the downside of introducing unnecessary latency for say single-node optimization so the latter is implemented. It is needed as part of IMPALA-4192 any way. After this change, ScalarFnCall expressions which need to be codegen'd are inserted into a vector in RuntimeState in ScalarFnCall::Prepare(). Later in the codegen phase, these expressions' GetCodegendComputeFn() will be called after codegen for operators is done. If any of these expressions are already codegen'd indirectly by the operators, GetCodegendComputeFn() will be a no-op. This preserves the behavior that ScalarFnCall will always be codegen'd even if the fragment doesn't contain any codegen enabled operators. Change-Id: I0b6a9ed723c64ba21b861608583cc9b6607d3397 --- M be/src/exec/aggregation-node.cc M be/src/exec/exchange-node.cc M be/src/exec/exec-node.cc M be/src/exec/exec-node.h M be/src/exec/hash-join-node.cc M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/partitioned-aggregation-node.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/sort-node.cc M be/src/exec/topn-node.cc M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/plan-fragment-executor.cc M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/service/fe-support.cc M common/thrift/ImpalaInternalService.thrift M fe/src/main/java/org/apache/impala/planner/Planner.java M tests/query_test/test_udfs.py 24 files changed, 180 insertions(+), 117 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/5105/7 -- To view, visit http://gerrit.cloudera.org:8080/5105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0b6a9ed723c64ba21b861608583cc9b6607d3397 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs
Tim Armstrong has posted comments on this change. Change subject: IMPALA-1430: enable codegen for native UDAs .. Patch Set 4: Rebased onto Michael's IMPALA-4432 commit -- To view, visit http://gerrit.cloudera.org:8080/5161 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs
Tim Armstrong has uploaded a new patch set (#4). Change subject: IMPALA-1430: enable codegen for native UDAs .. IMPALA-1430: enable codegen for native UDAs This uses the existing infrastructure for codegening builtin UDAs and for codegening calls to UDFs. GetUdf() is refactored to support both UDFs and UDAs. IR UDAs are still not allowed by the frontend. It's unclear if we want to enable them going forward because of the difficulties in testing and supporting IR UDFs/UDAs. Testing: test_udfs.py tests UDAs with codegen enabled and disabled Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f --- M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/exec/partitioned-aggregation-node.cc M be/src/exprs/agg-fn-evaluator.cc M be/src/exprs/agg-fn-evaluator.h M be/src/exprs/anyval-util.cc M be/src/exprs/anyval-util.h M be/src/exprs/expr.cc M be/src/exprs/expr.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h 11 files changed, 158 insertions(+), 124 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/5161/4 -- To view, visit http://gerrit.cloudera.org:8080/5161 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Michael Ho
[Impala-ASF-CR] IMPALA-4283: Ensure Kudu-specific lineage and audit behavior
Alex Behm has posted comments on this change. Change subject: IMPALA-4283: Ensure Kudu-specific lineage and audit behavior .. Patch Set 1: (7 comments) http://gerrit.cloudera.org:8080/#/c/5151/1/fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java File fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java: Line 689: if (mentionedColumns.isEmpty()) { Can we reverse the logic as follows: if (tbl instanceof KuduTable) { List mentionedColumns = insertStmt.getMentionedColumns(); Preconditions.checkState(!mentionedColumns.isEmpty()); ... } else { ... } That way the Kudu-specific nature seems clearer. Once this logic is reversed, it almost seems simpler to inline this at the single caller (which already has a similar instanceof KuduTable check). http://gerrit.cloudera.org:8080/#/c/5151/1/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java: Line 795: for (int i = 0; i < resultExprs_.size(); ++i) { We don't really need resultExprs_ right? for (Integer i: mentionedColumns_) { result.add(columns.get(i).getName()); } http://gerrit.cloudera.org:8080/#/c/5151/1/fe/src/main/java/org/apache/impala/planner/Planner.java File fe/src/main/java/org/apache/impala/planner/Planner.java: Line 186: if (ctx_.isUpdateOrDelete()) return fragments; Does this line do anything? I though we established that this is an INSERT or CTAS in L180. I think we need to move this check somewhere else. http://gerrit.cloudera.org:8080/#/c/5151/1/fe/src/test/java/org/apache/impala/analysis/AuditingTest.java File fe/src/test/java/org/apache/impala/analysis/AuditingTest.java: Line 374: public void TestKuduStatements() throws AuthorizationException, AnalysisException { nice! http://gerrit.cloudera.org:8080/#/c/5151/1/testdata/workloads/functional-planner/queries/PlannerTest/lineage.test File testdata/workloads/functional-planner/queries/PlannerTest/lineage.test: Line 4656: # Insert into a Kudu table with a column list specified Upsert Line 4661: "queryText":"insert into functional_kudu.alltypes (int_col, id) select int_col, id from\nfunctional.alltypes where id < 10", upsert? Line 4925: What happens for DELETE and UPDATE statements? Might be worth adding a test. -- To view, visit http://gerrit.cloudera.org:8080/5151 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idc4ca1cd63bcfa4370c240a5c4a4126ed6704f4d Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4432: Handle internal codegen disabling properly
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4432: Handle internal codegen disabling properly .. Patch Set 6: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/5105/6/be/src/runtime/runtime-state.h File be/src/runtime/runtime-state.h: PS6, Line 191: followings following. -- To view, visit http://gerrit.cloudera.org:8080/5105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0b6a9ed723c64ba21b861608583cc9b6607d3397 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3200: move bufferpool under runtime
Tim Armstrong has posted comments on this change. Change subject: IMPALA-3200: move bufferpool under runtime .. Patch Set 3: Code-Review+2 The Cmake script didn't handle be tests living in a subdirectory - fixed it. -- To view, visit http://gerrit.cloudera.org:8080/5165 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3507066fb5cb955f0872d31a4619bd8bb7d647cd Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3200: move bufferpool under runtime
Hello Marcel Kornacker, Internal Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5165 to look at the new patch set (#3). Change subject: IMPALA-3200: move bufferpool under runtime .. IMPALA-3200: move bufferpool under runtime It is arguably a subcomponent of the runtime system Change-Id: I3507066fb5cb955f0872d31a4619bd8bb7d647cd --- M .clang-tidy M be/CMakeLists.txt M be/src/runtime/CMakeLists.txt R be/src/runtime/bufferpool/CMakeLists.txt R be/src/runtime/bufferpool/buffer-allocator.cc R be/src/runtime/bufferpool/buffer-allocator.h R be/src/runtime/bufferpool/buffer-pool-counters.h R be/src/runtime/bufferpool/buffer-pool-test.cc R be/src/runtime/bufferpool/buffer-pool.cc R be/src/runtime/bufferpool/buffer-pool.h R be/src/runtime/bufferpool/reservation-tracker-counters.h R be/src/runtime/bufferpool/reservation-tracker-test.cc R be/src/runtime/bufferpool/reservation-tracker.cc R be/src/runtime/bufferpool/reservation-tracker.h M be/src/runtime/mem-tracker.cc 15 files changed, 25 insertions(+), 27 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/5165/3 -- To view, visit http://gerrit.cloudera.org:8080/5165 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3507066fb5cb955f0872d31a4619bd8bb7d647cd Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4502: test partition ddl predicates breaks on non-HDFS filesystems
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4502: test_partition_ddl_predicates breaks on non-HDFS filesystems .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/5164 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8606aa427cb6e50be3395cdde246abb53db5172c Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4502: test partition ddl predicates breaks on non-HDFS filesystems
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-4502: test_partition_ddl_predicates breaks on non-HDFS filesystems .. IMPALA-4502: test_partition_ddl_predicates breaks on non-HDFS filesystems This is because that test uses 'set cached' and 'set uncached' which are not supported on non-HDFS filesystems. This patch creates a separate test file for non-HDFS filesystems with only supported queries and invokes the right file based on the filesystem. Change-Id: I8606aa427cb6e50be3395cdde246abb53db5172c Reviewed-on: http://gerrit.cloudera.org:8080/5164 Reviewed-by: Sailesh Mukil Tested-by: Internal Jenkins --- R testdata/workloads/functional-query/queries/QueryTest/partition-ddl-predicates-all-fs.test A testdata/workloads/functional-query/queries/QueryTest/partition-ddl-predicates-hdfs-only.test M tests/metadata/test_ddl.py 3 files changed, 43 insertions(+), 23 deletions(-) Approvals: Internal Jenkins: Verified Sailesh Mukil: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/5164 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I8606aa427cb6e50be3395cdde246abb53db5172c Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-3200: move bufferpool under runtime
Internal Jenkins has posted comments on this change. Change subject: IMPALA-3200: move bufferpool under runtime .. Patch Set 2: Verified-1 Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/512/ -- To view, visit http://gerrit.cloudera.org:8080/5165 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3507066fb5cb955f0872d31a4619bd8bb7d647cd Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4478: Initial Kudu client mem tracking for sink
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4478: Initial Kudu client mem tracking for sink .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/5152 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I47f17a81e4362ab490019382fedc66c25f07080a Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4478: Initial Kudu client mem tracking for sink
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-4478: Initial Kudu client mem tracking for sink .. IMPALA-4478: Initial Kudu client mem tracking for sink The Kudu client allocates memory which is not tracked by Impala. There are several sources, but the most significant is the memory allocated by the KuduSession on the write path. This can be >100MB, so it is important to track to avoid OOM. Moving forward, we should have a better way to track Kudu client memory, but for now we must at least handle this potentially problematic case. This changes the KuduTableSink to consume 200MB which should be enough for the 100MB write mutation buffer as well as 100MB worth of errors buffered in the client before Impala takes ownership of them (and deletes them). This is left as a flag because it may turn out to be too high for some users and too low for others. When we have better support from Kudu (including KUDU-1752), we should simplify this. TODO: Handle DML w/ small or known resource requirements (e.g. VALUES specified or query has LIMIT) specially to avoid over-consumption. Testing: Have verified acceptable behavior in the stress test with a simple workload containing DML statements of moderate cardinality. Change-Id: I47f17a81e4362ab490019382fedc66c25f07080a Reviewed-on: http://gerrit.cloudera.org:8080/5152 Reviewed-by: Matthew Jacobs Reviewed-by: Dan Hecht Tested-by: Internal Jenkins --- M be/src/exec/kudu-table-sink.cc 1 file changed, 37 insertions(+), 3 deletions(-) Approvals: Matthew Jacobs: Looks good to me, approved Internal Jenkins: Verified Dan Hecht: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/5152 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I47f17a81e4362ab490019382fedc66c25f07080a Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-2890: Support ALTER TABLE statements for Kudu tables
Alex Behm has posted comments on this change. Change subject: IMPALA-2890: Support ALTER TABLE statements for Kudu tables .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/main/java/org/apache/impala/analysis/AlterTableChangeColStmt.java File fe/src/main/java/org/apache/impala/analysis/AlterTableChangeColStmt.java: Line 103: if (newColDef_.hasKuduOptions()) { we cannot change the default value? http://gerrit.cloudera.org:8080/#/c/5136/2/testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test File testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test: Line 156: # Insert a row that doesn't have values for the new columns; defaults are used > New test somewhere: Sorry this doesn't make sense. I was thinking of ALTERing the default value and then checking the old and new rows. Looks like that is currently not possible. -- To view, visit http://gerrit.cloudera.org:8080/5136 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I04bc87e04e05da5cc03edec79d13cedfd2012896 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2890: Support ALTER TABLE statements for Kudu tables
Alex Behm has posted comments on this change. Change subject: IMPALA-2890: Support ALTER TABLE statements for Kudu tables .. Patch Set 2: (18 comments) http://gerrit.cloudera.org:8080/#/c/5136/2/common/thrift/JniCatalog.thrift File common/thrift/JniCatalog.thrift: Line 185: 4: optional string location restore original numbers http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java File fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java: Line 88: throw new AnalysisException("Key-value partition specs are not supported for " + how about "ALTER TABLE ADD PARTITION is not supported for Kudu tables." or something like that http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/main/java/org/apache/impala/analysis/AlterTableAddRangePartitionStmt.java File fe/src/main/java/org/apache/impala/analysis/AlterTableAddRangePartitionStmt.java: Line 50: if (ifNotExists_) { single line Line 79: return; remove http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/main/java/org/apache/impala/analysis/AlterTableAddReplaceColsStmt.java File fe/src/main/java/org/apache/impala/analysis/AlterTableAddReplaceColsStmt.java: Line 123: if (c.hasEncoding() || c.hasCompression() || c.hasBlockSize()) { Should we let Kudu handle this case and the one below? Does Kudu behave sanely? http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java File fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java: Line 79: throw new AnalysisException("Key-value partition specs are not supported for " + ALTER TABLE DROP PARTITION is not supported for Kudu tables http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/main/java/org/apache/impala/analysis/AlterTableDropRangePartitionStmt.java File fe/src/main/java/org/apache/impala/analysis/AlterTableDropRangePartitionStmt.java: Line 32: public class AlterTableDropRangePartitionStmt extends AlterTableStmt { code seems very similar to AlterTableAddRangePartitionStmt, should we just merge them into AlterTableAddDropRangePartitionStmt? Line 77: return; remove http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: Line 553: case UPDATE_STATS: I think we should be able to avoid the copy+paste by adding an additional check to L365 that will make us only go down this patch for ADD_REPLACE_COLUMNS, DROP_COLUMN, etc., but not for UPDATE_STATS or SET_TBL_PROPERTIES Maybe something like altersKuduTable(TAlterTableType type), you get the idea Line 2231: KuduCatalogOpExecutor.renameTable((KuduTable) tbl, I don't think we should rename the Kudu table for external tables. An external table is like a pointer, and changing the tblprpperty should just point it to another table imo. http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java: Line 1831: String[] addDrop = {"add if not exists", "drop if exists"}; what about just "add" and "drop" Line 1868: AnalysisError("alter table functional_kudu.testtbl add columns (a1 int)", How do we know a1 is going to be non-nullable at this point (in analysis)? That's up to Kudu's default right? Probably best to add an explicit NOT NULL in any case. Line 1875: AnalysisError("alter table functional_kudu.testtbl add columns " + use a loop to cover the cases separately (if you decide to keep these checks part of analysis and not defer to Kudu) http://gerrit.cloudera.org:8080/#/c/5136/2/testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test File testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test: Line 52: distribute by range (id) (partition 1 < values <= 10) stored as kudu Test dropping all partitions and scanning/inserting into the empty table. Line 156: # Insert a row that doesn't have values for the new columns; defaults are used New test somewhere: * drop new_col2 * re-add new_col2 but with a different default value * insert a new row, triggering the new default value * check that old rows have the old default value and new rows have the new default value Line 157: insert into tbl_to_alter (id,name,vali) values (3, 'test', 200) Can you try this same test but with NULLs for the new columns? The insert should give errors. Line 257: create table copy_of_tbl (a int primary key) distribute by hash (a) into 3 buckets new test for the external table "pointer" behavior http://gerrit.cloudera.org:8080/#/c/5136/2/tests/query_test/test_kudu.py File tests/query_test/test_kudu.py: Line 99: assert kudu_client.table_exists(new_kudu_tbl_name) also check that the old
[Impala-ASF-CR] IMPALA-4510: Selectively filter args for metric verification tests
Michael Brown has posted comments on this change. Change subject: IMPALA-4510: Selectively filter args for metric verification tests .. Patch Set 5: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/5135 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I069172f44c1307d55f85779cdb01fecc0ba1799e Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4510: Selectively filter args for metric verification tests
Hello Michael Brown, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5135 to look at the new patch set (#5). Change subject: IMPALA-4510: Selectively filter args for metric verification tests .. IMPALA-4510: Selectively filter args for metric verification tests run-tests.py is a wrapper around impala-py.test. It abstracts away the need to invoke separate runs for serial tests, parallel tests, and metric verification tests. Because it's possible for a user to specify certain test suites, or even specific tests, on the command line when calling run-tests.py, it had been necessary to override the command line args when it came time to run the metric verification tests -- otherwise those other tests/suites would be rerun. Before this patch, we had simply been stripping away all command line args. However, that blanket approach causes problems when running tests against a remote cluster, because we need to retain those command line args that pertain to the remote cluster. This patch selectively prunes unwanted command line args for the last metric verification test stage, keeping the ones that we need, and also adds extensive documentation for explaining why we have to go through this fairly odd and elaborate step. This patch was tested by running a sample test suite locally, and against a remote cluster. Previously, the metric verification stage had been failing for remote cluster tests (since they were defaulting to localhost for services that were only available remotely.) With the patch, the remote verfification tests were passing. Also, while I'm here, add a small change that exits immediately if the user calls for --help. Before this, we actually still ran the tests. Change-Id: I069172f44c1307d55f85779cdb01fecc0ba1799e --- M tests/run-tests.py 1 file changed, 77 insertions(+), 14 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/5135/5 -- To view, visit http://gerrit.cloudera.org:8080/5135 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I069172f44c1307d55f85779cdb01fecc0ba1799e Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown
[Impala-ASF-CR] IMPALA-4510: Selectively filter args for metric verification tests
David Knupp has posted comments on this change. Change subject: IMPALA-4510: Selectively filter args for metric verification tests .. Patch Set 4: While I'm in this file, make a quick fix to not run tests if the user just passes in --help. -- To view, visit http://gerrit.cloudera.org:8080/5135 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I069172f44c1307d55f85779cdb01fecc0ba1799e Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4432: Handle internal codegen disabling properly
Dan Hecht has posted comments on this change. Change subject: IMPALA-4432: Handle internal codegen disabling properly .. Patch Set 6: (4 comments) Thanks, this is much easier to follow now. It'd be good for Tim to take another pass now to make sure it still makes sense to him as well. http://gerrit.cloudera.org:8080/#/c/5105/6/be/src/runtime/runtime-state.h File be/src/runtime/runtime-state.h: PS6, Line 169: HasScalarFnToCodegen Maybe ScalarFnRequiresCodegen() or similar? i.e. to make it clear that not only do we codegen these, but that they require codegen. Line 171: /// Returns true if there is hint to disable codegen. This can be true for single node "a" Line 180: /// of the hint and all expressions in the fragment can be interpreted. given that this depends on the scalar_fns_to_codegen_ state which isn't query-static state, let's document at what point this routine can be used (e.g. after Prepare() phase or whatever). Line 194: ///which cannot be interpreted. I think this would be clearer without the double negative. eg.: 1. It's enabled by query option. 2. Enabled by the internal hint and all expressions can be interpreted. -- To view, visit http://gerrit.cloudera.org:8080/5105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0b6a9ed723c64ba21b861608583cc9b6607d3397 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model
Michael Brown has posted comments on this change. Change subject: IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model .. Patch Set 4: Code-Review+1 I've simply chosen different names for the objects, and maintained the name Query for SELECT statements. We don't need that special pickling anymore. carry +1 -- To view, visit http://gerrit.cloudera.org:8080/5162 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I38e24da78c49e908449b35f0a6276ebe4236ddba Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model
Michael Brown has posted comments on this change. Change subject: IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/5162/1/tests/comparison/leopard/custom_pickle.py File tests/comparison/leopard/custom_pickle.py: PS1, Line 8: : Thanks to https://wiki.python.org/moin/UsingPickle/RenamingModules > I'm testing another patch set that lets us just not use this code at all. I Done -- To view, visit http://gerrit.cloudera.org:8080/5162 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I38e24da78c49e908449b35f0a6276ebe4236ddba Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model
Hello Matthew Jacobs, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5162 to look at the new patch set (#4). Change subject: IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model .. IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model This patch adds support to the random query generator infrastructure to model and write SQL INSERTs. It does not actually randomly generate INSERTs at this time (tracked in IMPALA-4353 and umbrella task IMPALA-3740) but does provide necessary building blocks to do so. First, it's necessary to model the INSERTs as part of our data model. This was done by taking the current notion of a Query and making it a SelectQuery. We also then create an abstract Query containing some of the more common methods and attributes. We then model an INSERT query, INSERT clause, and VALUES clause (IMPALA-4343). Second, it's necessary to test the basics of this data model. It made sense to go ahead and implement the necessary SqlWriter methods to write the SQL for these clauses (IMPALA-4354). I could then use this writer with some existing and new tests that take a query written into our data model and write the SQL, verifying they're correct. For INSERT into Kudu tables, the equivalent PostgreSQL queries need to use "ON CONFLICT DO NOTHING", so all existing and new query tests verify they can be written as PostgreSQL as well. Testing: - all the query generator tests pass - I can run Leopard front_end.py and load older query generator reports, browse them, and re-run failed queries - I can run Leopard controller.py to actually do a query generator run - discrepancy_searcher.py --explain-only ran for hundreds of queries. There were no problems writing the SELECT queries Change-Id: I38e24da78c49e908449b35f0a6276ebe4236ddba --- M tests/comparison/model_translator.py M tests/comparison/query.py M tests/comparison/tests/query_object_testdata.py M tests/comparison/tests/test_query_objects.py 4 files changed, 623 insertions(+), 68 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/5162/4 -- To view, visit http://gerrit.cloudera.org:8080/5162 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I38e24da78c49e908449b35f0a6276ebe4236ddba Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown
[Impala-ASF-CR] IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model
Hello Matthew Jacobs, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5162 to look at the new patch set (#3). Change subject: IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model .. IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model This patch adds support to the random query generator infrastructure to model and write SQL INSERTs. It does not actually randomly generate INSERTs at this time (tracked in IMPALA-4353 and umbrella task IMPALA-3740) but does provide necessary building blocks to do so. First, it's necessary to model the INSERTs as part of our data model. This was done by taking the current notion of a Query and making it a SelectQuery. We also then create an abstract Query containing some of the more common methods and attributes. We then model an INSERT query, INSERT clause, and VALUES clause (IMPALA-4343). Second, it's necessary to test the basics of this data model. It made sense to go ahead and implement the necessary SqlWriter methods to write the SQL for these clauses (IMPALA-4354). I could then use this writer with some existing and new tests that take a query written into our data model and write the SQL, verifying they're correct. For INSERT into Kudu tables, the equivalent PostgreSQL queries need to use "ON CONFLICT DO NOTHING", so all existing and new query tests verify they can be written as PostgreSQL as well. Testing: - all the query generator tests pass - I can run Leopard front_end.py and load older query generator reports, browse them, and re-run failed queries - I can run Leopard controller.py to actually do a query generator run - discrepancy_searcher.py --explain-only ran for hundreds of queries. There were no problems writing the SELECT queries Change-Id: I38e24da78c49e908449b35f0a6276ebe4236ddba --- M tests/comparison/model_translator.py M tests/comparison/query.py M tests/comparison/query_flattener.py M tests/comparison/query_generator.py M tests/comparison/tests/fake_query.py M tests/comparison/tests/query_object_testdata.py M tests/comparison/tests/test_query_objects.py 7 files changed, 627 insertions(+), 69 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/5162/3 -- To view, visit http://gerrit.cloudera.org:8080/5162 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I38e24da78c49e908449b35f0a6276ebe4236ddba Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown
[Impala-ASF-CR] IMPALA-2890: Support ALTER TABLE statements for Kudu tables
Dimitris Tsirogiannis has uploaded a new patch set (#3). Change subject: IMPALA-2890: Support ALTER TABLE statements for Kudu tables .. IMPALA-2890: Support ALTER TABLE statements for Kudu tables With this commit, we add support for additional ALTER TABLE statements against Kudu tables. The new supported ALTER TABLE operations for Kudu are: - ADD/DROP range partitions. Syntax: ALTER TABLE ADD [IF NOT EXISTS] RANGE ALTER TABLE DROP [IF EXISTS] RANGE - ADD/DROP/RENAME column. Syntax: ALTER TABLE ADD COLUMNS (col_spec, [col_spec, ...]) ALTER TABLE DROP COLUMN ALTER TABLE CHANGE COLUMN - Rename Kudu table using the 'kudu.table_name' table property. Example: ALTER TABLE SET TBLPROPERTY ('kudu.tbl_name'=''), will change the underlying Kudu table name to . - Renaming the HMS/Catalog table entry of a Kudu table is supported using the existing ALTER TABLE RENAME TO syntax. Not supported: - ALTER TABLE REPLACE COLUMNS Change-Id: I04bc87e04e05da5cc03edec79d13cedfd2012896 --- M common/thrift/JniCatalog.thrift M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java A fe/src/main/java/org/apache/impala/analysis/AlterTableAddRangePartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableAddReplaceColsStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableChangeColStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java A fe/src/main/java/org/apache/impala/analysis/AlterTableDropRangePartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableSetFileFormatStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableSetLocationStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java M fe/src/main/java/org/apache/impala/analysis/DistributeParam.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test M tests/query_test/test_kudu.py 21 files changed, 998 insertions(+), 61 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/36/5136/3 -- To view, visit http://gerrit.cloudera.org:8080/5136 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I04bc87e04e05da5cc03edec79d13cedfd2012896 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis
[Impala-ASF-CR] IMPALA-2890: Support ALTER TABLE statements for Kudu tables
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-2890: Support ALTER TABLE statements for Kudu tables .. Patch Set 1: (1 comment) Forgotten comment and some minor test fixes. http://gerrit.cloudera.org:8080/#/c/5136/1/fe/src/main/java/org/apache/impala/analysis/AlterTableAddReplaceColsStmt.java File fe/src/main/java/org/apache/impala/analysis/AlterTableAddReplaceColsStmt.java: Line 142: throw new AnalysisException("Unsupported column options: " + c.toString()); > would be nice to say why they are unsupported (only supported on Kudu table Done -- To view, visit http://gerrit.cloudera.org:8080/5136 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I04bc87e04e05da5cc03edec79d13cedfd2012896 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4514: Fix broken exhaustive builds caused by non-nullable columns
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4514: Fix broken exhaustive builds caused by non-nullable columns .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/5157 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib6f4b4c88ef18b1731b7c9789aad602880e18035 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4514: Fix broken exhaustive builds caused by non-nullable columns
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-4514: Fix broken exhaustive builds caused by non-nullable columns .. IMPALA-4514: Fix broken exhaustive builds caused by non-nullable columns This commit fixes the broken exhaustive Impala builds. The issue was caused by Kudu table that didn't properly specify nullability constraints. Hence, some rows were rejected during data loading causing some tests to fail. Change-Id: Ib6f4b4c88ef18b1731b7c9789aad602880e18035 Reviewed-on: http://gerrit.cloudera.org:8080/5157 Reviewed-by: Alex Behm Tested-by: Internal Jenkins --- M testdata/datasets/functional/functional_schema_template.sql 1 file changed, 2 insertions(+), 1 deletion(-) Approvals: Internal Jenkins: Verified Alex Behm: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/5157 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ib6f4b4c88ef18b1731b7c9789aad602880e18035 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins
[Impala-ASF-CR] IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE.
Internal Jenkins has posted comments on this change. Change subject: IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE. .. Patch Set 3: Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/519/ -- To view, visit http://gerrit.cloudera.org:8080/5125 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I83c91b540bc6d27cb4f21535fe12f3f8658c233e Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Internal Jenkins Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2523: Make HdfsTableSink aware of clustered input
Internal Jenkins has posted comments on this change. Change subject: IMPALA-2523: Make HdfsTableSink aware of clustered input .. Patch Set 19: Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/518/ -- To view, visit http://gerrit.cloudera.org:8080/4863 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibeda0bdabbfe44c8ac95bf7c982a75649e1b82d0 Gerrit-PatchSet: 19 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Ivanfi Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4440: lineage timestamps can go backwards across daylight savings transitions
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4440: lineage timestamps can go backwards across daylight savings transitions .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/5129 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I34e435fc3511e65bc62906205cb558f2c116a8a9 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan Hecht Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Internal Jenkins Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4440: lineage timestamps can go backwards across daylight savings transitions
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-4440: lineage timestamps can go backwards across daylight savings transitions .. IMPALA-4440: lineage timestamps can go backwards across daylight savings transitions Using TimestampValue (or equivalent string representation) for timestamps that require a point in time doesn't work because the same time can represent multiple point in times. For example, the timestamp: '2016-11-13 01:01 AM' occurred twice last weekend. Instead, we should use unix time directly rather than trying to derive unix time from a (timezone-less) timestamp. Note that there are other questionable uses of TimestampValue for internal Impala service stuff, but I want to fix them separately as they are not as important and fixing does add some risk. While I'm here, remove a template TimestampValue constructor that was unused and is confusing. We don't have any end-to-end tests that exercise column lineage, so add a simple custom cluster test that enables lineage and verifes the start and end unix times are within appropriate bounds. The other column lineage graph fields are at least tested via planner tests. Automated regression testing for the specifc daylight savings issue is difficult as we'd have to cross the daylight savings boundary at just the right time during query execution in order to reproduce reliably. But open to ideas. Testing: - loop the new test overnight without any failures. - exhaustive run. Change-Id: I34e435fc3511e65bc62906205cb558f2c116a8a9 Reviewed-on: http://gerrit.cloudera.org:8080/5129 Reviewed-by: Dan Hecht Tested-by: Internal Jenkins --- M be/src/runtime/timestamp-value.h M be/src/service/impala-server.cc M common/thrift/ImpalaInternalService.thrift M fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java M fe/src/test/java/org/apache/impala/testutil/TestUtils.java A tests/custom_cluster/test_lineage.py 6 files changed, 86 insertions(+), 30 deletions(-) Approvals: Internal Jenkins: Verified Dan Hecht: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/5129 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I34e435fc3511e65bc62906205cb558f2c116a8a9 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan Hecht Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Internal Jenkins
[Impala-ASF-CR] IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE.
Internal Jenkins has posted comments on this change. Change subject: IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE. .. Patch Set 3: Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/517/ -- To view, visit http://gerrit.cloudera.org:8080/5125 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I83c91b540bc6d27cb4f21535fe12f3f8658c233e Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Internal Jenkins Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE.
Internal Jenkins has posted comments on this change. Change subject: IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE. .. Patch Set 3: Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/516/ -- To view, visit http://gerrit.cloudera.org:8080/5125 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I83c91b540bc6d27cb4f21535fe12f3f8658c233e Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Internal Jenkins Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4510: Selectively filter args for metric verification tests
Michael Brown has posted comments on this change. Change subject: IMPALA-4510: Selectively filter args for metric verification tests .. Patch Set 4: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/5135 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I069172f44c1307d55f85779cdb01fecc0ba1799e Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4510: Selectively filter args for metric verification tests
David Knupp has uploaded a new patch set (#4). Change subject: IMPALA-4510: Selectively filter args for metric verification tests .. IMPALA-4510: Selectively filter args for metric verification tests run-tests.py is a wrapper around impala-py.test. It abstracts away the need to invoke separate runs for serial tests, parallel tests, and metric verification tests. Because it's possible for a user to specify certain test suites, or even specific tests, on the command line when calling run-tests.py, it had been necessary to override the command line args when it came time to run the metric verification tests -- otherwise those other tests/suites would be rerun. Before this patch, we had simply been stripping away all command line args. However, that blanket approach causes problems when running tests against a remote cluster, because we need to retain those command line args that pertain to the remote cluster. This patch selectively prunes unwanted command line args for the last metric verification test stage, keeping the ones that we need, and also adds extensive documentation for explaining why we have to go through this fairly odd and elaborate step. This patch was tested by running a sample test suite locally, and against a remote cluster. Previously, the metric verification stage had been failing for remote cluster tests (since they were defaulting to localhost for services that were only available remotely.) With the patch, the remote verfification tests were passing. Change-Id: I069172f44c1307d55f85779cdb01fecc0ba1799e --- M tests/run-tests.py 1 file changed, 69 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/5135/4 -- To view, visit http://gerrit.cloudera.org:8080/5135 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I069172f44c1307d55f85779cdb01fecc0ba1799e Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown
[Impala-ASF-CR] IMPALA-2523: Make HdfsTableSink aware of clustered input
Internal Jenkins has posted comments on this change. Change subject: IMPALA-2523: Make HdfsTableSink aware of clustered input .. Patch Set 19: Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/515/ -- To view, visit http://gerrit.cloudera.org:8080/4863 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibeda0bdabbfe44c8ac95bf7c982a75649e1b82d0 Gerrit-PatchSet: 19 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Ivanfi Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4510: Selectively filter args for metric verification tests
David Knupp has posted comments on this change. Change subject: IMPALA-4510: Selectively filter args for metric verification tests .. Patch Set 2: (7 comments) Thanks for the review. http://gerrit.cloudera.org:8080/#/c/5135/2//COMMIT_MSG Commit Message: PS2, Line 9: impala-pytest > impala-py.test Done PS2, Line 10: separate individual > You can just use one of these two words. :) Done Line 28: > Please mention the testing you performed for this patch. Done http://gerrit.cloudera.org:8080/#/c/5135/2/tests/run-tests.py File tests/run-tests.py: Line 73: Modify and return the command line arguments that will be passed to pytest. > Can you also talk about what log_base_name and valid_dirs need to be for a Done PS2, Line 107: # We need to account for the fact that '--foo bar' and '--foo=bar' might : # be supplied by the user. : raw_args = itertools.chain(*[arg.split('=') for arg in sys.argv[1:]]) > Crafty. It would help a reader to show a simple before/after example of wha Done PS2, Line 110: filtered_args = [] > The name here is strange to me. These args aren't being filtered, but in fa Done PS2, Line 113: try: : pytest.config.getvalue(arg.strip('-')) # Raises ValueError if invalid arg : filtered_args += [arg, str(raw_args.next())] : except ValueError: : continue > This is related to the L75 docstring comment above: it might make more sens Done -- To view, visit http://gerrit.cloudera.org:8080/5135 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I069172f44c1307d55f85779cdb01fecc0ba1799e Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4510: Selectively filter args for metric verification tests
David Knupp has uploaded a new patch set (#3). Change subject: IMPALA-4510: Selectively filter args for metric verification tests .. IMPALA-4510: Selectively filter args for metric verification tests run-tests.py is a wrapper around impala-py.test. It abstracts away the need to invoke separate runs for serial tests, parallel tests, and metric verification tests. Because it's possible for a user to specify certain test suites, or even specific tests, on the command line when calling run-tests.py, it had been necessary to override the command line args when it came time to run the metric verification tests -- otherwise those other tests/suites would be rerun. Before this patch, we had simply been stripping away all command line args. However, that blanket approach causes problems when running tests against a remote cluster, because we need to retain those command line args that pertain to the remote cluster. This patch selectively prunes unwanted command line args for the last metric verification test stage, keeping the ones that we need, and also adds extensive documentation for explaining why we have to go through this fairly odd and elaborate step. This patch was tested by running a sample test suite locally, and against a remote cluster. Previously, the metric verification stage had been failing for remote cluster tests (since they were defaulting to localhost for services that were only available remotely.) With the patch, the remote verfification tests were passing. Change-Id: I069172f44c1307d55f85779cdb01fecc0ba1799e --- M tests/run-tests.py 1 file changed, 69 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/5135/3 -- To view, visit http://gerrit.cloudera.org:8080/5135 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I069172f44c1307d55f85779cdb01fecc0ba1799e Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown
[Impala-ASF-CR] IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE.
Internal Jenkins has posted comments on this change. Change subject: IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE. .. Patch Set 3: Verified-1 Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/514/ -- To view, visit http://gerrit.cloudera.org:8080/5125 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I83c91b540bc6d27cb4f21535fe12f3f8658c233e Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Internal Jenkins Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model
Michael Brown has posted comments on this change. Change subject: IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/5162/1/tests/comparison/leopard/custom_pickle.py File tests/comparison/leopard/custom_pickle.py: PS1, Line 8: : Thanks to https://wiki.python.org/moin/UsingPickle/RenamingModules > If you can't find the license available you'll probably need to check with I'm testing another patch set that lets us just not use this code at all. I'll mark this done if my testing passes and I update this review to use that patch set. -- To view, visit http://gerrit.cloudera.org:8080/5162 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I38e24da78c49e908449b35f0a6276ebe4236ddba Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4494: Fix crash in SimpleScheduler
Lars Volker has posted comments on this change. Change subject: IMPALA-4494: Fix crash in SimpleScheduler .. Patch Set 4: (2 comments) Thanks for the review. Please see PS 6. http://gerrit.cloudera.org:8080/#/c/5127/4/be/src/scheduling/simple-scheduler.cc File be/src/scheduling/simple-scheduler.cc: Line 261: if (new_backend_config->NumBackends() > 0 > Hmm, yeah if it's possible to get a heartbeat before the initial statestore Good point. Line 264: new_backend_config->AddBackend(local_backend_descriptor_); > We could just combine this check and the current_membership_ check below, r Done -- To view, visit http://gerrit.cloudera.org:8080/5127 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6e1196a2fa47e5954c4a190aa326c135d039a77f Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4494: Fix crash in SimpleScheduler
Lars Volker has uploaded a new patch set (#6). Change subject: IMPALA-4494: Fix crash in SimpleScheduler .. IMPALA-4494: Fix crash in SimpleScheduler The scheduler maintains a local list of active backends, which is updated through messages from the statestore. Even the local backend enters this list by registering with the statestore and being included in a statestore update message. Thus, during restarts it can happen that a query gets scheduled with exec_at_coord set to true, while the local backend has not been registered with the scheduler. In this case the IP address lookup in the internal BackendConfig fails and an empty IP address is returned, leading to a nullptr dereference down the line. This change adds an additional check when handling updates from the statestore to make sure that the backend config always contains the local backend. It also changes scheduling when exec_at_coord is true to always use the local backend, irrespective of whether it is present in the backend config. Change-Id: I6e1196a2fa47e5954c4a190aa326c135d039a77f --- M be/src/scheduling/simple-scheduler-test.cc M be/src/scheduling/simple-scheduler.cc 2 files changed, 91 insertions(+), 30 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/5127/6 -- To view, visit http://gerrit.cloudera.org:8080/5127 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6e1196a2fa47e5954c4a190aa326c135d039a77f Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-3342: Add thread counters to monitor plan fragment execution
Henry Robinson has posted comments on this change. Change subject: IMPALA-3342: Add thread counters to monitor plan fragment execution .. Patch Set 9: Code-Review+2 Looks good to me. -- To view, visit http://gerrit.cloudera.org:8080/4633 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifa88aa6f3371fa42d11ecc122f43c7d83623c300 Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: anujphadke Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Yonghyun Hwang Gerrit-Reviewer: anujphadke Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE.
Hello Dimitris Tsirogiannis, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5125 to look at the new patch set (#3). Change subject: IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE. .. IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE. TODO: - Corresponding changes to DESCRIBE EXTENDED/FORMATTED. Testing: A private core/hdfs run passed. Change-Id: I83c91b540bc6d27cb4f21535fe12f3f8658c233e --- M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java M fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java A testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test M testdata/workloads/functional-query/queries/QueryTest/kudu_stats.test M tests/query_test/test_kudu.py 9 files changed, 213 insertions(+), 92 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/5125/3 -- To view, visit http://gerrit.cloudera.org:8080/5125 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I83c91b540bc6d27cb4f21535fe12f3f8658c233e Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis
[Impala-ASF-CR] IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE.
Alex Behm has posted comments on this change. Change subject: IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE. .. Patch Set 2: Code-Review+2 (4 comments) http://gerrit.cloudera.org:8080/#/c/5125/2//COMMIT_MSG Commit Message: PS2, Line 10: Show column encoding and compression once available > No longer relevant, no? Done http://gerrit.cloudera.org:8080/#/c/5125/2/fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java File fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java: PS2, Line 39: import org.apache.kudu.client.shaded.com.google.common.base.Strings; > :) looks as if two import statements got merged into one. Anyways, replace Wow, really weird. Fixed. PS2, Line 278: else { : defaultValCol.setString_val(kuduColumn.isNullable() ? "NULL" : ""); : } > The Kudu team is working on removing this constraint (i.e. either default Leaving the default value blank now. http://gerrit.cloudera.org:8080/#/c/5125/2/tests/query_test/test_kudu.py File tests/query_test/test_kudu.py: PS2, Line 459: [("b", "string", "", "true", "false", "", "AUTO_ENCODING", "DEFAULT_COMPRESSION", "0"), : ("c", "string", "", "false", "false", "", "AUTO_ENCODING", "DEFAULT_COMPRESSION", "0")] > long lines Done -- To view, visit http://gerrit.cloudera.org:8080/5125 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I83c91b540bc6d27cb4f21535fe12f3f8658c233e Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4432: Handle internal codegen disabling properly
Hello Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5105 to look at the new patch set (#6). Change subject: IMPALA-4432: Handle internal codegen disabling properly .. IMPALA-4432: Handle internal codegen disabling properly There are some conditions in which codegen is disabled internally even if it's enabled in the query option. For instance, the single node optimization or the expression evaluation requests sent from the FE to the BE. These internal disabling of codegen are advisory as their purposes are to reduce the latency for tables with no or very few rows. The internal disabling of codegen doesn't interact well with UDFs which cannot be interpreted (e.g. IR UDF) as it conflates the 'disable_codegen' query option set by the user. As a result, it's hard to differentiate between when codegen is disabled explicitly by users and when it is disabled internally. This change fixes the problem above by adding an explicit flag in TQueryCtx to indicate that codegen is disabled internally. This flag is only advisory. For cases in which codegen is needed to function, this internal flag is ignored and if codegen is disabled via query option, an error is thrown. For this new flag to work with ScalarFnCall, codegen needs to happen after ScalarFnCall::Prepare() because it's hard to tell if a fragment contains any UDF that cannot be interpreted until after ScalarFnCall::Prepare() is called. However, Prepare() needs the codegen object to codegen so it needs to be created before Prepare(). We can either always create the codegen module or defer codegen to a point after ScalarFnCall::Prepare(). The former has the downside of introducing unnecessary latency for say single-node optimization so the latter is implemented. It is needed as part of IMPALA-4192 any way. After this change, ScalarFnCall expressions which need to be codegen'd are inserted into a vector in RuntimeState in ScalarFnCall::Prepare(). Later in the codegen phase, these expressions' GetCodegendComputeFn() will be called after codegen for operators is done. If any of these expressions are already codegen'd indirectly by the operators, GetCodegendComputeFn() will be a no-op. This preserves the behavior that ScalarFnCall will always be codegen'd even if the fragment doesn't contain any codegen enabled operators. Change-Id: I0b6a9ed723c64ba21b861608583cc9b6607d3397 --- M be/src/exec/aggregation-node.cc M be/src/exec/exchange-node.cc M be/src/exec/exec-node.cc M be/src/exec/exec-node.h M be/src/exec/hash-join-node.cc M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/partitioned-aggregation-node.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/sort-node.cc M be/src/exec/topn-node.cc M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/plan-fragment-executor.cc M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/service/fe-support.cc M common/thrift/ImpalaInternalService.thrift M fe/src/main/java/org/apache/impala/planner/Planner.java M tests/query_test/test_udfs.py 24 files changed, 178 insertions(+), 117 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/5105/6 -- To view, visit http://gerrit.cloudera.org:8080/5105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0b6a9ed723c64ba21b861608583cc9b6607d3397 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-2890: Support ALTER TABLE statements for Kudu tables
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-2890: Support ALTER TABLE statements for Kudu tables .. Patch Set 1: (19 comments) http://gerrit.cloudera.org:8080/#/c/5136/1/common/thrift/JniCatalog.thrift File common/thrift/JniCatalog.thrift: Line 179: 2: optional CatalogObjects.TRangePartition range_partition_spec > Seems to make more sense to completely separate the HDFS partition case fro Done http://gerrit.cloudera.org:8080/#/c/5136/1/fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java File fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java: Line 56: public AlterTableAddPartitionStmt(TableName tableName, > Seems to make more sense to move this into a new AlterTableAddRangePartitio Done http://gerrit.cloudera.org:8080/#/c/5136/1/fe/src/main/java/org/apache/impala/analysis/AlterTableAddReplaceColsStmt.java File fe/src/main/java/org/apache/impala/analysis/AlterTableAddReplaceColsStmt.java: Line 85: throw new AnalysisException("ALTER TABLE REPLACE COLUMNS not currently " + > is not supported on Kudu tables. Done http://gerrit.cloudera.org:8080/#/c/5136/1/fe/src/main/java/org/apache/impala/analysis/AlterTableChangeColStmt.java File fe/src/main/java/org/apache/impala/analysis/AlterTableChangeColStmt.java: Line 109: if (col.getType() != newColDef_.getType()) { > use equals() Done http://gerrit.cloudera.org:8080/#/c/5136/1/fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java File fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java: Line 52: public AlterTableDropPartitionStmt(TableName tableName, > Separate this into a new AlterTableDropRangePartitionStmt? Done http://gerrit.cloudera.org:8080/#/c/5136/1/fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java File fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java: Line 99: throw new AnalysisException(String.format( > Let's move this check into the respective alter statements that are not sup Done http://gerrit.cloudera.org:8080/#/c/5136/1/fe/src/main/java/org/apache/impala/analysis/ColumnDef.java File fe/src/main/java/org/apache/impala/analysis/ColumnDef.java: Line 162: public boolean hasEncoding() { return encoding_ != null; } > use these in hasKuduOptions() Done Line 322: return new ColumnDef(col.getName(), new TypeDef(col.getType())); > What about all the other column properties like comment/primary_key, etc,? Done http://gerrit.cloudera.org:8080/#/c/5136/1/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java File fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java: Line 232: // We shouldn't output the columns for external tables > good catch Done http://gerrit.cloudera.org:8080/#/c/5136/1/fe/src/main/java/org/apache/impala/catalog/KuduTable.java File fe/src/main/java/org/apache/impala/catalog/KuduTable.java: Line 174: return ImmutableList.copyOf(rangeDistribution.getColumnNames()); > getColumnNames() already returns an ImmutableList Thanks. Done Line 267: ColumnSchema col = > add helper getColumnById()? Done Line 268: tableSchema.getColumnByIndex(tableSchema.getColumnIndex(colId)); > indent 4 Done http://gerrit.cloudera.org:8080/#/c/5136/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: Line 366: if (tbl instanceof KuduTable) { > Consider adding a separate switch() only for Kudu tables above, or some oth Done Line 2168: if (properties.containsKey(KuduTable.KEY_TABLE_NAME) && > This seems to only allow renaming the Kudu table if the old Kudu table alre The Kudu table name property is always set even if the user does explicitly specify it, so I believe the case you mention is handled correctly here. Let me know if I misunderstood the comment. Line 2172: KuduCatalogOpExecutor.renameTable((KuduTable) tbl, > will this succeed if the old table doesn't exist? It will fail but the tbl properties haven't been updated at that point, so the HMS entry will not change. http://gerrit.cloudera.org:8080/#/c/5136/1/fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java: Line 300: if (!ifNotExists) { > we really need finer grained error handling here :( Yeah, I don't like that either. KuduException is too generic and doesn't tell us much in this case. Line 337:* Returns the bounds of a range partition in two PartialRow, RangePartitionBound pairs > enclose the pair members in <> Done Line 393: public static void dropColumn(KuduTable tbl, String colName) > I assume we can't drop PK columns. Let's check for that in analysis. Shouldn't we let Kudu handle this check? Line 409: public static void rename
[Impala-ASF-CR] IMPALA-2890: Support ALTER TABLE statements for Kudu tables
Dimitris Tsirogiannis has uploaded a new patch set (#2). Change subject: IMPALA-2890: Support ALTER TABLE statements for Kudu tables .. IMPALA-2890: Support ALTER TABLE statements for Kudu tables With this commit, we add support for additional ALTER TABLE statements against Kudu tables. The new supported ALTER TABLE operations for Kudu are: - ADD/DROP range partitions. Syntax: ALTER TABLE ADD [IF NOT EXISTS] RANGE ALTER TABLE DROP [IF EXISTS] RANGE - ADD/DROP/RENAME column. Syntax: ALTER TABLE ADD COLUMNS (col_spec, [col_spec, ...]) ALTER TABLE DROP COLUMN ALTER TABLE CHANGE COLUMN - Rename Kudu table using the 'kudu.table_name' table property. Example: ALTER TABLE SET TBLPROPERTY ('kudu.tbl_name'=''), will change the underlying Kudu table name to . - Renaming the HMS/Catalog table entry of a Kudu table is supported using the existing ALTER TABLE RENAME TO syntax. Not supported: - ALTER TABLE REPLACE COLUMNS Change-Id: I04bc87e04e05da5cc03edec79d13cedfd2012896 --- M common/thrift/JniCatalog.thrift M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java A fe/src/main/java/org/apache/impala/analysis/AlterTableAddRangePartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableAddReplaceColsStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableChangeColStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java A fe/src/main/java/org/apache/impala/analysis/AlterTableDropRangePartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableSetFileFormatStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableSetLocationStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java M fe/src/main/java/org/apache/impala/analysis/DistributeParam.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test M tests/query_test/test_kudu.py 21 files changed, 997 insertions(+), 61 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/36/5136/2 -- To view, visit http://gerrit.cloudera.org:8080/5136 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I04bc87e04e05da5cc03edec79d13cedfd2012896 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis
[Impala-ASF-CR] IMPALA-4432: Handle internal codegen disabling properly
Michael Ho has posted comments on this change. Change subject: IMPALA-4432: Handle internal codegen disabling properly .. Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/5105/5/be/src/exec/exec-node.cc File be/src/exec/exec-node.cc: PS5, Line 463: CodegenDisabledByHint( > is this correct? we might still codegen if there's an IR udf, right? See al Good point. Fixed in the new patch. http://gerrit.cloudera.org:8080/#/c/5105/5/be/src/runtime/runtime-state.h File be/src/runtime/runtime-state.h: PS5, Line 171: codegen is disabled due to hints > is this true? doesn't this just indicate whether the hint would like to di Rephrased to "if there is hint to disable codegen." PS5, Line 175: CodegenDisabledByHint > if my above question is right, then this should be called CodegenDisableHin The new patch introduces both CodegenDisableHint() and CodegenDisabledByHint(). The former is needed in ScalarFnCall::Prepare(). Ideally, if there is hint to disable codegen, we should only codegen expressions which cannot be interpreted. If we use CodegenDisabledByHint() (with the suggested change) in ScalarFnCall::Prepare() and once an expression is added to scalar_fn_calls_to_codegen_, all subsequent ScalarFnCall expressions will be codegen'd. This makes it a bit non-deterministic which expressions will be codegen'd in the presence of the disabling hints as it depends on the order of when Prepare() is called for the non-interpretable expression. That said, the expressions may still be codegen'd when the operator is codegen'd but this logic may affect fe-support.cc and the above may matter in that case. -- To view, visit http://gerrit.cloudera.org:8080/5105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I0b6a9ed723c64ba21b861608583cc9b6607d3397 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.
Henry Robinson has posted comments on this change. Change subject: IMPALA-4014: Introduce query-wide execution state. .. Patch Set 3: (18 comments) Looked over the ref count logic so far. http://gerrit.cloudera.org:8080/#/c/4418/4/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: PS4, Line 502: QueryState* qs = ExecEnv::GetInstance()->query_exec_mgr()->GetQueryState(query_id_); ScopedQueryState ? http://gerrit.cloudera.org:8080/#/c/4418/4/be/src/runtime/query-exec-mgr.cc File be/src/runtime/query-exec-mgr.cc: PS4, Line 59: boost:: remove Line 66: } else { Why is there no refcnt increment here? PS4, Line 69: remove here and below. PS4, Line 86: ImpaladMetrics::IMPALA_SERVER_NUM_FRAGMENTS_IN_FLIGHT->Increment(1L); : ImpaladMetrics::IMPALA_SERVER_NUM_FRAGMENTS->Increment(1L); : return Status::OK(); Why is this not in the QueryState? It seems like it should be part of RegisterFInstance(). PS4, Line 95: stance completed. in Call this ExecFInstance() to make it clear that it blocks until instance completion. PS4, Line 115: VLOG_FILE << "GetQueryState(): query_i You have a bug here - if a fragment instance finishes quickly (perhaps it didn't prepare() successfully) the query state will get GC'ed before another fragment instance can be started. I realise this will be addressed by batching fragment start-up, but until then I suggest a workaround like adding the number of expected fragment instances to the fragment instance params, so that when the QEM is created it can take the right number of references. PS4, Line 124: id QueryExecMgr::ReleaseQueryState(QueryState* qs) { : VLOG_FILE << " Prefer atomics rather than heavyweight mutexes for the ref count. Fewer locks -> fewer locking bugs. PS4, Line 126: << " refcnt=" << What are you checking for here - overflow or that the refcnt_ didn't start below 0? If it's the latter, move this to line 125 and check before increment that its GE(..., 0). PS4, Line 147: How? Only one caller to ReleaseQueryState() can set refcnt == 0. http://gerrit.cloudera.org:8080/#/c/4418/4/be/src/runtime/query-exec-mgr.h File be/src/runtime/query-exec-mgr.h: PS4, Line 47: In both cases it brackets the instance execution with an increment/decrement : /// of the refcount. This needs clarification - why is this relevant to callers? I think you mean that the QEM retains a reference to the QueryState until this FInstance has completed. PS4, Line 58: Otherwise returns nullptr AFAICT, GetQueryState() will return a QueryState even if its refcount == 0. It should not, because then you can avoid the complex logic in ReleaseQueryState() that deals with the possibility that some other caller took a refcount between setting the count to 0 and trying to GC the query state. PS4, Line 66: /// TODO: isn't this available in std:: now? remove, once we switch to std::mutex we'll get this one as well. PS4, Line 72: clean up. Comment on how it's allocated. PS4, Line 75: Clean up what? Does this block until instance has finished? http://gerrit.cloudera.org:8080/#/c/4418/4/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: PS4, Line 42: released_r set this to 0. Callers should always manage the refcount with a balanced number of increment / decrement operations, otherwise bugs will be harder to understand. http://gerrit.cloudera.org:8080/#/c/4418/4/be/src/runtime/query-state.h File be/src/runtime/query-state.h: PS4, Line 79: /// don't access query_ctx().desc_tbl, it's most likely for a different fragment instance I don't understand this comment. Is this going to change after batching? Otherwise let's think about ways to avoid having to tell a caller not to touch a certain part of this data structure. PS4, Line 107: Does access to this need to be synchronised? -- To view, visit http://gerrit.cloudera.org:8080/4418 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel Kornacker Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2523: Make HdfsTableSink aware of clustered input
Internal Jenkins has posted comments on this change. Change subject: IMPALA-2523: Make HdfsTableSink aware of clustered input .. Patch Set 19: Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/510/ -- To view, visit http://gerrit.cloudera.org:8080/4863 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibeda0bdabbfe44c8ac95bf7c982a75649e1b82d0 Gerrit-PatchSet: 19 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Ivanfi Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4494: Fix crash in SimpleScheduler
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4494: Fix crash in SimpleScheduler .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/5127/4/be/src/scheduling/simple-scheduler.cc File be/src/scheduling/simple-scheduler.cc: Line 261: if (new_backend_config->NumBackends() > 0 > The reasoning behind the check was that it prevents empty messages from the Hmm, yeah if it's possible to get a heartbeat before the initial statestore message that could cause other problems. E.g. if you have a delta with a single subscriber in the first heartbeat, then you could end up scheduling everything on that node. Line 264: new_backend_config->AddBackend(local_backend_descriptor_); > So far current_membership_ has mirrored the statestore's view. Adding the l We could just combine this check and the current_membership_ check below, right? It seems ok to preemptively add the local backend here at the same time as we add it to be sent to the statestore. -- To view, visit http://gerrit.cloudera.org:8080/5127 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6e1196a2fa47e5954c4a190aa326c135d039a77f Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4502: test partition ddl predicates breaks on non-HDFS filesystems
Hello Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5164 to look at the new patch set (#3). Change subject: IMPALA-4502: test_partition_ddl_predicates breaks on non-HDFS filesystems .. IMPALA-4502: test_partition_ddl_predicates breaks on non-HDFS filesystems This is because that test uses 'set cached' and 'set uncached' which are not supported on non-HDFS filesystems. This patch creates a separate test file for non-HDFS filesystems with only supported queries and invokes the right file based on the filesystem. Change-Id: I8606aa427cb6e50be3395cdde246abb53db5172c --- R testdata/workloads/functional-query/queries/QueryTest/partition-ddl-predicates-all-fs.test A testdata/workloads/functional-query/queries/QueryTest/partition-ddl-predicates-hdfs-only.test M tests/metadata/test_ddl.py 3 files changed, 43 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/5164/3 -- To view, visit http://gerrit.cloudera.org:8080/5164 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8606aa427cb6e50be3395cdde246abb53db5172c Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-4502: test partition ddl predicates breaks on non-HDFS filesystems
Sailesh Mukil has posted comments on this change. Change subject: IMPALA-4502: test_partition_ddl_predicates breaks on non-HDFS filesystems .. Patch Set 3: Code-Review+2 (1 comment) Carry +2 http://gerrit.cloudera.org:8080/#/c/5164/2/testdata/workloads/functional-query/queries/QueryTest/partition-ddl-predicates-hdfs-only.test File testdata/workloads/functional-query/queries/QueryTest/partition-ddl-predicates-hdfs-only.test: Line 19: alter table p1_hdfs partition (j<2, k in ("b", "c")) set cached in 'testPool' > I think we can get rid of the "show" tests in this file. Done -- To view, visit http://gerrit.cloudera.org:8080/5164 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8606aa427cb6e50be3395cdde246abb53db5172c Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3200: move bufferpool under runtime
Tim Armstrong has posted comments on this change. Change subject: IMPALA-3200: move bufferpool under runtime .. Patch Set 2: Code-Review+2 Carry +2 -- To view, visit http://gerrit.cloudera.org:8080/5165 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3507066fb5cb955f0872d31a4619bd8bb7d647cd Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3200: move bufferpool under runtime
Tim Armstrong has posted comments on this change. Change subject: IMPALA-3200: move bufferpool under runtime .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/5165/1/be/src/runtime/bufferpool/buffer-allocator.h File be/src/runtime/bufferpool/buffer-allocator.h: Line 18: #ifndef IMPALA_BUFFER_ALLOCATOR_H > let's also add a _RUNTIME to the include guards. Done -- To view, visit http://gerrit.cloudera.org:8080/5165 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3507066fb5cb955f0872d31a4619bd8bb7d647cd Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3200: move bufferpool under runtime
Hello Marcel Kornacker, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5165 to look at the new patch set (#2). Change subject: IMPALA-3200: move bufferpool under runtime .. IMPALA-3200: move bufferpool under runtime It is arguably a subcomponent of the runtime system Change-Id: I3507066fb5cb955f0872d31a4619bd8bb7d647cd --- M .clang-tidy M be/CMakeLists.txt M be/src/runtime/CMakeLists.txt R be/src/runtime/bufferpool/CMakeLists.txt R be/src/runtime/bufferpool/buffer-allocator.cc R be/src/runtime/bufferpool/buffer-allocator.h R be/src/runtime/bufferpool/buffer-pool-counters.h R be/src/runtime/bufferpool/buffer-pool-test.cc R be/src/runtime/bufferpool/buffer-pool.cc R be/src/runtime/bufferpool/buffer-pool.h R be/src/runtime/bufferpool/reservation-tracker-counters.h R be/src/runtime/bufferpool/reservation-tracker-test.cc R be/src/runtime/bufferpool/reservation-tracker.cc R be/src/runtime/bufferpool/reservation-tracker.h M be/src/runtime/mem-tracker.cc 15 files changed, 24 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/5165/2 -- To view, visit http://gerrit.cloudera.org:8080/5165 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3507066fb5cb955f0872d31a4619bd8bb7d647cd Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Marcel Kornacker
[Impala-ASF-CR] IMPALA-4502: test partition ddl predicates breaks on non-HDFS filesystems
Alex Behm has posted comments on this change. Change subject: IMPALA-4502: test_partition_ddl_predicates breaks on non-HDFS filesystems .. Patch Set 2: Code-Review+2 (1 comment) Thanks for fixing this! http://gerrit.cloudera.org:8080/#/c/5164/2/testdata/workloads/functional-query/queries/QueryTest/partition-ddl-predicates-hdfs-only.test File testdata/workloads/functional-query/queries/QueryTest/partition-ddl-predicates-hdfs-only.test: Line 19: show partitions p1_hdfs I think we can get rid of the "show" tests in this file. -- To view, visit http://gerrit.cloudera.org:8080/5164 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8606aa427cb6e50be3395cdde246abb53db5172c Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2494: Support for byte array-encoded decimals in Parquet scanner
Tim Armstrong has posted comments on this change. Change subject: IMPALA-2494: Support for byte array-encoded decimals in Parquet scanner .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/5115/3/be/src/exec/parquet-column-readers.cc File be/src/exec/parquet-column-readers.cc: Line 1126: switch (slot_desc->type().GetByteSize()) { I think the code might work out simpler if we added parquet::Type as a template parameter to ScalarColumnReader and Encode/Decode. I think that keeps things more consistent since then type parameters are always handled in the same way (rather than handling this one case with a runtime check). I think it makes sense conceptually to have both the source parquet type and destination Impala types be template parameters (it's mostly coincidental that each Impala type mapped to a single Parquet type up until now). It seems like the right framework for future changes of this nature where we want to have multiple parquet types mapping to an Impala type - instead of having to convince ourselves that the performance regression of the runtime check is tolerable each time, we could just use the template parameter. http://gerrit.cloudera.org:8080/#/c/5115/1/be/src/exec/parquet-common.h File be/src/exec/parquet-common.h: PS1, Line 328: sizeof(T))) re > Hm, I think it's accurate - either it's set (> 0) and fixed, or it's not se It's not that bad but it did take a bit of thought to understand how "fixed length" makes sense when decoding from a variable-length byte array. PS1, Line 338: cimal Done. Actually, looking at DecodeFromFixedLenByteArray, it expects this to be >= 1, so we need to check that too. -- To view, visit http://gerrit.cloudera.org:8080/5115 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If95171e65aa48f08b08b8e87f4555dc75e867977 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3342: Add thread counters to monitor plan fragment execution
Dan Hecht has posted comments on this change. Change subject: IMPALA-3342: Add thread counters to monitor plan fragment execution .. Patch Set 9: Code-Review+1 Looks okay to me once Henry signs off. -- To view, visit http://gerrit.cloudera.org:8080/4633 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifa88aa6f3371fa42d11ecc122f43c7d83623c300 Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: anujphadke Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Yonghyun Hwang Gerrit-Reviewer: anujphadke Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1788: Fold constant expressions.
Alex Behm has uploaded a new patch set (#2). Change subject: IMPALA-1788: Fold constant expressions. .. IMPALA-1788: Fold constant expressions. Adds a new ExprRewriteRule for replacing constant expressions with their literal equivalent via BE evaluation. Applies the new rule together with the existing ones on the parse tree, after analysis. Limitations - Constant folding is applied on the unresolved expressions. As a result, it only works for expressions that are constant within a single query block, as opposed to expressions that may become constant after fully substituting inline-view exprs. - Exprs are not normalized, so some opportunities for constant folding are missed for certain expr-tree shapes. This patch includes the following interesting changes: - Introduces a timestamp literal that can only be produced by constant folding (not expressible directly via SQL). - To make sure that rewrites have no user-visible effect, the original result types and column labels of the top-level statement are restored after the rewrites are performed. - Does not fold exprs if their evaluation resulted in a warning or error, or if the resulting value is not representable by corresponding FE LiteralExpr. - Fixes an existing issue with converting strings between the FE/BE. String produced in the BE that have characters with a value > 127 are not correctly deserialized into a Java String via thrift. We detect this case during constant folding and abandon folding of such exprs. - Fixes several issues with detecting/reporting errors in NativeEvalConstExprs(). - Cleans up ExprContext::GetValue() into ExprContext::GetConstantValue() which clarifies its only use of evaluating exprs from the FE. Testing: - Modifies expr-test.cc to run all tests through the constant folding path. - Adds basic planner and rewrite rule tests. Change-Id: If672b703db1ba0bfc26e5b9130161798b40a69e9 --- M be/src/exprs/expr-context.cc M be/src/exprs/expr-context.h M be/src/exprs/expr-test.cc M be/src/exprs/expr.cc M be/src/exprs/literal.cc M be/src/exprs/literal.h M be/src/runtime/runtime-state.h M be/src/service/fe-support.cc M common/thrift/Exprs.thrift M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java M fe/src/main/java/org/apache/impala/analysis/NullLiteral.java M fe/src/main/java/org/apache/impala/analysis/PartitionSet.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/analysis/SlotRef.java M fe/src/main/java/org/apache/impala/analysis/StringLiteral.java A fe/src/main/java/org/apache/impala/analysis/TimestampLiteral.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/rewrite/ExprRewriteRule.java M fe/src/main/java/org/apache/impala/rewrite/ExprRewriter.java A fe/src/main/java/org/apache/impala/rewrite/FoldConstantsRule.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M fe/src/test/java/org/apache/impala/testutil/TestUtils.java M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test M testdata/workloads/functional-planner/queries/PlannerTest/conjunct-ordering.test A testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/empty.test M testdata/workloads/functional-planner/queries/PlannerTest/insert.test M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test M testdata/workloads/functional-planner/queries/PlannerTest/joins.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test M testdata/workloads/functional-planner/queries/PlannerTest/outer-joins.test M testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test M testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test M testdata/workloads/functional-planner/queries/PlannerTest/values.test M testdata/workloads/functional-query/queries/QueryTest/udf.test 47 files changed, 887 insertions(+), 285 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/cha
[Impala-ASF-CR] PREVIEW: IMPALA-1788: Fold constant expressions during plan generation.
Alex Behm has posted comments on this change. Change subject: PREVIEW: IMPALA-1788: Fold constant expressions during plan generation. .. Patch Set 1: (20 comments) http://gerrit.cloudera.org:8080/#/c/5109/1/be/src/exprs/expr-context.cc File be/src/exprs/expr-context.cc: Line 155: void* value = GetValue(NULL); > how does this handle/bail out of cases where exprctx is accidentally not a It the non-constant expr has a SlotRef it will hit a SIGSEGV because we are passing a NULL TupleRow. We can make this more defensive if you prefer, but it might hide bugs. We should really never call this on non-constant exprs. http://gerrit.cloudera.org:8080/#/c/5109/1/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: Line 1541: // The rewrite rule for extracting conjuncts simplifies these to NULL. > rather than special-casing it here, it would make more sense to special-cas Not necessary anymore. Types are preserved. http://gerrit.cloudera.org:8080/#/c/5109/1/be/src/exprs/literal.h File be/src/exprs/literal.h: Line 26: #include "runtime/string-value.h" > duplicated Done http://gerrit.cloudera.org:8080/#/c/5109/1/common/thrift/Exprs.thrift File common/thrift/Exprs.thrift: Line 41: TIMESTAMP_LITERAL > please group all the literals together, so we can easily see what's still m Done. Also got rid of LITERAL_PRED which isn't used. http://gerrit.cloudera.org:8080/#/c/5109/1/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java File fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java: Line 242: fnName = path.get(path.size() - 1); > why is this necessary? To make isConstant() work even when a FunctionCallExpr has not yet been analyzed. Added comment. http://gerrit.cloudera.org:8080/#/c/5109/1/fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java File fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java: Line 204: if (Double.isNaN(val.double_val) || Double.isInfinite(val.double_val)) return null; > nice catch! Done. These changes went in as a separate JIRA/commit. Rebased this patch on it. Line 226: for (byte b: bytes) { > single line Done Line 233: result = new StringLiteral(strVal, constExpr.getType(), false); > did you add a new test that blows up with the previous default unescaping? These cases are covered by expr-test.cc. I also added a few interesting cases to ExprRewriteRulesTest for convenience, but they don't give any new coverage in addition to expr-test.cc http://gerrit.cloudera.org:8080/#/c/5109/1/fe/src/main/java/org/apache/impala/analysis/TimestampLiteral.java File fe/src/main/java/org/apache/impala/analysis/TimestampLiteral.java: Line 34: * produce, so it better to defer to a single source of truth (the BE implementation). > it -> it's Done http://gerrit.cloudera.org:8080/#/c/5109/1/fe/src/main/java/org/apache/impala/planner/AggregationNode.java File fe/src/main/java/org/apache/impala/planner/AggregationNode.java: Line 164: if (analyzer.getQueryOptions().enable_expr_rewrites && rewriter != null) { > pass in null for the rewriter if it's disabled, so you don't have to check Not relevant anymore in the new approach. http://gerrit.cloudera.org:8080/#/c/5109/1/fe/src/main/java/org/apache/impala/planner/JoinNode.java File fe/src/main/java/org/apache/impala/planner/JoinNode.java: Line 523: // constant into nested-loop joins. > refer to cross products, to avoid confusion with future index nlj. Not relevant anymore in new approach. http://gerrit.cloudera.org:8080/#/c/5109/1/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java File fe/src/main/java/org/apache/impala/planner/KuduScanNode.java: Line 373 > do we still need this function for anything? Added a constant folder to Analyzer.GlobalState and removed Expr.foldConstantChildren() http://gerrit.cloudera.org:8080/#/c/5109/1/fe/src/main/java/org/apache/impala/planner/PlanNode.java File fe/src/main/java/org/apache/impala/planner/PlanNode.java: Line 434: public void initNoRewrite(Analyzer analyzer) throws ImpalaException { > why not just init(Analyzer)? that would also make it clear that you're not Not relevant anymore in new approach. http://gerrit.cloudera.org:8080/#/c/5109/1/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java: Line 577: emptySetNode.init(ctx_.getRewriter(), analyzer); > instead of passing this in, maybe each PlanNode should have access to the c Not relevant anymore in new approach (but we should probably still do that in the future) http://gerrit.cloudera.org:8080/#/c/5109/1/fe/src/main/java/org/apache/impala/rewrite/ExprRewriter.java File fe/src/main/java/org/apache/impala/rewrite/ExprRewriter.java: Line 93: public void rewriteList(List exprs, Analyzer analyzer) throws AnalysisException { > why rewrite as opposed to appl
[Impala-ASF-CR] IMPALA-4478: Initial Kudu client mem tracking for sink
Dan Hecht has posted comments on this change. Change subject: IMPALA-4478: Initial Kudu client mem tracking for sink .. Patch Set 2: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/5152/2/be/src/exec/kudu-table-sink.cc File be/src/exec/kudu-table-sink.cc: PS2, Line 34: define oops, I guess this should really be "static const int". -- To view, visit http://gerrit.cloudera.org:8080/5152 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I47f17a81e4362ab490019382fedc66c25f07080a Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/5162/1/tests/comparison/leopard/custom_pickle.py File tests/comparison/leopard/custom_pickle.py: PS1, Line 8: : Thanks to https://wiki.python.org/moin/UsingPickle/RenamingModules > How do I go about finding this out? I had previously looked for copyrights If you can't find the license available you'll probably need to check with lawyers. Jim or Henry can probably point you in the right direction. -- To view, visit http://gerrit.cloudera.org:8080/5162 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I38e24da78c49e908449b35f0a6276ebe4236ddba Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model
Michael Brown has posted comments on this change. Change subject: IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model .. Patch Set 2: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/5162 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I38e24da78c49e908449b35f0a6276ebe4236ddba Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3200: move bufferpool under runtime
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-3200: move bufferpool under runtime .. Patch Set 1: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/5165/1/be/src/runtime/bufferpool/buffer-allocator.h File be/src/runtime/bufferpool/buffer-allocator.h: Line 18: #ifndef IMPALA_BUFFER_ALLOCATOR_H let's also add a _RUNTIME to the include guards. -- To view, visit http://gerrit.cloudera.org:8080/5165 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3507066fb5cb955f0872d31a4619bd8bb7d647cd Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4478: Initial Kudu client mem tracking for sink
Hello Tim Armstrong, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5152 to look at the new patch set (#2). Change subject: IMPALA-4478: Initial Kudu client mem tracking for sink .. IMPALA-4478: Initial Kudu client mem tracking for sink The Kudu client allocates memory which is not tracked by Impala. There are several sources, but the most significant is the memory allocated by the KuduSession on the write path. This can be >100MB, so it is important to track to avoid OOM. Moving forward, we should have a better way to track Kudu client memory, but for now we must at least handle this potentially problematic case. This changes the KuduTableSink to consume 200MB which should be enough for the 100MB write mutation buffer as well as 100MB worth of errors buffered in the client before Impala takes ownership of them (and deletes them). This is left as a flag because it may turn out to be too high for some users and too low for others. When we have better support from Kudu (including KUDU-1752), we should simplify this. TODO: Handle DML w/ small or known resource requirements (e.g. VALUES specified or query has LIMIT) specially to avoid over-consumption. Testing: Have verified acceptable behavior in the stress test with a simple workload containing DML statements of moderate cardinality. Change-Id: I47f17a81e4362ab490019382fedc66c25f07080a --- M be/src/exec/kudu-table-sink.cc 1 file changed, 37 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/5152/2 -- To view, visit http://gerrit.cloudera.org:8080/5152 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I47f17a81e4362ab490019382fedc66c25f07080a Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4478: Initial Kudu client mem tracking for sink
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-4478: Initial Kudu client mem tracking for sink .. Patch Set 2: Code-Review+2 carrying +2 -- To view, visit http://gerrit.cloudera.org:8080/5152 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I47f17a81e4362ab490019382fedc66c25f07080a Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model
Michael Brown has posted comments on this change. Change subject: IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model .. Patch Set 1: Code-Review+1 (3 comments) Thanks for the review. Carry +1 http://gerrit.cloudera.org:8080/#/c/5162/1/tests/comparison/leopard/custom_pickle.py File tests/comparison/leopard/custom_pickle.py: > std file header? Done PS1, Line 8: : Thanks to https://wiki.python.org/moin/UsingPickle/RenamingModules > are we sure licensing is OK? How do I go about finding this out? I had previously looked for copyrights on this page, or on code snippets from the wiki generally, and found none. (The only copyright I found was for MoinMoin wiki itself, which isn't what I need.) http://gerrit.cloudera.org:8080/#/c/5162/1/tests/comparison/query.py File tests/comparison/query.py: PS1, Line 43: Since all supported : queries may have a WITH clause, getting table expressions from the WITH clause is : supported here. > FYI, other DML statements won't support WITH clauses. Done. We talked offline about this and it makes sense to leave this as-is, the reason being that we could eventually support WITH in DELETE and UPDATE, just not now. Since UPSERT is "mostly" INSERT, it currently supports WITH already. -- To view, visit http://gerrit.cloudera.org:8080/5162 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I38e24da78c49e908449b35f0a6276ebe4236ddba Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model
Hello Matthew Jacobs, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5162 to look at the new patch set (#2). Change subject: IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model .. IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model This patch adds support to the random query generator infrastructure to model and write SQL INSERTs. It does not actually randomly generate INSERTs at this time (tracked in IMPALA-4353 and umbrella task IMPALA-3740) but does provide necessary building blocks to do so. First, it's necessary to model the INSERTs as part of our data model. This was done by taking the current notion of a Query and making it a SelectQuery. We also then create an abstract Query containing some of the more common methods and attributes. We then model an INSERT query, INSERT clause, and VALUES clause (IMPALA-4343). Second, it's necessary to test the basics of this data model. It made sense to go ahead and implement the necessary SqlWriter methods to write the SQL for these clauses (IMPALA-4354). I could then use this writer with some existing and new tests that take a query written into our data model and write the SQL, verifying they're correct. For INSERT into Kudu tables, the equivalent PostgreSQL queries need to use "ON CONFLICT DO NOTHING", so all existing and new query tests verify they can be written as PostgreSQL as well. When last doing an end to end test of the changes, I encounterd a problem I hadn't anticipated (but should have): the Leopard framework no longer had the ability to unpickle query objects, because the name had changed. I found a solution on the Python wiki here https://wiki.python.org/moin/UsingPickle/RenamingModules and adapted it to my needs. Last, I made some flake8 adjustments in a few files where I also changed the pickling. Testing: - all the query generator tests pass - I can run Leopard front_end.py and load older query generator reports, browse them, and re-run failed queries - I can run Leopard controller.py to actually do a query generator run - discrepancy_searcher.py --explain-only ran for hundreds of queries. There were no problems writing the SELECT queries Change-Id: I38e24da78c49e908449b35f0a6276ebe4236ddba --- M tests/comparison/leopard/controller.py A tests/comparison/leopard/custom_pickle.py M tests/comparison/leopard/front_end.py M tests/comparison/leopard/job.py M tests/comparison/leopard/report.py M tests/comparison/leopard/schedule_item.py M tests/comparison/model_translator.py M tests/comparison/query.py M tests/comparison/query_flattener.py M tests/comparison/query_generator.py M tests/comparison/tests/fake_query.py M tests/comparison/tests/query_object_testdata.py M tests/comparison/tests/test_query_objects.py 13 files changed, 728 insertions(+), 112 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/5162/2 -- To view, visit http://gerrit.cloudera.org:8080/5162 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I38e24da78c49e908449b35f0a6276ebe4236ddba Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: Matthew Jacobs
[Impala-ASF-CR] IMPALA-4511: Add missing total time counter() to PFE::Exec()
Dan Hecht has posted comments on this change. Change subject: IMPALA-4511: Add missing total_time_counter() to PFE::Exec() .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/5149 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifdd6c059bdf022303920720291ebb319a69d60ec Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Dan Hecht Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4478: Initial Kudu client mem tracking for sink
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-4478: Initial Kudu client mem tracking for sink .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/5152/1/be/src/exec/kudu-table-sink.cc File be/src/exec/kudu-table-sink.cc: PS1, Line 34: 100 * 1024 * 1024 > how about pulling this into a #define and using it here and below, to make Done Line 49: " required (bytes) for a KuduTableSink. This flag is subject to change or removal."); > Can you add something here that indicates that it is typically 2 * kudu_mut Done -- To view, visit http://gerrit.cloudera.org:8080/5152 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I47f17a81e4362ab490019382fedc66c25f07080a Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4363: Add Parquet timestamp validation
Dan Hecht has posted comments on this change. Change subject: IMPALA-4363: Add Parquet timestamp validation .. Patch Set 7: (5 comments) http://gerrit.cloudera.org:8080/#/c/4968/7/be/src/exec/parquet-column-readers.cc File be/src/exec/parquet-column-readers.cc: Line 473: ErrorMsg msg; this has a non-trivial destructor, so we should probably only construct it after checking NeedsValidation(). Also, why are NeedsValidation() and NeedsConversion() mutually exclusive? Actually, doesn't this break --convert_legacy_hive_parquet_utc_timestamps since we'll never take the conversion path for timestamp now? (Do we not have tests for that?) That is, I think both NeedsConversion() and NeedsValidation() can be true, and the code needs to be adjusted for that. Also consider tucking lines 475 to 480 inside ValidateSlot() and that way the ErrorMsg doesn't have to be passed back and forth, and it'll naturally keep the construction after the NeedsValidation() check. PS7, Line 474: val_ptr you'll also have to be careful that we pass the right pointer here, depending on how you address the comment above. http://gerrit.cloudera.org:8080/#/c/4968/7/be/src/runtime/timestamp-value.h File be/src/runtime/timestamp-value.h: Line 147: /// Verifies that the timestamp date falls into a valid range (years 1400..). update this comment to be clear that we also validate time. PS7, Line 156: l doesn't this have to be "ll" (long long)? PS7, Line 160: time_.ticks() < NUMBER_OF_NANOSECONDS_IN_24H I'm slightly worried about how this interacts with leap seconds. Are you sure this is always illegal? If it doesn't crash, what behavior did you see (i.e. how do you know this is the correct bounds)? -- To view, visit http://gerrit.cloudera.org:8080/4968 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9988449aa0dc0f39fabb91ce6cce0dd8a06e8bcf Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4502: test partition ddl predicates breaks on non-HDFS filesystems
Sailesh Mukil has posted comments on this change. Change subject: IMPALA-4502: test_partition_ddl_predicates breaks on non-HDFS filesystems .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/5164/1/tests/metadata/test_ddl.py File tests/metadata/test_ddl.py: Line 396: self.run_test_case('QueryTest/partition-ddl-predicates-all-fs', vector, > Good points. I'm more worried about having to copy+paste tests in two diffe Makes sense. I went with option 1 and made the hdfs-specific file not have a lot of repeat queries. It takes ~33 seconds to run overall now on HDFS, so not too bad. -- To view, visit http://gerrit.cloudera.org:8080/5164 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8606aa427cb6e50be3395cdde246abb53db5172c Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4502: test partition ddl predicates breaks on non-HDFS filesystems
Sailesh Mukil has uploaded a new patch set (#2). Change subject: IMPALA-4502: test_partition_ddl_predicates breaks on non-HDFS filesystems .. IMPALA-4502: test_partition_ddl_predicates breaks on non-HDFS filesystems This is because that test uses 'set cached' and 'set uncached' which are not supported on non-HDFS filesystems. This patch creates a separate test file for non-HDFS filesystems with only supported queries and invokes the right file based on the filesystem. Change-Id: I8606aa427cb6e50be3395cdde246abb53db5172c --- R testdata/workloads/functional-query/queries/QueryTest/partition-ddl-predicates-all-fs.test A testdata/workloads/functional-query/queries/QueryTest/partition-ddl-predicates-hdfs-only.test M tests/metadata/test_ddl.py 3 files changed, 66 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/5164/2 -- To view, visit http://gerrit.cloudera.org:8080/5164 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8606aa427cb6e50be3395cdde246abb53db5172c Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-3200: move bufferpool under runtime
Tim Armstrong has uploaded a new change for review. http://gerrit.cloudera.org:8080/5165 Change subject: IMPALA-3200: move bufferpool under runtime .. IMPALA-3200: move bufferpool under runtime It is arguably a subcomponent of the runtime system Change-Id: I3507066fb5cb955f0872d31a4619bd8bb7d647cd --- M .clang-tidy M be/CMakeLists.txt M be/src/runtime/CMakeLists.txt R be/src/runtime/bufferpool/CMakeLists.txt R be/src/runtime/bufferpool/buffer-allocator.cc R be/src/runtime/bufferpool/buffer-allocator.h R be/src/runtime/bufferpool/buffer-pool-counters.h R be/src/runtime/bufferpool/buffer-pool-test.cc R be/src/runtime/bufferpool/buffer-pool.cc R be/src/runtime/bufferpool/buffer-pool.h R be/src/runtime/bufferpool/reservation-tracker-counters.h R be/src/runtime/bufferpool/reservation-tracker-test.cc R be/src/runtime/bufferpool/reservation-tracker.cc R be/src/runtime/bufferpool/reservation-tracker.h M be/src/runtime/mem-tracker.cc 15 files changed, 14 insertions(+), 16 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/5165/1 -- To view, visit http://gerrit.cloudera.org:8080/5165 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I3507066fb5cb955f0872d31a4619bd8bb7d647cd Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-4478: Initial Kudu client mem tracking for sink
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4478: Initial Kudu client mem tracking for sink .. Patch Set 1: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/5152/1/be/src/exec/kudu-table-sink.cc File be/src/exec/kudu-table-sink.cc: Line 49: " required (bytes) for a KuduTableSink. This flag is subject to change or removal."); Can you add something here that indicates that it is typically 2 * kudu_mutation_buffer_size. Otherwise there's no info in the help text that the two options are correlated. -- To view, visit http://gerrit.cloudera.org:8080/5152 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I47f17a81e4362ab490019382fedc66c25f07080a Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model .. Patch Set 1: Code-Review+1 (3 comments) http://gerrit.cloudera.org:8080/#/c/5162/1/tests/comparison/leopard/custom_pickle.py File tests/comparison/leopard/custom_pickle.py: std file header? PS1, Line 8: : Thanks to https://wiki.python.org/moin/UsingPickle/RenamingModules are we sure licensing is OK? http://gerrit.cloudera.org:8080/#/c/5162/1/tests/comparison/query.py File tests/comparison/query.py: PS1, Line 43: Since all supported : queries may have a WITH clause, getting table expressions from the WITH clause is : supported here. FYI, other DML statements won't support WITH clauses. -- To view, visit http://gerrit.cloudera.org:8080/5162 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I38e24da78c49e908449b35f0a6276ebe4236ddba Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4510: Selectively filter args for metric verification tests
Michael Brown has posted comments on this change. Change subject: IMPALA-4510: Selectively filter args for metric verification tests .. Patch Set 2: (8 comments) I understand the problem, and I think I see what you're doing to fix it, but it might make things a tad clearer if you address my comments below. Thanks! http://gerrit.cloudera.org:8080/#/c/5135/2//COMMIT_MSG Commit Message: PS2, Line 9: impala-pytest impala-py.test PS2, Line 10: separate individual You can just use one of these two words. :) Line 28: Please mention the testing you performed for this patch. http://gerrit.cloudera.org:8080/#/c/5135/2/tests/run-tests.py File tests/run-tests.py: Line 73: Modify and return the command line arguments that will be passed to pytest. Can you also talk about what log_base_name and valid_dirs need to be for a caller? Line 75: The raw command line arguments need to be modified because of the way our Should all of this be in the docstring? It seems like you could instead inline some of this information in relevant places below. PS2, Line 107: # We need to account for the fact that '--foo bar' and '--foo=bar' might : # be supplied by the user. : raw_args = itertools.chain(*[arg.split('=') for arg in sys.argv[1:]]) Crafty. It would help a reader to show a simple before/after example of what sys.argv[1:] starts as, and then what the resulting raw_args is. PS2, Line 110: filtered_args = [] The name here is strange to me. These args aren't being filtered, but in fact they are getting through your filter, right? What's being filtered actually the options that raise ValueError below, right? PS2, Line 113: try: : pytest.config.getvalue(arg.strip('-')) # Raises ValueError if invalid arg : filtered_args += [arg, str(raw_args.next())] : except ValueError: : continue This is related to the L75 docstring comment above: it might make more sense for a reader to see here that you're saying "If this is a pytest config argh, it should be carried along; otherwise it should be filtered" or something similar. -- To view, visit http://gerrit.cloudera.org:8080/5135 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I069172f44c1307d55f85779cdb01fecc0ba1799e Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes