[Impala-ASF-CR] IMPALA-8438: Store WriteId and valid writeId list for table and partitio
Yongzhi Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13215 ) Change subject: IMPALA-8438: Store WriteId and valid writeId list for table and partitio .. Patch Set 5: (7 comments) Thanks for your reviews. submit patch 6 http://gerrit.cloudera.org:8080/#/c/13215/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13215/3//COMMIT_MSG@7 PS3, Line 7: IMPALA-8438: Store WriteId and valid writeId list for table and partitio > nit: missing n Done http://gerrit.cloudera.org:8080/#/c/13215/3//COMMIT_MSG@10 PS3, Line 10: fucntions > nit, spell-check Done http://gerrit.cloudera.org:8080/#/c/13215/3/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: http://gerrit.cloudera.org:8080/#/c/13215/3/common/thrift/CatalogObjects.thrift@483 PS3, Line 483: 4: optional string valid_write_ids > Instead of storing this in string format, do you think it would be better t Keep the formatted String we can easily use hive provided functions to convert and use it. http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java File fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java: http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java@237 PS3, Line 237: getHighestWriteId > Do you think its better to throw UnsupportedOperationException in these met One of the reason of adding these functions is to avoid run time exceptions for Hive 2 client, it will be convert to these values when catch UnSupportedException anyway. http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java@258 PS3, Line 258: public static long getMajorVersion() { > +1 Done http://gerrit.cloudera.org:8080/#/c/13215/4/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: http://gerrit.cloudera.org:8080/#/c/13215/4/fe/src/main/java/org/apache/impala/catalog/Table.java@281 PS4, Line 281: String tblFullName = getFullName(); > tab used for whitespace Done http://gerrit.cloudera.org:8080/#/c/13215/5/fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java File fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java: http://gerrit.cloudera.org:8080/#/c/13215/5/fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java@233 PS5, Line 233: // ToDo this assumes the acid tables have been created. : // They will be available after IMPALA-8439 is checked in. : testLoadAcidTables("select * from acid.insert_only_no_partitions"); : testLoadAcidTables("select * from acid.insert_only_with_partitions"); > These tables should have deterministic writeIds, so the test could except a The major purpose of the test is to test we can get writeIds from hive. We do not test the correctness of hive code. And I think the writeid list could be very dynamic which can cause flaky tests. -- To view, visit http://gerrit.cloudera.org:8080/13215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54 Gerrit-Change-Number: 13215 Gerrit-PatchSet: 5 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sudhanshu Arora Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Fri, 03 May 2019 21:35:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8438: Store WriteId and valid writeId list for table and partitio
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/13215 ) Change subject: IMPALA-8438: Store WriteId and valid writeId list for table and partitio .. Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/13215/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13215/3//COMMIT_MSG@7 PS3, Line 7: IMPALA-8438: Store WriteId and valid writeId list for table and partitio nit: missing n http://gerrit.cloudera.org:8080/#/c/13215/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13215/5//COMMIT_MSG@7 PS5, Line 7: IMPALA-8438: Store WriteId and valid writeId list for table and partitio nit: missing n http://gerrit.cloudera.org:8080/#/c/13215/5/fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java File fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java: http://gerrit.cloudera.org:8080/#/c/13215/5/fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java@233 PS5, Line 233: // ToDo this assumes the acid tables have been created. : // They will be available after IMPALA-8439 is checked in. : testLoadAcidTables("select * from acid.insert_only_no_partitions"); : testLoadAcidTables("select * from acid.insert_only_with_partitions"); These tables should have deterministic writeIds, so the test could except a given validWriteIds string. -- To view, visit http://gerrit.cloudera.org:8080/13215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54 Gerrit-Change-Number: 13215 Gerrit-PatchSet: 5 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sudhanshu Arora Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Fri, 03 May 2019 21:11:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8438: Store WriteId and valid writeId list for table and partitio
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/13215 ) Change subject: IMPALA-8438: Store WriteId and valid writeId list for table and partitio .. Patch Set 5: (5 comments) http://gerrit.cloudera.org:8080/#/c/13215/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13215/3//COMMIT_MSG@7 PS3, Line 7: partitio nit, spell-check http://gerrit.cloudera.org:8080/#/c/13215/3//COMMIT_MSG@10 PS3, Line 10: fucntions nit, spell-check http://gerrit.cloudera.org:8080/#/c/13215/3/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: http://gerrit.cloudera.org:8080/#/c/13215/3/common/thrift/CatalogObjects.thrift@483 PS3, Line 483: 4: optional string valid_write_ids Instead of storing this in string format, do you think it would be better to parse the string and store these as individual fields? If we do that, can we get rid of storing the table_name since it is already available in THdfsTable? http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java File fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java: http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java@237 PS3, Line 237: getHighestWriteId Do you think its better to throw UnsupportedOperationException in these methods? http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java@258 PS3, Line 258: public static long getMajorVersion() { > can we rename this to getMajorVersion or something since it's not the full +1 -- To view, visit http://gerrit.cloudera.org:8080/13215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54 Gerrit-Change-Number: 13215 Gerrit-PatchSet: 5 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sudhanshu Arora Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Fri, 03 May 2019 20:43:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8438: Store WriteId and valid writeId list for table and partitio
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13215 ) Change subject: IMPALA-8438: Store WriteId and valid writeId list for table and partitio .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3064/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54 Gerrit-Change-Number: 13215 Gerrit-PatchSet: 5 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sudhanshu Arora Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Fri, 03 May 2019 20:09:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8438: Store WriteId and valid writeId list for table and partitio
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13215 ) Change subject: IMPALA-8438: Store WriteId and valid writeId list for table and partitio .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3063/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54 Gerrit-Change-Number: 13215 Gerrit-PatchSet: 4 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sudhanshu Arora Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Fri, 03 May 2019 20:01:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8438: Store WriteId and valid writeId list for table and partitio
Hello Vihang Karajgaonkar, Sudhanshu Arora, Todd Lipcon, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13215 to look at the new patch set (#5). Change subject: IMPALA-8438: Store WriteId and valid writeId list for table and partitio .. IMPALA-8438: Store WriteId and valid writeId list for table and partitio This happens when tables load metadata from HMS. Add MetastoreShim fucntions to support HMS3 only functions. Add validwriteIdlists to query profile through timeline. Tests: Manually tests HMS2 and HMS3, using log files to check Unit tests against HMS3 ToDo: WriteId and valid writeIds can be fetched in other time, need more study on that. Profile example: Query Compilation: 5s057ms - Metadata load started: 63.006ms (63.006ms) - Metadata load finished. loaded-tables=2/2...: 4s801ms (4s738ms) - Loaded ValidWriteIdLists: acid.insert_only_no_partitions:6:9223372036854775807:: acid.insert_only_with_partitions:3:9223372036854775807:: : 4s921ms (120.580ms) - Analysis finished: 4s929ms (8.013ms) Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54 --- M common/thrift/CatalogObjects.thrift M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java M fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java M fe/src/main/java/org/apache/impala/catalog/FeTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java M fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java M fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java 14 files changed, 262 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/13215/5 -- To view, visit http://gerrit.cloudera.org:8080/13215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54 Gerrit-Change-Number: 13215 Gerrit-PatchSet: 5 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sudhanshu Arora Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-8438: Store WriteId and valid writeId list for table and partitio
Yongzhi Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13215 ) Change subject: IMPALA-8438: Store WriteId and valid writeId list for table and partitio .. Patch Set 4: (16 comments) http://gerrit.cloudera.org:8080/#/c/13215/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13215/3//COMMIT_MSG@12 PS3, Line 12: Tests: > I think if we add some of this info to the profile or to EXPLAIN output we Done http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java File fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java: http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java@238 PS3, Line 238: return -1L; > nit: please use capital L so this doesn't look like -11. (same below) Done http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java@258 PS3, Line 258: public static long getMajorVersion() { > can we rename this to getMajorVersion or something since it's not the full Done http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java: http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@374 PS3, Line 374: * @return the list of valid write IDs for the table in a string > I had to trace my way through Hive code to understand what this meant. My u We do not have use case, will remove it. http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@378 PS3, Line 378: lidWriteIdLi > can we use Optional here? writeId is removed http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java: http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@236 PS3, Line 236: StringBuilder validIdsBuf = new StringBuilder("Loaded ValidWriteIdLists: "); > How about putting this into the query timeline event marked above? This way Done http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@285 PS3, Line 285:* Returns the set of tables whose metadata needs to be loaded for the analysis of the > this won't work with LocalCatalog mode enabled. Probably best to just remov Removed the function http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@617 PS3, Line 617: // The last committed write ID which modified this partition. > can you be more clear here? eg "The last committed write ID which modified Done http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@618 PS3, Line 618: //-1 means writeId_ is irrelevant(not supported). > -1L Done http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/catalog/Table.java@120 PS3, Line 120: // Valid write id list for this table > The way this got implemented, this is just a high watermark write id, which The high water mark can help us get a list of valid writeIDs When apply on a ValidWriteIdList /** * Find out if a range of write ids are valid. Note that valid may have different meanings * for different implementations, as some will only want to see committed transactions and some * both committed and aborted. * @param minWriteId minimum write ID to look for, inclusive * @param maxWriteId maximum write ID to look for, inclusive * @return Indicate whether none, some, or all of these transactions are valid. */ RangeResponse isWriteIdRangeValid(long minWriteId, long maxWriteId); But the maxWriteId may comes from transaction manager? I will remove the writeId for table by following your suggestion as we can easily get from stored ValidWriteIdList http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/catalog/Table.java@285 PS3, Line 285: writeIds = MetastoreShim.fetchValidWriteIds(client, tblFullName); > per note elsewhere, I don't think we have a good use for this Done http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/catalog/Table.java@289 PS3, Line 289: } > getFullName() Done
[Impala-ASF-CR] IMPALA-8438: Store WriteId and valid writeId list for table and partitio
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13215 ) Change subject: IMPALA-8438: Store WriteId and valid writeId list for table and partitio .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/13215/4/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: http://gerrit.cloudera.org:8080/#/c/13215/4/fe/src/main/java/org/apache/impala/catalog/Table.java@281 PS4, Line 281: String tblFullName = getFullName(); tab used for whitespace -- To view, visit http://gerrit.cloudera.org:8080/13215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54 Gerrit-Change-Number: 13215 Gerrit-PatchSet: 4 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sudhanshu Arora Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Fri, 03 May 2019 19:20:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8438: Store WriteId and valid writeId list for table and partitio
Hello Vihang Karajgaonkar, Sudhanshu Arora, Todd Lipcon, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13215 to look at the new patch set (#4). Change subject: IMPALA-8438: Store WriteId and valid writeId list for table and partitio .. IMPALA-8438: Store WriteId and valid writeId list for table and partitio This happens when tables load metadata from HMS. Add MetastoreShim fucntions to support HMS3 only functions. Add validwriteIdlists to query profile through timeline. Tests: Manually tests HMS2 and HMS3, using log files to check Unit tests against HMS3 ToDo: WriteId and valid writeIds can be fetched in other time, need more study on that. Profile example: Query Compilation: 5s057ms - Metadata load started: 63.006ms (63.006ms) - Metadata load finished. loaded-tables=2/2...: 4s801ms (4s738ms) - Loaded ValidWriteIdLists: acid.insert_only_no_partitions:6:9223372036854775807:: acid.insert_only_with_partitions:3:9223372036854775807:: : 4s921ms (120.580ms) - Analysis finished: 4s929ms (8.013ms) Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54 --- M common/thrift/CatalogObjects.thrift M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java M fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java M fe/src/main/java/org/apache/impala/catalog/FeTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java M fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java M fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java 14 files changed, 262 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/13215/4 -- To view, visit http://gerrit.cloudera.org:8080/13215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54 Gerrit-Change-Number: 13215 Gerrit-PatchSet: 4 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sudhanshu Arora Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-8438: Store WriteId and valid writeId list for table and partitio
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/13215 ) Change subject: IMPALA-8438: Store WriteId and valid writeId list for table and partitio .. Patch Set 3: (17 comments) http://gerrit.cloudera.org:8080/#/c/13215/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13215/3//COMMIT_MSG@12 PS3, Line 12: Tests: I think if we add some of this info to the profile or to EXPLAIN output we could write some better end-to-end tests that actually modify a table and see that we pick up the new write ids on REFRESH http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java File fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java: http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java@238 PS3, Line 238: return -1l; nit: please use capital L so this doesn't look like -11. (same below) http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java@258 PS3, Line 258: public static long getShimVersion() { can we rename this to getMajorVersion or something since it's not the full version number? http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java: http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@374 PS3, Line 374: * @param wrtId the write id to get the corresponding txn I had to trace my way through Hive code to understand what this meant. My understanding is the following: - if we pass no writeId, we'll get the current *latest* state of writes for a table - if we pass a write ID, that's assuming that we are actually a writer on this table that has already been allocated an ID, and it'll translate our writeID back to a transaction list, and see which prior writes to the table were committed when our write started? Do we have any use case for the latter yet? If not, maybe we should not expose it. http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@378 PS3, Line 378: long writeId can we use Optional here? http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java: http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@236 PS3, Line 236: //Keep the log for HMS 3 related dev. How about putting this into the query timeline event marked above? This way it will show up in profiles, which is moderately useful. http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@285 PS3, Line 285: HdfsPartition hPartition = (HdfsPartition)part; this won't work with LocalCatalog mode enabled. Probably best to just remove this, and instead add the relevant metadata to teh output of 'SHOW PARTITIONS' if you think it's useful for debugging. http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@617 PS3, Line 617: // For Acid table. can you be more clear here? eg "The last committed write ID which modified this partition." http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@618 PS3, Line 618: private long writeId_ = -1l; -1L http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/catalog/Table.java@120 PS3, Line 120: // Last commited write id for this table. The way this got implemented, this is just a high watermark write id, which isn't necessarily the most recently committed. Not sure we have a strong reason to cache this. http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/catalog/Table.java@127 PS3, Line 127: validWrtIds_ rename to 'validWriteIds_' (no need to abbreviate things) http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/catalog/Table.java@285 PS3, Line 285:* @param wrtId the write id to get the corresponding txn per note elsewhere, I don't think we have a good use for this http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/catalog/Table.java@289 PS3, Line 289: if (LOG.isTraceEnabled()) LOG.trace("Get valid writeIds for table:
[Impala-ASF-CR] IMPALA-8438: Store WriteId and valid writeId list for table and partitio
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13215 ) Change subject: IMPALA-8438: Store WriteId and valid writeId list for table and partitio .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3048/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54 Gerrit-Change-Number: 13215 Gerrit-PatchSet: 3 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sudhanshu Arora Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Fri, 03 May 2019 00:00:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8438: Store WriteId and valid writeId list for table and partitio
Yongzhi Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13215 ) Change subject: IMPALA-8438: Store WriteId and valid writeId list for table and partitio .. Patch Set 3: (20 comments) Patch 3 addressed review issues. http://gerrit.cloudera.org:8080/#/c/13215/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13215/2//COMMIT_MSG@7 PS2, Line 7: > nit: we generally don't add a . to the commit message title Done http://gerrit.cloudera.org:8080/#/c/13215/2//COMMIT_MSG@8 PS2, Line 8: > nit: leave an empty line after the title in the commit message Done http://gerrit.cloudera.org:8080/#/c/13215/2//COMMIT_MSG@9 PS2, Line 9: > nit: functions Done http://gerrit.cloudera.org:8080/#/c/13215/2//COMMIT_MSG@10 PS2, Line 10: Add MetastoreShim fucntions to support HMS3 only functions. > nit: extra line before Tests Done http://gerrit.cloudera.org:8080/#/c/13215/2/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: http://gerrit.cloudera.org:8080/#/c/13215/2/common/thrift/CatalogObjects.thrift@307 PS2, Line 307: // For acid table, store last committed write id > nit: capital F + ending . Done http://gerrit.cloudera.org:8080/#/c/13215/2/common/thrift/CatalogObjects.thrift@479 PS2, Line 479: // Set iff this is an acid table. The valid write ids list > Can you add more information about the format, e.g. an example? Done http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java: http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@371 PS2, Line 371: * Get valid write ids for the acid table : * @param client the client to access HMS : * @param tableFullName the name for the table : * @param wrtId the write id to get the corresponding txn : * @return the list of valid write IDs for the table in a string > nit: try to be consistent with ending . Done http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@388 PS2, Line 388: /** :* Get highest writeId in a given validwriteidlist :* @param validWriteIds ValidWriteIdList object in String :* @return highest writeId :*/ > nit: try to be consistent with ending . Done http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java: http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@235 PS2, Line 235: for (FeTable iTbl : loadedTbls_.values()) { > optional: the logging could be moved to a separate function like LogWriteId Done http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@240 PS2, Line 240: } > nit: missing space? Done http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@245 PS2, Line 245: t > nit: extra space? Done http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java File fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java: http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java@172 PS2, Line 172: > nit: . not needed Done http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@617 PS2, Line 617: For Acid table > nit: consistency: "For acid tables." Done http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@1023 PS2, Line 1023: if (thriftPartition.isSetWrite_id()) { > nit: we generally add braces if the "if" doesn't fit to one line Done http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@1025 PS2, Line 1025: } > nit: braces or one line Done http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@1069 PS2, Line 1069: > @Override Done http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@924 PS2, Line 924: //TODO put the writeID load here for now. : loadValidWriteIdList(client); > Can you
[Impala-ASF-CR] IMPALA-8438: Store WriteId and valid writeId list for table and partitio
Hello Vihang Karajgaonkar, Sudhanshu Arora, Todd Lipcon, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13215 to look at the new patch set (#3). Change subject: IMPALA-8438: Store WriteId and valid writeId list for table and partitio .. IMPALA-8438: Store WriteId and valid writeId list for table and partitio This happens when tables load metadata from HMS. Add MetastoreShim fucntions to support HMS3 only functions. Tests: Manually tests HMS2 and HMS3, using log files to check Unit tests against HMS3 ToDo: WriteId and valid writeIds can be fetched in other time, need more study on that. Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54 --- M common/thrift/CatalogObjects.thrift M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java M fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java M fe/src/main/java/org/apache/impala/catalog/FeTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java M fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java M fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java 14 files changed, 292 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/13215/3 -- To view, visit http://gerrit.cloudera.org:8080/13215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54 Gerrit-Change-Number: 13215 Gerrit-PatchSet: 3 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sudhanshu Arora Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar