Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/13545 )
Change subject: IMPALA-8630: Include partition id when calculating consistent remote placement ...................................................................... Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/13545/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13545/1//COMMIT_MSG@17 PS1, Line 17: This adds the partition_id in the hash, so files with the same > How are the partition IDs generated? I'm wondering if there might be a less They are just sequential numbers that we make up, but it turns out that they are unique, even across tables: https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java#L572 But that creates a different problem, which is that they are not stable over time or across different coordinators. If I invalidate metadata and rerun a query, I get different partition ids. I'll explore ways to make something stable to use. -- To view, visit http://gerrit.cloudera.org:8080/13545 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46c739fc31af539af2b3509e2a161f4e29f44d7b Gerrit-Change-Number: 13545 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Mon, 10 Jun 2019 18:31:16 +0000 Gerrit-HasComments: Yes
