Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16534 )
Change subject: IMPALA-10205: Replace MD5 with Murmur3 for generating datafile path hash ...................................................................... Patch Set 1: I think we should just figure out how to get rid of the md5 use here. I took a look and I'm really not seeing a benefit to the current approach compared to using the path directly and choosing a better thrift structure. I think if we use the path as the key in the java map, there should be no space overhead - it looks like DataFile.path() will just return a reference to the path String in DataFile - there's no copy or anything. Then for TIcebergTable we don't need to represent it as a map, we can just use a list<THdfsFileDesc> and construct the java map in loadFileDescFromThrift -- To view, visit http://gerrit.cloudera.org:8080/16534 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If7c805f2fdf0cf5a69738579c7e55f4bd047ed59 Gerrit-Change-Number: 16534 Gerrit-PatchSet: 1 Gerrit-Owner: Wenzhe Zhou <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Wenzhe Zhou <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: wangsheng <[email protected]> Gerrit-Comment-Date: Sat, 03 Oct 2020 00:27:04 +0000 Gerrit-HasComments: No
