Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/14804 )
Change subject: IMPALA-9174: Speedup allocations of ORC Scanner ...................................................................... Patch Set 4: Code-Review+2 (2 comments) I am ok with the implementation, but I think hdfs-orc-scanner.cc is not the ideal place for it, see my comment. http://gerrit.cloudera.org:8080/#/c/14804/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14804/4//COMMIT_MSG@19 PS4, Line 19: malloc already stores the size of the allocated bytes in front of the : allocated bytes. That's why 'free()' doesn't take a 'size' parameter, : it just searches for the size information next to the pointed bytes. : TC Malloc provides a function called 'tc_malloc_size()' that reveals : that information programmatically, so instead of the hash table we : can just use this function to retrieve the size. This is not exactly how tc malloc stores sizes. There is an explanation in http://goog-perftools.sourceforge.net/doc/tcmalloc.html , mainly in "Small Object Allocation", "Spans", and Deallocation sections. I would only mention that tc malloc has an API to retrieve the allocated size and provide link to the doc with details. The main problem with always storing the size for every object is that in case of very small objects it leads to large overhead, so a common optimization in heap implementations is to store objects of similar size (class) in a contiguous area and only store the size once for that area. It also wouldn't be ideal for big allocations, where it makes sense to work only with K*4096 byte chunks. http://gerrit.cloudera.org:8080/#/c/14804/4/be/src/exec/hdfs-orc-scanner.cc File be/src/exec/hdfs-orc-scanner.cc: http://gerrit.cloudera.org:8080/#/c/14804/4/be/src/exec/hdfs-orc-scanner.cc@63 PS4, Line 63: malloc I think it would be better to move most of this functionality into MemTracker, e.g. to functions like TrackedMalloc + TrackedFree. This function could just call that and throw a suitable exception based on the result. It would be also nice to create some tests that exercise most paths. I am ok with creating a follow up jira for this and do it later. -- To view, visit http://gerrit.cloudera.org:8080/14804 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia09e746883176d6f955c1718267bf55e2abb239b Gerrit-Change-Number: 14804 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Tue, 10 Dec 2019 13:17:16 +0000 Gerrit-HasComments: Yes
