Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/11694 )
Change subject: IMPALA-7644: Hide Parquet page index writing with feature flag ...................................................................... Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/11694/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11694/2//COMMIT_MSG@18 PS2, Line 18: custom_cluset nit: typo http://gerrit.cloudera.org:8080/#/c/11694/2/be/src/exec/hdfs-parquet-table-writer.cc File be/src/exec/hdfs-parquet-table-writer.cc: http://gerrit.cloudera.org:8080/#/c/11694/2/be/src/exec/hdfs-parquet-table-writer.cc@705 PS2, Line 705: num_data_pages_ > Maybe we could just drop 'num_data_pages_' and call std::vector::resize(0) I agree, that would simplify things substantially. When you do this, can you either do a perf run or inspect the assembly to make sure that the additional dtor ctor calls for the pages are constant time, i.e. the DataPage structs are all PODs? It might be better to do all of that in a subsequent change, but given we can into this issue, we should improve the current code. If you don't have time to push the additional change soon, please add a comment here. The SO discussion linked in the cppreference reads as if clear() should be safe. If you're unsure, you can add a DCHECK on the capacity right after calling clear(). -- To view, visit http://gerrit.cloudera.org:8080/11694 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib4a9098a2085a385351477c715ae245d83bf1c72 Gerrit-Change-Number: 11694 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Lars Volker <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Tue, 16 Oct 2018 22:16:38 +0000 Gerrit-HasComments: Yes
