Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/11694 )
Change subject: IMPALA-7644: Hide Parquet page index writing with feature flag ...................................................................... Patch Set 2: (3 comments) Thanks for the comments! I'm running ASAN tests currently just in case, so please postpone GVO until it's finished. http://gerrit.cloudera.org:8080/#/c/11694/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11694/2//COMMIT_MSG@8 PS2, Line 8: > Can you mention somehow that the original was reverted, and this commits ad Added a sentence about it. http://gerrit.cloudera.org:8080/#/c/11694/2//COMMIT_MSG@18 PS2, Line 18: custom_cluset > nit: typo Done http://gerrit.cloudera.org:8080/#/c/11694/2/be/src/exec/hdfs-parquet-table-writer.cc File be/src/exec/hdfs-parquet-table-writer.cc: http://gerrit.cloudera.org:8080/#/c/11694/2/be/src/exec/hdfs-parquet-table-writer.cc@705 PS2, Line 705: num_data_pages_ > I agree, that would simplify things substantially. When you do this, can yo DataPage is not POD, but since we never set parquet::Statistics for the DataPageHeader it's ctor and dtor shouldn't do much. I did some perf measurements locally. I did a release build and ran the following statement on a single-node cluster: create table lineitem stored as parquet as (select * from tpch.lineitem); Then I compared the profiles with and without 'num_data_pages_'. I couldn't find any significant difference, because it seems that the effect of this change is smaller than the variance of the measurements. In fact, the version without 'num_data_pages_' was even faster a bit. I'm calling clear() since it is more straightforward and now I'm sure enough about that it won't affect the capacity. -- To view, visit http://gerrit.cloudera.org:8080/11694 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib4a9098a2085a385351477c715ae245d83bf1c72 Gerrit-Change-Number: 11694 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Lars Volker <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Wed, 17 Oct 2018 15:33:36 +0000 Gerrit-HasComments: Yes
