----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/71707/#review218479 -----------------------------------------------------------
looked at the code looks good to me. - Slim Bouguerra On Oct. 31, 2019, 11:16 a.m., Attila Magyar wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/71707/ > ----------------------------------------------------------- > > (Updated Oct. 31, 2019, 11:16 a.m.) > > > Review request for hive, Ashutosh Chauhan, Peter Vary, and Slim Bouguerra. > > > Bugs: HIVE-22411 > https://issues.apache.org/jira/browse/HIVE-22411 > > > Repository: hive-git > > > Description > ------- > > Executing single insert statements on a transactional table effects write > performance on a s3 file system. Each insert creates a new delta directory. > After each insert hive calculates statistics like number of file in the table > and total size of the table. In order to calculate these, it traverses the > directory recursively. During the recursion for each path a separate > listStatus call is executed. In the end the more delta directory you have the > more time it takes to calculate the statistics. > > Therefore insertion time goes up linearly. > > > Diffs > ----- > > > standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/Warehouse.java > 38e843aeacf > > standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/FileUtils.java > 155ecb18bf5 > > > Diff: https://reviews.apache.org/r/71707/diff/1/ > > > Testing > ------- > > measured and plotted insertation time > > > Thanks, > > Attila Magyar > >