[ https://issues.apache.org/jira/browse/IMPALA-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Todd Lipcon resolved IMPALA-7047. --------------------------------- Resolution: Fixed Fix Version/s: Impala 3.2.0 Yep, thanks for catching this. > REFRESH on unpartitioned tables calls getBlockLocations on every file > --------------------------------------------------------------------- > > Key: IMPALA-7047 > URL: https://issues.apache.org/jira/browse/IMPALA-7047 > Project: IMPALA > Issue Type: Bug > Components: Catalog > Affects Versions: Impala 2.13.0 > Reporter: Todd Lipcon > Assignee: Todd Lipcon > Priority: Major > Labels: metadata > Fix For: Impala 3.2.0 > > > In HdfsTable.updateUnpartitionedTableFileMd() the existing default Partition > object is reset, and a new empty one is created. It then calls > refreshPartitionFileMetadata with this new partition which has an empty list > of file descriptors. This ends up listing the directory, and for each file, > since it doesn't find it in the empty descriptor list, will make a separate > RPC to HDFS to get the locations. > This is quite wasteful vs just using the API that returns the located > statuses for the directory. > Alternatively, it seems like it should probably keep around the old file > descriptor list in the new Partition object so that the incremental refresh > path can work. -- This message was sent by Atlassian JIRA (v7.6.3#76005)