[ https://issues.apache.org/jira/browse/HIVE-21177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16755453#comment-16755453 ]
Eugene Koifman edited comment on HIVE-21177 at 1/29/19 10:54 PM: ----------------------------------------------------------------- I added checks so that we don't look for the side file if we don't have to. We have another issue. Operations like Load Data/Add Partition, create base/delta and place 'raw' (aka 'original' schema) files there. Split gen and read path need to know what schema to expect in a given file/split. There is nothing in the file path that indicates what it is so it opens one of the data files in base/delta to determine that: {{AcidUtils.isRawFormat()}}. This should be less of an issue, since it does a listing first to choose the file, so it should never be looking for a file that is not actually there. I optimized isRawFormat() some but it will do the checks a lot of the time. It could be changed to rely on the file name instead but that's rather fragile. was (Author: ekoifman): I added checks so that we don't look for the side file if we don't have to. We have another issue. Operations like Load Data/Add Partition, create base/delta and place 'raw' (aka 'original' schema) files there. Split gen and read path need to know what schema to expect in a given file/split. There is nothing in the file path that indicates what it is so it opens one of the data files in base/delta to determine that: {{AcidUtils.isRawFormat()}}. This should be less of an issue, since it does a listing first to choose the file, so it should never be looking for a file that is not actually there. I optimized isRawFormat() some but it will do the checks a lot of the time. It could be changed to rely of file name instead but that's rather fragile. > Optimize AcidUtils.getLogicalLength() > ------------------------------------- > > Key: HIVE-21177 > URL: https://issues.apache.org/jira/browse/HIVE-21177 > Project: Hive > Issue Type: Bug > Components: Transactions > Affects Versions: 3.0.0 > Reporter: Eugene Koifman > Assignee: Eugene Koifman > Priority: Major > Attachments: HIVE-21177.01.patch > > > {{AcidUtils.getLogicalLength()}} - tries look for the side file > {{OrcAcidUtils.getSideFile()}} on the file system even when the file couldn't > possibly be there, e.g. when the path is delta_x_x or base_x. It could only > be there in delta_x_y, x != y. -- This message was sent by Atlassian JIRA (v7.6.3#76005)