[
https://issues.apache.org/jira/browse/HIVE-21177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Eugene Koifman updated HIVE-21177:
----------------------------------
Status: Patch Available (was: Open)
I added checks so that we don't look for the side file if we don't have to.
We have another issue. Operations like Load Data/Add Partition, create
base/delta and place 'raw' (aka 'original' schema) files there. Split gen and
read path need to know what schema to expect in a given file/split. There is
nothing in the file path that indicates what it is so it opens one of the data
files in base/delta to determine that: {{AcidUtils.isRawFormat()}}.
This should be less of an issue, since it does a listing first to choose the
file, so it should never be looking for a file that is not actually there. I
optimized isRawFormat() some but it will do the checks a lot of the time. It
could be changed to rely of file name instead but that's rather fragile.
> Optimize AcidUtils.getLogicalLength()
> -------------------------------------
>
> Key: HIVE-21177
> URL: https://issues.apache.org/jira/browse/HIVE-21177
> Project: Hive
> Issue Type: Bug
> Components: Transactions
> Affects Versions: 3.0.0
> Reporter: Eugene Koifman
> Assignee: Eugene Koifman
> Priority: Major
> Attachments: HIVE-21177.01.patch
>
>
> {{AcidUtils.getLogicalLength()}} - tries look for the side file
> {{OrcAcidUtils.getSideFile()}} on the file system even when the file couldn't
> possibly be there, e.g. when the path is delta_x_x or base_x. It could only
> be there in delta_x_y, x != y.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)