[
https://issues.apache.org/jira/browse/HIVE-11030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14608971#comment-14608971
]
Eugene Koifman commented on HIVE-11030:
---------------------------------------
I'll fix the the seriazlieDeltas() issues
bq. I don't understand this code. Why get a ParsedDelta and turn around and
create a new one?
In some cases ParseDelta needs to retain a reference to FileStatus that it was
created from, but in some cases it's created form Path and doesn't have/need
FileStatus. So the new object constructed here is to keep ParseDelta immutable.
bq. In parseDelta, would it be better to split the string on '_' rather than
call indexOf twice?
I'm not sure what difference it makes - either way you do 1 linear scan of the
string.
bq. In OrcRawRecordMerger, in the constructor (line 489 i...
This calls AcidUtils.parseDelta() not deseriazlieDeltas(). This doesn't stat
the file, it just parses the file name. The {{deltaDirectory}} argument to
this c'tor is not new. This seems ok.
bq. OrcRecordUpdate, end of the constructor (line 265 in your patch)...
This does stat the file but if this check were to fail unnoticed it leads to
data loss which seems really bad. I could wrap this in LOG.isInfoEnabled() for
the most perf sensitive cases...
> Enhance storage layer to create one delta file per write
> --------------------------------------------------------
>
> Key: HIVE-11030
> URL: https://issues.apache.org/jira/browse/HIVE-11030
> Project: Hive
> Issue Type: Sub-task
> Components: Transactions
> Affects Versions: 1.2.0
> Reporter: Eugene Koifman
> Assignee: Eugene Koifman
> Attachments: HIVE-11030.2.patch, HIVE-11030.3.patch
>
>
> Currently each txn using ACID insert/update/delete will generate a delta
> directory like delta_0000100_0000101. In order to support multi-statement
> transactions we must generate one delta per operation within the transaction
> so the deltas would be named like delta_0000100_0000101_0001, etc.
> Support for MERGE (HIVE-10924) would need the same.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)