[ 
https://issues.apache.org/jira/browse/HIVE-11030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14608971#comment-14608971
 ] 

Eugene Koifman commented on HIVE-11030:
---------------------------------------

I'll fix the the seriazlieDeltas() issues

bq. I don't understand this code. Why get a ParsedDelta and turn around and 
create a new one?

In some cases ParseDelta needs to retain a reference to FileStatus that it was 
created from, but in some cases it's created form Path and doesn't have/need 
FileStatus.  So the new object constructed here is to keep ParseDelta immutable.

bq. In parseDelta, would it be better to split the string on '_' rather than 
call indexOf twice?
I'm not sure what difference it makes - either way you do 1 linear scan of the 
string.  

bq. In OrcRawRecordMerger, in the constructor (line 489 i...
This calls AcidUtils.parseDelta() not deseriazlieDeltas().  This doesn't stat 
the file, it just parses the file name.  The {{deltaDirectory}} argument to 
this c'tor is not new.  This seems ok.

bq. OrcRecordUpdate, end of the constructor (line 265 in your patch)...
This does stat the file but if this check were to fail unnoticed it leads to 
data loss which seems really bad.  I could wrap this in LOG.isInfoEnabled() for 
the most perf sensitive cases...

> Enhance storage layer to create one delta file per write
> --------------------------------------------------------
>
>                 Key: HIVE-11030
>                 URL: https://issues.apache.org/jira/browse/HIVE-11030
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Transactions
>    Affects Versions: 1.2.0
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>         Attachments: HIVE-11030.2.patch, HIVE-11030.3.patch
>
>
> Currently each txn using ACID insert/update/delete will generate a delta 
> directory like delta_0000100_0000101.  In order to support multi-statement 
> transactions we must generate one delta per operation within the transaction 
> so the deltas would be named like delta_0000100_0000101_0001, etc.
> Support for MERGE (HIVE-10924) would need the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to