[ 
https://issues.apache.org/jira/browse/HIVE-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15143569#comment-15143569
 ] 

Sergey Shelukhin commented on HIVE-4897:
----------------------------------------

I think the simplest solution is actually to have a 2pc-like protocol, where 
first the operation will get a token from metastore. The tokens can be stored 
in memory and are unique (an incrementing number), they are GCed after a long 
period (an hour?). Then each operation with the token would have a 
client-maintained sequence number that is not incremented on retries (for 
generality; or just one expected operation to start, which would be good enough 
for this case) would record success for the token.
If a retry comes with the same token(+seq num in case of multiple ops), it 
would be a no-op.
Comparing objects as originally suggested is both difficult and error-prone, 
and also not bullet-proof if someone alters the objects in the interim.
This approach is hard to use for ops that return result because it's not clear 
what result is to be returned, unless the result of original operation is 
saved, which is a PITA. We can either throw a special exception (succeeded, but 
cannot return the result), or return the latest state of the object for some 
ops; create* do not return any result so we are good here.
Tokens can be stored externally for failover, but I don't think this is really 
necessary for the first draft.

For the first draft in-memory, single-operation tokens with no result option 
would be easy to implement. [~thejas] should we do that? Opinions? Related to 
the timeout issue we were discussing.

> Hive should handle AlreadyExists on retries when creating tables/partitions
> ---------------------------------------------------------------------------
>
>                 Key: HIVE-4897
>                 URL: https://issues.apache.org/jira/browse/HIVE-4897
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Aihua Xu
>         Attachments: HIVE-4897.patch, hive-snippet.log
>
>
> Creating new tables/partitions may fail with an AlreadyExistsException if 
> there is an error part way through the creation and the HMS tries again 
> without properly cleaning up or checking if this is a retry.
> While partitioning a new table via a script on distributed hive (MetaStore on 
> the same machine) there was a long timeout and then:
> {code}
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. 
> AlreadyExistsException(message:Partition already exists:Partition( ...
> {code}
> I am assuming this is due to retry. Perhaps already-exists on retry could be 
> handled better.
> A similar error occurred while creating a table through Impala, which issued 
> a single createTable call that failed with an AlreadyExistsException. See the 
> logs related to table tmp_proc_8_d2b7b0f133be455ca95615818b8a5879_7 in the 
> attached hive-snippet.log



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to