[jira] [Commented] (HIVE-4897) Hive should handle AlreadyExists on retries when creating tables/partitions

2020-04-20 Thread David Mollitor (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17087746#comment-17087746
 ] 

David Mollitor commented on HIVE-4897:
--

I think this is often achieved with an 'updated' timestamp value in the 
database schema.  When a request is generated, the client puts in a timestamp.  
If the operation fails and is re-submitted, if the 'updated' timestamp matches 
the requests then a success is returned, otherwise, if the timestamp in HMS is 
older, than the retry happens again.

> Hive should handle AlreadyExists on retries when creating tables/partitions
> ---
>
> Key: HIVE-4897
> URL: https://issues.apache.org/jira/browse/HIVE-4897
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Aihua Xu
>Priority: Major
> Attachments: hive-snippet.log
>
>
> Creating new tables/partitions may fail with an AlreadyExistsException if 
> there is an error part way through the creation and the HMS tries again 
> without properly cleaning up or checking if this is a retry.
> While partitioning a new table via a script on distributed hive (MetaStore on 
> the same machine) there was a long timeout and then:
> {code}
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. 
> AlreadyExistsException(message:Partition already exists:Partition( ...
> {code}
> I am assuming this is due to retry. Perhaps already-exists on retry could be 
> handled better.
> A similar error occurred while creating a table through Impala, which issued 
> a single createTable call that failed with an AlreadyExistsException. See the 
> logs related to table tmp_proc_8_d2b7b0f133be455ca95615818b8a5879_7 in the 
> attached hive-snippet.log



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-4897) Hive should handle AlreadyExists on retries when creating tables/partitions

2016-03-03 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178474#comment-15178474
 ] 

Sergey Shelukhin commented on HIVE-4897:


I think the simplest path of the approach outlined above will work. I've done 
similar work in HBase to make increment operation retries idempotent (so the 
requirements were more stringent and tokens actually needed to survive restarts 
and failover), and it was pretty manageable. With relaxed requirements like no 
persistence it should be simpler still.

> Hive should handle AlreadyExists on retries when creating tables/partitions
> ---
>
> Key: HIVE-4897
> URL: https://issues.apache.org/jira/browse/HIVE-4897
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Aihua Xu
> Attachments: HIVE-4897.patch, hive-snippet.log
>
>
> Creating new tables/partitions may fail with an AlreadyExistsException if 
> there is an error part way through the creation and the HMS tries again 
> without properly cleaning up or checking if this is a retry.
> While partitioning a new table via a script on distributed hive (MetaStore on 
> the same machine) there was a long timeout and then:
> {code}
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. 
> AlreadyExistsException(message:Partition already exists:Partition( ...
> {code}
> I am assuming this is due to retry. Perhaps already-exists on retry could be 
> handled better.
> A similar error occurred while creating a table through Impala, which issued 
> a single createTable call that failed with an AlreadyExistsException. See the 
> logs related to table tmp_proc_8_d2b7b0f133be455ca95615818b8a5879_7 in the 
> attached hive-snippet.log



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-4897) Hive should handle AlreadyExists on retries when creating tables/partitions

2016-03-03 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177849#comment-15177849
 ] 

Aihua Xu commented on HIVE-4897:


Yeah. That scenario definitely will cause the issue, but should be rare? What 
we have seen seems to be caused by unsafe concurrent HMS access, which seems to 
be fixed. 

Let me investigate further how to completely fix this issue including the cases 
you mentioned. 

> Hive should handle AlreadyExists on retries when creating tables/partitions
> ---
>
> Key: HIVE-4897
> URL: https://issues.apache.org/jira/browse/HIVE-4897
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Aihua Xu
> Attachments: HIVE-4897.patch, hive-snippet.log
>
>
> Creating new tables/partitions may fail with an AlreadyExistsException if 
> there is an error part way through the creation and the HMS tries again 
> without properly cleaning up or checking if this is a retry.
> While partitioning a new table via a script on distributed hive (MetaStore on 
> the same machine) there was a long timeout and then:
> {code}
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. 
> AlreadyExistsException(message:Partition already exists:Partition( ...
> {code}
> I am assuming this is due to retry. Perhaps already-exists on retry could be 
> handled better.
> A similar error occurred while creating a table through Impala, which issued 
> a single createTable call that failed with an AlreadyExistsException. See the 
> logs related to table tmp_proc_8_d2b7b0f133be455ca95615818b8a5879_7 in the 
> attached hive-snippet.log



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-4897) Hive should handle AlreadyExists on retries when creating tables/partitions

2016-03-02 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15176759#comment-15176759
 ] 

Sergey Shelukhin commented on HIVE-4897:


I've seen this error fairly recently. It happens if the response was not 
delivered to client due to a network problem (or theoretically due to timing 
issue with retry it can also happen, if the retry is done after timeout but 
before the corresponding timeout on the server, and the original request 
finishes before the retry is processed.
I've also seen it happen when the connection to underlying DB was lost in 
commitTxn, but the commit still happened (that one time was due to BoneCP 
connection-closing bug, but it could presumably also happen because of a 
connection issue). commitTxn fails, but the table is already created. 

> Hive should handle AlreadyExists on retries when creating tables/partitions
> ---
>
> Key: HIVE-4897
> URL: https://issues.apache.org/jira/browse/HIVE-4897
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Aihua Xu
> Attachments: HIVE-4897.patch, hive-snippet.log
>
>
> Creating new tables/partitions may fail with an AlreadyExistsException if 
> there is an error part way through the creation and the HMS tries again 
> without properly cleaning up or checking if this is a retry.
> While partitioning a new table via a script on distributed hive (MetaStore on 
> the same machine) there was a long timeout and then:
> {code}
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. 
> AlreadyExistsException(message:Partition already exists:Partition( ...
> {code}
> I am assuming this is due to retry. Perhaps already-exists on retry could be 
> handled better.
> A similar error occurred while creating a table through Impala, which issued 
> a single createTable call that failed with an AlreadyExistsException. See the 
> logs related to table tmp_proc_8_d2b7b0f133be455ca95615818b8a5879_7 in the 
> attached hive-snippet.log



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-4897) Hive should handle AlreadyExists on retries when creating tables/partitions

2016-03-02 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15176132#comment-15176132
 ] 

Aihua Xu commented on HIVE-4897:


[~sershe] Do you still notice the issue happens? I guess we may not have this 
issue anymore since before we have unsafe concurrent access to the HMS clients 
which could lead to this error. 

I went through the code, if some error happens when we create table or 
partitions, it should roll back properly. Let me know your thoughts. 

> Hive should handle AlreadyExists on retries when creating tables/partitions
> ---
>
> Key: HIVE-4897
> URL: https://issues.apache.org/jira/browse/HIVE-4897
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Aihua Xu
> Attachments: HIVE-4897.patch, hive-snippet.log
>
>
> Creating new tables/partitions may fail with an AlreadyExistsException if 
> there is an error part way through the creation and the HMS tries again 
> without properly cleaning up or checking if this is a retry.
> While partitioning a new table via a script on distributed hive (MetaStore on 
> the same machine) there was a long timeout and then:
> {code}
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. 
> AlreadyExistsException(message:Partition already exists:Partition( ...
> {code}
> I am assuming this is due to retry. Perhaps already-exists on retry could be 
> handled better.
> A similar error occurred while creating a table through Impala, which issued 
> a single createTable call that failed with an AlreadyExistsException. See the 
> logs related to table tmp_proc_8_d2b7b0f133be455ca95615818b8a5879_7 in the 
> attached hive-snippet.log



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-4897) Hive should handle AlreadyExists on retries when creating tables/partitions

2016-02-12 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15144740#comment-15144740
 ] 

Aihua Xu commented on HIVE-4897:


[~sershe] I haven't been able to work on that yet. Feel free to work on it if 
you have any idea. 

> Hive should handle AlreadyExists on retries when creating tables/partitions
> ---
>
> Key: HIVE-4897
> URL: https://issues.apache.org/jira/browse/HIVE-4897
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Aihua Xu
> Attachments: HIVE-4897.patch, hive-snippet.log
>
>
> Creating new tables/partitions may fail with an AlreadyExistsException if 
> there is an error part way through the creation and the HMS tries again 
> without properly cleaning up or checking if this is a retry.
> While partitioning a new table via a script on distributed hive (MetaStore on 
> the same machine) there was a long timeout and then:
> {code}
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. 
> AlreadyExistsException(message:Partition already exists:Partition( ...
> {code}
> I am assuming this is due to retry. Perhaps already-exists on retry could be 
> handled better.
> A similar error occurred while creating a table through Impala, which issued 
> a single createTable call that failed with an AlreadyExistsException. See the 
> logs related to table tmp_proc_8_d2b7b0f133be455ca95615818b8a5879_7 in the 
> attached hive-snippet.log



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-4897) Hive should handle AlreadyExists on retries when creating tables/partitions

2016-02-11 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143569#comment-15143569
 ] 

Sergey Shelukhin commented on HIVE-4897:


I think the simplest solution is actually to have a 2pc-like protocol, where 
first the operation will get a token from metastore. The tokens can be stored 
in memory and are unique (an incrementing number), they are GCed after a long 
period (an hour?). Then each operation with the token would have a 
client-maintained sequence number that is not incremented on retries (for 
generality; or just one expected operation to start, which would be good enough 
for this case) would record success for the token.
If a retry comes with the same token(+seq num in case of multiple ops), it 
would be a no-op.
Comparing objects as originally suggested is both difficult and error-prone, 
and also not bullet-proof if someone alters the objects in the interim.
This approach is hard to use for ops that return result because it's not clear 
what result is to be returned, unless the result of original operation is 
saved, which is a PITA. We can either throw a special exception (succeeded, but 
cannot return the result), or return the latest state of the object for some 
ops; create* do not return any result so we are good here.
Tokens can be stored externally for failover, but I don't think this is really 
necessary for the first draft.

For the first draft in-memory, single-operation tokens with no result option 
would be easy to implement. [~thejas] should we do that? Opinions? Related to 
the timeout issue we were discussing.

> Hive should handle AlreadyExists on retries when creating tables/partitions
> ---
>
> Key: HIVE-4897
> URL: https://issues.apache.org/jira/browse/HIVE-4897
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Aihua Xu
> Attachments: HIVE-4897.patch, hive-snippet.log
>
>
> Creating new tables/partitions may fail with an AlreadyExistsException if 
> there is an error part way through the creation and the HMS tries again 
> without properly cleaning up or checking if this is a retry.
> While partitioning a new table via a script on distributed hive (MetaStore on 
> the same machine) there was a long timeout and then:
> {code}
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. 
> AlreadyExistsException(message:Partition already exists:Partition( ...
> {code}
> I am assuming this is due to retry. Perhaps already-exists on retry could be 
> handled better.
> A similar error occurred while creating a table through Impala, which issued 
> a single createTable call that failed with an AlreadyExistsException. See the 
> logs related to table tmp_proc_8_d2b7b0f133be455ca95615818b8a5879_7 in the 
> attached hive-snippet.log



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-4897) Hive should handle AlreadyExists on retries when creating tables/partitions

2015-08-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14662705#comment-14662705
 ] 

Sergey Shelukhin commented on HIVE-4897:


Hmm... wouldn't it just retry after the first exception and then ignore the 
repeated exception on retry?
I think it may need to check that the object in question exists for each API 
and maybe that it matches the request, on retry.
I am not sure there's good way to add repeatable test for this...

 Hive should handle AlreadyExists on retries when creating tables/partitions
 ---

 Key: HIVE-4897
 URL: https://issues.apache.org/jira/browse/HIVE-4897
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Aihua Xu
 Attachments: HIVE-4897.patch, hive-snippet.log


 Creating new tables/partitions may fail with an AlreadyExistsException if 
 there is an error part way through the creation and the HMS tries again 
 without properly cleaning up or checking if this is a retry.
 While partitioning a new table via a script on distributed hive (MetaStore on 
 the same machine) there was a long timeout and then:
 {code}
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask. 
 AlreadyExistsException(message:Partition already exists:Partition( ...
 {code}
 I am assuming this is due to retry. Perhaps already-exists on retry could be 
 handled better.
 A similar error occurred while creating a table through Impala, which issued 
 a single createTable call that failed with an AlreadyExistsException. See the 
 logs related to table tmp_proc_8_d2b7b0f133be455ca95615818b8a5879_7 in the 
 attached hive-snippet.log



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-4897) Hive should handle AlreadyExists on retries when creating tables/partitions

2015-08-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661883#comment-14661883
 ] 

Hive QA commented on HIVE-4897:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12749104/HIVE-4897.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4859/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4859/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4859/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Tests exited with: ExecutionException: java.util.concurrent.ExecutionException: 
java.io.IOException: Error writing to 
/data/hive-ptest/working/scratch/hiveptest-TestErrorMsg.sh
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12749104 - PreCommit-HIVE-TRUNK-Build

 Hive should handle AlreadyExists on retries when creating tables/partitions
 ---

 Key: HIVE-4897
 URL: https://issues.apache.org/jira/browse/HIVE-4897
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Aihua Xu
 Attachments: HIVE-4897.patch, hive-snippet.log


 Creating new tables/partitions may fail with an AlreadyExistsException if 
 there is an error part way through the creation and the HMS tries again 
 without properly cleaning up or checking if this is a retry.
 While partitioning a new table via a script on distributed hive (MetaStore on 
 the same machine) there was a long timeout and then:
 {code}
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask. 
 AlreadyExistsException(message:Partition already exists:Partition( ...
 {code}
 I am assuming this is due to retry. Perhaps already-exists on retry could be 
 handled better.
 A similar error occurred while creating a table through Impala, which issued 
 a single createTable call that failed with an AlreadyExistsException. See the 
 logs related to table tmp_proc_8_d2b7b0f133be455ca95615818b8a5879_7 in the 
 attached hive-snippet.log



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)