[
https://issues.apache.org/jira/browse/HADOOP-17015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sneha Vijayarajan updated HADOOP-17015:
---------------------------------------
Description:
Initially changes were made as part of this PR to handle idempotency including
rename operation with the understanding that last modified time gets updated.
But that assumption was wrong and the rename idempotency handling has since
evolved.
For a job clean up, if the Manifest Committer in below Jira is used, then
rename idempotency works using the previously fetched etag :
[HADOOP-18163] hadoop-azure support for the Manifest Committer of
MAPREDUCE-7341 - ASF JIRA (apache.org)
A part of the commit tracked under current Jira to handle DELETE idempotency is
still relevant.
A means to handle idempotency between driver and backend inherently is being
worked upon.
-- Older notes
Currently when a PUT or POST operation timeouts and the server has already
successfully executed the operation, there is no check in driver to see if the
operation did succeed or not and just retries the same operation again. This
can cause driver to through invalid user errors.
Sample scenario:
# Rename request times out. Though server has successfully executed the
operation.
# Driver retries rename and get source not found error.
In the scenario, driver needs to check if rename is being retried and success
if source if not found, but destination is present.
was:
Currently when a PUT or POST operation timeouts and the server has already
successfully executed the operation, there is no check in driver to see if the
operation did succeed or not and just retries the same operation again. This
can cause driver to through invalid user errors.
Sample scenario:
# Rename request times out. Though server has successfully executed the
operation.
# Driver retries rename and get source not found error.
In the scenario, driver needs to check if rename is being retried and success
if source if not found, but destination is present.
> ABFS: Make PUT and POST operations idempotent
> ---------------------------------------------
>
> Key: HADOOP-17015
> URL: https://issues.apache.org/jira/browse/HADOOP-17015
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/azure
> Affects Versions: 3.2.1
> Reporter: Sneha Vijayarajan
> Assignee: Sneha Vijayarajan
> Priority: Major
> Fix For: 3.3.1, 3.4.0
>
>
> Initially changes were made as part of this PR to handle idempotency
> including rename operation with the understanding that last modified time
> gets updated. But that assumption was wrong and the rename idempotency
> handling has since evolved.
> For a job clean up, if the Manifest Committer in below Jira is used, then
> rename idempotency works using the previously fetched etag :
> [HADOOP-18163] hadoop-azure support for the Manifest Committer of
> MAPREDUCE-7341 - ASF JIRA (apache.org)
>
> A part of the commit tracked under current Jira to handle DELETE idempotency
> is still relevant.
> A means to handle idempotency between driver and backend inherently is being
> worked upon.
> -- Older notes
> Currently when a PUT or POST operation timeouts and the server has already
> successfully executed the operation, there is no check in driver to see if
> the operation did succeed or not and just retries the same operation again.
> This can cause driver to through invalid user errors.
>
> Sample scenario:
> # Rename request times out. Though server has successfully executed the
> operation.
> # Driver retries rename and get source not found error.
> In the scenario, driver needs to check if rename is being retried and success
> if source if not found, but destination is present.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]