[
https://issues.apache.org/jira/browse/HADOOP-12079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14581028#comment-14581028
]
Steve Loughran commented on HADOOP-12079:
-----------------------------------------
the reason for the x-newest is that we were seeing inconsistent
read-after-write behaviour even in some of the unit tests.
I think it was {{TestSwiftFileSystemBasicOps.testOverwrite()}} where things
were playing up with this sequence of operations
# write small file
# overwrite with larger file
# reread new file
# observe that the file read back in contained a mixture of the two. That is:
all the original file, followed by the new bits of the latter.
After seeing that more than once, I put the x-newest in there as the sole way
to achieve some form of reliability in operations. That is: it wasn't just fear
or pessimism, it was based on evidence. Note also how all our tests now use
different filenames -again, driven by this overwrite problem.
We can certainly put the switch in, with
# documentation which quite clearly states "there are no guarantees what you
get, especially after overwrites", and advising against using it for any reads
of data that may change. That is: it should only be used for reading static
datasets, not for reading any intermediate output of a workflow.
# tests for requests with that operation set. Specifically repeated operations
of the type described: create-update-read with small->large, then the same for
create-delete-read, create-update-read with large->small, but with a seek()
past the small file thrown in. Set these up to run a (configurable) number of
times.
> Make 'X-Newest' header a configurable
> -------------------------------------
>
> Key: HADOOP-12079
> URL: https://issues.apache.org/jira/browse/HADOOP-12079
> Project: Hadoop Common
> Issue Type: Improvement
> Components: fs/swift
> Affects Versions: 3.0.0, 2.6.0
> Reporter: Gil Vernik
> Assignee: Gil Vernik
> Fix For: 3.0.0, 2.6.1
>
> Attachments: x-newest-optional0001.patch,
> x-newest-optional0002.patch, x-newest-optional0003.patch
>
>
> Current code always sends X-Newest header to Swift. While it's true that
> Swift is eventual consistent and X-Newest will always get the newest version
> from Swift, in practice this header will make Swift response very slow.
> This header should be configured as an optional, so that it will be possible
> to access Swift without this header and get much better performance.
> This patch doesn't modify current behavior. All is working as is, but there
> is an option to provide fs.swift.service.useXNewest = false.
> Some background on Swift and X-Newest:
> When a GET or HEAD request is made to an object, the default behavior is to
> get the data from one of the replicas (could be any of them). The downside to
> this is that if there are older versions of the object (due to eventual
> consistency) it is possible to get an older version of the object. The upside
> is that the for the majority of use cases, this isn't an issue. For the small
> subset of use cases that need to make sure that they get the latest version
> of the object, they can set the "X-Newest" header to "True". If this is set,
> the proxy server will check all replicas of the object and only return the
> newest object. The downside to this is that the request can take longer,
> since it has to contact all the replicas. It is also more expensive for the
> backend, so only recommended when it is absolutely needed.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)