[
https://issues.apache.org/jira/browse/HADOOP-19654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18030630#comment-18030630
]
ASF GitHub Bot commented on HADOOP-19654:
-----------------------------------------
ahmarsuhail commented on PR #7882:
URL: https://github.com/apache/hadoop/pull/7882#issuecomment-3415388890
This is happening because those readVectored() tests create a new
`vectored-read.txt` file on the setup() before each test. Since the tests are
parameterized, they run twice, once for `direct-buffer` and then for
`array-buffer`.
On the first run for `direct-buffer`, a HEAD for the metadata is made and
cached, and the data for `vectored-read.txt` is also cached. Then the stream is
`closed()` and since the file ends in `.txt`, AAL clears the data cache. (since
it's a sequential format, the chances there will be a backward seek and the
same data will be accessed are low so it's better to clear the data cache). The
metadata cache is not cleared here (it should be, and I will make that fix).
On the second run for `array-buffer`, the `vectored-file` is written again.
AAL will get the metadata from the metadata cache, and use that eTag when
making the GETS for the block data. Since on S3 express, the eTag is no longer
the md5 of the object content, even though the object content is the same, the
eTag has changed. And hence the 412s on the GETS.
On consistency with caching in general:
* AAL provides a `metadatastore.ttl` config, set that to 0 and HEAD
responses are never cached. This solves the caching issues we had when
overwrite files before, as with that `ttl` 0 we will always get the latest
version of the file.
* Data blocks will be removed once memory usage is > defined memory
threshold (2GB), and clean up happens every 5s by default. The edge case here
is that what if data usage is always below 2GB, and data blocks never get
evicted? This is why the `metadatastore.ttl` was introduced.
* Our `BlockKey` which is the key under which file data is stored is a
combination of the S3URI + eTag. If the eTag changes, then we'll have a
different BlockKey, which means we don't have any data stored for it. For
example:
```
* Data is written to A.parquet, etag is "1234".
* A.parquet is read fully in to the cache, with key "A.parquet + 1234"
* A.parquet is overwritten, etag is "6789".
* A.parquet is opened for reading again:
If metadata ttl has not yet expired, and metadata cache has eTag as `1234`,
so AAL will return data from the data cache using key "A.parquet + 1234". If
the requested data is not in the data cache, we'll make a GET with the outdated
eTag as `1234` and this will fail with a 412.
If metadata TTL has expired, a new HEAD request is made, and we now have the
eTag `6789`, this will now create a new BlockKey "A.parquet + 6789", and since
there is no data stored here, will make GETS for the data.
```
With this we ensure two things:
1/ Once a stream opened it will always serve bytes from the same object
version, or fail.
2/ Data will be stale at maximum metadata.tll milliseconds, with the
exception of stream's lifetime.
Basically, if your data changes often, set the metadataTTL to 0, and AAL
will always get the latest data. Otherwise we have eventually consistency.
> Upgrade AWS SDK to 2.33.x
> -------------------------
>
> Key: HADOOP-19654
> URL: https://issues.apache.org/jira/browse/HADOOP-19654
> Project: Hadoop Common
> Issue Type: Improvement
> Components: build, fs/s3
> Affects Versions: 3.5.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Priority: Major
> Labels: pull-request-available
>
> Upgrade to a recent version of 2.33.x or later while off the critical path of
> things.
> HADOOP-19485 froze the sdk at a version which worked with third party stores.
> Apparently the new version works; early tests show that Bulk Delete calls
> with third party stores complain about lack of md5 headers, so some tuning is
> clearly going to be needed.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]