[ 
https://issues.apache.org/jira/browse/HADOOP-19654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18030630#comment-18030630
 ] 

ASF GitHub Bot commented on HADOOP-19654:
-----------------------------------------

ahmarsuhail commented on PR #7882:
URL: https://github.com/apache/hadoop/pull/7882#issuecomment-3415388890

    This is happening because those readVectored() tests create a new 
`vectored-read.txt` file on the setup() before each test. Since the tests are 
parameterized, they run twice, once for `direct-buffer` and then for 
`array-buffer`. 
    
   On the first run for `direct-buffer`, a HEAD for the metadata is made and 
cached, and the data for `vectored-read.txt` is also cached. Then the stream is 
`closed()` and since the file ends in `.txt`, AAL clears the data cache. (since 
it's a sequential format, the chances there will be a backward seek and the 
same data will be accessed are low so it's better to clear the data cache). The 
metadata cache is not cleared here (it should be, and I will make that fix).
   
   On the second run for `array-buffer`, the `vectored-file` is written again. 
AAL will get the metadata from the metadata cache, and use that eTag when 
making the GETS for the block data. Since on S3 express, the eTag is no longer 
the md5 of the object content, even though the object content is the same, the 
eTag has changed. And hence the 412s on the GETS. 
   
   On consistency with caching in general:
   
   * AAL provides a `metadatastore.ttl` config, set that to 0 and HEAD 
responses are never cached. This solves the caching issues we had when 
overwrite files before, as with that `ttl` 0 we will always get the latest 
version of the file. 
   
   * Data blocks will be removed once memory usage is > defined memory 
threshold (2GB), and clean up happens every 5s by default. The edge case here 
is that what if data usage is always below 2GB, and data blocks never get 
evicted? This is why the   `metadatastore.ttl` was introduced. 
   
   * Our `BlockKey` which is the key under which file data is stored is a 
combination of the S3URI + eTag. If the eTag changes, then we'll have a 
different BlockKey, which means we don't have any data stored for it. For 
example:
   
   ```
   * Data is written to A.parquet, etag is "1234".
   * A.parquet is read fully in to the cache, with key "A.parquet + 1234"
   * A.parquet is overwritten, etag is "6789". 
   * A.parquet is opened for reading again:
   
   If metadata ttl has not yet expired, and  metadata cache has eTag as `1234`, 
so AAL will return data from the data cache using key "A.parquet + 1234". If 
the requested data is not in the data cache, we'll make a GET with the outdated 
eTag as `1234` and this will fail with a 412. 
   
   If metadata TTL has expired, a new HEAD request is made, and we now have the 
eTag `6789`, this will now create a new BlockKey "A.parquet + 6789", and since 
there is no data stored here, will make GETS for the data. 
   ```
   With this we ensure two things:
   
   1/ Once a stream opened it will always serve bytes from the same object 
version, or fail. 
   
   2/ Data will be stale at maximum metadata.tll milliseconds, with the 
exception of stream's lifetime. 
   
   Basically, if your data changes often, set the metadataTTL to 0, and AAL 
will always get the latest data. Otherwise we have eventually consistency. 




> Upgrade AWS SDK to 2.33.x
> -------------------------
>
>                 Key: HADOOP-19654
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19654
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: build, fs/s3
>    Affects Versions: 3.5.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>              Labels: pull-request-available
>
> Upgrade to a recent version of 2.33.x or later while off the critical path of 
> things.
> HADOOP-19485 froze the sdk at a version which worked with third party stores. 
> Apparently the new version works; early tests show that Bulk Delete calls 
> with third party stores complain about lack of md5 headers, so some tuning is 
> clearly going to be needed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to