We're using S3 to store checkpoints. They are taken every minute. I'm
seeing a large number of 404 responses from S3 being generated by the
job manager. The order of the entries in the debugging log would imply
that it's a result of a HEAD request to a key. For example all the
incidents look like this,


2022-05-11 23:29:00,804 DEBUG com.amazonaws.request [] - Sending
Request: HEAD https://[MY-BUCKET].s3.amazonaws.com
/[MY_JOB]/checkpoints/5f4d6923883a1702b206f978fa3637a3/ Headers:
(amz-sdk-invocation-id: XXXXX, Content-Type: application/octet-stream,
User-Agent: Hadoop 3.1.0, aws-sdk-java/1.11.788
Linux/5.4.181-99.354.amzn2.x86_64 OpenJDK_64-Bit_Server_VM/11.0.13+8
java/11.0.13 scala/2.12.7 vendor/Oracle_Corporation, )

2022-05-11 23:29:00,815 DEBUG com.amazonaws.request [] - Received
error response: com.amazonaws.services.s3.model.AmazonS3Exception: Not
Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not
Found; ......)

The key does in fact exist. How can I go about resolving this?

-- 
Cheers,
Aeden

GitHub: https://github.com/aedenj

Reply via email to