[
https://issues.apache.org/jira/browse/IMPALA-6544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16732496#comment-16732496
]
Steve Loughran commented on IMPALA-6544:
----------------------------------------
yes: S3A create file does a check to see if a file is there before creation
* if its a directory: fail fast
* if its a file and overwrite=false, falil
It's something we've discussed killing in the past as when we know
overwrite=true, all we care about is whether its a directory or not: no need to
HEAD the file.
The other thing is that with the newer createFile() API call, we can add an s3
specific option to say "skip all the existence checks". A bit dangerous, but
very fast. You had better know what you are doing The Flink team have asked for
it already.
* If you switch to using S3Guard, DynamoDB gives the consistency
* If you aren't using it, you have other consistency issues lurking
Looking @ the rest of the stack (traces are always interesting), put is doiing
an upload to one path, then kicking off a rename; the renames need its own src
and data checks. Eliminate that temp file (remember, PUT to an object store is
the atomic operation you need), then that'll strip out most of that IO.
> Lack of S3 consistency leads to rare test failures
> --------------------------------------------------
>
> Key: IMPALA-6544
> URL: https://issues.apache.org/jira/browse/IMPALA-6544
> Project: IMPALA
> Issue Type: Task
> Components: Frontend
> Affects Versions: Impala 2.8.0
> Reporter: Sailesh Mukil
> Priority: Major
> Labels: S3, broken-build, consistency, flaky, test-framework
>
> Every now and then, we hit a flaky test on S3 runs due to files missing when
> they should be present, and vice versa. We could consider running our tests
> (or a subset of our tests) with S3Guard to avoid these problems, however rare
> they are.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]