[
https://issues.apache.org/jira/browse/IMPALA-6544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16729196#comment-16729196
]
Philip Zeyliger commented on IMPALA-6544:
-----------------------------------------
Perhaps depressingly, {{hdfs fs -put}} triggers 24 HTTP requests to S3 to
upload a small (in this case, 29 byte) file:
{code}
[root@philip-bb-3 ~]# HADOOP_ROOT_LOGGER=TRACE,console hadoop fs -put z
s3a://..../test$(date +%s) |& grep 'http.wire.*>>.*HTTP/1.1' | grep -n .
1:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-0 >> "HEAD /
HTTP/1.1[\r][\n]"
2:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "HEAD /
HTTP/1.1[\r][\n]"
3:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "HEAD /test1545858824
HTTP/1.1[\r][\n]"
4:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "HEAD /test1545858824/
HTTP/1.1[\r][\n]"
5:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "GET
/?list-type=2&delimiter=%2F&max-keys=1&prefix=test1545858824%2F&fetch-owner=false
HTTP/1.1[\r][\n]"
6:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "GET
/?list-type=2&delimiter=%2F&max-keys=1&prefix=&fetch-owner=false
HTTP/1.1[\r][\n]"
7:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "HEAD
/test1545858824._COPYING_ HTTP/1.1[\r][\n]"
8:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "HEAD
/test1545858824._COPYING_/ HTTP/1.1[\r][\n]"
9:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "GET
/?list-type=2&delimiter=%2F&max-keys=1&prefix=test1545858824._COPYING_%2F&fetch-owner=false
HTTP/1.1[\r][\n]"
10:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "HEAD
/test1545858824._COPYING_ HTTP/1.1[\r][\n]"
11:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "HEAD
/test1545858824._COPYING_/ HTTP/1.1[\r][\n]"
12:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "GET
/?list-type=2&delimiter=%2F&max-keys=1&prefix=test1545858824._COPYING_%2F&fetch-owner=false
HTTP/1.1[\r][\n]"
13:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "HEAD
/test1545858824._COPYING_ HTTP/1.1[\r][\n]"
14:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "HEAD
/test1545858824._COPYING_/ HTTP/1.1[\r][\n]"
15:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "GET
/?list-type=2&delimiter=%2F&max-keys=1&prefix=test1545858824._COPYING_%2F&fetch-owner=false
HTTP/1.1[\r][\n]"
16:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "PUT
/test1545858824._COPYING_ HTTP/1.1[\r][\n]"
17:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "HEAD
/test1545858824._COPYING_ HTTP/1.1[\r][\n]"
18:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "HEAD /test1545858824
HTTP/1.1[\r][\n]"
19:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "HEAD /test1545858824/
HTTP/1.1[\r][\n]"
20:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "GET
/?list-type=2&delimiter=%2F&max-keys=1&prefix=test1545858824%2F&fetch-owner=false
HTTP/1.1[\r][\n]"
21:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "HEAD
/test1545858824._COPYING_ HTTP/1.1[\r][\n]"
22:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "HEAD
/test1545858824._COPYING_ HTTP/1.1[\r][\n]"
23:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "PUT /test1545858824
HTTP/1.1[\r][\n]"
24:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "DELETE
/test1545858824._COPYING_ HTTP/1.1[\r][\n]"
{code}
This includes the "HEAD before PUT" behavior that blows away read-after-write
consistency. We definitely have some tests that use {{hdfs put}}. We have even
more that use Impala to write, though, and it's less clear if this is going on
there. (I've had some trouble getting these http wire logs out.)
> Lack of S3 consistency leads to rare test failures
> --------------------------------------------------
>
> Key: IMPALA-6544
> URL: https://issues.apache.org/jira/browse/IMPALA-6544
> Project: IMPALA
> Issue Type: Task
> Components: Frontend
> Affects Versions: Impala 2.8.0
> Reporter: Sailesh Mukil
> Priority: Major
> Labels: S3, broken-build, consistency, flaky, test-framework
>
> Every now and then, we hit a flaky test on S3 runs due to files missing when
> they should be present, and vice versa. We could consider running our tests
> (or a subset of our tests) with S3Guard to avoid these problems, however rare
> they are.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]