[ 
https://issues.apache.org/jira/browse/IMPALA-6544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16729196#comment-16729196
 ] 

Philip Zeyliger commented on IMPALA-6544:
-----------------------------------------

Perhaps depressingly, {{hdfs fs -put}} triggers 24 HTTP requests to S3 to 
upload a small (in this case, 29 byte) file:
{code}
[root@philip-bb-3 ~]# HADOOP_ROOT_LOGGER=TRACE,console hadoop fs -put z  
s3a://..../test$(date +%s) |& grep 'http.wire.*>>.*HTTP/1.1' | grep -n .
1:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-0 >> "HEAD / 
HTTP/1.1[\r][\n]"
2:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "HEAD / 
HTTP/1.1[\r][\n]"
3:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "HEAD /test1545858824 
HTTP/1.1[\r][\n]"
4:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "HEAD /test1545858824/ 
HTTP/1.1[\r][\n]"
5:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "GET 
/?list-type=2&delimiter=%2F&max-keys=1&prefix=test1545858824%2F&fetch-owner=false
 HTTP/1.1[\r][\n]"
6:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "GET 
/?list-type=2&delimiter=%2F&max-keys=1&prefix=&fetch-owner=false 
HTTP/1.1[\r][\n]"
7:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "HEAD 
/test1545858824._COPYING_ HTTP/1.1[\r][\n]"
8:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "HEAD 
/test1545858824._COPYING_/ HTTP/1.1[\r][\n]"
9:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "GET 
/?list-type=2&delimiter=%2F&max-keys=1&prefix=test1545858824._COPYING_%2F&fetch-owner=false
 HTTP/1.1[\r][\n]"
10:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "HEAD 
/test1545858824._COPYING_ HTTP/1.1[\r][\n]"
11:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "HEAD 
/test1545858824._COPYING_/ HTTP/1.1[\r][\n]"
12:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "GET 
/?list-type=2&delimiter=%2F&max-keys=1&prefix=test1545858824._COPYING_%2F&fetch-owner=false
 HTTP/1.1[\r][\n]"
13:18/12/26 13:13:46 DEBUG http.wire: http-outgoing-1 >> "HEAD 
/test1545858824._COPYING_ HTTP/1.1[\r][\n]"
14:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "HEAD 
/test1545858824._COPYING_/ HTTP/1.1[\r][\n]"
15:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "GET 
/?list-type=2&delimiter=%2F&max-keys=1&prefix=test1545858824._COPYING_%2F&fetch-owner=false
 HTTP/1.1[\r][\n]"
16:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "PUT 
/test1545858824._COPYING_ HTTP/1.1[\r][\n]"
17:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "HEAD 
/test1545858824._COPYING_ HTTP/1.1[\r][\n]"
18:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "HEAD /test1545858824 
HTTP/1.1[\r][\n]"
19:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "HEAD /test1545858824/ 
HTTP/1.1[\r][\n]"
20:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "GET 
/?list-type=2&delimiter=%2F&max-keys=1&prefix=test1545858824%2F&fetch-owner=false
 HTTP/1.1[\r][\n]"
21:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "HEAD 
/test1545858824._COPYING_ HTTP/1.1[\r][\n]"
22:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "HEAD 
/test1545858824._COPYING_ HTTP/1.1[\r][\n]"
23:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "PUT /test1545858824 
HTTP/1.1[\r][\n]"
24:18/12/26 13:13:47 DEBUG http.wire: http-outgoing-1 >> "DELETE 
/test1545858824._COPYING_ HTTP/1.1[\r][\n]"
{code}

This includes the "HEAD before PUT" behavior that blows away read-after-write 
consistency. We definitely have some tests that use {{hdfs put}}. We have even 
more that use Impala to write, though, and it's less clear if this is going on 
there. (I've had some trouble getting these http wire logs out.)

> Lack of S3 consistency leads to rare test failures
> --------------------------------------------------
>
>                 Key: IMPALA-6544
>                 URL: https://issues.apache.org/jira/browse/IMPALA-6544
>             Project: IMPALA
>          Issue Type: Task
>          Components: Frontend
>    Affects Versions: Impala 2.8.0
>            Reporter: Sailesh Mukil
>            Priority: Major
>              Labels: S3, broken-build, consistency, flaky, test-framework
>
> Every now and then, we hit a flaky test on S3 runs due to files missing when 
> they should be present, and vice versa. We could consider running our tests 
> (or a subset of our tests) with S3Guard to avoid these problems, however rare 
> they are.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to