[ 
https://issues.apache.org/jira/browse/HADOOP-13421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16151302#comment-16151302
 ] 

Aaron Fabbri commented on HADOOP-13421:
---------------------------------------

Whoever added the forced list response paging to ITestS3AContractGetFileStatus, 
thank you.  Was going to add that and see it is already there.

Also explains why that test was timing out with v2 list.. not just slow home 
internet.. I needed to change this bit:

{noformat}
diff --git 
a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
 
b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
index e8b739432d1..eb80d37a12f 100644
--- 
a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
+++ 
b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
@@ -1113,7 +1113,7 @@ protected ListObjectsV2Result 
continueListObjects(ListObjectsV2Request req,
       ListObjectsV2Result objects) {
     incrementStatistic(OBJECT_CONTINUE_LIST_REQUESTS);
     incrementReadOperations();
-    req.setContinuationToken(objects.getContinuationToken());
+    req.setContinuationToken(objects.getNextContinuationToken());
     return s3.listObjectsV2(req);
   }
{noformat}

So, the v2 response has two continuation token fields, {{ContinuationToken}} 
and {{NextContinuationToken}}.  Turns out i was using the former and retrieving 
the same 2 results over and over.  Gave me a giggle, had to share..  V2 patch 
coming soon.

> Switch to v2 of the S3 List Objects API in S3A
> ----------------------------------------------
>
>                 Key: HADOOP-13421
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13421
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.8.0
>            Reporter: Steven K. Wong
>            Assignee: Aaron Fabbri
>            Priority: Minor
>         Attachments: HADOOP-13421-HADOOP-13345.001.patch
>
>
> Unlike [version 
> 1|http://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketGET.html] of the 
> S3 List Objects API, [version 
> 2|http://docs.aws.amazon.com/AmazonS3/latest/API/v2-RESTBucketGET.html] by 
> default does not fetch object owner information, which S3A doesn't need 
> anyway. By switching to v2, there will be less data to transfer/process. 
> Also, it should be more robust when listing a versioned bucket with "a large 
> number of delete markers" ([according to 
> AWS|https://aws.amazon.com/releasenotes/Java/0735652458007581]).
> Methods in S3AFileSystem that use this API include:
> * getFileStatus(Path)
> * innerDelete(Path, boolean)
> * innerListStatus(Path)
> * innerRename(Path, Path)
> Requires AWS SDK 1.10.75 or later.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to