[
https://issues.apache.org/jira/browse/HADOOP-17281?focusedWorklogId=495327&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-495327
]
ASF GitHub Bot logged work on HADOOP-17281:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 05/Oct/20 13:41
Start Date: 05/Oct/20 13:41
Worklog Time Spent: 10m
Work Description: mukund-thakur commented on pull request #2354:
URL: https://github.com/apache/hadoop/pull/2354#issuecomment-703639827
> Looks good. Annoying about the return types which force you to do that
wrapping/casting. Can't you just forcibly cast the return type of the inner
iterator? after all, type erasure means all type info will be lost in the
actual compiled binary. I'd prefer that as it will give you automatic
passthrough of the IOStatistics stuff.
This is not possible sadly.
>
> Add text to filesystem.md, something which:
>
> * specifies the result is exactly the same a listStatus, provided no other
caller updates the directory during the list
> * declares that it's not atomic and performance implementations will page
Done
> * and that if a path isn't there, that fact may not surface until
next/hasNext...that is, we do lazy eval for all file IO
>
Actually this is not correct, we do throw FNFE if a path is not there,
remember we have a check if the listing returns zeror results assuming the path
as a directory, we fall back to file checks.
> We need to similar new contract tests in AbstractContractGetFileStatusTest
for all to use
>
> * that in a dir with files and subdirectories, you get both returned in
the listing
> * that you can iterate through with next() to failure as well as
hasNext/next, and get the same results
> * listStatusIterator(file) returns the file
> * listStatusIterator("/") gives you a listing of root (put that in
AbstractContractRootDirectoryTest)
>
Done
> And two for changes partway through the iteration
>
> * change the directory during a list to add/delete files
> * deletes the actual path.
>
> These tests can't assert on what will happen, and with paged IO aren't
likely to pick up on changes...there just to show it can be done and pick up on
any major issues with implementations.
As discussed not adding these tests.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 495327)
Time Spent: 40m (was: 0.5h)
> Implement FileSystem.listStatusIterator() in S3AFileSystem
> ----------------------------------------------------------
>
> Key: HADOOP-17281
> URL: https://issues.apache.org/jira/browse/HADOOP-17281
> Project: Hadoop Common
> Issue Type: Task
> Components: fs/s3
> Affects Versions: 3.3.0
> Reporter: Mukund Thakur
> Assignee: Mukund Thakur
> Priority: Major
> Labels: pull-request-available
> Time Spent: 40m
> Remaining Estimate: 0h
>
> Currently S3AFileSystem only implements listStatus() api which returns an
> array. Once we implement the listStatusIterator(), clients can benefit from
> the async listing done recently
> https://issues.apache.org/jira/browse/HADOOP-17074 by performing some tasks
> on files while iterating them.
>
> CC [~stevel]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]