[
https://issues.apache.org/jira/browse/HDFS-13616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16491002#comment-16491002
]
Andrew Wang commented on HDFS-13616:
------------------------------------
Thanks for taking a look, Xiao and Aaron!
bq. We currently FNFE on the first error. Is it possible a partition is deleted
while another thread is listing halfway for Hive/Impala? What's the expected
behavior from them if so? (I'm lacking the knowledge of this so no strong
preference either way, but curious...)
This case is somewhat addressed by the the unit test listSomeDoNotExist, you'll
see that the get() method throws if there was an exception but you can still
get results from other listing batches returned by the iterator.
If you're talking about listing a single large directory and the directory gets
deleted during the listing, then yea this API will throw an FNFE like the
existing RemoteIterator<FileStatus> API. Paged listings aren't atomic.
bq. If caller added some subdirs to srcs, should we list the subdir twice, or
throw, or 'smartly' list everything at most once?
This is addressed by the unit test listSamePaths, it lists it multiple times. I
didn't see it as the role of the filesystem to coalesce these paths,
semantically I wanted it to behave like the existing RemoteIterator<FileStatus>
API called in a for loop.
Aaron, I'll hit your review comments in a new patch rev. Precommit is getting
pretty close, so I'm hoping to coalesce review comments from others before
posting the next one.
bq. Why not just RemoteIterator<FileStatus>?
We need an entry point to throw an exception for a single path that doesn't
kill the entire listing. From a client POV, it's also nice to have the same
path passed in provided back, since the HDFS returns back absolute, qualified
paths. It also makes it easier to understand the empty directory case.
I attached the benchmark I ran for further examination. I think you correctly
answered the usecase question yourself, but to confirm: the Hive/Impala client
already has a list of leaf directories to list, so it'd require some
contortions to use a recursive API like listFiles instead. I imagine a
server-side listFiles (like what S3 has) would be a nice speedup though.
> Batch listing of multiple directories
> -------------------------------------
>
> Key: HDFS-13616
> URL: https://issues.apache.org/jira/browse/HDFS-13616
> Project: Hadoop HDFS
> Issue Type: New Feature
> Affects Versions: 3.2.0
> Reporter: Andrew Wang
> Assignee: Andrew Wang
> Priority: Major
> Attachments: BenchmarkListFiles.java, HDFS-13616.001.patch,
> HDFS-13616.002.patch
>
>
> One of the dominant workloads for external metadata services is listing of
> partition directories. This can end up being bottlenecked on RTT time when
> partition directories contain a small number of files. This is fairly common,
> since fine-grained partitioning is used for partition pruning by the query
> engines.
> A batched listing API that takes multiple paths amortizes the RTT cost.
> Initial benchmarks show a 10-20x improvement in metadata loading performance.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]