[
https://issues.apache.org/jira/browse/HDFS-10785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16012716#comment-16012716
]
James Clampffer commented on HDFS-10785:
----------------------------------------
Thanks for the improvements [~anatoli.shein]! There's just a few more things I
noticed in my last pass.
1) FsInfo::str takes a URI object. Can we have the str method just format the
contents of the FsInfo struct? If a tool wants to prefix those contents with a
URI it can do it in a more application specific way elsewhere. The main issue
here is that because you include common/uri.h you're pulling implementation
details into the public headers, so those headers alone wouldn't be enough to
link against the library.
2) In the headers for FsInfo, StatInfo, and ContentSummary you include iomanip
and sstream. You can push these includes down to the implementation since they
only need std::string. content_summary.h also includes stdint.h, but c+\+11
should have all the (u)int<width>_t types built in, was there some other reason
you needed to include this? Maybe older compiler with partial c++11 support?
3) in hdfs_ls your handler takes an std::promise by reference. This gets into
race condition issues described in HDFS-11556 which you might be less likely to
hit if you're using a single worker thread. For now just wrap the promise in a
shared_ptr and capture the shared_ptr by copy.
That issue is amplified here (from same block of code):
{code}
if (!has_more_results) {
promise.set_value(); //set promise
return false; //request stop sending results
}
return true; //request more results
{code}
If you have something blocked on that promise it's possible that it unblocks
right away and you have a race for assigning values. It looks like you can get
away with it here but I'd really try to avoid that by having the lambda take a
bool&. Tracing if it's possible to hit that race condition for every usage is
very time consuming and easy to get wrong.
Other:
This is just personal preference so I'm not going to push 1 way or the other,
but IMHO sprintf can be a lot more concise than iostream+iomanip for places
you're doing fixed width output. If you prefer iomanip that's cool too.
Some of the handlers for the recursive async calls may be better off as structs
with an overloaded operator() (lambdas are just shorthand for that anyway).
Lambdas are good for tiny chunks of code but can become confusing once there's
a mix of local variables along with stuff being captured particularly when you
have them inside main(). Doesn't need to be fixed in this patch since the
other tools do it but if you can think of a nice way to encapsulate that
functionality (at least pull if from main) it'd make it easier to new
contributors (and reviewers) to understand and debug.
The recursive functions like chmod/chgrp/chown/du could be simplified a lot by
generalizing FileSystem::Find into a file tree walk
(http://man7.org/linux/man-pages/man3/ftw.3.html) and just plugging in a lambda
that did the modification to change permissions or whatever else. Find does
most of this work already but it seems like things that use it have a lot of
glue code in common.
Another thing that's not specific to your patch but might be worth considering
in the future is when you have long function or method signatures it's hard to
tell what bool flags are doing; especially once that are just at the end to
toggle something like printing extra info. A 2 value enum e.g.
\{INCLUDE_QUOTA, OMIT_QUOTA} in the case of ContentSummary::str(<flag_type>)
makes this more clear at the call site and machine code the compiler emits will
generally look the same.
> libhdfs++: Implement the rest of the tools
> ------------------------------------------
>
> Key: HDFS-10785
> URL: https://issues.apache.org/jira/browse/HDFS-10785
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: hdfs-client
> Reporter: Anatoli Shein
> Assignee: Anatoli Shein
> Attachments: HDFS-10785.HDFS-8707.000.patch,
> HDFS-10785.HDFS-8707.001.patch, HDFS-10785.HDFS-8707.002.patch,
> HDFS-10785.HDFS-8707.003.patch, HDFS-10785.HDFS-8707.004.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]