[ 
https://issues.apache.org/jira/browse/HDFS-10785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16012716#comment-16012716
 ] 

James Clampffer commented on HDFS-10785:
----------------------------------------

Thanks for the improvements [~anatoli.shein]!  There's just a few more things I 
noticed in my last pass.

1) FsInfo::str takes a URI object.  Can we have the str method just format the 
contents of the FsInfo struct?  If a tool wants to prefix those contents with a 
URI it can do it in a more application specific way elsewhere.  The main issue 
here is that because you include common/uri.h you're pulling implementation 
details into the public headers, so those headers alone wouldn't be enough to 
link against the library. 

2) In the headers for FsInfo, StatInfo, and ContentSummary you include iomanip 
and sstream.  You can push these includes down to the implementation since they 
only need std::string.  content_summary.h also includes stdint.h, but c+\+11 
should have all the (u)int<width>_t types built in, was there some other reason 
you needed to include this?  Maybe older compiler with partial c++11 support?

3) in hdfs_ls your handler takes an std::promise by reference.  This gets into 
race condition issues described in HDFS-11556 which you might be less likely to 
hit if you're using a single worker thread.  For now just wrap the promise in a 
shared_ptr and capture the shared_ptr by copy.

That issue is amplified here (from same block of code):
{code}
      if (!has_more_results) {
        promise.set_value();  //set promise
        return false;         //request stop sending results
      }
      return true;  //request more results
{code}

If you have something blocked on that promise it's possible that it unblocks 
right away and you have a race for assigning values.  It looks like you can get 
away with it here but I'd really try to avoid that by having the lambda take a 
bool&.  Tracing if it's possible to hit that race condition for every usage is 
very time consuming and easy to get wrong.

Other:
This is just personal preference so I'm not going to push 1 way or the other, 
but IMHO sprintf can be a lot more concise than iostream+iomanip for places 
you're doing fixed width output.  If you prefer iomanip that's cool too.

Some of the handlers for the recursive async calls may be better off as structs 
with an overloaded operator() (lambdas are just shorthand for that anyway).  
Lambdas are good for tiny chunks of code but can become confusing once there's 
a mix of local variables along with stuff being captured particularly when you 
have them inside main().  Doesn't need to be fixed in this patch since the 
other tools do it but if you can think of a nice way to encapsulate that 
functionality (at least pull if from main) it'd make it easier to new 
contributors (and reviewers) to understand and debug.

The recursive functions like chmod/chgrp/chown/du could be simplified a lot by 
generalizing FileSystem::Find into a file tree walk 
(http://man7.org/linux/man-pages/man3/ftw.3.html) and just plugging in a lambda 
that did the modification to change permissions or whatever else.  Find does 
most of this work already but it seems like things that use it have a lot of 
glue code in common.

Another thing that's not specific to your patch but might be worth considering 
in the future is when you have long function or method signatures it's hard to 
tell what bool flags are doing; especially once that are just at the end to 
toggle something like printing extra info.  A 2 value enum e.g. 
\{INCLUDE_QUOTA, OMIT_QUOTA} in the case of ContentSummary::str(<flag_type>) 
makes this more clear at the call site and machine code the compiler emits will 
generally look the same.

> libhdfs++: Implement the rest of the tools
> ------------------------------------------
>
>                 Key: HDFS-10785
>                 URL: https://issues.apache.org/jira/browse/HDFS-10785
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>            Reporter: Anatoli Shein
>            Assignee: Anatoli Shein
>         Attachments: HDFS-10785.HDFS-8707.000.patch, 
> HDFS-10785.HDFS-8707.001.patch, HDFS-10785.HDFS-8707.002.patch, 
> HDFS-10785.HDFS-8707.003.patch, HDFS-10785.HDFS-8707.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to