[ 
https://issues.apache.org/jira/browse/HDFS-10672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15404346#comment-15404346
 ] 

James Clampffer commented on HDFS-10672:
----------------------------------------

Hey Anatoli, it looks like this no longer applies cleanly to the current head 
of HDFS-8707, could you take a look?  I let my turnaround time for finishing 
reviews get longer than it should be which could have caused this; sorry about 
that.

{code}
  virtual Status PositionRead(void *buf, size_t buf_size, off_t offset, size_t 
*bytes_read) = 0;
  virtual Status Read(void *buf, size_t buf_size, size_t *bytes_read) = 0;
{code}
Thanks for changing this and the bits of code the change touched.  The old API 
that used nbyte as an in/out parameter was a bad design choice for every 
real-world situation I tried using it on.

Overall the code looks good.  I think there's a few things worth doing make it 
a bit better before committing.

1) In cat.c you have some nice uri parsing code.  It looks like you are working 
on several other file system tools that are going to be using something similar 
if they have C versions.  I think it'd be worth pulling out that parsing code 
into it's own file so it can be shared between your tools and used by others 
who need a nice C API for that.  You could also write some C wrappers around 
the C++ URI parsing code because it's more functionally complete if you want to 
go that route.  You can defer this work until some of your other utility tools 
start landing; I just want to make sure we don't have several tools with the 
same copy/pasted block of code.

2) In cat.cpp you use the same C-style URI code.  Since part of the value of 
this work is to provide good examples of libhdfs API usage I think you should 
reuse our URI parsing class from libhdfspp/lib/common/uri.h here.

3) Like Bob mentioned earlier it's worth updating this to use the 
ConfigurationLoader to give you an HdfsConfiguration object that can give you 
an Options object that reflects what people have in /etc/hadoop/conf or 
$HADOOP_CONF_DIR.  I can give you a hand here if you need some references on 
how to do this (at least the way I'd do it).

4) In some of the unit tests:
{code}
file_info->file_length_ = 1; //To avoid running into EOF
{code}
Best to add at least one test that makes sure we do get the EOF status when 
expected.  I've tried to get away with similar "small" changes to other tests 
without new tests and it always ended up being more pain later on than writing 
the test would have.

5) In the changes to status:
{code}
bool invalid_offset() const { return code_ == kInvalidOffset; }
{code}
Not really a blocker, but I try to follow the pattern of is_<some predicate on 
object's current state> to make it really clear that what you're doing is 
strictly a test even when the object isn't const qualified.

General comment not related to your changes but could be worth discussing 
somewhere:
Factory functions that return a Status and assign to a user supplied pointer 
leads to weird looking code (not at all your fault, this is what the API forces)
{code}
FileHandle *file_raw = nullptr;
stat = fs->Open(uri.path, &file_raw);
if (!stat.ok()) {
  cerr << "Could not open file " << uri.path << endl;
  return 1;
}
//wrapping file_raw into a unique pointer to guarantee deletion
unique_ptr<FileHandle> file(file_raw);
{code}

It'd sure be nice if FileSystem::Open could just return a unique_ptr or raw 
pointer IMO.  Google's coding standards pretend/assert that exceptions don't 
exist.  For their code that might be true, but I think this is going to be a 
boilerplate pattern that shows up in any code that uses libhdfs++ but is 
written to use RAII in idiomatic C++.  Do we want to make life easier for those 
users with a more idiomatic constructor?  One thing you might want to do in the 
error catching block is assert that the pointer is still null (or just call 
free on it).




> libhdfs++: reorder directories in src/main/libhdfspp/examples, and add C++ 
> version of cat tool
> ----------------------------------------------------------------------------------------------
>
>                 Key: HDFS-10672
>                 URL: https://issues.apache.org/jira/browse/HDFS-10672
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>            Reporter: Anatoli Shein
>            Assignee: Anatoli Shein
>         Attachments: HDFS-10672.HDFS-8707.000.patch, 
> HDFS-10672.HDFS-8707.001.patch, HDFS-10672.HDFS-8707.002.patch, 
> HDFS-10672.HDFS-8707.003.patch
>
>
> src/main/libhdfspp/examples should be structured like 
> examples/language/utility instead of examples/utility/language for easier 
> access by different developers.
> Additionally implementing C++ version of cat tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to