[
https://issues.apache.org/jira/browse/HDFS-10672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15404346#comment-15404346
]
James Clampffer commented on HDFS-10672:
----------------------------------------
Hey Anatoli, it looks like this no longer applies cleanly to the current head
of HDFS-8707, could you take a look? I let my turnaround time for finishing
reviews get longer than it should be which could have caused this; sorry about
that.
{code}
virtual Status PositionRead(void *buf, size_t buf_size, off_t offset, size_t
*bytes_read) = 0;
virtual Status Read(void *buf, size_t buf_size, size_t *bytes_read) = 0;
{code}
Thanks for changing this and the bits of code the change touched. The old API
that used nbyte as an in/out parameter was a bad design choice for every
real-world situation I tried using it on.
Overall the code looks good. I think there's a few things worth doing make it
a bit better before committing.
1) In cat.c you have some nice uri parsing code. It looks like you are working
on several other file system tools that are going to be using something similar
if they have C versions. I think it'd be worth pulling out that parsing code
into it's own file so it can be shared between your tools and used by others
who need a nice C API for that. You could also write some C wrappers around
the C++ URI parsing code because it's more functionally complete if you want to
go that route. You can defer this work until some of your other utility tools
start landing; I just want to make sure we don't have several tools with the
same copy/pasted block of code.
2) In cat.cpp you use the same C-style URI code. Since part of the value of
this work is to provide good examples of libhdfs API usage I think you should
reuse our URI parsing class from libhdfspp/lib/common/uri.h here.
3) Like Bob mentioned earlier it's worth updating this to use the
ConfigurationLoader to give you an HdfsConfiguration object that can give you
an Options object that reflects what people have in /etc/hadoop/conf or
$HADOOP_CONF_DIR. I can give you a hand here if you need some references on
how to do this (at least the way I'd do it).
4) In some of the unit tests:
{code}
file_info->file_length_ = 1; //To avoid running into EOF
{code}
Best to add at least one test that makes sure we do get the EOF status when
expected. I've tried to get away with similar "small" changes to other tests
without new tests and it always ended up being more pain later on than writing
the test would have.
5) In the changes to status:
{code}
bool invalid_offset() const { return code_ == kInvalidOffset; }
{code}
Not really a blocker, but I try to follow the pattern of is_<some predicate on
object's current state> to make it really clear that what you're doing is
strictly a test even when the object isn't const qualified.
General comment not related to your changes but could be worth discussing
somewhere:
Factory functions that return a Status and assign to a user supplied pointer
leads to weird looking code (not at all your fault, this is what the API forces)
{code}
FileHandle *file_raw = nullptr;
stat = fs->Open(uri.path, &file_raw);
if (!stat.ok()) {
cerr << "Could not open file " << uri.path << endl;
return 1;
}
//wrapping file_raw into a unique pointer to guarantee deletion
unique_ptr<FileHandle> file(file_raw);
{code}
It'd sure be nice if FileSystem::Open could just return a unique_ptr or raw
pointer IMO. Google's coding standards pretend/assert that exceptions don't
exist. For their code that might be true, but I think this is going to be a
boilerplate pattern that shows up in any code that uses libhdfs++ but is
written to use RAII in idiomatic C++. Do we want to make life easier for those
users with a more idiomatic constructor? One thing you might want to do in the
error catching block is assert that the pointer is still null (or just call
free on it).
> libhdfs++: reorder directories in src/main/libhdfspp/examples, and add C++
> version of cat tool
> ----------------------------------------------------------------------------------------------
>
> Key: HDFS-10672
> URL: https://issues.apache.org/jira/browse/HDFS-10672
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: hdfs-client
> Reporter: Anatoli Shein
> Assignee: Anatoli Shein
> Attachments: HDFS-10672.HDFS-8707.000.patch,
> HDFS-10672.HDFS-8707.001.patch, HDFS-10672.HDFS-8707.002.patch,
> HDFS-10672.HDFS-8707.003.patch
>
>
> src/main/libhdfspp/examples should be structured like
> examples/language/utility instead of examples/utility/language for easier
> access by different developers.
> Additionally implementing C++ version of cat tool.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]