[
https://issues.apache.org/jira/browse/HAWQ-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15736565#comment-15736565
]
William Forson commented on HAWQ-1210:
--------------------------------------
btw, to make that question _a bit_ more specific, I am particularly interested
in knowing how {{hdfsFS}} handles returned by
[this|https://github.com/apache/incubator-hawq/blob/master/depends/libhdfs3/src/client/hdfs.h#L151]
function should be used. For instance:
a) is {{hdfsFS}} construction expensive or cheap?
b) can a single {{hdfsFS}} handle be safely used for concurrent {{hdfsRead}}
operations?
c) can distinct {{hdfsFS}} handles be safely used for concurrent {{hdfsRead}}
operations (i.e. if each handle is only being used for a single read operation
at any given time)?
> Documentation regarding usage of libhdfs3 in concurrent environment
> -------------------------------------------------------------------
>
> Key: HAWQ-1210
> URL: https://issues.apache.org/jira/browse/HAWQ-1210
> Project: Apache HAWQ
> Issue Type: Bug
> Components: libhdfs
> Reporter: William Forson
> Assignee: Lei Chang
>
> Hi,
> I've been using libhdfs3 in a single-threaded environment for several months
> now, without any problems. However, as soon as I tried using the library
> concurrently from multiple threads: hello, segfaults.
> Although the source of these segfaults is annoyingly subtle, I've managed to
> isolate it to a relatively small block of my code that does nothing
> interesting aside from using libhdfs3 to download a single hdfs file.
> To be clear: I assume that the mistake here is mine -- that is, that I am
> using your library incorrectly. However, I have been unable to find any
> documentation as to how the libhdfs3 API _should_ be used in a multi-threaded
> environment. I initially interpreted this to mean, "go to town, it's all more
> or less thread-safe", but I am now questioning that interpretation.
> So, I have a question, and a request.
> Question: Are there any known, non-obvious concurrency gotchas regarding the
> usage of libhdfs3 (or whatever it's currently called)?
> Request: Could you please add some documentation, to the README and/or
> hdfs.h, regarding usage in a concurrent environment? (ideally, such notes
> would annotate individual components of the API in hdfs.h, but if the answer
> to my question above is, "No", then this could perhaps be a single sentence
> in the README which affirmatively states that the library is generally safe
> for concurrent usage without additional/explicit synchronization -- anything
> would be better than nothing :))
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)