William Forson created HAWQ-1210:
------------------------------------
Summary: Documentation regarding usage of libhdfs3 in concurrent
environment
Key: HAWQ-1210
URL: https://issues.apache.org/jira/browse/HAWQ-1210
Project: Apache HAWQ
Issue Type: Bug
Components: libhdfs
Reporter: William Forson
Assignee: Lei Chang
Hi,
I've been using libhdfs3 in a single-threaded environment for several months
now, without any problems. However, as soon as I tried using the library
concurrently from multiple threads: hello, segfaults.
Although the source of these segfaults is annoyingly subtle, I've managed to
isolate it to a relatively small block of my code that does nothing interesting
aside from using libhdfs3 to download a single hdfs file.
To be clear: I assume that the mistake here is mine -- that is, that I am using
your library incorrectly. However, I have been unable to find any documentation
as to how the libhdfs3 API _should_ be used in a multi-threaded environment. I
initially interpreted this to mean, "go to town, it's all more or less
threadsafe", but I am now questioning that interpretation.
So, I have a question, a request.
Question: Are there any known, non-obvious concurrency gotchas regarding the
usage of libhdfs3 (or whatever it's currently called)?
Request: Could you please add some documentation, to the README and/or hdfs.h,
regarding usage in a concurrent environment? (ideally, such notes would
annotate individual components of the API in hdfs.h, but if the answer to my
question above is, "No", then this could perhaps be a single sentence in the
README which affirmatively states that the library is generally safe for
concurrent usage without additional/explicit synchronization -- anything would
be better than nothing :))
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)