[
https://issues.apache.org/jira/browse/HDFS-573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Nauroth updated HDFS-573:
-------------------------------
Attachment: HDFS-573.1.patch
This patch gets the current trunk/branch-2 libhdfs source code compiling and
working on Windows. Linux-specific code has been either eliminated in favor of
something platform-agnostic, or ported to use the corresponding Windows system
calls. It's a large patch, but unfortunately, I don't see a logical way to
break it into smaller pieces.
Instead of using a lot of conditional compilation like we do in
libhadoop.so/hadoop.dll, the approach is to split platform-specific code into
platform-specific files. CMake selects the correct files for the platform at
build time. I think this yields more legible code. I modeled the source tree
structure after what OpenJDK uses (/os/<platform>).
All automated tests pass on both Linux and Windows, except for zero-copy which
isn't yet supported on Windows. In addition to the automated tests, I manually
ran test_libhdfs_ops against live clusters running on both Linux and Windows.
Here are details on a couple of specific points:
* BUILDING.txt
** I used this opportunity as a testbed for CMake on Windows, and it worked out
great. We might consider doing the same for hadoop-common later instead of
checking in .vcproj files with logic that duplicates the CMake logic. I've
updated the build instructions to indicate that CMake is a requirement on
Windows now.
* pom.xml
** Add steps to trigger CMake build on Windows.
** Refactored logic of native_tests to use an Ant macro.
** I noticed that test_libhdfs_zerocopy wasn't actually being run, and none of
the tests were running with libhadoop.so/hadoop.dll, so I took the opportunity
to fix that. test_libhdfs_zerocopy only runs on Linux, because Windows doesn't
yet support short-circuit reads, and therefore cannot support zero-copy.
* CMakeLists.txt
** Parameterized various build steps for POSIX vs. Windows platform differences.
** Don't compile posix_util.c. Instead, compile it in fuse-dfs, which was the
only thing actually using it. This way, we don't need to port code that isn't
really used in libhdfs.
* htable.c/htable.h
** libhdfs keeps a very small hash table mapping class names to class
references. This had been implemented using the Linux-specific {{hcreate}} and
{{hsearch}} functions. The simplest solution was to take this hash table code
from the HADOOP-10388 branch. These files are identical to the code on the
feature branch, where it's already been code reviewed and +1'd once.
* jni_helper.c
** Removed the {{hdfsTls}} struct. We can store the {{JNIEnv}} pointer
directly into thread-local storage, so we don't need this container struct.
* mutexes.c
** The Windows version needs to do some linker trickery to guarantee
initialization of each {{CRITICAL_SECTION}}. The comments explain this in
detail.
* thread_local_storage.c
** In the Windows version, it was pretty challenging to recreate the logic of
using a pthreads thread-local storage key destructor to detach the thread from
the JVM on exit. Windows doesn't offer a simple API for hooking onto a thread
shutdown event, but the portable executable format does define a place for
thread-local storage callbacks. This involves more linker trickery. Details
are in the comments.
* Stop using C99 constructs and stick to C89 in various files.
** Declare local variables at the top of the function.
** Don't use designated initializers on structs.
** Don't use variable-length arrays.
* Clean up warnings in various files.
** implicit conversions
** losses of precision
** assignments from conditionals
* Several files needed to rename internal constants that clashed with names in
Windows headers.
libwebhdfs is not covered in this patch. That would need to be handled
separately.
Similarly, vecsum is not covered in this patch. We'd need to port the
sys/mman.h functions to get that working.
fuse-dfs is unchanged. I believe fuse isn't supported on Windows.
> Porting libhdfs to Windows
> --------------------------
>
> Key: HDFS-573
> URL: https://issues.apache.org/jira/browse/HDFS-573
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs-client
> Environment: Windows, Visual Studio 2008
> Reporter: Ziliang Guo
> Assignee: Chris Nauroth
> Attachments: HDFS-573.1.patch
>
> Original Estimate: 336h
> Remaining Estimate: 336h
>
> The current C code in libhdfs is written using C99 conventions and also uses
> a few POSIX specific functions such as hcreate, hsearch, and pthread mutex
> locks. To compile it using Visual Studio would require a conversion of the
> code in hdfsJniHelper.c and hdfs.c to C89 and replacement/reimplementation of
> the POSIX functions. The code also uses the stdint.h header, which is not
> part of the original C89, but there exists what appears to be a BSD licensed
> reimplementation written to be compatible with MSVC floating around. I have
> already done the other necessary conversions, as well as created a simplistic
> hash bucket for use with hcreate and hsearch and successfully built a DLL of
> libhdfs. Further testing is needed to see if it is usable by other programs
> to actually access hdfs, which will likely happen in the next few weeks as
> the Condor Project continues with its file transfer work.
> In the process, I've removed a few what I believe are extraneous consts and
> also fixed an incorrect array initialization where someone was attempting to
> initialize with something like this: JavaVMOption options[noArgs]; where
> noArgs was being incremented in the code above. This was in the
> hdfsJniHelper.c file, in the getJNIEnv function.
--
This message was sent by Atlassian JIRA
(v6.2#6252)