[ 
https://issues.apache.org/jira/browse/HDFS-573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-573:
-------------------------------

    Attachment: HDFS-573.1.patch

This patch gets the current trunk/branch-2 libhdfs source code compiling and 
working on Windows.  Linux-specific code has been either eliminated in favor of 
something platform-agnostic, or ported to use the corresponding Windows system 
calls.  It's a large patch, but unfortunately, I don't see a logical way to 
break it into smaller pieces.

Instead of using a lot of conditional compilation like we do in 
libhadoop.so/hadoop.dll, the approach is to split platform-specific code into 
platform-specific files.  CMake selects the correct files for the platform at 
build time.  I think this yields more legible code.  I modeled the source tree 
structure after what OpenJDK uses (/os/<platform>).

All automated tests pass on both Linux and Windows, except for zero-copy which 
isn't yet supported on Windows.  In addition to the automated tests, I manually 
ran test_libhdfs_ops against live clusters running on both Linux and Windows.

Here are details on a couple of specific points:

* BUILDING.txt
** I used this opportunity as a testbed for CMake on Windows, and it worked out 
great.  We might consider doing the same for hadoop-common later instead of 
checking in .vcproj files with logic that duplicates the CMake logic.  I've 
updated the build instructions to indicate that CMake is a requirement on 
Windows now.
* pom.xml
** Add steps to trigger CMake build on Windows.
** Refactored logic of native_tests to use an Ant macro.
** I noticed that test_libhdfs_zerocopy wasn't actually being run, and none of 
the tests were running with libhadoop.so/hadoop.dll, so I took the opportunity 
to fix that.  test_libhdfs_zerocopy only runs on Linux, because Windows doesn't 
yet support short-circuit reads, and therefore cannot support zero-copy.
* CMakeLists.txt
** Parameterized various build steps for POSIX vs. Windows platform differences.
** Don't compile posix_util.c.  Instead, compile it in fuse-dfs, which was the 
only thing actually using it.  This way, we don't need to port code that isn't 
really used in libhdfs.
* htable.c/htable.h
** libhdfs keeps a very small hash table mapping class names to class 
references.  This had been implemented using the Linux-specific {{hcreate}} and 
{{hsearch}} functions.  The simplest solution was to take this hash table code 
from the HADOOP-10388 branch.  These files are identical to the code on the 
feature branch, where it's already been code reviewed and +1'd once.
* jni_helper.c
** Removed the {{hdfsTls}} struct.  We can store the {{JNIEnv}} pointer 
directly into thread-local storage, so we don't need this container struct.
* mutexes.c
** The Windows version needs to do some linker trickery to guarantee 
initialization of each {{CRITICAL_SECTION}}.  The comments explain this in 
detail.
* thread_local_storage.c
** In the Windows version, it was pretty challenging to recreate the logic of 
using a pthreads thread-local storage key destructor to detach the thread from 
the JVM on exit.  Windows doesn't offer a simple API for hooking onto a thread 
shutdown event, but the portable executable format does define a place for 
thread-local storage callbacks.  This involves more linker trickery.  Details 
are in the comments.
* Stop using C99 constructs and stick to C89 in various files.
** Declare local variables at the top of the function.
** Don't use designated initializers on structs.
** Don't use variable-length arrays.
* Clean up warnings in various files.
** implicit conversions
** losses of precision
** assignments from conditionals
* Several files needed to rename internal constants that clashed with names in 
Windows headers.

libwebhdfs is not covered in this patch.  That would need to be handled 
separately.

Similarly, vecsum is not covered in this patch.  We'd need to port the 
sys/mman.h functions to get that working.

fuse-dfs is unchanged.  I believe fuse isn't supported on Windows.


> Porting libhdfs to Windows
> --------------------------
>
>                 Key: HDFS-573
>                 URL: https://issues.apache.org/jira/browse/HDFS-573
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>         Environment: Windows, Visual Studio 2008
>            Reporter: Ziliang Guo
>            Assignee: Chris Nauroth
>         Attachments: HDFS-573.1.patch
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> The current C code in libhdfs is written using C99 conventions and also uses 
> a few POSIX specific functions such as hcreate, hsearch, and pthread mutex 
> locks.  To compile it using Visual Studio would require a conversion of the 
> code in hdfsJniHelper.c and hdfs.c to C89 and replacement/reimplementation of 
> the POSIX functions.  The code also uses the stdint.h header, which is not 
> part of the original C89, but there exists what appears to be a BSD licensed 
> reimplementation written to be compatible with MSVC floating around.  I have 
> already done the other necessary conversions, as well as created a simplistic 
> hash bucket for use with hcreate and hsearch and successfully built a DLL of 
> libhdfs.  Further testing is needed to see if it is usable by other programs 
> to actually access hdfs, which will likely happen in the next few weeks as 
> the Condor Project continues with its file transfer work.
> In the process, I've removed a few what I believe are extraneous consts and 
> also fixed an incorrect array initialization where someone was attempting to 
> initialize with something like this: JavaVMOption options[noArgs]; where 
> noArgs was being incremented in the code above.  This was in the 
> hdfsJniHelper.c file, in the getJNIEnv function.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to