[
https://issues.apache.org/jira/browse/HADOOP-8079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217316#comment-13217316
]
Steve Loughran commented on HADOOP-8079:
----------------------------------------
Overall, it's a good initial start, though it could be made a bit more elegant
and easier to test.
Testing is what worries me here, as even if the release process & Jenkins test
on Windows, there's no guarantee anyone else will, which increases the
likelihood of a regression sneaking in. The smaller amount of platform-specific
code the better
* Incomplete full use of ASF guidelines; all if() clauses should be curly
braced for better long-term maintenance esp. w/ patches.
* Some of the changes seem IDE-triggered, not OS-related; these should be
removed as they complicate other patches and versions.
* I'm not sure about "temp hack to copy file" comment above a method in
{{FileUtil}}; it's a bit worrying.
* Even when exceptions are swallowed, a log at debug level is wise. Just in
case something really, really unexpected happens.
* The patches imply that cygwin will never be used again. Is this something
everyone is happy with? I don't personally have any...
* I'm curious why the SymLink code opts to copy a file instead of using
{{::CreateSymbolicLink()}}; I assume that an extended
{{org.apache.hadoop.fs.HardLink}} class will also avoid {{::CreateHardLink()}}.
I know these aren't exported via the Java runtime, but is there no way they
could be invoked by executing something? If that's not possible, then this is a
good time to add {{ln}} to the windows command line.
* {{stop-slave.cmd}} and its siblings use the phrase "Microsoft Hadoop
Distribution"
This should not be in the ASF source, and will fall foul of the ASF trademark
rules were it to be used in products not released by the ASF
This is a good opportunity to do better abstraction and so make it possible to
test a lot of the abstraction behaviour (e.g. the file copying), even on Linux,
so ensuring that test coverage is higher across all platforms. For example,
there is a lot of snippets like
{code}
String[] shellCmd = {(Path.WINDOWS)?"cmd":"bash",
(Path.WINDOWS)?"/c":"-c", untarCommand.toString() };
{code}
And
{code}
return (WINDOWS)? new String[]{"cmd", "/c", "df -k " + dirPath + " 2>nul"}:
new String[] {"bash","-c","exec 'df' '-k' '" + dirPath + "'
2>/dev/null"};
}
{code}
I could imagine something to generate a bash command or a wincommand that takes
a list of args
{code}
String bashCommand(String[] args) {
String[] command = new String[args+2];
command[0] = "bash";
command[1] = "-c";
//array copy here
return command;
}
String winCommand(String[] args) {
String[] command = new String[args+2];
command[0] = "cmd";
command[1] = "/c";
//array copy here
return command;
}
String command(String[] args) {
return (!WINDOWS) bashCommand(args): winCommand(args);
}
{code}
Similarly, {{quietBashCommand}} and {{quietWinCommand()}} would set up the null
output. You could test at the low level bash/win command generation and very
that what you got is what is expected; unit tests for all platforms.
> Proposal for enhancements to Hadoop for Windows Server and Windows Azure
> development and runtime environments
> -------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-8079
> URL: https://issues.apache.org/jira/browse/HADOOP-8079
> Project: Hadoop Common
> Issue Type: Improvement
> Components: native
> Affects Versions: 1.0.0
> Reporter: Alexander Stojanovic
> Labels: hadoop
> Attachments: azurenative.zip, general-utils-windows.patch,
> hadoop-8079.AzureBlobStore.patch, hadoop-8079.patch, hadoopcmdscripts.zip,
> mapred-tasks.patch, microsoft-windowsazure-api-0.1.2.jar, security.patch,
> windows-cmd-scripts.patch
>
> Original Estimate: 2,016h
> Remaining Estimate: 2,016h
>
> This JIRA is intended to capture discussion around proposed work to enhance
> Apache Hadoop to run well on Windows. Apache Hadoop has worked on Microsoft
> Windows since its inception, but Windows support has never been a priority.
> Currently Windows works as a development and testing platform for Hadoop, but
> Hadoop is not natively integrated, full-featured or performance and
> scalability tuned for Windows Server or Windows Azure. We would like to
> change this and engage in a dialog with the broader community on the
> architectural design points for making Windows (enterprise and cloud) an
> excellent runtime and deployment environment for Hadoop.
>
> The Isotope team at Microsoft (names below) has developed an Apache Hadoop
> 1.0 patch set that addresses these performance, integration and feature gaps,
> allowing Apache Hadoop to be used with Azure and Windows Server without
> recourse to virtualization technologies such as Cygwin. We have significant
> interest in the deployment of Hadoop across many multi-tenant, PaaS and IaaS
> environments - which bring their own unique requirements.
> Microsoft has recently completed a CCLA with Apache and would like to
> contribute these enhancements back to the Apache Hadoop community.
> In the interest of improving Apache Hadoop so that it runs more smoothly on
> all platforms, including Windows, we propose first contributing this work to
> the Apache community by attaching it to this JIRA. From there we would like
> to work with the community to refine the patch set until it is ready to be
> merged into the Apache trunk.
> Your feedback solicited,
>
> Alexander Stojanovic
> Min Wei
> David Lao
> Lengning Liu
> David Zhang
> Asad Khan
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira