[ 
https://issues.apache.org/jira/browse/HADOOP-11929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567968#comment-14567968
 ] 

Colin Patrick McCabe commented on HADOOP-11929:
-----------------------------------------------

bq. Given your comments thus far, let me clue you in: it's happening now, even 
in the nightly build, because we don't have everything turned on and/or it's 
inappropriate for test-patch to do it and/or it's currently impossible to do so 
(e.g., missing dependencies, missing JVMs, etc)..... This is when I throw my 
hands up. It's very obvious at this point you have zero understanding of the 
issues and the purpose of this JIRA.

Allen, I don't appreciate your tone.  I implemented the current CMake native 
build that we have in HADOOP-8368, was a reviewer on libwebhdfs, fuse-dfs, and 
most of the other stuff we are discussing here.  You don't need to "clue me in" 
that our unit test coverage (and test coverage on Jenkins) is incomplete.  I am 
well aware of that.  If you look back at JIRAs like HDFS-4003 (test-patch 
should build the common native libs before running hdfs tests) and HDFS-3753, 
these issues have a long history.

Sometimes the unit test coverage is incomplete because nobody bothered to write 
a good test.  Sometimes it's incomplete because we never bothered to install 
the relevant dependencies on the Jenkins machines (libwebhdfs seems to fall 
into this category.)  Up until recently openssl fell into this category too, 
until we fixed it.

The obvious fix for the libwebhdfs issue is *not* continuing to disable the 
libwebhdfs testing, which this patch does.  It is to simply install the right 
binaries on the Jenkins machines.  In general, every native component we have 
should be built on every architecture, with three exceptions:
* fuse-dfs.  Fuse is inherently Linux-specific.  (Please do not write a reply 
informing me about MacFUSE.  It's a different project with a different set of 
APIs.  Perhaps it will one day be supported, but not today.)
* the cgroup stuff in yarn.  Again, this is inherently Linux-specific so it's 
fine and expected to skip it on other architectures, until we have a solution 
there.
* the mapreduce native task stuff needs some work to be portable.  Actually the 
work needed is small, but apparently the manpower available to do it is as 
well, so... here we are.

It sounds like there are a few issues here that you feel that this patch solves 
or helps to solve:
1. setting the right combinations of profiles on relevant architectures
2. ensuring that the right combination of maven builds are triggered (i.e. need 
to trigger mvn install first, need to compile hadoop-common if any other 
submodule is being compiled
3. re-using the Hadoop build scripts for NiFi, HBase and other non-Hadoop 
projects
4. fixing parallel test builds
5. fixing issues with ClientBaseWithFixes.java, TestSSLHttpServer.java, 
KeyStoreTestUtil.java, and some other tests (possibly concurrency related?)

I would like to see #5 split out into a separate patch so that it can be more 
thoroughly reviewed.  It seems like fairly straightforward test fixes.

I am concerned about the complexity of this patch.  We have had regressions-- 
huge ones-- in the past in our testing of the native components and nobody 
noticed for a while.  While this may be a valuable change potentially, it adds 
a lot of complexity.  I would like to see a design document about this spelling 
out what the goals are, how this patch achieves those goals, and what the plans 
are for the future.  It doesn't have to be long, but it does have to spell out 
what concepts like "specializations" are, how the dependency management is 
supposed to work, and what Hadoop developers will have to do to maintain this 
infrastructure in the future.

Until we get such a design document and get a chance to review it, I am -1 on 
this change.  I apologize for bringing out the -1, but I simply want to be 
clear.  I will most likely change my vote once we have time to review this and 
go through all the details.

> add test-patch plugin points for customizing build layout
> ---------------------------------------------------------
>
>                 Key: HADOOP-11929
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11929
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Sean Busbey
>            Assignee: Allen Wittenauer
>            Priority: Minor
>         Attachments: HADOOP-11929.00.patch, HADOOP-11929.01.patch, 
> HADOOP-11929.02.patch, HADOOP-11929.03.patch, HADOOP-11929.04.patch, hadoop.sh
>
>
> Sean Busbey and I had a chat about this at the Bug Bash. Here's the proposal:
>   * Introduce the concept of a 'personality module'.
>   * There can be only one personality.
>   * Personalities provide a single function that takes as input the name of 
> the test current being processed
>   * This function uses two other built-in functions to define two queues: 
> maven module name and profiles to use against those maven module names
>   * If something needs to be compiled prior to this test (but not actually 
> tested), the personality will be responsible for doing that compilation
> In hadoop, the classic example is hadoop-hdfs needs common compiled with the 
> native bits. So prior to the javac tests, the personality would check 
> CHANGED_MODULES, see hadoop-hdfs, and compile common w/ -Pnative prior to 
> letting test-patch.sh do the work in hadoop-hdfs. Another example is our lack 
> of test coverage of various native bits. Since these require profiles to be 
> defined prior to compilation, the personality could see that something 
> touches native code, set the appropriate profile, and let test-patch.sh be on 
> its way.
> One way to think of it is some higher order logic on top of the automated 
> 'figure out what modules and what tests to run' functions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to