[ 
https://issues.apache.org/jira/browse/HADOOP-11984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14548455#comment-14548455
 ] 

Allen Wittenauer commented on HADOOP-11984:
-------------------------------------------

bq. It seems to me that it is more of an apple vs orange comparison – more 
importantly, does the time parsing TEST-*xml (which takes seconds at maximum) 
actually matter, give the fact that in general Jenkins spends 15 mins to build 
the trunk, and ~2 hours to run the tests?

~2 hours only for HDFS.  The next closest (IIRC) is mapreduce-jobclient which 
comes in at 20 minutes.  Perhaps the HDFS folks should take a serious look at 
re-arranging the universe, not running integration tests in unit tests, start 
paying attention to the nightly build, etc.

bq. Popping up one level – it looks like you have some concerns on moving 
test-patch to other scripting languages that have more choices of libraries.

deadhorse.gif

Python, ruby, etc, all suffer from the same problem: which version do you 
target to get the maximum amount of coverage?  test-patch, like the user-client 
code, MUST be able to run in a variety of hostile environments. (No, Mac OS X 
and Linux are NOT good enough.)  python, frankly, sucks at that because the API 
is continually evolving in incompatible ways.(*)  ... and that's before we even 
get into the morass of add-ons.  And python 3.x.

FWIW, the *only* big portability problem with the current version of 
test-patch.sh that I'm aware of is one usage of GNU diff because I was too lazy 
to write more complex awk to work around it.  Otherwise, it's all POSIX+bash 
3.x and should run even on fairly ancient systems unchanged!  The outlook for 
*forward* compatibility, as a result, is extremely good. It's pretty much 
impossible to do that with most other language choices (including, ironically, 
Java).... except maybe one:

If I had my way, I'd have written this in perl 5.  It's a significantly better 
choice for the things we need to do here (text processing! OS manipulation!) 
and it's compatibility across versions deployed with every relatively modern OS 
that I'm aware of is extremely high.  But we don't do perl, have a small 
tolerance for python, and the rest is in bash.  So given those choices, it was 
an easy one to make.

bq. I'm wondering whether there are anythings can be done to improve the 
maintainability and reduce the bars of getting involved (e.g., reusing 
libraries from other scripting languages) in the longer term.

There are plenty of people who are fully competent to write decent bash.  We 
just don't invite them into the Hadoop tent.  The number of people contributing 
to the parts that I've rewritten have gone up SIGNIFICANTLY because people who 
have these skills realize that someone is paying attention.  As a side note, I 
personally think it's great if the Java folks feel uncomfortable that code that 
they don't understand is in the system. 

(*) - while working on releasedocmaker, I heard two conflicting things:  "that 
API is deprecated you should use xyz" and "oh, make sure this works with python 
vx.x".  Guess what? I can't use the non-deprecated API in vx.x.  So deprecated 
APIs here we come, which now means I'm continually answering the question of 
"why does this code use method y?". 



> Enable parallel JUnit tests in pre-commit.
> ------------------------------------------
>
>                 Key: HADOOP-11984
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11984
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: scripts
>            Reporter: Chris Nauroth
>            Assignee: Chris Nauroth
>         Attachments: HADOOP-11984.001.patch, HADOOP-11984.002.patch, 
> HADOOP-11984.003.patch, HADOOP-11984.004.patch
>
>
> HADOOP-9287 and related issues implemented the parallel-tests Maven profile 
> for running JUnit tests in multiple concurrent processes.  This issue 
> proposes to activate that profile during pre-commit to speed up execution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to