[ 
https://issues.apache.org/jira/browse/HADOOP-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12754391#action_12754391
 ] 

Alex Loddengaard commented on HADOOP-6248:
------------------------------------------

Thanks for the feedback, Chris.

{quote}
The idea is interesting and could beget a useful tool, but the current version 
is principally a wrapper for default scripts and settings.
{quote}
As the proposal states, this is a framework, with enough context examples and 
tests to show how the framework is used.  I agree with you that it is currently 
a wrapper, but it will immediately cease to be a wrapper when more interesting 
contexts and tests are written.  Being a large contributor to Hadoop itself, I 
would love to hear how you think this tool could make your job easier, if at 
all.  Some of us here at Cloudera, along with at least a few of our customers 
and users, would value a framework like this.  Circus will let an organization 
write a context that uses a development cluster of some sort, along with tests 
that emulate their production jobs, to ensure that their jobs are running as 
expected on their development cluster.  Then, by simply switching contexts, the 
organization can run all of their jobs on a different version of Hadoop.  
Perhaps I should write a new, more interesting context to prove my point.

More responses:
{quote}
Don't cut and paste code such as examples.
{quote}
Agreed it's silly to copy-paste the word count example.  This test is a 
demonstration that users can compile Java MapReduce programs in their tests.  I 
find it useful in that regard, but I can write a new MapReduce job that isn't 
an example to demonstrate the compilation use case if you'd like.  I chose the 
word count example specifically so users interested in writing tests would have 
access to a very simple MapReduce program that is compiled on the fly.

{quote}
Don't wrap the shell scripts with another level of indirection; they do enough 
of that on their own
{quote}
I assume you're referring to the bin/hadoop-daemon.sh and bin/hadoop scripts, 
right?  I argue that not using these scripts would greatly complicate creating 
new contexts and tests.  I want users of Circus to write contexts and tests in 
a way that they're familiar with; namely, command line tools.  Additionally, 
Circus is meant to test Hadoop end-to-end.  Using the shell scripts helps to 
achieve this goal, especially because Hadoop's unit tests do not test the shell 
scripts.  What are your specific objections to calling bin/hadoop-daemon.sh and 
bin/hadoop, except that doing so is one more level of indirection?

{quote}
We try not to include references to specific companies. Certainly Hadoop should 
not be fetched from anywhere but Apache in this distribution.
{quote}
Good catch here.  While scanning the Apache mirror page, I didn't notice a link 
to an apache.org site.  My mistake.

> Circus: Proposal and Preliminary Code for a Hadoop System Testing Framework
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-6248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6248
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: test
>         Environment: Python, bash
>            Reporter: Alex Loddengaard
>         Attachments: HADOOP-6248.diff, HADOOP-6248_v2.diff
>
>
> This issue contains a proposal and preliminary source code for Circus, a 
> Hadoop system testing framework.  At a high level, Circus will help Hadoop 
> users and QA engineers to run system tests on a configurable Hadoop cluster, 
> or distribution of Hadoop.  See the comment below for the proposal itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to