[ 
https://issues.apache.org/jira/browse/HADOOP-12547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14991078#comment-14991078
 ] 

Allen Wittenauer commented on HADOOP-12547:
-------------------------------------------

bq.  I'm curious what the remaining cases are where hadoop-pipes is a better 
option than streaming. 

Why do people use streaming instead of the Java MR API? Or even, why do people 
use Java MR instead of streaming?

Meanwhile, it looks like large chunks of the pipes documentation got dropped 
somewhere between 1.x and 0.23.  It's definitely documented in 1.x:  

1.2.1: 
https://hadoop.apache.org/docs/r1.2.1/api/org/apache/hadoop/mapred/pipes/package-summary.html

0.23 through 2.7.1:  
https://hadoop.apache.org/docs/r2.7.1/api/org/apache/hadoop/mapred/pipes/package-summary.html

Now I'm even less inclined to remove it: 

* we have people actually using it
* we have documentation that we can re-instate
* it actually does compile and has compiled for a very long time (albeit not in 
a very convenient way... see the other JIRA to fix that though)
* we haven't removed or deprecated MRv1 yet either, and these two seem fairly 
tied together given the history of why it exists
* missing unit tests... while a concern... well, if we remove everything that 
didn't have unit tests, we'd be dropping large portions of the source base, 
including pretty much all of the compiled C/C++ code

So yeah, I'm definitely -1 at this point. 

> Deprecate hadoop-pipes
> ----------------------
>
>                 Key: HADOOP-12547
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12547
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>            Priority: Minor
>
> Development appears to have stopped on hadoop-pipes upstream for the last few 
> years, aside from very basic maintenance.  Hadoop streaming seems to be a 
> better alternative, since it supports more programming languages and is 
> better implemented.
> There were no responses to a message on the mailing list asking for users of 
> Hadoop pipes... and in my experience, I have never seen anyone use this.  We 
> should remove it to reduce our maintenance burden and build times.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to