[
https://issues.apache.org/jira/browse/HADOOP-12547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14991078#comment-14991078
]
Allen Wittenauer commented on HADOOP-12547:
-------------------------------------------
bq. I'm curious what the remaining cases are where hadoop-pipes is a better
option than streaming.
Why do people use streaming instead of the Java MR API? Or even, why do people
use Java MR instead of streaming?
Meanwhile, it looks like large chunks of the pipes documentation got dropped
somewhere between 1.x and 0.23. It's definitely documented in 1.x:
1.2.1:
https://hadoop.apache.org/docs/r1.2.1/api/org/apache/hadoop/mapred/pipes/package-summary.html
0.23 through 2.7.1:
https://hadoop.apache.org/docs/r2.7.1/api/org/apache/hadoop/mapred/pipes/package-summary.html
Now I'm even less inclined to remove it:
* we have people actually using it
* we have documentation that we can re-instate
* it actually does compile and has compiled for a very long time (albeit not in
a very convenient way... see the other JIRA to fix that though)
* we haven't removed or deprecated MRv1 yet either, and these two seem fairly
tied together given the history of why it exists
* missing unit tests... while a concern... well, if we remove everything that
didn't have unit tests, we'd be dropping large portions of the source base,
including pretty much all of the compiled C/C++ code
So yeah, I'm definitely -1 at this point.
> Deprecate hadoop-pipes
> ----------------------
>
> Key: HADOOP-12547
> URL: https://issues.apache.org/jira/browse/HADOOP-12547
> Project: Hadoop Common
> Issue Type: Improvement
> Reporter: Colin Patrick McCabe
> Assignee: Colin Patrick McCabe
> Priority: Minor
>
> Development appears to have stopped on hadoop-pipes upstream for the last few
> years, aside from very basic maintenance. Hadoop streaming seems to be a
> better alternative, since it supports more programming languages and is
> better implemented.
> There were no responses to a message on the mailing list asking for users of
> Hadoop pipes... and in my experience, I have never seen anyone use this. We
> should remove it to reduce our maintenance burden and build times.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)