[ 
https://issues.apache.org/jira/browse/HADOOP-12547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14988229#comment-14988229
 ] 

Colin Patrick McCabe commented on HADOOP-12547:
-----------------------------------------------

Thank you for the perspective, [~aw].  It's true that you have been around for 
longer than me.  However, it's also true that in about 4 years of supporting 
customer Hadoop deployments I have never, once, seen anyone use or ask about 
Hadoop Pipes.  We've gotten requests for some pretty obscure things-- like 
adding a feature or fixing a bug in fuse_dfs, supporting the old obsolete MR1 
framework, or even preparing native code patches for decades-old versions of 
AIX, even running Hadoop on JVMs that I'm convinced most people have never 
heard of.  But __never__ for pipes.

That stack overflow post looks like a newbie stumbling into Hadoop for the 
first time and trying to follow a tutorial from more than 5 years ago... and 
failing, because this stuff hasn't been maintained-- and won't be maintained in 
the future.  That's hardly a ringing endorsement of keeping this around.  
Anyway, nobody is proposing removing this from 2.6 or any branch-2 release... 
only from trunk.

bq. Pipes was written primarily for Yahoo!'s search team. It was provided as a 
way for C code to interface with MapReduce without requiring significant 
rewrites. It was definitely in use before I left Yahoo! but I haven't kept 
track of whether it is still being used. My guess is no, given most of that 
team has left/was shipped over to Microsoft.

[~daryn], [~kihwal], do you have any perspective on this?  Is there any reason 
to keep this around in trunk / branch-3.0?  If we are going to keep this, I 
would like to see some unit tests, documentation, and actual maintenance.

> Remove hadoop-pipes
> -------------------
>
>                 Key: HADOOP-12547
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12547
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>            Priority: Minor
>
> Development appears to have stopped on hadoop-pipes upstream for the last few 
> years, aside from very basic maintenance.  Hadoop streaming seems to be a 
> better alternative, since it supports more programming languages and is 
> better implemented.
> There were no responses to a message on the mailing list asking for users of 
> Hadoop pipes... and in my experience, I have never seen anyone use this.  We 
> should remove it to reduce our maintenance burden and build times.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to