[ 
https://issues.apache.org/jira/browse/HADOOP-12547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14990266#comment-14990266
 ] 

Chris Nauroth commented on HADOOP-12547:
----------------------------------------

Some of this discussion has not been constructive.  I urge everyone to stick to 
the technical points of the debate.

I'm still weighing this, but I have a few other points to mention for 
consideration.

Part of the argument presented here for deprecation/removal is that development 
has halted.  It's worth noting that the flow of patches for MapReduce itself 
has slowed significantly since completion of YARN/MRv2.  By extension, a C++ 
wrapper over MapReduce is going to see even fewer contributions.  I don't think 
patch count alone is a sufficient measure to justify the elimination (or the 
existence) of a component.

I have no direct experience with my users using hadoop-pipes, but I also don't 
see it as a hindrance to maintain if someone like Yahoo does find it useful.  
Another part of the argument for removal was reduced build times.  I do not see 
this component causing a significant delay in build times though.  Granted, 
that's partly due to the lack of tests.

A more telling problem is the lack of tests.  Maybe I'm mistaken, but has the 
documentation vanished too?  These are gaps that don't speak well to the 
long-term viability of the component.  If we cannot come to consensus on 
removal, then we need to commit to filling those gaps.

As a matter of process, I disagree with adding libwebhdfs as a rider to this 
proposal.  I don't think the two are in a comparable state.  However, I do 
agree that libwebhdfs is a much more viable candidate for removal.  We have 
evidence that Pipes was at least used by someone at some time, worked 
correctly, and satsified its design goals.  I don't believe we have any 
evidence that anyone has ever used libwebhdfs, it still doesn't build properly 
in recent releases, and it does not satisfy its design goal of providing a 
library with no JVM dependency.  (This can be viewed as just a bug, but there 
is also not overwhelming support for bothering to fix it.)

> Deprecate hadoop-pipes
> ----------------------
>
>                 Key: HADOOP-12547
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12547
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>            Priority: Minor
>
> Development appears to have stopped on hadoop-pipes upstream for the last few 
> years, aside from very basic maintenance.  Hadoop streaming seems to be a 
> better alternative, since it supports more programming languages and is 
> better implemented.
> There were no responses to a message on the mailing list asking for users of 
> Hadoop pipes... and in my experience, I have never seen anyone use this.  We 
> should remove it to reduce our maintenance burden and build times.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to