Re: Experience using binary packages on various Hadoop distros

2015-04-06 Thread Dean Chen
This would be great for those of us running on HDP. At eBay we recently ran in to few problems using the generic Hadoop lib. Two off of the top of my head: * Needed to included our custom Hadoop client due to custom keberos integration * Minor difference in HDFS protocol causing the following erro

Re: Experience using binary packages on various Hadoop distros

2015-03-25 Thread Marcelo Vanzin
Hey Patrick, The only issue I've seen so far has been the YARN container ID issue. That can be technically be described as a breakage in forwards compatibility in YARN. The APIs didn't break, but the data transferred through YARN's protocol has, and the old library cannot understand the data sent

Re: Experience using binary packages on various Hadoop distros

2015-03-24 Thread Patrick Wendell
We can probably better explain that if you are not using HDFS or YARN, you can download any binary. However, my question was about if the existing binaries do not work well with newer Hadoop versions, which I heard some people suggest but I'm looking for more specific issues. On Tue, Mar 24, 2015

Re: Experience using binary packages on various Hadoop distros

2015-03-24 Thread Jey Kottalam
Could we gracefully fallback to an in-tree Hadoop binary (e.g. 1.0.4) in that case? I think many new Spark users are confused about why Spark has anything to do with Hadoop, e.g. I could see myself being confused when the download page asks me to select a "package type". I know that what I want is

Re: Experience using binary packages on various Hadoop distros

2015-03-24 Thread Matei Zaharia
Just a note, one challenge with the BYOH version might be that users who download that can't run in local mode without also having Hadoop. But if we describe it correctly then hopefully it's okay. Matei > On Mar 24, 2015, at 3:05 PM, Patrick Wendell wrote: > > Hey All, > > For a while we've

Experience using binary packages on various Hadoop distros

2015-03-24 Thread Patrick Wendell
Hey All, For a while we've published binary packages with different Hadoop client's pre-bundled. We currently have three interfaces to a Hadoop cluster (a) the HDFS client (b) the YARN client (c) the Hive client. Because (a) and (b) are supposed to be backwards compatible interfaces. My working a