Re: Utilize newer hadoop releases WAS: [VOTE] Release Apache Spark 1.0.2 (RC1)

2014-07-27 Thread Patrick Wendell
Hey Ted, We always intend Spark to work with the newer Hadoop versions and encourage Spark users to use the newest Hadoop versions for best performance. We do try to be liberal in terms of supporting older versions as well. This is because many people run older HDFS versions and we want Spark to

Re: Utilize newer hadoop releases WAS: [VOTE] Release Apache Spark 1.0.2 (RC1)

2014-07-27 Thread Sean Owen
Good idea, although it gets difficult in the context of multiple distributions. Say change X is not present in version A, but present in version B. If you depend on X, what version can you look for to detect it? The distribution will return A or A+X or somesuch, but testing for A will give an

Re: Utilize newer hadoop releases WAS: [VOTE] Release Apache Spark 1.0.2 (RC1)

2014-07-27 Thread Matei Zaharia
We could also do this, though it would be great if the Hadoop project provided this version number as at least a baseline. It's up to distributors to decide which version they report but I imagine they won't remove stuff that's in the reported version number. Matei On Jul 27, 2014, at 1:57