oh nevermind i am used to spark builds without hadoop included. but i realize that if hadoop is included it matters if its 2.6 or 2.7...
On Thu, Feb 8, 2018 at 5:06 PM, Koert Kuipers <ko...@tresata.com> wrote: > wouldn't hadoop 2.7 profile means someone by introduces usage of some > hadoop apis that dont exist in hadoop 2.6? > > why not keep 2.6 and ditch 2.7 given that hadoop 2.7 is backwards > compatible with 2.6? what is the added value of having a 2.7 profile? > > On Thu, Feb 8, 2018 at 5:03 PM, Sean Owen <so...@cloudera.com> wrote: > >> That would still work with a Hadoop-2.7-based profile, as there isn't >> actually any code difference in Spark that treats the two versions >> differently (nor, really, much different between 2.6 and 2.7 to begin >> with). This practice of different profile builds was pretty unnecessary >> after 2.2; it's mostly vestigial now. >> >> On Thu, Feb 8, 2018 at 3:57 PM Koert Kuipers <ko...@tresata.com> wrote: >> >>> CDH 5 is still based on hadoop 2.6 >>> >>> On Thu, Feb 8, 2018 at 2:03 PM, Sean Owen <so...@cloudera.com> wrote: >>> >>>> Mostly just shedding the extra build complexity, and builds. The >>>> primary little annoyance is it's 2x the number of flaky build failures to >>>> examine. >>>> I suppose it allows using a 2.7+-only feature, but outside of YARN, not >>>> sure there is anything compelling. >>>> >>>> It's something that probably gains us virtually nothing now, but isn't >>>> too painful either. >>>> I think it will not make sense to distinguish them once any Hadoop >>>> 3-related support comes into the picture, and maybe that will start soon; >>>> there were some more pings on related JIRAs this week. You could view it as >>>> early setup for that move. >>>> >>>> >>>> On Thu, Feb 8, 2018 at 12:57 PM Reynold Xin <r...@databricks.com> >>>> wrote: >>>> >>>>> Does it gain us anything to drop 2.6? >>>>> >>>>> > On Feb 8, 2018, at 10:50 AM, Sean Owen <so...@cloudera.com> wrote: >>>>> > >>>>> > At this point, with Hadoop 3 on deck, I think hadoop 2.6 is both >>>>> fairly old, and actually, not different from 2.7 with respect to Spark. >>>>> That is, I don't know if we are actually maintaining anything here but a >>>>> separate profile and 2x the number of test builds. >>>>> > >>>>> > The cost is, by the same token, low. However I'm floating the idea >>>>> of removing the 2.6 profile and just requiring 2.7+ as of Spark 2.4? >>>>> >>>> >>> >