As far as I know Hadoop 2.6 + Spark 1.2 + Mahout 0.10.0 works fine for everything except cooccurrence. There is a user-side work around that has been verified by me only. If we can verify on a real cluster we could document the fix. No changes to code are required, just a new artifact?
The cooccurrence problem longer term needs to be fixed in Mahout, which I think means writing a custom serializer for hashmaps—something like our VectorWritable. BTW Dmitriy pointed out an important consideration wrt target Spark and Hadoop versions. We need to pay close attention to what the popular distributions have in them. I’d go as far as to say we *must* run on the latest CDH or even target a near future version with every release. This is because many users will have to live with that and are unable to downgrade Spark or Hadoop. This is going to be hard to meet since we already at least partially violate it. BigTop is then on the important distro list. On Apr 24, 2015, at 9:43 PM, Andrew Musselman <[email protected]> wrote: Yeah which we tested and which worked fine. On Friday, April 24, 2015, Dmitriy Lyubimov <[email protected]> wrote: > my guess is this mostly means that mahout-mr has to run on 2.6.0, because > spark part would basically run anywhere. > > but mr... gosh. > > On Fri, Apr 24, 2015 at 9:26 PM, Andrew Musselman < > [email protected] <javascript:;>> wrote: > >> I'm not educated enough in what has to happen but we're happy to help. >> >> Are there things we need to do from the Mahout end or is it changing >> recipes and doing regressions of BigTop builds, etc., what else? >> >> On Friday, April 24, 2015, Konstantin Boudnik <[email protected] > <javascript:;>> wrote: >> >>> I am trying to see if anyone is doing the accomodation of 0.10 into >> coming >>> 1.0 >>> release. That's pretty much a release blocker at this point. I am not >> very >>> much concerned about Spark compat, but if we to take 0.10 into 1.0 it >>> needs to >>> work and be tested against 2.6.0 Hadoop. >>> >>> So, does anyone works on the patch or this JIRA? >>> >>> Cos >>> >>> On Fri, Apr 24, 2015 at 05:48PM, Andrew Musselman wrote: >>>> The spark 1.3 compat is in a near future release; what do you need > from >>> us >>>> to make 1.1 and 1.2 compat work? >>>> >>>> On Thursday, April 23, 2015, Konstantin Boudnik (JIRA) < >> [email protected] <javascript:;> >>> <javascript:;>> >>>> wrote: >>>> >>>>> >>>>> [ >>>>> >>> >> > https://issues.apache.org/jira/browse/BIGTOP-1831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14510075#comment-14510075 >>>>> ] >>>>> >>>>> Konstantin Boudnik commented on BIGTOP-1831: >>>>> -------------------------------------------- >>>>> >>>>> How is it going guys? Looks like this is one of the blockers for > 1.0 >>> as we >>>>> can not use old 0.9 version. Appreciate the help! Thank you! >>>>> >>>>>> Upgrade Mahout to 0.10 >>>>>> ---------------------- >>>>>> >>>>>> Key: BIGTOP-1831 >>>>>> URL: >>> https://issues.apache.org/jira/browse/BIGTOP-1831 >>>>>> Project: Bigtop >>>>>> Issue Type: Task >>>>>> Components: general >>>>>> Affects Versions: 0.8.0 >>>>>> Reporter: David Starina >>>>>> Priority: Blocker >>>>>> Labels: Mahout >>>>>> Fix For: 1.0.0 >>>>>> >>>>>> >>>>>> Need to upgrade Mahout to the latest 0.10 release (first Hadoop > 2.x >>>>> compatible release) >>>>> >>>>> >>>>> >>>>> -- >>>>> This message was sent by Atlassian JIRA >>>>> (v6.3.4#6332) >>>>> >>> >> >
