Hi folks, So yesterday I invested several hours in cleaning up your build. I got quite far, currently trying to get the tests to pass. For some Scala-Java-major-version problems, but working on fixing them.
However sind things will be different. I hope that's OK. Especially with the artifact ids. Just as a question: does the build currently work at all? Especially in the community block I had to fix quite a few API changes where the code in the blocks were for older versions of libraries. I hope I get some more tests to pass today. Chris ________________________________ Von: Andrew Musselman <a...@apache.org> Gesendet: Donnerstag, 16. April 2020 20:35 An: Mahout Dev List <dev@mahout.apache.org> Betreff: Re: Hi ... need some help? Looking forward to working on this with you; thanks again! On Thu, Apr 16, 2020 at 11:09 AM Christofer Dutz <christofer.d...@c-ware.de> wrote: > Hi Andrew, > > guess I'll start with the fork and contact you folks on slack. > > Chris > > > > Am 16.04.20, 19:43 schrieb "Andrew Musselman" <a...@apache.org>: > > Chris, thank you for your help.. > > Yeah if you fork what's in master you can see what state it's in; we > are in > the #mahout channel in tge-asf slack and this is also a fine way to > keep > track of discussion. > > We could file a JIRA ticket as well, however you prefer to work. > > Best > Andrew > > On Thu, Apr 16, 2020 at 06:59 Christofer Dutz < > christofer.d...@c-ware.de> > wrote: > > > Hi Trevor, > > > > ok ... first of all ... the Mahout PMC is defining a "community > > maintained" library which is not maintained by the mahout PMC?!?! > > I thought at Apache everything is about Community over code. So is a > > company driving the non-community stuff? > > > > But back to your build issues: > > I had a look and I too encountered these comments and remarks and > > sometimes patterns I recognized and could imagine why they were > created. > > Yes quite a bit of the build could be cleaned up and simplified a > lot. > > > > So how about I create a fork and try to do a cleanup of the build. > > Usually I also leave comments about what I do as I hope I'll not be > the > > only one maintaining a build and documenting things helps people > feel more > > confident. > > > > However in some cases I will have questions ... so would someone be > > available on Slack for quick questions? > > > > Usually switching to another build system does solve some problems > ... > > mostly the reason to switch is that it solved the main problem that > you are > > having with the old. > > However you usually notice too late that you get yourself a lot of > new > > problems. I remember doing some contract work for an insurance > company and > > they were totally down Maven-road but then had to build something > with SBT > > ... in the end I compiled the thing on my laptop, copied it to a USB > stick > > and told the people what was on the stick and that I'll be having a > coffee > > and will be back in 30 minutes. When I came back the sick wasn't at > the > > same place and the build problem was "solved" ;-) > > > > So I think it's quite good to stick to maven ... that is very > mature, you > > can do almost everything you want with it and it integrates > perfectly into > > the Apache infrastructure. > > > > But that's just my opinion. > > > > So if you want me to help, I'll be happy to be of assistance. > > > > > > Chris > > > > > > > > Am 16.04.20, 15:28 schrieb "Trevor Grant" <trevor.d.gr...@gmail.com > >: > > > > Hey Christopher, > > > > I would agree with what Stevo outlined but add some more context > and a > > couple related JIRA issues. > > > > For 0.14.0 We did a big refactor and finally moved the MapReduce > based > > Mahout all into what we called "community/" that is community > > maintained, > > which is to say, we're not maintaining it anymore (sunset began I > > think in > > 2015). > > > > But all of our POMs were so huge and fat because they'd been > layered up > > over the years by people coming and going and dropping in code. I > > wouldn't > > call these drive- bys, its just been over 10 years and people > come and > > go. > > Such is the life of Apache Projects. So we had a situation where > a lot > > of > > the old Map Reduce stuff and the POMs were considered > "old-magic" no > > one > > really knew how it was all tied together, but we didn't want to > mess > > with > > it for fear of breaking something in the "new" Mahout (aka > Samsar) > > which is > > the Scala/Spark based library that it is now* (to others in the > > community: > > I know it runs on other engines, but for simplicity, I'm just > calling > > it > > "runs-on-spark"). > > > > For 0.14.0 We decided to trim out as much of that which was > possible. > > We > > did some major liposuction on POMs, re organized things, etc. > This was > > done > > by commenting out a section, then seeing if it would still > build. So > > the > > current release > > _does_ build. And aside for some CLI driver issues which are > outlined > > in > > [1], the project runs fairly smooth. (An SBT would probably > solve [1], > > I > > believe Pat Ferrel has made his own SBT script to compile > Mahout, which > > solved that problem for them). > > > > The issue we ran into with the releases (and the reason I think > you're > > here), is that we also somewhere along the line commented out > something > > that was important to the release process. Hence why 0.14.0 > released > > source > > only. > > > > Since 2008, there has been a lot of great work on generating > plugins > > for > > doing Apache releases. Instead of the awkward hacks that made up > the > > old > > poms (literally comments that said, "this is a hack, there's > supposedly > > something better coming from ..." dated like 2012), we would > like to > > do it > > the "right way" and incorporate the appropriate plugins. > > > > Refactoring to SBT was _one_ proposed solution. We're also OK > > continuing to > > use Maven, and I agree with what you said about the cross > compiling. We > > actually have a script that just changes the scala version. We > tried > > using > > the classifiers but there were issues in SBT, but the way you're > > proposing > > sounds a lot more pro than the route we were trying for. > > > > That said- we'd be OK just releasing one scala/spark version at > a time. > > But getting the convenience binaries to release/publish would be > a > > major > > first step. > > > > Also, we really appreciate the help, > > > > tg > > > > > > [1] > > > > > https://issues.apache.org/jira/projects/MAHOUT/issues/MAHOUT-2093?filter=allopenissues > > > > > > > > On Thu, Apr 16, 2020 at 4:50 AM Christofer Dutz < > > christofer.d...@c-ware.de> > > wrote: > > > > > Hi Stevo, > > > > > > so let me summarize what I understood: > > > > > > - There are some modules in mahout that are built with Scala, > some > > with > > > java and some with both (At least that's what I see when > checking > > out the > > > project) > > > - The current build uses Scala 2.11 to build the Scala code. > > > - The resulting libraries are only compatible with Scala 2.11 > > > > > > Now you want to also publish versions compatible with Scala > 2.12? > > > > > > If that's the case I think Maven could easily add multiple > executions > > > where each compile compiles to different output directories: > > > - Java --> target/classes > > > - Scala 2.11 --> target/classes-2.11 > > > - Scala 2.12 --> target/classes-2.12 > > > > > > Then the packaging would also need a second execution ... each > of the > > > executions bundling the classes and the corresponding scala > output. > > > Ideally I would probably use maven classifiers to distinguish > the > > > artifacts. > > > > > > <dependency> > > > <groupId>org.apache.mahout</groupId> > > > <artifactId>mahout-spark</artifactId> > > > <version>14.1-SNAPSHOT</version> > > > <classifier>2.11</classifier> > > > </dependency> > > > > > > Then it should all work in a normal maven build. In the > > distributions you > > > could also filter the versions according to their classifiers. > > > > > > So if this is the case, I could help you with this. > > > > > > Chris > > > > > > > > > Am 16.04.20, 09:39 schrieb "Stevo Slavić" <ssla...@gmail.com>: > > > > > > Disclaimer: I'm not active Mahout maintainer for quite a > while, > > have > > > some > > > historical perspective, take it with a grain of salt, > could be > > I'm > > > missing > > > the whole point you were approached for by a wide margin of > > error. > > > > > > At a point Mahout, some of its modules, have turned into a > scala > > > library, and there was need to cross publish those modules, > > across > > > different scala versions. Back than Maven scala plugin > didn't > > support > > > cross > > > publishing, it doesn't fit well with Maven's build > lifecycle > > concept > > > (multiple compile phases - one for each scala version, and > what > > not > > > would > > > be needed). Switching to sbt could have solved the problem. > > Switch was > > > deemed to be too big task, even though ages have been > spent on > > trying > > > to > > > apply Maven (profiles) + bash scripts and what not to > solve the > > > problem. > > > Trying to apply same approach over and over again and > expecting > > > different > > > results is not smart, no expert can help there. Mahout > > maintainers and > > > contributors, should consider alternative approach, one of > them > > being > > > switching to sbt - it's scala native, supports scala cross > > publishing, > > > supports publishing Maven compatible release metadata and > > binaries. > > > > > > Kind regards, > > > Stevo Slavic. > > > > > > On Thu, Apr 16, 2020 at 9:15 AM Christofer Dutz < > > > christofer.d...@c-ware.de> > > > wrote: > > > > > > > Hi folks, > > > > > > > > my name is Chris and I’m involved in quite a lot of > Apache > > projects. > > > > Justin approached me this morning, asking me if I could > > perhaps help > > > you. > > > > He told me you were having trouble with doing Maven > releases. > > > > > > > > As Maven releases are my specialty, could you please > summarize > > the > > > issues > > > > you are having? > > > > > > > > Chris > > > > > > > > > > > > > > > >