I'll do some testing today. On Thu, Apr 23, 2020 at 01:39 Christofer Dutz <christofer.d...@c-ware.de> wrote:
> Ok, > > so I just pushed some additional tweaks and now the handling of multiple > scala versions should work. > Also the apache-release profile seems to be working nicely. > > All I currently see as missing is setting up the binary distribution. > > But before going into this too much, I would like to ask you folks to > check my changes and check if Mahout still works ;-) > We wouldn't want to release something that compiles, but doesn't work ;-) > > Chris > > > Am 23.04.20, 00:56 schrieb "Ted Dunning" <ted.dunn...@gmail.com>: > > Chris, > > This is really nice work. > > > > On Wed, Apr 22, 2020 at 1:46 AM Christofer Dutz < > christofer.d...@c-ware.de> > wrote: > > > Hi Andrew, > > > > thanks for your kind words ... they are sort of the fuel that makes > me run > > ;-) > > > > So some general observations and suggestions: > > - You seem to use test-jars quite a bit: These are generally > considered an > > antipattern as you possibly import problems from another module and > you > > will have no way of detecting them. If you need shared test-code it's > > better practice to create a dedicated test-utils module and include > that > > wherever it's needed. > > - Don't use variables for project dependencies: It makes things > slightly > > more difficult to read the release plugin takes care of updating > version > > for you and some third party plugins might have issues with it. > > - I usually provide versions for all project dependencies and have > all > > other dependencies managed in a dependencyManagement section of the > root > > module this avoids problems with version conflicts when constructing > > something using multiple parts of your projects (Especially your lib > > directory thing) > > - Accessing resources outside of the current modules scope is > generally > > considered an antipattern ... regarding your lib thing, I would > suggest an > > assembly that builds a directory (but I do understand that this > version > > perhaps speeds up the development workflow ... we could move the > clean > > plugin configuration and the antrun plugin config into a profile > dedicated > > for development) > > - I usually order the plugin configurations (as much as possible) > the way > > they are usually executed in the build ... so: clean, process > resources, > > compile, test, package, ... this makes it easier to understand the > build in > > general. > > > > Today I'll go through the poms again managing all versions and > cleaning up > > the order of things. Then if all still works I would bump the > dependencies > > versions up as much as possible. > > > > Will report back as soon as I'm though or I've got something to > report ... > > then I'll also go into details with your feedback (I haven't ignored > it ;-) > > ) > > > > Chris > > > > > > > > Am 22.04.20, 06:08 schrieb "Andrew Palumbo" <ap....@outlook.com>: > > > > Fixing previous message.. > > > > > > Quote from Chris Dutz: > > > > > Hi folks, > > > so I was now able to build (including all tests) with Java > 8 and > > 9 ... currently trying 10 ... > > > Are there any objection that some maven dependencies get > updated > > to more recent versions? I mean ... the hbase-client you're using is > more > > than 5 years old ... > > > > My answer: > > > > I personally have no problem with the updating of any > dependencies, > > they may break some things and caue more work, but that is the kind > of > > thing that we've been trying to get done in this build work, get > > everything up to speed. > > > > Id say take Andrew, Trevor and Pat's word over mine though i am > a bit > > less active presently. > > > > Thanks. > > > > Andy > > > > ________________________________ > > From: Andrew Palumbo <ap....@outlook.com> > > Sent: Tuesday, April 21, 2020 10:17 PM > > To: dev@mahout.apache.org <dev@mahout.apache.org> > > Subject: Re: Hi ... need some help? > > > > Hi folks, > > > > so I was now able to build (including all tests) with Java 8 > and 9 > > ... currently trying 10 ... > > > > Are there any objection that some maven dependencies get > updated > > to more recent versions? I mean ... the hbase-client you're using is > more > > than 5 years old ... > > Not by me, I believe that is being used by the MR module, which > is > > Deprecated. > > > > I personally have no problem with the updating of any > dependencies, > > they may break some things and caue more work, but that is the kind > of > > thing that we've been trying to get done in this build work, get > > everything up to speed. > > > > Id say take Andrew, Trevor and Pat's word over mine though i am > a bit > > less active presently. > > > > Thanks. > > > > Andy > > ________________________________ > > From: Andrew Palumbo <ap....@outlook.com> > > Sent: Tuesday, April 21, 2020 10:13 PM > > To: dev@mahout.apache.org <dev@mahout.apache.org> > > Subject: Re: Hi ... need some help? > > > > Chris, Thank you so much for what you are doing, This is Apache > at > > its best.. I've been down and out with a serious Illness, Injury and > other > > issues, which have seriously limited my Machine time. I was pretty > close > > to getting a good build, but it was hacky, and the method that you > use to > > name the modules for both Scala versions, looks great. > > > > We've always relied on Stevo to fix the builds for us, but as > he said > > is unable to contribute right now. The main issues (solved by > hacks), > > currently are > > > > > > 1. Dependencies and transitive dependencies are not being > picked > > and copied to the `./lib` directory, where `/bin/mahout` and parts > of the > > MahoutSparkContext look for them, to add to the class path. So > running > > either from the CLI or as a library, dependencies are not picked up. > > * We used to use the mahout-experimental-xx.jar as a fat > jar > > for this, though it was bloated with now deprecated MR stuff, and no > longer > > packed. > > 2. `./bin/mahout` (and `compute-classpath.sh`) need to be > revamped > > to ensure that they are picking up the correct classes. > > > > w.r.t. to Java 8/7 issues, We did mandate Java 8+, and this > required a > > few minor code changes to play nicely with Scala 2.11. Mainly one > class > > needed a JVM "Static" field, so i refactored that field out of the > Class > > and into a companion object. I wonder if this is what is giving you > issues > > with Java 7. > > > > I'd thought that Java 8 was mandated now, but may be thinking of > maven > > 3.3.x. > > > > Regardless Thank you very much for this. This board is doing > really > > doing well so far. and deserves accolades. > > > > > > > > <dependency> > > > <groupId>org.apache.mahout</groupId> > > > <artifactId>mahout-spark</artifactId> > > > <version>14.1-SNAPSHOT</version> > > > <classifier>2.11</classifier> > > > </dependency> > > This would be perfect IMO. > > > > > > I can send you the commits that I am talking about. > > > > As well, I saw that Trevor gave you a link to a filter.. I have > one > > here with a bit more limited scope, which is open issues fixversion > == 14.1. > > > > To answer one question yes this was recently building, and > releasing, > > with all of the tests passing (for a few modules, that we were > focusing > > on). after that i made some changes that broke it again.. > > > > the board with limited scope: > > > https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=348&view=detail&selectedIssue=MAHOUT-2093 > > > > > > Thanks again for helping out. we are really bad with poms, not > so > > much from the ground up, as fixing some that are 10 years old, as > Stevo > > mentioned, very quickly while working on several other things. > > > > Thank you again for this. It is a great help, and once we get a > good > > build, we can get back to doing work on the library itself. > > > > I have some documents that i can provide if it will help explain > the > > structure of the project, which is still kind of in flux. E.g. I'd > like to > > get the ViennaCL-OMP branch out of experimental, but there is much > to clean > > up first. As well, I am on medical leave, and dont have much time > on the > > computer these days.. have to budget my time. > > > > I'll send you some (closed) PRs with notes and changes, if it > helps. > > lmk. > > > > Thanks again, This is Huge. > > > > Andy > > ________________________________ > > From: Christofer Dutz <christofer.d...@c-ware.de> > > Sent: Thursday, April 16, 2020 9:50 AM > > To: dev@mahout.apache.org <dev@mahout.apache.org> > > Subject: Re: Hi ... need some help? > > > > Hi Trevor, > > > > ok ... first of all ... the Mahout PMC is defining a "community > > maintained" library which is not maintained by the mahout PMC?!?! > > I thought at Apache everything is about Community over code. So > is a > > company driving the non-community stuff? > > > > But back to your build issues: > > I had a look and I too encountered these comments and remarks and > > sometimes patterns I recognized and could imagine why they were > created. > > Yes quite a bit of the build could be cleaned up and simplified > a lot. > > > > So how about I create a fork and try to do a cleanup of the > build. > > Usually I also leave comments about what I do as I hope I'll not > be > > the only one maintaining a build and documenting things helps people > feel > > more confident. > > > > However in some cases I will have questions ... so would someone > be > > available on Slack for quick questions? > > > > Usually switching to another build system does solve some > problems ... > > mostly the reason to switch is that it solved the main problem that > you are > > having with the old. > > However you usually notice too late that you get yourself a lot > of new > > problems. I remember doing some contract work for an insurance > company and > > they were totally down Maven-road but then had to build something > with SBT > > ... in the end I compiled the thing on my laptop, copied it to a USB > stick > > and told the people what was on the stick and that I'll be having a > coffee > > and will be back in 30 minutes. When I came back the sick wasn't at > the > > same place and the build problem was "solved" ;-) > > > > So I think it's quite good to stick to maven ... that is very > mature, > > you can do almost everything you want with it and it integrates > perfectly > > into the Apache infrastructure. > > > > But that's just my opinion. > > > > So if you want me to help, I'll be happy to be of assistance. > > > > > > Chris > > > > > > > > Am 16.04.20, 15:28 schrieb "Trevor Grant" < > trevor.d.gr...@gmail.com>: > > > > Hey Christopher, > > > > I would agree with what Stevo outlined but add some more > context > > and a > > couple related JIRA issues. > > > > For 0.14.0 We did a big refactor and finally moved the > MapReduce > > based > > Mahout all into what we called "community/" that is community > > maintained, > > which is to say, we're not maintaining it anymore (sunset > began I > > think in > > 2015). > > > > But all of our POMs were so huge and fat because they'd been > > layered up > > over the years by people coming and going and dropping in > code. I > > wouldn't > > call these drive- bys, its just been over 10 years and > people come > > and go. > > Such is the life of Apache Projects. So we had a situation > where a > > lot of > > the old Map Reduce stuff and the POMs were considered > "old-magic" > > no one > > really knew how it was all tied together, but we didn't want > to > > mess with > > it for fear of breaking something in the "new" Mahout (aka > Samsar) > > which is > > the Scala/Spark based library that it is now* (to others in > the > > community: > > I know it runs on other engines, but for simplicity, I'm just > > calling it > > "runs-on-spark"). > > > > For 0.14.0 We decided to trim out as much of that which was > > possible. We > > did some major liposuction on POMs, re organized things, > etc. This > > was done > > by commenting out a section, then seeing if it would still > build. > > So the > > current release > > _does_ build. And aside for some CLI driver issues which are > > outlined in > > [1], the project runs fairly smooth. (An SBT would probably > solve > > [1], I > > believe Pat Ferrel has made his own SBT script to compile > Mahout, > > which > > solved that problem for them). > > > > The issue we ran into with the releases (and the reason I > think > > you're > > here), is that we also somewhere along the line commented out > > something > > that was important to the release process. Hence why 0.14.0 > > released source > > only. > > > > Since 2008, there has been a lot of great work on generating > > plugins for > > doing Apache releases. Instead of the awkward hacks that > made up > > the old > > poms (literally comments that said, "this is a hack, there's > > supposedly > > something better coming from ..." dated like 2012), we would > like > > to do it > > the "right way" and incorporate the appropriate plugins. > > > > Refactoring to SBT was _one_ proposed solution. We're also OK > > continuing to > > use Maven, and I agree with what you said about the cross > > compiling. We > > actually have a script that just changes the scala version. > We > > tried using > > the classifiers but there were issues in SBT, but the way > you're > > proposing > > sounds a lot more pro than the route we were trying for. > > > > That said- we'd be OK just releasing one scala/spark version > at a > > time. > > But getting the convenience binaries to release/publish > would be a > > major > > first step. > > > > Also, we really appreciate the help, > > > > tg > > > > > > [1] > > > > > https://issues.apache.org/jira/projects/MAHOUT/issues/MAHOUT-2093?filter=allopenissues > > > > > > > > On Thu, Apr 16, 2020 at 4:50 AM Christofer Dutz < > > christofer.d...@c-ware.de> > > wrote: > > > > > Hi Stevo, > > > > > > so let me summarize what I understood: > > > > > > - There are some modules in mahout that are built with > Scala, > > some with > > > java and some with both (At least that's what I see when > > checking out the > > > project) > > > - The current build uses Scala 2.11 to build the Scala > code. > > > - The resulting libraries are only compatible with Scala > 2.11 > > > > > > Now you want to also publish versions compatible with > Scala 2.12? > > > > > > If that's the case I think Maven could easily add multiple > > executions > > > where each compile compiles to different output > directories: > > > - Java --> target/classes > > > - Scala 2.11 --> target/classes-2.11 > > > - Scala 2.12 --> target/classes-2.12 > > > > > > Then the packaging would also need a second execution ... > each > > of the > > > executions bundling the classes and the corresponding scala > > output. > > > Ideally I would probably use maven classifiers to > distinguish the > > > artifacts. > > > > > > <dependency> > > > <groupId>org.apache.mahout</groupId> > > > <artifactId>mahout-spark</artifactId> > > > <version>14.1-SNAPSHOT</version> > > > <classifier>2.11</classifier> > > > </dependency> > > > > > > Then it should all work in a normal maven build. In the > > distributions you > > > could also filter the versions according to their > classifiers. > > > > > > So if this is the case, I could help you with this. > > > > > > Chris > > > > > > > > > Am 16.04.20, 09:39 schrieb "Stevo Slavić" < > ssla...@gmail.com>: > > > > > > Disclaimer: I'm not active Mahout maintainer for quite > a > > while, have > > > some > > > historical perspective, take it with a grain of salt, > could > > be I'm > > > missing > > > the whole point you were approached for by a wide > margin of > > error. > > > > > > At a point Mahout, some of its modules, have turned > into a > > scala > > > library, and there was need to cross publish those > modules, > > across > > > different scala versions. Back than Maven scala plugin > > didn't support > > > cross > > > publishing, it doesn't fit well with Maven's build > lifecycle > > concept > > > (multiple compile phases - one for each scala version, > and > > what not > > > would > > > be needed). Switching to sbt could have solved the > problem. > > Switch was > > > deemed to be too big task, even though ages have been > spent > > on trying > > > to > > > apply Maven (profiles) + bash scripts and what not to > solve > > the > > > problem. > > > Trying to apply same approach over and over again and > > expecting > > > different > > > results is not smart, no expert can help there. Mahout > > maintainers and > > > contributors, should consider alternative approach, > one of > > them being > > > switching to sbt - it's scala native, supports scala > cross > > publishing, > > > supports publishing Maven compatible release metadata > and > > binaries. > > > > > > Kind regards, > > > Stevo Slavic. > > > > > > On Thu, Apr 16, 2020 at 9:15 AM Christofer Dutz < > > > christofer.d...@c-ware.de> > > > wrote: > > > > > > > Hi folks, > > > > > > > > my name is Chris and I’m involved in quite a lot of > Apache > > projects. > > > > Justin approached me this morning, asking me if I > could > > perhaps help > > > you. > > > > He told me you were having trouble with doing Maven > > releases. > > > > > > > > As Maven releases are my specialty, could you please > > summarize the > > > issues > > > > you are having? > > > > > > > > Chris > > > > > > > > > > > > > > > > > >