I'm +1 on that. Thanks again for all the help Chris!!
tg On Wed, Apr 22, 2020 at 7:23 AM Christofer Dutz <christofer.d...@c-ware.de> wrote: > So, > > I'm almost through with the cleaning up ... but haven't started bumping > the versions of dependencies. > > One question I do have though: > Regarding dependencies, there's a lot of stuff you can do wrong. Some are > pretty difficult to track down. > However there are some techniques that help reduce the risk quite > dramatically: > - Never rely on transitive dependencies (If you need it, declare it) > - Don't have dependencies you don't really need > - Don’t have multiple versions of one artifact in your reactor (Explicitly > manage the dependencies) > And probably the trickiest one: > - Don’t have the same classes included in multiple jars (Can happen easily > when directly or transitively including a fat jar) > > In PLC4X I put in place enforcer rules to track down all of these and to > fail the build if you're having one of these issues. > > You want me to put them in place here too? > It can sometimes be a little annoying cause even a temporary dependency > issue when you're working on something will cause failures, but I think on > the long-term the benefits way outweigh the inconveniences. > > Chris > > > > Am 22.04.20, 10:46 schrieb "Christofer Dutz" <christofer.d...@c-ware.de>: > > Hi Andrew, > > thanks for your kind words ... they are sort of the fuel that makes me > run ;-) > > So some general observations and suggestions: > - You seem to use test-jars quite a bit: These are generally > considered an antipattern as you possibly import problems from another > module and you will have no way of detecting them. If you need shared > test-code it's better practice to create a dedicated test-utils module and > include that wherever it's needed. > - Don't use variables for project dependencies: It makes things > slightly more difficult to read the release plugin takes care of updating > version for you and some third party plugins might have issues with it. > - I usually provide versions for all project dependencies and have all > other dependencies managed in a dependencyManagement section of the root > module this avoids problems with version conflicts when constructing > something using multiple parts of your projects (Especially your lib > directory thing) > - Accessing resources outside of the current modules scope is > generally considered an antipattern ... regarding your lib thing, I would > suggest an assembly that builds a directory (but I do understand that this > version perhaps speeds up the development workflow ... we could move the > clean plugin configuration and the antrun plugin config into a profile > dedicated for development) > - I usually order the plugin configurations (as much as possible) the > way they are usually executed in the build ... so: clean, process > resources, compile, test, package, ... this makes it easier to understand > the build in general. > > Today I'll go through the poms again managing all versions and > cleaning up the order of things. Then if all still works I would bump the > dependencies versions up as much as possible. > > Will report back as soon as I'm though or I've got something to report > ... then I'll also go into details with your feedback (I haven't ignored it > ;-) ) > > Chris > > > > Am 22.04.20, 06:08 schrieb "Andrew Palumbo" <ap....@outlook.com>: > > Fixing previous message.. > > > Quote from Chris Dutz: > > > Hi folks, > > so I was now able to build (including all tests) with Java 8 > and 9 ... currently trying 10 ... > > Are there any objection that some maven dependencies get > updated to more recent versions? I mean ... the hbase-client you're using > is more than 5 years old ... > > My answer: > > I personally have no problem with the updating of any > dependencies, they may break some things and caue more work, but that is > the kind of thing that we've been trying to get done in this build work, > get everything up to speed. > > Id say take Andrew, Trevor and Pat's word over mine though i am a > bit less active presently. > > Thanks. > > Andy > > ________________________________ > From: Andrew Palumbo <ap....@outlook.com> > Sent: Tuesday, April 21, 2020 10:17 PM > To: dev@mahout.apache.org <dev@mahout.apache.org> > Subject: Re: Hi ... need some help? > > Hi folks, > > so I was now able to build (including all tests) with Java 8 > and 9 ... currently trying 10 ... > > Are there any objection that some maven dependencies get > updated to more recent versions? I mean ... the hbase-client you're using > is more than 5 years old ... > Not by me, I believe that is being used by the MR module, which is > Deprecated. > > I personally have no problem with the updating of any > dependencies, they may break some things and caue more work, but that is > the kind of thing that we've been trying to get done in this build work, > get everything up to speed. > > Id say take Andrew, Trevor and Pat's word over mine though i am a > bit less active presently. > > Thanks. > > Andy > ________________________________ > From: Andrew Palumbo <ap....@outlook.com> > Sent: Tuesday, April 21, 2020 10:13 PM > To: dev@mahout.apache.org <dev@mahout.apache.org> > Subject: Re: Hi ... need some help? > > Chris, Thank you so much for what you are doing, This is Apache > at its best.. I've been down and out with a serious Illness, Injury and > other issues, which have seriously limited my Machine time. I was pretty > close to getting a good build, but it was hacky, and the method that you > use to name the modules for both Scala versions, looks great. > > We've always relied on Stevo to fix the builds for us, but as he > said is unable to contribute right now. The main issues (solved by hacks), > currently are > > > 1. Dependencies and transitive dependencies are not being > picked and copied to the `./lib` directory, where `/bin/mahout` and parts > of the MahoutSparkContext look for them, to add to the class path. So > running either from the CLI or as a library, dependencies are not picked up. > * We used to use the mahout-experimental-xx.jar as a fat > jar for this, though it was bloated with now deprecated MR stuff, and no > longer packed. > 2. `./bin/mahout` (and `compute-classpath.sh`) need to be > revamped to ensure that they are picking up the correct classes. > > w.r.t. to Java 8/7 issues, We did mandate Java 8+, and this > required a few minor code changes to play nicely with Scala 2.11. Mainly > one class needed a JVM "Static" field, so i refactored that field out of > the Class and into a companion object. I wonder if this is what is giving > you issues with Java 7. > > I'd thought that Java 8 was mandated now, but may be thinking of > maven 3.3.x. > > Regardless Thank you very much for this. This board is doing > really doing well so far. and deserves accolades. > > > > > <dependency> > > <groupId>org.apache.mahout</groupId> > > <artifactId>mahout-spark</artifactId> > > <version>14.1-SNAPSHOT</version> > > <classifier>2.11</classifier> > > </dependency> > This would be perfect IMO. > > > I can send you the commits that I am talking about. > > As well, I saw that Trevor gave you a link to a filter.. I have > one here with a bit more limited scope, which is open issues fixversion == > 14.1. > > To answer one question yes this was recently building, and > releasing, with all of the tests passing (for a few modules, that we were > focusing on). after that i made some changes that broke it again.. > > the board with limited scope: > https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=348&view=detail&selectedIssue=MAHOUT-2093 > > > Thanks again for helping out. we are really bad with poms, not so > much from the ground up, as fixing some that are 10 years old, as Stevo > mentioned, very quickly while working on several other things. > > Thank you again for this. It is a great help, and once we get a > good build, we can get back to doing work on the library itself. > > I have some documents that i can provide if it will help explain > the structure of the project, which is still kind of in flux. E.g. I'd > like to get the ViennaCL-OMP branch out of experimental, but there is much > to clean up first. As well, I am on medical leave, and dont have much time > on the computer these days.. have to budget my time. > > I'll send you some (closed) PRs with notes and changes, if it > helps. lmk. > > Thanks again, This is Huge. > > Andy > ________________________________ > From: Christofer Dutz <christofer.d...@c-ware.de> > Sent: Thursday, April 16, 2020 9:50 AM > To: dev@mahout.apache.org <dev@mahout.apache.org> > Subject: Re: Hi ... need some help? > > Hi Trevor, > > ok ... first of all ... the Mahout PMC is defining a "community > maintained" library which is not maintained by the mahout PMC?!?! > I thought at Apache everything is about Community over code. So is > a company driving the non-community stuff? > > But back to your build issues: > I had a look and I too encountered these comments and remarks and > sometimes patterns I recognized and could imagine why they were created. > Yes quite a bit of the build could be cleaned up and simplified a > lot. > > So how about I create a fork and try to do a cleanup of the build. > Usually I also leave comments about what I do as I hope I'll not > be the only one maintaining a build and documenting things helps people > feel more confident. > > However in some cases I will have questions ... so would someone > be available on Slack for quick questions? > > Usually switching to another build system does solve some problems > ... mostly the reason to switch is that it solved the main problem that you > are having with the old. > However you usually notice too late that you get yourself a lot of > new problems. I remember doing some contract work for an insurance company > and they were totally down Maven-road but then had to build something with > SBT ... in the end I compiled the thing on my laptop, copied it to a USB > stick and told the people what was on the stick and that I'll be having a > coffee and will be back in 30 minutes. When I came back the sick wasn't at > the same place and the build problem was "solved" ;-) > > So I think it's quite good to stick to maven ... that is very > mature, you can do almost everything you want with it and it integrates > perfectly into the Apache infrastructure. > > But that's just my opinion. > > So if you want me to help, I'll be happy to be of assistance. > > > Chris > > > > Am 16.04.20, 15:28 schrieb "Trevor Grant" < > trevor.d.gr...@gmail.com>: > > Hey Christopher, > > I would agree with what Stevo outlined but add some more > context and a > couple related JIRA issues. > > For 0.14.0 We did a big refactor and finally moved the > MapReduce based > Mahout all into what we called "community/" that is community > maintained, > which is to say, we're not maintaining it anymore (sunset > began I think in > 2015). > > But all of our POMs were so huge and fat because they'd been > layered up > over the years by people coming and going and dropping in > code. I wouldn't > call these drive- bys, its just been over 10 years and people > come and go. > Such is the life of Apache Projects. So we had a situation > where a lot of > the old Map Reduce stuff and the POMs were considered > "old-magic" no one > really knew how it was all tied together, but we didn't want > to mess with > it for fear of breaking something in the "new" Mahout (aka > Samsar) which is > the Scala/Spark based library that it is now* (to others in > the community: > I know it runs on other engines, but for simplicity, I'm just > calling it > "runs-on-spark"). > > For 0.14.0 We decided to trim out as much of that which was > possible. We > did some major liposuction on POMs, re organized things, etc. > This was done > by commenting out a section, then seeing if it would still > build. So the > current release > _does_ build. And aside for some CLI driver issues which are > outlined in > [1], the project runs fairly smooth. (An SBT would probably > solve [1], I > believe Pat Ferrel has made his own SBT script to compile > Mahout, which > solved that problem for them). > > The issue we ran into with the releases (and the reason I > think you're > here), is that we also somewhere along the line commented out > something > that was important to the release process. Hence why 0.14.0 > released source > only. > > Since 2008, there has been a lot of great work on generating > plugins for > doing Apache releases. Instead of the awkward hacks that made > up the old > poms (literally comments that said, "this is a hack, there's > supposedly > something better coming from ..." dated like 2012), we would > like to do it > the "right way" and incorporate the appropriate plugins. > > Refactoring to SBT was _one_ proposed solution. We're also OK > continuing to > use Maven, and I agree with what you said about the cross > compiling. We > actually have a script that just changes the scala version. We > tried using > the classifiers but there were issues in SBT, but the way > you're proposing > sounds a lot more pro than the route we were trying for. > > That said- we'd be OK just releasing one scala/spark version > at a time. > But getting the convenience binaries to release/publish would > be a major > first step. > > Also, we really appreciate the help, > > tg > > > [1] > > https://issues.apache.org/jira/projects/MAHOUT/issues/MAHOUT-2093?filter=allopenissues > > > > On Thu, Apr 16, 2020 at 4:50 AM Christofer Dutz < > christofer.d...@c-ware.de> > wrote: > > > Hi Stevo, > > > > so let me summarize what I understood: > > > > - There are some modules in mahout that are built with > Scala, some with > > java and some with both (At least that's what I see when > checking out the > > project) > > - The current build uses Scala 2.11 to build the Scala code. > > - The resulting libraries are only compatible with Scala 2.11 > > > > Now you want to also publish versions compatible with Scala > 2.12? > > > > If that's the case I think Maven could easily add multiple > executions > > where each compile compiles to different output directories: > > - Java --> target/classes > > - Scala 2.11 --> target/classes-2.11 > > - Scala 2.12 --> target/classes-2.12 > > > > Then the packaging would also need a second execution ... > each of the > > executions bundling the classes and the corresponding scala > output. > > Ideally I would probably use maven classifiers to > distinguish the > > artifacts. > > > > <dependency> > > <groupId>org.apache.mahout</groupId> > > <artifactId>mahout-spark</artifactId> > > <version>14.1-SNAPSHOT</version> > > <classifier>2.11</classifier> > > </dependency> > > > > Then it should all work in a normal maven build. In the > distributions you > > could also filter the versions according to their > classifiers. > > > > So if this is the case, I could help you with this. > > > > Chris > > > > > > Am 16.04.20, 09:39 schrieb "Stevo Slavić" <ssla...@gmail.com > >: > > > > Disclaimer: I'm not active Mahout maintainer for quite a > while, have > > some > > historical perspective, take it with a grain of salt, > could be I'm > > missing > > the whole point you were approached for by a wide margin > of error. > > > > At a point Mahout, some of its modules, have turned into > a scala > > library, and there was need to cross publish those > modules, across > > different scala versions. Back than Maven scala plugin > didn't support > > cross > > publishing, it doesn't fit well with Maven's build > lifecycle concept > > (multiple compile phases - one for each scala version, > and what not > > would > > be needed). Switching to sbt could have solved the > problem. Switch was > > deemed to be too big task, even though ages have been > spent on trying > > to > > apply Maven (profiles) + bash scripts and what not to > solve the > > problem. > > Trying to apply same approach over and over again and > expecting > > different > > results is not smart, no expert can help there. Mahout > maintainers and > > contributors, should consider alternative approach, one > of them being > > switching to sbt - it's scala native, supports scala > cross publishing, > > supports publishing Maven compatible release metadata > and binaries. > > > > Kind regards, > > Stevo Slavic. > > > > On Thu, Apr 16, 2020 at 9:15 AM Christofer Dutz < > > christofer.d...@c-ware.de> > > wrote: > > > > > Hi folks, > > > > > > my name is Chris and I’m involved in quite a lot of > Apache projects. > > > Justin approached me this morning, asking me if I > could perhaps help > > you. > > > He told me you were having trouble with doing Maven > releases. > > > > > > As Maven releases are my specialty, could you please > summarize the > > issues > > > you are having? > > > > > > Chris > > > > > > > > > > >