So,

I'm almost through with the cleaning up ... but haven't started bumping the 
versions of dependencies.

One question I do have though:
Regarding dependencies, there's a lot of stuff you can do wrong. Some are 
pretty difficult to track down.
However there are some techniques that help reduce the risk quite dramatically:
- Never rely on transitive dependencies (If you need it, declare it)
- Don't have dependencies you don't really need
- Don’t have multiple versions of one artifact in your reactor (Explicitly 
manage the dependencies)
And probably the trickiest one:
- Don’t have the same classes included in multiple jars (Can happen easily when 
directly or transitively including a fat jar)

In PLC4X I put in place enforcer rules to track down all of these and to fail 
the build if you're having one of these issues.

You want me to put them in place here too?
It can sometimes be a little annoying cause even a temporary dependency issue 
when you're working on something will cause failures, but I think on the 
long-term the benefits way outweigh the inconveniences.

Chris



Am 22.04.20, 10:46 schrieb "Christofer Dutz" <christofer.d...@c-ware.de>:

    Hi Andrew,

    thanks for your kind words ... they are sort of the fuel that makes me run 
;-)

    So some general observations and suggestions:
    - You seem to use test-jars quite a bit: These are generally considered an 
antipattern as you possibly import problems from another module and you will 
have no way of detecting them. If you need shared test-code it's better 
practice to create a dedicated test-utils module and include that wherever it's 
needed.
    - Don't use variables for project dependencies: It makes things slightly 
more difficult to read the release plugin takes care of updating version for 
you and some third party plugins might have issues with it. 
    - I usually provide versions for all project dependencies and have all 
other dependencies managed in a dependencyManagement section of the root module 
this avoids problems with version conflicts when constructing something using 
multiple parts of your projects (Especially your lib directory thing)
    - Accessing resources outside of the current modules scope is generally 
considered an antipattern ... regarding your lib thing, I would suggest an 
assembly that builds a directory (but I do understand that this version perhaps 
speeds up the development workflow ... we could move the clean plugin 
configuration and the antrun plugin config into a profile dedicated for 
development)
    - I usually order the plugin configurations (as much as possible) the way 
they are usually executed in the build ... so: clean, process resources, 
compile, test, package, ... this makes it easier to understand the build in 
general.

    Today I'll go through the poms again managing all versions and cleaning up 
the order of things. Then if all still works I would bump the dependencies 
versions up as much as possible.

    Will report back as soon as I'm though or I've got something to report ... 
then I'll also go into details with your feedback (I haven't ignored it ;-) )

    Chris



    Am 22.04.20, 06:08 schrieb "Andrew Palumbo" <ap....@outlook.com>:

        Fixing previous message..


        Quote from Chris Dutz:

        > Hi folks,
        >    so I was now able to build (including all tests) with Java 8 and 9 
... currently trying 10 ...
        >    Are there any objection that some maven dependencies get updated 
to more recent versions? I mean ... the hbase-client you're using is more than 
5 years old ...

        My answer:

        I personally have no problem with the updating of any dependencies, 
they may break some things and caue more work, but that is the kind of thing 
that we've been trying to get done in this build work,  get everything up to 
speed.

        Id say take Andrew, Trevor and Pat's word over mine though i am a bit 
less active presently.

        Thanks.

        Andy

        ________________________________
        From: Andrew Palumbo <ap....@outlook.com>
        Sent: Tuesday, April 21, 2020 10:17 PM
        To: dev@mahout.apache.org <dev@mahout.apache.org>
        Subject: Re: Hi ... need some help?

          Hi folks,

            so I was now able to build (including all tests) with Java 8 and 9 
... currently trying 10 ...

            Are there any objection that some maven dependencies get updated to 
more recent versions? I mean ... the hbase-client you're using is more than 5 
years old ...
        Not by me, I believe that is being used by the MR module, which is 
Deprecated.

        I personally have no problem with the updating of any dependencies, 
they may break some things and caue more work, but that is the kind of thing 
that we've been trying to get done in this build work,  get everything up to 
speed.

        Id say take Andrew, Trevor and Pat's word over mine though i am a bit 
less active presently.

        Thanks.

        Andy
        ________________________________
        From: Andrew Palumbo <ap....@outlook.com>
        Sent: Tuesday, April 21, 2020 10:13 PM
        To: dev@mahout.apache.org <dev@mahout.apache.org>
        Subject: Re: Hi ... need some help?

        Chris, Thank you so much for what you are doing,  This is Apache at its 
best.. I've been down and out with a serious Illness, Injury and other issues, 
which have seriously limited my Machine time.   I was pretty close to getting a 
good build, but it was hacky, and the method that you use to name the modules 
for both Scala versions, looks great.

        We've always relied on Stevo to fix the builds for us,  but as he said 
is unable to contribute right now.  The main issues (solved by hacks), 
currently are


          1.  Dependencies and transitive dependencies  are not being picked 
and copied to the `./lib` directory, where `/bin/mahout` and parts of the 
MahoutSparkContext look for them, to add to the class path.  So running either 
from the CLI or as a library, dependencies are not picked up.
             *   We used to use the mahout-experimental-xx.jar as a fat jar for 
this, though it was bloated with now deprecated MR stuff, and no longer packed.
          2.  `./bin/mahout` (and `compute-classpath.sh`) need to be revamped 
to ensure that they are picking up the correct classes.

        w.r.t. to Java 8/7 issues, We did mandate Java 8+, and this required a 
few minor code changes to play nicely with Scala 2.11.  Mainly one class needed 
a JVM "Static" field, so i refactored that field out of the Class and into a 
companion object.  I wonder if this is what is giving you issues with Java 7.

        I'd thought that Java 8 was mandated now, but may be thinking of maven 
3.3.x.

        Regardless Thank you very much for this.  This board is doing really 
doing well so far. and deserves accolades.

        >
            > <dependency>
            >     <groupId>org.apache.mahout</groupId>
            >     <artifactId>mahout-spark</artifactId>
            >     <version>14.1-SNAPSHOT</version>
            >     <classifier>2.11</classifier>
            > </dependency>
        This would be perfect IMO.


        I can send you the commits that I am talking about.

        As well, I saw that Trevor gave you a link to a filter.. I have one 
here with a bit more limited scope, which is open issues fixversion == 14.1.

        To answer one question yes this was recently building, and releasing, 
with all of the tests passing (for a few modules, that we were focusing on).  
after that i made some changes that broke it again..

        the board with limited scope: 
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=348&view=detail&selectedIssue=MAHOUT-2093


        Thanks again for helping out.  we are really bad with poms, not so much 
from the ground up, as fixing some that are 10 years old, as Stevo mentioned, 
very quickly while working on several other things.

        Thank you again for this. It is a great help, and once we get a good 
build, we can get back to doing work on the library itself.

        I have some documents that i can provide if it will help explain the 
structure of the project, which is still kind of in flux.  E.g. I'd like to get 
the ViennaCL-OMP branch out of experimental, but there is much to clean up 
first.  As well, I am on medical leave, and dont have much time on the computer 
these days.. have to budget my time.

        I'll send you some (closed) PRs with notes and changes, if it helps.  
lmk.

        Thanks again, This is Huge.

        Andy
        ________________________________
        From: Christofer Dutz <christofer.d...@c-ware.de>
        Sent: Thursday, April 16, 2020 9:50 AM
        To: dev@mahout.apache.org <dev@mahout.apache.org>
        Subject: Re: Hi ... need some help?

        Hi Trevor,

        ok ... first of all ... the Mahout PMC is defining a "community 
maintained" library which is not maintained by the mahout PMC?!?!
        I thought at Apache everything is about Community over code. So is a 
company driving the non-community stuff?

        But back to your build issues:
        I had a look and I too encountered these comments and remarks and 
sometimes patterns I recognized and could imagine why they were created.
        Yes quite a bit of the build could be cleaned up and simplified a lot.

        So how about I create a fork and try to do a cleanup of the build.
        Usually I also leave comments about what I do as I hope I'll not be the 
only one maintaining a build and documenting things helps people feel more 
confident.

        However in some cases I will have questions ... so would someone be 
available on Slack for quick questions?

        Usually switching to another build system does solve some problems ... 
mostly the reason to switch is that it solved the main problem that you are 
having with the old.
        However you usually notice too late that you get yourself a lot of new 
problems. I remember doing some contract work for an insurance company and they 
were totally down Maven-road but then had to build something with SBT ... in 
the end I compiled the thing on my laptop, copied it to a USB stick and told 
the people what was on the stick and that I'll be having a coffee and will be 
back in 30 minutes. When I came back the sick wasn't at the same place and the 
build problem was "solved" ;-)

        So I think it's quite good to stick to maven ... that is very mature, 
you can do almost everything you want with it and it integrates perfectly into 
the Apache infrastructure.

        But that's just my opinion.

        So if you want me to help, I'll be happy to be of assistance.


        Chris



        Am 16.04.20, 15:28 schrieb "Trevor Grant" <trevor.d.gr...@gmail.com>:

            Hey Christopher,

            I would agree with what Stevo outlined but add some more context 
and a
            couple related JIRA issues.

            For 0.14.0 We did a big refactor and finally moved the MapReduce 
based
            Mahout all into what we called "community/" that is community 
maintained,
            which is to say, we're not maintaining it anymore (sunset began I 
think in
            2015).

            But all of our POMs were so huge and fat because they'd been 
layered up
            over the years by people coming and going and dropping in code. I 
wouldn't
            call these drive- bys, its just been over 10 years and people come 
and go.
            Such is the life of Apache Projects. So we had a situation where a 
lot of
            the old Map Reduce stuff and the POMs were considered "old-magic" 
no one
            really knew how it was all tied together, but we didn't want to 
mess with
            it for fear of breaking something in the "new" Mahout (aka Samsar) 
which is
            the Scala/Spark based library that it is now* (to others in the 
community:
            I know it runs on other engines, but for simplicity, I'm just 
calling it
            "runs-on-spark").

            For 0.14.0 We decided to trim out as much of that which was 
possible. We
            did some major liposuction on POMs, re organized things, etc. This 
was done
            by commenting out a section, then seeing if it would still build. 
So the
            current release
            _does_ build. And aside for some CLI driver issues which are 
outlined in
            [1], the project runs fairly smooth. (An SBT would probably solve 
[1], I
            believe Pat Ferrel has made his own SBT script to compile Mahout, 
which
            solved that problem for them).

            The issue we ran into with the releases (and the reason I think 
you're
            here), is that we also somewhere along the line commented out 
something
            that was important to the release process. Hence why 0.14.0 
released source
            only.

            Since 2008, there has been a lot of great work on generating 
plugins for
            doing Apache releases. Instead of the awkward hacks that made up 
the old
            poms (literally comments that said, "this is a hack, there's 
supposedly
            something better coming from ..." dated like 2012), we would like 
to do it
            the "right way" and incorporate the appropriate plugins.

            Refactoring to SBT was _one_ proposed solution. We're also OK 
continuing to
            use Maven, and I agree with what you said about the cross 
compiling. We
            actually have a script that just changes the scala version. We 
tried using
            the classifiers but there were issues in SBT, but the way you're 
proposing
            sounds a lot more pro than the route we were trying for.

            That said- we'd be OK just releasing one scala/spark version at a 
time.
            But getting the convenience binaries to release/publish would be a 
major
            first step.

            Also, we really appreciate the help,

            tg


            [1]
            
https://issues.apache.org/jira/projects/MAHOUT/issues/MAHOUT-2093?filter=allopenissues



            On Thu, Apr 16, 2020 at 4:50 AM Christofer Dutz 
<christofer.d...@c-ware.de>
            wrote:

            > Hi Stevo,
            >
            > so let me summarize what I understood:
            >
            > - There are some modules in mahout that are built with Scala, 
some with
            > java and some with both (At least that's what I see when checking 
out the
            > project)
            > - The current build uses Scala 2.11 to build the Scala code.
            > - The resulting libraries are only compatible with Scala 2.11
            >
            > Now you want to also publish versions compatible with Scala 2.12?
            >
            > If that's the case I think Maven could easily add multiple 
executions
            > where each compile compiles to different output directories:
            > - Java --> target/classes
            > - Scala 2.11 --> target/classes-2.11
            > - Scala 2.12 --> target/classes-2.12
            >
            > Then the packaging would also need a second execution ... each of 
the
            > executions bundling the classes and the corresponding scala 
output.
            > Ideally I would probably use maven classifiers to distinguish the
            > artifacts.
            >
            > <dependency>
            >     <groupId>org.apache.mahout</groupId>
            >     <artifactId>mahout-spark</artifactId>
            >     <version>14.1-SNAPSHOT</version>
            >     <classifier>2.11</classifier>
            > </dependency>
            >
            > Then it should all work in a normal maven build. In the 
distributions you
            > could also filter the versions according to their classifiers.
            >
            > So if this is the case, I could help you with this.
            >
            > Chris
            >
            >
            > Am 16.04.20, 09:39 schrieb "Stevo Slavić" <ssla...@gmail.com>:
            >
            >     Disclaimer: I'm not active Mahout maintainer for quite a 
while, have
            > some
            >     historical perspective, take it with a grain of salt, could 
be I'm
            > missing
            >     the whole point you were approached for by a wide margin of 
error.
            >
            >     At a point Mahout, some of its modules, have turned into a 
scala
            >     library, and there was need to cross publish those modules, 
across
            >     different scala versions. Back than Maven scala plugin didn't 
support
            > cross
            >     publishing, it doesn't fit well with Maven's build lifecycle 
concept
            >     (multiple compile phases - one for each scala version, and 
what not
            > would
            >     be needed). Switching to sbt could have solved the problem. 
Switch was
            >     deemed to be too big task, even though ages have been spent 
on trying
            > to
            >     apply Maven (profiles) + bash scripts and what not to solve 
the
            > problem.
            >     Trying to apply same approach over and over again and 
expecting
            > different
            >     results is not smart, no expert can help there. Mahout 
maintainers and
            >     contributors, should consider alternative approach, one of 
them being
            >     switching to sbt - it's scala native, supports scala cross 
publishing,
            >     supports publishing Maven compatible release metadata and 
binaries.
            >
            >     Kind regards,
            >     Stevo Slavic.
            >
            >     On Thu, Apr 16, 2020 at 9:15 AM Christofer Dutz <
            > christofer.d...@c-ware.de>
            >     wrote:
            >
            >     > Hi folks,
            >     >
            >     > my name is Chris and I’m involved in quite a lot of Apache 
projects.
            >     > Justin approached me this morning, asking me if I could 
perhaps help
            > you.
            >     > He told me you were having trouble with doing Maven 
releases.
            >     >
            >     > As Maven releases are my specialty, could you please 
summarize the
            > issues
            >     > you are having?
            >     >
            >     > Chris
            >     >
            >
            >



Reply via email to