Re: [RANT] This Maven thing is killing us....

Steve Loughran Wed, 05 Jul 2006 03:10:39 -0700

Jason van Zyl wrote:

On 4 Jul 06, at 1:45 PM 4 Jul 06, Steve Loughran wrote:
In a way, many of the stuff in M2 is experimental; a build tool thateffectively encodes beliefs about how a project should be structuredand delivered, focusing on component-based development instead ofapplication dev. I also think its time to look at how well some of theexperiment is working.
You make it sound like we're some sort of cult :-)

I think you are exploring cutting edge loosely coupled softwaredevelopment processes. It's research. Interesting, fun research, butresearch nonetheless. Just as Gump is an experiment in whether a unifiednightly build changes people's working processes.

I've been hanging round with semantic-web people recently, and havedevolved into using the word "belief" where they use "fact", because ofdifferences of opinion on what they and I think RDF triples are (theythink they're facts in a graph, I think every triple is a beliefpublished by an entity at a particular moment in time). The nice thingabout a belief-centric model is you get to accept the fact thatdifferent entities have different beliefs, and a single entity/agent canchange its belief set, without ever having to worry about the fact thatthe global belief-set is inconsistent.

in real agent-oriented-runtimes (still very much academic research, evenmore than RDF engines), the resolver takes in to account the metadataabout which agent issued a belief statement and when during itsresolution process. Newer statements by the same entity can overrideolder ones; differences between entities are allowable but result inambiguities that may need to be dealt with further down the line.

When you apply the same agent-oriented view to POM metadata, you can say"a POM file represents the pom author's beliefs about the artifact'sdependencies at the time they wrote the POM". It may be the beliefsmatch what the artifact really needs, it may be those beliefs turn outto be utterly wrong.

[interlude. I just grabbed the chair of the W3C RDF working group by thecoffee machine. Apparently "a belief is a state of mind", "a fact issomething that is believed". So all facts are beliefs, the only variablebeing the number of believers]

Because the ibiblio repository contains fact/belief metadata from somany sources, its that much harder to reconcile than those from singleentities. The good news is that we do have a very nice way to test theseassertions in java; running the program and seeing what classes getloaded. So when someone is utterly wrong in their dependencies, itspretty obvious. Its when they are slightly wrong, when they use someclasses in certain cases, often using reflection to bind at run time,that you can get caught out.

The phrase "encoding beliefs" is an inaccurate description. It's issimply the pursuit of best practices for software development and thosepractices are very much mutable, this thread being very good evidence ofthat. We also not only focused on component-oriented development, weourselves develop applications ourselves and we're trying to make thatcoherent as well.

Ok. how about "encoding the team's ideas and experience in how to buildapplications as sets of components, using

shared repositories to exchange components and their metadata"?

Personally, I always experience a bit of fear when adding a newdependency to a project. the repository stuff, and estimate a coupleof hours to get every addition stable, primarily by building up a goodexclusion list.
This is the place to talk about that as people shouldn't be fearfuladding dependencies. But people who have an ideal setup here theycompletely control the repository they use internally don't have many ofthe problems that people are experiencing in this thread. Having apublic repository of high quality is not a trivial task.
Is it worse than before? Better? Or just, well, different? and ifthings are either worse or not as good as they could be, what can bechanged?
The process is absolutely better. The process couple with the publicinfrastructure we have now is problematic. Two very different things.
One underlying cause seems to be pom quality. Open source software devis a vast collection of loosely coupled projects, and what we get inthe repository in terms of metadata matches this model. Each projectproduces artifacts that match their immediate needs, with POM filesthat appear to work at the time of publishing. Maven then caches thoseand freezes that metadata forever, even if it turns out that themetadata was wrong. There's far better coherence within Gump, wherethe metadata is effectively maintained more by the gump teamthemselves than by the individual projects.
There is absolutely no way this is scalable over time. You are sayingthat a small group of people can maintain metadata for projects thatthey are not intimately involved with? That's like saying that peoplewho live outside your community have a better chance at describing yourcommunity. I really just don't think that's possible. How many problemshas Gump had over the years trying to maintain the metadata? Hugeproblems, almost never in sync with projects. You basically find outwhen it breaks and go back track most of the time. There is no doubtthat the same process will happen with Maven where users of Maven willeventually make their metadata better but that will take time. Gump hasbeen around for 5-6 years now. People are really only starting to useMaven 2.x which is closing in on being out for a year. I am will to betin another year a great number of the problems seen in this thread willbe gone. I would argue that Gump will not work precisely because it isnot the projects themselves maintaining the metadata. Projects usingMaven will eventually get it right because it provides some value tothem to get it right.

Oh I agree, handwritten custom-coded stuff doesn't scale. That is theprice with that model, and it makesit hard to use the same tools within your own build process. But it doessupport the low-hanging-fruit of things that depend on commons-loggingyet who don't want logkit on their classpath.

Gump's problem is not just that the metadata is written by the gumpers,and not the projects, but that the projects don't always care if thebuild is broken. Getting someone to care about what happens to theirstuff downstream is the first step to fixing the problem. As more m2takeup occurs, you should get a lot of that feedback in the system,moving from the "please redist on the maven repository", to "please havegood metadata", before finally, the joy of silence, as everything works.

Question is, what to do about it? And if the m2 repository was anattempt to leave the problems of the M1 repository behind, has it worked?
To a large extent I would say we have fixed many of the problems on atechnical level. Correctly the metadata and educating projects as to howbest maintain is it is a social problem and a matter of education.Couple that with some automated integrity checks that will be performedby the repository manager.

Yes, I think more rigorousness on accepting poms would be good. People,even apache projects, should not be able to submit an artifact to therepository without

-everything you depend on being there. No unresolvable artifacts.

-no dependencies on -SNAPSHOT. I know, apache projects arent meant torelease in that mode, but Apache Muse managed it, with very badconsequences downstream.-a (manual) review of your dependencies. You, the submitter would gettold your dependencies; the repository mail list would somehow get asubmission note that listed the complete depends graph of that component.-the repository analyzer has some (extensible) rules about generally"bad" dependencies, those that should be flagged with a warning. Egjunit.jar in the runtime, any of the xml implementations in there(rather than just the stax/xml-apis api imports, use of commons-loggingover commons-logging-api".-flag appearance of strongly-deprecated versions of things. e.g.junit-3.7, anything else that is not in modern use and/or with securityholes.-scan the artifacts to see which packages they publish; store a list ofall classes. Then scan their imports to see what they explicitly import.Warn when something they import isnt published by anything they evenoptionally depend upon.

we could have some fun there, given the appropriate amount of sparetime. I quite like the idea of .class level validation...


-steve


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [RANT] This Maven thing is killing us....

Reply via email to