Re: Nagging the cause vs. Nagging the effect

Stefano Mazzocchi Thu, 05 Feb 2004 12:47:02 -0800

On 5 Feb 2004, at 09:22, Stefan Bodewig wrote:

On Thu, 5 Feb 2004, Stefano Mazzocchi <[EMAIL PROTECTED]> wrote:

On 5 Feb 2004, at 02:17, Adam Jack wrote:

I've long wished that Gump could nag the 'cause' of a problem, not
the 'effect', but it is (AFAICT) pretty much impossible to guess
who is cause from a compile failure.


Tell you what: there have been looooong discussing about this and
endless hours that I spent on the whiteboard trying to figure out
*where* that data can emerge out of the entire mass of data that
gump is either collecting or generating.

I was still not able to find it, still not able to come up with a
general algorithm that would, at least, if not identify the cause,
at least discriminate between "causing trends" and "effected
trends".

The way you'd do it manually is about as follows, I believe:

* start with the last good build

* replace one of the dependencies with its latest version at a time
  and rebuild - reiterate until the build fails.

Gumpy should have enough data to do that, but the whole thing breaks
down when Gump has been unable to build a project for weeks or even
months as the number of changes becomes too big.

It is clear to me that we should develop a system that works "a regime", the bootstrap process (getting enough critical mass to attract attention) is way to complex to be automated.

But my gut feeling is that with a system that is running and has a reasonable nag/FoG metric/heuristic, the amount of changes never become too big because we stop them right at the beginning.

That's the beauty of continuous integration: it's a problem forecaster, it builds your project in a much wider scope that you can't simply take care by yourself..

[in this sense, it's like what google does when lists backlinks to a page: that information is not available to you, page author, because it's a property of the graph, not of the node]

I think the key is that the gump runs as for Gump or Gumpy do *not*
contain enough information. But if we had both:

  1) the latest dependency run
  2) the stable dependency run

and we had enough history of these (say a few months), I'm pretty
sure the data *IS* there.


I think Gumpy already collects, or at least could collect, historical
data for all dependency runs.

how? where?

If the dependency run has been
successful at least once, the data is supposed to be there.

I realize that this is naive. 8-)

Oh, believe me, nothing is naive in regard to data emergence. Even the slightest local trend that looks silly and obvious, could exhibit massively great potentials if applied to a complex topology (much like google does for hyperlinks, agora does for replies, amazon does for shopping carts, for example)

--
Stefano.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Nagging the cause vs. Nagging the effect

Reply via email to