I collect some statistics based on CC notifications for period December,
8-18.
For the claslib runs over IBMVM and DRLVM we have:



Regression detected, notifications: 6

Intermittent zero-lengh reports on IBMVM (usually, swing): 4

Intermittent failure of swing tests over IBMVM: 5

Intermittent failure of swing tests over DRLVM: 1

Intermittent failure of network tests: 9

Intermittent failure of nio tests: 4



Detailed data for classlib tests:



Failures of swing tests (IBMVM on Windows only):

javax.swing.JToggleButtonTest         3

javax.swing.text.AbstractDocument_SerializationTest           5

javax.swing.text.DefaultStyledDocumentTest            5

javax.swing.text.GapContent_SerializeTest               2



Failures of nio tests (IBMVM and DRLVM but Linux only):

org.apache.harmony.nio.tests.java.nio.channels.DatagramChannelTest
2

org.apache.harmony.nio.tests.java.nio.channels.ServerSocketChannelTest     1

org.apache.harmony.nio.tests.java.nio.channels.SocketChannelTest  1



Failures of net tests:

tests.api.java.net.ConnectExceptionTest      1

tests.api.java.net.ServerSocketTest              1

HttpURLConnectionTest                              7



Current status:

1) HttpURLConnectionTest test was excluded (thanks to Tim)

2) Issue 2438 updated by ex-list to exclude: 4 swing test. Wait for
integration



So, to increase stability of CC we need to investigate 5 tests (3 nio + 2
net) and do a big work to enable intermittently failed GUI tests.



On 12/18/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote:



Mikhail Loenko wrote:
> 2006/12/18, Geir Magnusson Jr. <[EMAIL PROTECTED] >:
>>
>>
>> Mikhail Loenko wrote:
>> > 2006/12/1, Geir Magnusson Jr. <[EMAIL PROTECTED]>:
>> >>
>> >>
>> >> Mikhail Loenko wrote:
>> >> > 4) We have cruise controls running classlibrary tests on DRLVM. We
>> >> > need to decide what will we do when DRLVM+Classlib cruise control
>> >> > reports failure.
>> >>
>> >> Stop and fix the problem.  Is there really a question here?  I agree
>> >
>> > Yes, there is a question here. "Stop and fix" includes "discuss". But

>> > as we now know discussion may take several days. And while some
people
>> > discuss what the problem is other people can't proceed with
>> > development and patch
>> > intagration.
>> >
>> > To have better pace and better CC up-time we need something else but
>> not
>> > just "stop and fix". I suggest "revert and continue"
>>
>> What's the difference, other than debating the semantics of "fix" and
>> "revert"?
>>
>> We all agree - but I still don't think you're clearly stating the
>> problem.  I think that the core problem is that we don't immediately
>> react to CC failure.
>>
>> Immediately reacting to CC failure should be the first order of the day
>> here.  Reacting to me is making the decision, quickly, about either
>> rolling back the change ("reverting") or doing something else.  The key

>> is being responsive.
>>
>> It seems that what happens is that we wait, and then sets of changes
>> pile up, and I think that doing mass rollbacks at that point will solve
>> it, but make a mess.
>>
>> The example of what I envision is when I broke the build in DRLVM,
>> Gregory told me immediately, and I fixed immediately - w/o a rollback.
>>
>>
>> All I'm saying is :
>>
>> 1) We need to be far better with reaction time
>
> I would say we need to be far better with fixing/reverting time.
> If we reacted immediately and than discussed for two weeks -- we would
not
> be better than where we are now

Yes, fixing/reverting is included. It's what I meant.

>
>>
>> 2) We have intelligent people - we can be agile in this by making
>> decisions (quickly!) on a case by case basis what to do.
>>
>> I'll also suggest that we ask each committer to check the CC event
>> stream before committing, so you don't commit into a bad state of
things.
>>
>> One of my problems is that I don't trust the CC stream, and don't
>> clearly see it because it's mixed in the other drek of the commits@
list.
>
> The problem is intermittent failures. I suggest that we exclude graphics
> tests
> from CCs and probably have CC-specific exclude lists for networking
tests
> (or fix all the known intermittent failures right now :)

good idea - works for me.

We need to drive into stability - we've made amazing progress in the
last two months, and now we're down to the really, really hard stuff.  I
think that excluding them to get rock-solid CC reporting is step 0,
and then step 1 is try and grind out the intermittent failures.

geir


Reply via email to