Re: Stop nodes after test by default - IGNITE-6842

Dmitry Pavlov Tue, 06 Mar 2018 09:12:32 -0800

Hi Maxim,

are there any news on these test fails?


Is issue ready for review?

Sincerely,
Dmitiry Pavlov

вт, 27 февр. 2018 г. в 17:12, Dmitry Pavlov <[email protected]>:

> Hi, thank you!
>
> I've found several suspicious fails: such test fails have rate less than
> 1%, it is probably new failures.
>
> It would be great if we can fix it before merge. Could you address this
> fails?
>
> Sincerely,
> Dmitriy Pavlov
>
> IgniteCacheTestSuite5: IgniteCacheStoreCollectionTest.testStoreMap (fail
> rate 0,0%)
> IgniteCacheTestSuite5:
> CacheLateAffinityAssignmentTest.testDelayAssignmentClientJoin (fail rate
> 0,0%)
> IgniteCacheWithIndexingTestSuite:
> CacheRandomOperationsMultithreadedTest.testAtomicOffheapEviction (fail rate
> 0,0%)
> IgniteCacheWithIndexingTestSuite:
> CacheRandomOperationsMultithreadedTest.testAtomicOffheapEvictionIndexing
> (fail rate 0,0%)
> IgniteCacheWithIndexingTestSuite:
> CacheRandomOperationsMultithreadedTest.testTxOffheapEviction (fail rate
> 0,0%)
> IgniteCacheWithIndexingTestSuite:
> CacheRandomOperationsMultithreadedTest.testTxOffheapEvictionIndexing (fail
> rate 0,0%)
>
> IgniteBinarySimpleNameMapperCacheFullApiTestSuite:
> GridCachePartitionedNearDisabledMultiNodeWithGroupFullApiSelfTest.testWithSkipStoreTx
> (fail rate 0,0%)
>
> вт, 27 февр. 2018 г. в 17:04, Maxim Muzafarov <[email protected]>:
>
>> Yep, link may not be correct.
>>
>> Here is correct version:
>> TC: *
>> https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8&branch_IgniteTests24Java8=pull%2F3542%2Fhead
>> <
>> https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8&branch_IgniteTests24Java8=pull%2F3542%2Fhead
>> >*
>>
>>
>> вт, 27 февр. 2018 г. в 16:41, Dmitry Pavlov <[email protected]>:
>>
>> > Hi Maxim,
>> >
>> > could you please provide link to TC run on your PR? It seems link
>> provided
>> > points to run of master. In changes field you may select pull/3542/head
>> > before starting RunAll.
>> >
>> > Igniters,
>> >
>> > this change is related to our test framework, so change may affect your
>> > tests. Please join to review
>> > https://reviews.ignite.apache.org/ignite/review/IGNT-CR-502
>> >
>> > Sincerely,
>> > Dmitriy Pavlov
>> >
>> > вт, 27 февр. 2018 г. в 16:14, Maxim Muzafarov <[email protected]>:
>> >
>> > > Hi all,
>> > >
>> > > I think, I've done with this issue, what should we do next?
>> > >
>> > > PR: https://github.com/apache/ignite/pull/3542
>> > > Upsource: https://reviews.ignite.apache.org/ignite/review/IGNT-CR-502
>> > > TC:
>> > >
>> > >
>> >
>> https://ci.ignite.apache.org/viewModification.html?modId=723895&personal=false&buildTypeId=&tab=vcsModificationTests
>> > > JIRA: https://issues.apache.org/jira/browse/IGNITE-6842
>> > >
>> > >
>> > > чт, 22 февр. 2018 г. в 14:12, Dmitry Pavlov <[email protected]>:
>> > >
>> > > > Hi Maxim,
>> > > >
>> > > > Thank you.
>> > > >
>> > > > To my mind stopAllGrids call should be kept in afterTestsStop(). If
>> > > > developer forgot to call super(), he will see exception from
>> > > > beforeTestsStart()
>> > > > added by you.
>> > > >
>> > > > If you think patch is ready to be reviewed, please change JIRA
>> status
>> > to
>> > > > "Patch Available".
>> > > >
>> > > > Optionally you can create Upsource review. It would be easier for
>> > > multiple
>> > > > reviewers to handle this patch. This test framework is used by all
>> > > Igniters
>> > > > so Upsource would be good option (
>> https://reviews.ignite.apache.org/
>> > ).
>> > > >
>> > > > Sincerely,
>> > > > Dmitriy Pavlov
>> > > >
>> > > > чт, 22 февр. 2018 г. в 13:44, Maxim Muzafarov <[email protected]>:
>> > > >
>> > > > > Hi all,
>> > > > >
>> > > > > I've made some changes based on our previous discusstions, please
>> see
>> > > PR
>> > > > > [1]:
>> > > > > 1) Remove duplicated code for stopGrid (by index and by name);
>> > > > > 2) Add new method that thows exception if not all grids haven't
>> > stopped
>> > > > > correctly;
>> > > > > 3)  Change tests that have been affected by this changes;
>> > > > >
>> > > > > Also, I have some thoughts for clarification:
>> > > > > 1) beforeTestsStart() - I expect here in subtests that grids are
>> not
>> > > > > started yet. Am I right?
>> > > > > 2) I think we should call stopAllGrids in tearDown method. So, if
>> in
>> > > some
>> > > > > cases we'll override afterTestsStop in subclasses and forgot to
>> call
>> > > > > super() it won't lead us to exception.
>> > > > >
>> > > > > [1] https://github.com/apache/ignite/pull/3542
>> > > > > [2]
>> https://ci.ignite.apache.org/viewModification.html?modId=717275
>> > > > > [3] https://issues.apache.org/jira/browse/IGNITE-6842
>> > > > >
>> > > > >
>> > > > > ср, 7 февр. 2018 г. в 18:28, Maxim Muzafarov <[email protected]
>> >:
>> > > > >
>> > > > > > Thank you all,
>> > > > > >
>> > > > > > I'll add this comment's for JIRA ticket, if you don't mind.
>> > > > > >
>> > > > > > ср, 7 февр. 2018 г. в 15:16, Dmitry Pavlov <
>> [email protected]
>> > >:
>> > > > > >
>> > > > > >> Yes, this solution allows to cover both cases:
>> > > > > >> a) not stopped node from previous test and
>> > > > > >> b) allows to remove useless code that stops Ignite nodes from
>> each
>> > > > test.
>> > > > > >>
>> > > > > >> ср, 7 февр. 2018 г. в 15:13, Anton Vinogradov <
>> > > > [email protected]
>> > > > > >:
>> > > > > >>
>> > > > > >> > Maxim,
>> > > > > >> >
>> > > > > >> > We discussed with Dima privately, and decided
>> > > > > >> >
>> > > > > >> > 1) We have to assert that there is no alive nodes at
>> > > > > GridAbstractTest's
>> > > > > >> > beforeTestsStarted
>> > > > > >> > 2) We have to kill all alive nodes (without force) at
>> > > > > GridAbstractTest's
>> > > > > >> > afterTestsStopped
>> > > > > >> > 3) In case of any exceptions at #2 we have to see test error
>> > > > > >> > 4) We can get rid of all useless stopAllGrids at
>> > > GridAbstractTest's
>> > > > > >> > subclasses.
>> > > > > >> >
>> > > > > >> >
>> > > > > >> >
>> > > > > >> > On Wed, Feb 7, 2018 at 2:32 PM, Dmitry Pavlov <
>> > > > [email protected]>
>> > > > > >> > wrote:
>> > > > > >> >
>> > > > > >> > > > Let's just add stopAllGrids(flase) to GridAbstractTest 's
>> > > > > >> > > afterTestsStopped
>> > > > > >> > > method body.
>> > > > > >> > >
>> > > > > >> > > Can't agree with it becase implicit silent shutdown of
>> nodes
>> > > from
>> > > > > test
>> > > > > >> > > framework may cause a lot of bugs introduced to Ignite.
>> > > > > >> > >
>> > > > > >> > > I think stop+test fail should occur.
>> > > > > >> > > In that case author of incorrect test or not consistent
>> Ignite
>> > > > > >> shutdown
>> > > > > >> > > will see problem.
>> > > > > >> > >
>> > > > > >> > > 'Fail fast' if always better than hidding real problem
>> under
>> > > > > automatic
>> > > > > >> > > framework feature.
>> > > > > >> > >
>> > > > > >> > > ср, 7 февр. 2018 г. в 14:05, Anton Vinogradov <
>> > > > > >> [email protected]
>> > > > > >> > >:
>> > > > > >> > >
>> > > > > >> > > > Hi all,
>> > > > > >> > > >
>> > > > > >> > > > > I've made some research about this problem and i think
>> > that
>> > > in
>> > > > > >> > general
>> > > > > >> > > we
>> > > > > >> > > > > should move stopAllGrids method in GridAbstractTest
>> class
>> > to
>> > > > > >> > > > > afterTestsStopped method with some changes. Am I right?
>> > > > > >> > > > Let's just add stopAllGrids(flase) to GridAbstractTest 's
>> > > > > >> > > > afterTestsStopped method
>> > > > > >> > > > body.
>> > > > > >> > > >
>> > > > > >> > > > > I have a question about stopAllGrids(boolean cancel)
>> this
>> > > > > "cancel"
>> > > > > >> > > > That's just a flag means "do not wait for any operations
>> > > finish"
>> > > > > >> > > > See no reason to set it true in that case.
>> > > > > >> > > >
>> > > > > >> > > > > If you delegate closing to afterTestsStopped this will
>> > > affect
>> > > > > only
>> > > > > >> > > > > last test (method).
>> > > > > >> > > > The idea is to stop all nodes at last test's method
>> finish.
>> > > > > >> > > >
>> > > > > >> > > > >  Nodes that survive between tests can affect successive
>> > > > > >> > > > tests.
>> > > > > >> > > > Thats ok. we have a lot tests where nodes reusable
>> between
>> > > > test's
>> > > > > >> > > methods.
>> > > > > >> > > >
>> > > > > >> > > >
>> > > > > >> > > > On Wed, Feb 7, 2018 at 1:20 PM, Dmitry Pavlov <
>> > > > > >> [email protected]>
>> > > > > >> > > > wrote:
>> > > > > >> > > >
>> > > > > >> > > > > Hi Igniters,
>> > > > > >> > > > >
>> > > > > >> > > > > IMO nodes that survive between tests is not general
>> > practice
>> > > > and
>> > > > > >> I'm
>> > > > > >> > > not
>> > > > > >> > > > > sure is a best practice. I suggest to mark such tests
>> with
>> > > > some
>> > > > > >> > method
>> > > > > >> > > > > overriden from AbstractTest.
>> > > > > >> > > > >
>> > > > > >> > > > > About cancel flag, please note stopAllGrids(boolean
>> > cancel)
>> > > > > >> > > cancel=false
>> > > > > >> > > > > allows to wait of checkpoint ends in case persistence
>> > > enabled.
>> > > > > >> > > > >
>> > > > > >> > > > > I still suggest to avoid stopping any nodes by test,
>> but
>> > > > > validate
>> > > > > >> not
>> > > > > >> > > > > stopped node exist and fail test instead of siltent
>> > implicit
>> > > > > >> actions.
>> > > > > >> > > > >
>> > > > > >> > > > > Sincerely,
>> > > > > >> > > > > Dmitriy Pavlov
>> > > > > >> > > > >
>> > > > > >> > > > > ср, 7 февр. 2018 г. в 13:04, Andrey Kuznetsov <
>> > > > > [email protected]
>> > > > > >> >:
>> > > > > >> > > > >
>> > > > > >> > > > > > Hi Maxim,
>> > > > > >> > > > > >
>> > > > > >> > > > > > Regarding your first question, the use of
>> > > afterTestsStopped
>> > > > is
>> > > > > >> not
>> > > > > >> > > > enough
>> > > > > >> > > > > > to stop all nodes, since each individual test
>> (method)
>> > can
>> > > > > start
>> > > > > >> > > custom
>> > > > > >> > > > > set
>> > > > > >> > > > > > of notes during its operation, and this very test
>> should
>> > > > stop
>> > > > > >> all
>> > > > > >> > > those
>> > > > > >> > > > > > nodes. If you delegate closing to afterTestsStopped
>> this
>> > > > will
>> > > > > >> > affect
>> > > > > >> > > > only
>> > > > > >> > > > > > last test (method). Nodes that survive between tests
>> can
>> > > > > affect
>> > > > > >> > > > > successive
>> > > > > >> > > > > > tests.
>> > > > > >> > > > > >
>> > > > > >> > > > > > 2018-02-07 1:10 GMT+03:00 Maxim Muzafarov <
>> > > > [email protected]
>> > > > > >:
>> > > > > >> > > > > >
>> > > > > >> > > > > > > Hello,
>> > > > > >> > > > > > >
>> > > > > >> > > > > > > I've made some research about this problem and i
>> think
>> > > > that
>> > > > > in
>> > > > > >> > > > general
>> > > > > >> > > > > we
>> > > > > >> > > > > > > should move stopAllGrids method in GridAbstractTest
>> > > class
>> > > > to
>> > > > > >> > > > > > > afterTestsStopped method with some changes. Am I
>> > right?
>> > > > > >> > > > > > >
>> > > > > >> > > > > > > Also, I have a question about stopAllGrids(boolean
>> > > cancel)
>> > > > > >> this
>> > > > > >> > > > > "cancel"
>> > > > > >> > > > > > > argument. Why in some cases we should interrupt
>> > > ComputeJob
>> > > > > >> and in
>> > > > > >> > > > some
>> > > > > >> > > > > > > cases shouldn't? For example here:
>> > > > > >> > > > > > >
>> IgniteBaselineAffinityTopologyActivationTest#afterTest
>> > > > > >> > > > > > > we call method stopAllGrids(false) this way. Why
>> not
>> > > > "true"
>> > > > > >> > > argument
>> > > > > >> > > > > > > instead?
>> > > > > >> > > > > > >
>> > > > > >> > > > > > >
>> > > > > >> > > > > > > --
>> > > > > >> > > > > > Best regards,
>> > > > > >> > > > > >   Andrey Kuznetsov.
>> > > > > >> > > > > >
>> > > > > >> > > > >
>> > > > > >> > > >
>> > > > > >> > >
>> > > > > >> >
>> > > > > >>
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>

Re: Stop nodes after test by default - IGNITE-6842

Reply via email to