Re: IgniteSemaphore and failoverSafe flag

Dmitry Karachentsev Fri, 14 Apr 2017 07:50:11 -0700

It's not 100% reproducible, to get failed locally I've ran it many timesin a loop (Intellij IDEA feature).

N.B. This test was muted before the fix, so yes, it's could not be a cause.


Thanks!

14.04.2017 17:23, Vladisav Jelisavcic пишет:

Hmm, I cannot reproduce this behavior locally,

my guess is interrupt flag is not always cleared properly in#GridCacheSemaphore.acquire method (but it doesn't have anything to dowith latest fix)


Can you make it reproducible?

On Fri, Apr 14, 2017 at 2:46 PM, Dmitry Karachentsev<dkarachent...@gridgain.com <mailto:dkarachent...@gridgain.com>> wrote:


    Vladislav,

    One more thing, This test [1] started failing on semaphore close
    when this fix [2] was introduced.
    Could you check it please?

    [1]
    
http://ci.ignite.apache.org/viewLog.html?buildId=547151&tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteDataStrucutures#testNameId-979977708202725050
    
<http://ci.ignite.apache.org/viewLog.html?buildId=547151&tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteDataStrucutures#testNameId-979977708202725050>
    [2] https://issues.apache.org/jira/browse/IGNITE-1977
    <https://issues.apache.org/jira/browse/IGNITE-1977>

    Thanks!

    14.04.2017 15:27, Dmitry Karachentsev пишет:

    Vladislav,

    Yep, you're right. I'll fix it.

    Thanks!

    14.04.2017 15:18, Vladisav Jelisavcic пишет:

    Hi Dmitry,

    it looks to me that this test is not valid - after the semaphore
    2 fails the permits are redistributed
    so the expected number of permits should really be 20 not 10. Do
    you agree?

    I guess before latest fix this test was (incorrectly) passing
    because permits weren't released properly.

    What do you think?

    On Fri, Apr 14, 2017 at 11:27 AM, Dmitry Karachentsev
    <dkarachent...@gridgain.com <mailto:dkarachent...@gridgain.com>>
    wrote:

        Hi Vladislav,

        It looks like after fix was merged these tests [1] started
        failing. Could you please take a look?

        [1]
        
http://ci.ignite.apache.org/viewLog.html?buildId=544238&tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteBinaryObjectsDataStrucutures
        
<http://ci.ignite.apache.org/viewLog.html?buildId=544238&tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteBinaryObjectsDataStrucutures>

        Thanks!

        -Dmitry.

        13.04.2017 16:15, Dmitry Karachentsev пишет:

        Thanks a lot!

        12.04.2017 16:35, Vladisav Jelisavcic пишет:

        Hi Dmitry,

        sure, I made a fix, take a look at the PR and the comments
        in the ticket.

        Best regards,
        Vladisav

        On Tue, Apr 11, 2017 at 3:00 PM, Dmitry Karachentsev
        <dkarachent...@gridgain.com
        <mailto:dkarachent...@gridgain.com>> wrote:

            Hi Vladislav,

            Thanks for your contribution! But it seems doesn't fix
            related tickets, in particular [1].
            Could you please take a look?

            [1] https://issues.apache.org/jira/browse/IGNITE-4173
            <https://issues.apache.org/jira/browse/IGNITE-4173>

            Thanks!

            06.04.2017 16:27, Vladisav Jelisavcic пишет:

            Hey Dmitry,

            sorry for the late reply, I'll try to bake a pr later
            during the day.

            Best regards,
            Vladisav



            On Tue, Apr 4, 2017 at 11:05 AM, Dmitry Karachentsev
            <dkarachent...@gridgain.com
            <mailto:dkarachent...@gridgain.com>> wrote:

                Hi Vladislav,

                I see you're developing [1] for a while, did you
                have any chance to fix it? If no, is there any
                estimate?

                [1]
                https://issues.apache.org/jira/browse/IGNITE-1977
                <https://issues.apache.org/jira/browse/IGNITE-1977>

                Thanks!

                -Dmitry.



                20.03.2017 10:28, Alexey Goncharuk пишет:

                    I think re-creation should be handled by a
                    user who will make sure that
                    nobody else is currently executing the
                    guarded logic before the
                    re-creation. This is exactly the same
                    semantics as with
                    BrokenBarrierException for j.u.c.CyclicBarrier.

                    2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic
                    <vladis...@gmail.com
                    <mailto:vladis...@gmail.com>>:

                        Hi everyone,

                        I agree with Val, he's got a point;
                        recreating the lock doesn't seem
                        possible
                        (at least not the with the transactional
                        cache lock/semaphore we have).
                        Is this re-create behavior really needed?

                        Best regards,
                        Vladisav



                        On Thu, Mar 16, 2017 at 8:34 PM, Valentin
                        Kulichenko <
                        valentin.kuliche...@gmail.com
                        <mailto:valentin.kuliche...@gmail.com>>
                        wrote:

                            Guys,

                            How does recreation of the lock
                            helps? My understanding is that scenario

                        is

                            the following:

                            1. Client A creates and acquires a
                            lock, and then starts to execute

                        guarded

                            logic.
                            2. Client B tries to acquire the same
                            lock and parks to wait.
                            3. Before client A unlocks, all
                            affinity nodes for the lock fail, lock
                            disappears from the cache.
                            4. Client B fails with exception,
                            recreates the lock, acquires it, and
                            starts to execute guarded logic
                            concurrently with client A.

                            In my view this is wrong anyway,
                            regardless of whether this happens
                            silently or with an exception handled
                            in user's code. Because this code
                            doesn't have any way to know if
                            client A still holds the lock or not.

                            Am I missing something?

                            -Val

                            On Tue, Mar 14, 2017 at 10:14 AM,
                            Dmitriy Setrakyan <

                        dsetrak...@apache.org
                        <mailto:dsetrak...@apache.org>

                            wrote:

                                On Tue, Mar 14, 2017 at 12:46 AM,
                                Alexey Goncharuk <
                                alexey.goncha...@gmail.com
                                <mailto:alexey.goncha...@gmail.com>>
                                wrote:

                                        Which user operation
                                        would result in
                                        exception? To my knowledge,

                        user

                                may

                                        already be holding the
                                        lock and not invoking any
                                        Ignite APIs, no?

                                    Yes, this is exactly my point.

                                    Imagine that a node already
                                    holds a lock and another node
                                    is waiting

                            for

                                    the lock. If all partition
                                    nodes leave the grid and the
                                    lock is

                                re-created,

                                    this second node will
                                    immediately acquire the lock
                                    and we will have

                        two

                                    lock owners. I think in this
                                    case this second node (blocked on

                        lock())

                                    should get an exception
                                    saying that the lock was lost
                                    (which is, by

                        the

                                    way, the current behavior),
                                    and the first node should get an

                        exception

                            on

                                    unlock.

                                Makes sense.

Re: IgniteSemaphore and failoverSafe flag

Reply via email to