[ 
https://issues.apache.org/jira/browse/IGNITE-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14723513#comment-14723513
 ] 

Denis Magda edited comment on IGNITE-1294 at 8/31/15 2:59 PM:
--------------------------------------------------------------

I don't see an easy way to fully fix the race condition between shmem and TCP 
clients creation. It seems to require us putting more efforts on this.

Thus, for now only bugs fixed on TCP side got merged. All the changes done to 
fix the race are reverted.

Let's return to this task when it become an issue of a higher priority.

The race is caused by this code snipped that is a part of {{onFirstMessage}} 
method of {{TcpCommunicationSpi}}.
{noformat}
                    else {
                        boolean reserved = 
recoveryDesc.tryReserve(msg0.connectCount(),
                                new ConnectClosure(ses, recoveryDesc, rmtNode, 
msg0, !hasShmemClient, fut));

                        if (reserved)
                            connected(recoveryDesc, ses, rmtNode, 
msg0.received(), true, !hasShmemClient);
                    }
{noformat} 


was (Author: dmagda):
I don't see an easy way to fully fix the race condition between shmem and TCP 
clients creation. It seems to require us putting more efforts on this.

Thus, for now only bugs fixed on TCP side got merged. All the changes done to 
fix the race are reverted.

Let's return to this task when it become an issue with a higher priority.

The race is caused by this code snipped that is a part of {{onFirstMessage}} 
method of {{TcpCommunicationSpi}}.
{noformat}
                    else {
                        boolean reserved = 
recoveryDesc.tryReserve(msg0.connectCount(),
                                new ConnectClosure(ses, recoveryDesc, rmtNode, 
msg0, !hasShmemClient, fut));

                        if (reserved)
                            connected(recoveryDesc, ses, rmtNode, 
msg0.received(), true, !hasShmemClient);
                    }
{noformat} 

> Assertion in TCP communication SPI: client already created
> ----------------------------------------------------------
>
>                 Key: IGNITE-1294
>                 URL: https://issues.apache.org/jira/browse/IGNITE-1294
>             Project: Ignite
>          Issue Type: Bug
>          Components: general
>    Affects Versions: 1.1.4
>            Reporter: Alexey Goncharuk
>            Assignee: Denis Magda
>         Attachments: ignite-1294.patch
>
>
> Observed this failure on TC in master branch:
> {code}
> [19:39:53]W:           [org.apache.ignite:ignite-core] 
> java.lang.AssertionError: Client already created [
>       node=TcpDiscoveryNode [id=00db22a2-37de-4d41-9a81-1b3ccb7a3000, 
> addrs=[127.0.0.1], sockAddrs=[/127.0.0.1:47500], discPort=47500, order=1, 
> intOrder=1, lastExchangeTime=1440434393018, loc=false, 
> ver=1.4.1#19700101-sha1:00000000, isClient=false],
>       client=GridShmemCommunicationClient 
> [shmem=IpcSharedMemoryClientEndpoint [inSpace=IpcSharedMemorySpace 
> [opSize=262144, shmemPtr=139828001624128, shmemId=815824901, semId=696811527, 
> closed=false, isReader=true, writerPid=23710, readerPid=23710, 
> tokFileName=/opt/TeamcityAgent/temp/buildTmp/ignite/work/ipc/shmem/00db22a2-37de-4d41-9a81-1b3ccb7a3000-23710/gg-shmem-space-1087-23710-262144,
>  closed=false], outSpace=IpcSharedMemorySpace [opSize=262144, 
> shmemPtr=139828001357888, shmemId=815792132, semId=696778758, closed=false, 
> isReader=false, writerPid=23710, readerPid=23710, 
> tokFileName=/opt/TeamcityAgent/temp/buildTmp/ignite/work/ipc/shmem/00db22a2-37de-4d41-9a81-1b3ccb7a3000-23710/gg-shmem-space-1086-23710-262144,
>  closed=false], checkIn=true, checkOut=true], 
> writeBuf=java.nio.HeapByteBuffer[pos=0 lim=8192 cap=8192], 
> formatter=org.apache.ignite.internal.managers.communication.GridIoManager$2@489a1849,
>  super=GridAbstractCommunicationClient [lastUsed=1440434393133, reserves=0]], 
>       oldClient=GridTcpNioCommunicationClient [ses=GridSelectorNioSessionImpl 
> [selectorIdx=0, queueSize=0, writeBuf=java.nio.DirectByteBuffer[pos=0 
> lim=32768 cap=32768], readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 
> cap=32768], recovery=GridNioRecoveryDescriptor [acked=0, resendCnt=0, 
> rcvCnt=2, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode 
> [id=00db22a2-37de-4d41-9a81-1b3ccb7a3000, addrs=[127.0.0.1], 
> sockAddrs=[/127.0.0.1:47500], discPort=47500, order=1, intOrder=1, 
> lastExchangeTime=1440434393018, loc=false, ver=1.4.1#19700101-sha1:00000000, 
> isClient=false], connected=true, connectCnt=0, queueLimit=5120], 
> super=GridNioSessionImpl [locAddr=/127.0.0.1:45254, rmtAddr=/127.0.0.1:53055, 
> createTime=1440434393174, closeTime=0, bytesSent=26, bytesRcvd=345, 
> sndSchedTime=1440434393174, lastSndTime=1440434393174, 
> lastRcvTime=1440434393184, readsPaused=false, 
> filterChain=FilterChain[filters=[GridNioCodecFilter 
> [parser=org.apache.ignite.internal.util.nio.GridDirectParser@1cc9616c, 
> directMode=true], GridConnectionBytesVerifyFilter], accepted=true]], 
> super=GridAbstractCommunicationClient [lastUsed=1440434393174, reserves=0]]]
> [19:39:53]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:1909)
> [19:39:53]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:1840)
> [19:39:53]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:1806)
> [19:39:53]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1020)
> [19:39:53]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1168)
> [19:39:53]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:598)
> [19:39:53]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.sendLocalPartitions(GridDhtPartitionsExchangeFuture.java:932)
> [19:39:53]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.sendPartitions(GridDhtPartitionsExchangeFuture.java:973)
> [19:39:53]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:839)
> [19:39:53]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:1122)
> [19:39:53]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:108)
> [19:39:53]W:           [org.apache.ignite:ignite-core]        at 
> java.lang.Thread.run(Thread.java:745)
> [19:39:53]W:           [org.apache.ignite:ignite-core] Exception in thread 
> "exchange-worker-#15005%replicated.GridCacheSyncReplicatedPreloadSelfTest45%" 
> java.lang.AssertionError: Client already created [node=TcpDiscoveryNode 
> [id=00db22a2-37de-4d41-9a81-1b3ccb7a3000, addrs=[127.0.0.1], 
> sockAddrs=[/127.0.0.1:47500], discPort=47500, order=1, intOrder=1, 
> lastExchangeTime=1440434393018, loc=false, ver=1.4.1#19700101-sha1:00000000, 
> isClient=false], client=GridShmemCommunicationClient 
> [shmem=IpcSharedMemoryClientEndpoint [inSpace=IpcSharedMemorySpace 
> [opSize=262144, shmemPtr=139828001624128, shmemId=815824901, semId=696811527, 
> closed=false, isReader=true, writerPid=23710, readerPid=23710, 
> tokFileName=/opt/TeamcityAgent/temp/buildTmp/ignite/work/ipc/shmem/00db22a2-37de-4d41-9a81-1b3ccb7a3000-23710/gg-shmem-space-1087-23710-262144,
>  closed=false], outSpace=IpcSharedMemorySpace [opSize=262144, 
> shmemPtr=139828001357888, shmemId=815792132, semId=696778758, closed=false, 
> isReader=false, writerPid=23710, readerPid=23710, 
> tokFileName=/opt/TeamcityAgent/temp/buildTmp/ignite/work/ipc/shmem/00db22a2-37de-4d41-9a81-1b3ccb7a3000-23710/gg-shmem-space-1086-23710-262144,
>  closed=false], checkIn=true, checkOut=true], 
> writeBuf=java.nio.HeapByteBuffer[pos=0 lim=8192 cap=8192], 
> formatter=org.apache.ignite.internal.managers.communication.GridIoManager$2@489a1849,
>  super=GridAbstractCommunicationClient [lastUsed=1440434393133, reserves=0]], 
> oldClient=GridTcpNioCommunicationClient [ses=GridSelectorNioSessionImpl 
> [selectorIdx=0, queueSize=0, writeBuf=java.nio.DirectByteBuffer[pos=0 
> lim=32768 cap=32768], readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 
> cap=32768], recovery=GridNioRecoveryDescriptor [acked=0, resendCnt=0, 
> rcvCnt=2, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode 
> [id=00db22a2-37de-4d41-9a81-1b3ccb7a3000, addrs=[127.0.0.1], 
> sockAddrs=[/127.0.0.1:47500], discPort=47500, order=1, intOrder=1, 
> lastExchangeTime=1440434393018, loc=false, ver=1.4.1#19700101-sha1:00000000, 
> isClient=false], connected=true, connectCnt=0, queueLimit=5120], 
> super=GridNioSessionImpl [locAddr=/127.0.0.1:45254, rmtAddr=/127.0.0.1:53055, 
> createTime=1440434393174, closeTime=0, bytesSent=26, bytesRcvd=345, 
> sndSchedTime=1440434393174, lastSndTime=1440434393174, 
> lastRcvTime=1440434393184, readsPaused=false, 
> filterChain=FilterChain[filters=[GridNioCodecFilter 
> [parser=org.apache.ignite.internal.util.nio.GridDirectParser@1cc9616c, 
> directMode=true], GridConnectionBytesVerifyFilter], accepted=true]], 
> super=GridAbstractCommunicationClient [lastUsed=1440434393174, reserves=0]]]
> {code}
> Because of this exchange hung. It looks like Shmem and TCP clients were 
> created concurrently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to