Hi, What baseline topology does ./control.sh prints? Is it possible, a node that out of baseline has started before baseline node starts?
On Thu, Jun 7, 2018 at 9:54 AM, szj <[email protected]> wrote: > Well, it definitely does work in 2.4. Please notice that there needs to be > ignitevisorcmd.sh involved to trigger this bug (I didn't try with other > clients though). Here's what is printed by Java on the console: > > [09:28:33] > [09:28:33] To start Console Management & Monitoring run > ignitevisorcmd.{sh|bat} > [09:28:33] > [09:28:33] Ignite node started OK (id=ae8697ad) > [09:28:33] Topology snapshot [ver=33, servers=2, clients=0, CPUs=4, > offheap=2.1GB, heap=2.0GB] > [09:28:33] ^-- Node [id=AE8697AD-6421-4C0C-96FE-FC29ED9B6DCA, > clusterState=ACTIVE] > [09:28:33] ^-- Baseline [id=7, size=2, online=2, offline=0] > [09:28:33] Data Regions Configured: > [09:28:33] ^-- default [initSize=256.0 MiB, maxSize=1.4 GiB, > persistenceEnabled=true] > [09:29:25] Ignite node stopped OK [uptime=00:00:51.837] > [09:29:35] __________ ________________ > [09:29:35] / _/ ___/ |/ / _/_ __/ __/ > [09:29:35] _/ // (7 7 // / / / / _/ > [09:29:35] /___/\___/_/|_/___/ /_/ /___/ > [09:29:35] > [09:29:35] ver. 2.5.0#20180523-sha1:86e110c7 > [09:29:35] 2018 Copyright(C) Apache Software Foundation > [09:29:35] > [09:29:35] Ignite documentation: http://ignite.apache.org > [09:29:35] > [09:29:35] Quiet mode. > [09:29:35] ^-- Logging to file > '/usr/share/apache-ignite/work/log/ignite-d484e6c6.0.log' > [09:29:35] ^-- Logging by 'JavaLogger [quiet=true, config=null]' > [09:29:35] ^-- To see **FULL** console log here add -DIGNITE_QUIET=false > or "-v" to ignite.{sh|bat} > [09:29:35] > [09:29:35] OS: Linux 2.6.32-696.18.7.el6.x86_64 amd64 > [09:29:35] VM information: OpenJDK Runtime Environment 1.8.0_121-b13 Oracle > Corporation OpenJDK 64-Bit Server VM 25.121-b13 > [09:29:35] Configured plugins: > [09:29:35] ^-- None > [09:29:35] > [09:29:35] Configured failure handler: [hnd=StopNodeOrHaltFailureHandler > [tryStop=false, timeout=0]] > [09:29:35] Message queue limit is set to 0 which may lead to potential > OOMEs > when running cache operations in FULL_ASYNC or PRIMARY_SYNC modes due to > message queues growth on sender and receiver sides. > [09:29:35] Security status [authentication=off, tls/ssl=off] > [09:29:36,435][SEVERE][tcp-disco-msg-worker-#2][TcpDiscoverySpi] > TcpDiscoverSpi's message worker thread failed abnormally. Stopping the node > in order to prevent cluster wide instability. > class org.apache.ignite.IgniteException: Node with BaselineTopology cannot > join mixed cluster running in compatibility mode > at > org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor. > onGridDataReceived(GridClusterStateProcessor.java:714) > at > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$5. > onExchange(GridDiscoveryManager.java:883) > at > org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi. > onExchange(TcpDiscoverySpi.java:1939) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker. > processNodeAddedMessage(ServerImpl.java:4354) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker. > processMessage(ServerImpl.java:2744) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker. > processMessage(ServerImpl.java:2536) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerAdapter.body( > ServerImpl.java:6775) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body( > ServerImpl.java:2621) > at > org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) > [09:29:36,437][SEVERE][tcp-disco-msg-worker-#2][] Critical system error > detected. Will be handled accordingly to configured handler [hnd=class > o.a.i.failure.StopNodeOrHaltFailureHandler, failureCtx=FailureContext > [type=SYSTEM_WORKER_TERMINATION, err=class o.a.i.IgniteException: Node > with > BaselineTopology cannot join mixed cluster running in compatibility mode]] > class org.apache.ignite.IgniteException: Node with BaselineTopology cannot > join mixed cluster running in compatibility mode > at > org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor. > onGridDataReceived(GridClusterStateProcessor.java:714) > at > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$5. > onExchange(GridDiscoveryManager.java:883) > at > org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi. > onExchange(TcpDiscoverySpi.java:1939) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker. > processNodeAddedMessage(ServerImpl.java:4354) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker. > processMessage(ServerImpl.java:2744) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker. > processMessage(ServerImpl.java:2536) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerAdapter.body( > ServerImpl.java:6775) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body( > ServerImpl.java:2621) > at > org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) > [09:29:36,438][SEVERE][tcp-disco-msg-worker-#2][] JVM will be halted > immediately due to the failure: [failureCtx=FailureContext > [type=SYSTEM_WORKER_TERMINATION, err=class o.a.i.IgniteException: Node > with > BaselineTopology cannot join mixed cluster running in compatibility mode]] > > > And here's the log from the log folder (hostnames/IP addresses anonymised). > Apparently the exception is thrown when the other node connects to the one > that is starting (which happens pretty quickly during the startup): > > 09:29:35,205][INFO][main][IgniteKernal] > > >>> __________ ________________ > >>> / _/ ___/ |/ / _/_ __/ __/ > >>> _/ // (7 7 // / / / / _/ > >>> /___/\___/_/|_/___/ /_/ /___/ > >>> > >>> ver. 2.5.0#20180523-sha1:86e110c7 > >>> 2018 Copyright(C) Apache Software Foundation > >>> > >>> Ignite documentation: http://ignite.apache.org > > [09:29:35,217][INFO][main][IgniteKernal] Config URL: > file:/etc/apache-ignite/saltstack.xml > [09:29:35,244][INFO][main][IgniteKernal] IgniteConfiguration > [igniteInstanceName=null, pubPoolSize=8, svcPoolSize=8, callbackPoolSize=8, > stripedPoolSize=8, sysPoolSize=8, mgmtPoolSize=4, igfsPoolSize=2, > dataStreamerPoolSize=8, utilityCachePoolSize=8, > utilityCacheKeepAliveTime=60000, p2pPoolSize=2, qryPoolSize=8, > igniteHome=/usr/share/apache-ignite, > igniteWorkDir=/usr/share/apache-ignite/work, > mbeanSrv=com.sun.jmx.mbeanserver.JmxMBeanServer@6f94fa3e, > nodeId=d484e6c6-5d54-4d13-8704-6b2cea991cde, > marsh=org.apache.ignite.internal.binary.BinaryMarshaller@7fa98a66, > marshLocJobs=false, daemon=false, p2pEnabled=false, netTimeout=5000, > sndRetryDelay=1000, sndRetryCnt=3, metricsHistSize=10000, > metricsUpdateFreq=2000, metricsExpTime=9223372036854775807, > discoSpi=TcpDiscoverySpi [addrRslvr=null, sockTimeout=0, ackTimeout=0, > marsh=null, reconCnt=10, reconDelay=2000, maxAckTimeout=600000, > forceSrvMode=false, clientReconnectDisabled=false, internalLsnr=null], > segPlc=STOP, segResolveAttempts=2, waitForSegOnStart=true, > allResolversPassReq=true, segChkFreq=10000, commSpi=TcpCommunicationSpi > [connectGate=null, connPlc=null, enableForcibleNodeKill=false, > enableTroubleshootingLog=false, > srvLsnr=org.apache.ignite.spi.communication.tcp. > TcpCommunicationSpi$2@661972b0, > locAddr=null, locHost=null, locPort=47100, locPortRange=100, shmemPort=-1, > directBuf=true, directSndBuf=false, idleConnTimeout=600000, > connTimeout=5000, maxConnTimeout=600000, reconCnt=10, sockSndBuf=32768, > sockRcvBuf=32768, msgQueueLimit=0, slowClientQueueLimit=0, nioSrvr=null, > shmemSrv=null, usePairedConnections=false, connectionsPerNode=1, > tcpNoDelay=true, filterReachableAddresses=false, ackSndThreshold=32, > unackedMsgsBufSize=0, sockWriteTimeout=2000, lsnr=null, boundTcpPort=-1, > boundTcpShmemPort=-1, selectorsCnt=4, selectorSpins=0, addrRslvr=null, > ctxInitLatch=java.util.concurrent.CountDownLatch@5af3afd9[Count = 1], > stopping=false, > metricsLsnr=org.apache.ignite.spi.communication.tcp. > TcpCommunicationMetricsListener@323b36e0], > evtSpi=org.apache.ignite.spi.eventstorage.NoopEventStorageSpi@44ebcd03, > colSpi=NoopCollisionSpi [], deploySpi=LocalDeploymentSpi [lsnr=null], > indexingSpi=org.apache.ignite.spi.indexing.noop.NoopIndexingSpi@6356695f, > addrRslvr=null, clientMode=false, rebalanceThreadPoolSize=1, > txCfg=org.apache.ignite.configuration.TransactionConfiguration@4f18837a, > cacheSanityCheckEnabled=true, discoStartupDelay=60000, deployMode=SHARED, > p2pMissedCacheSize=100, locHost=null, timeSrvPortBase=31100, > timeSrvPortRange=100, failureDetectionTimeout=10000, > clientFailureDetectionTimeout=30000, metricsLogFreq=60000, hadoopCfg=null, > connectorCfg=org.apache.ignite.configuration.ConnectorConfiguration@ > 359f7cdf, > odbcCfg=null, warmupClos=null, atomicCfg=AtomicConfiguration > [seqReserveSize=1000, cacheMode=PARTITIONED, backups=1, aff=null, > grpName=null], classLdr=null, sslCtxFactory=null, platformCfg=null, > binaryCfg=null, memCfg=null, pstCfg=null, dsCfg=DataStorageConfiguration > [sysRegionInitSize=41943040, sysCacheMaxSize=104857600, pageSize=0, > concLvl=0, dfltDataRegConf=DataRegionConfiguration [name=default, > maxSize=1515879628, initSize=268435456, swapPath=null, > pageEvictionMode=DISABLED, evictionThreshold=0.9, emptyPagesPoolSize=100, > metricsEnabled=false, metricsSubIntervalCount=5, > metricsRateTimeInterval=60000, persistenceEnabled=true, > checkpointPageBufSize=0], storagePath=null, checkpointFreq=180000, > lockWaitTime=10000, checkpointThreads=4, checkpointWriteOrder=SEQUENTIAL, > walHistSize=20, walSegments=10, walSegmentSize=67108864, walPath=db/wal, > walArchivePath=db/wal/archive, metricsEnabled=false, walMode=LOG_ONLY, > walTlbSize=131072, walBuffSize=0, walFlushFreq=2000, walFsyncDelay=1000, > walRecordIterBuffSize=67108864, alwaysWriteFullPages=false, > fileIOFactory=org.apache.ignite.internal.processors. > cache.persistence.file.AsyncFileIOFactory@4f6ee6e4, > metricsSubIntervalCnt=5, metricsRateTimeInterval=60000, > walAutoArchiveAfterInactivity=-1, writeThrottlingEnabled=false, > walCompactionEnabled=false], activeOnStart=true, autoActivation=true, > longQryWarnTimeout=3000, sqlConnCfg=null, > cliConnCfg=ClientConnectorConfiguration [host=null, port=10800, > portRange=100, sockSndBufSize=0, sockRcvBufSize=0, tcpNoDelay=true, > maxOpenCursorsPerConn=128, threadPoolSize=8, idleTimeout=0, > jdbcEnabled=true, odbcEnabled=true, thinCliEnabled=true, sslEnabled=false, > useIgniteSslCtxFactory=true, sslClientAuth=false, sslCtxFactory=null], > authEnabled=false, failureHnd=null, commFailureRslvr=null] > [09:29:35,245][INFO][main][IgniteKernal] Daemon mode: off > [09:29:35,246][INFO][main][IgniteKernal] OS: Linux > 2.6.32-696.18.7.el6.x86_64 amd64 > [09:29:35,246][INFO][main][IgniteKernal] OS user: saltmndb > [09:29:35,246][INFO][main][IgniteKernal] PID: 30531 > [09:29:35,247][INFO][main][IgniteKernal] Language runtime: Java Platform > API > Specification ver. 1.8 > [09:29:35,247][INFO][main][IgniteKernal] VM information: OpenJDK Runtime > Environment 1.8.0_121-b13 Oracle Corporation OpenJDK 64-Bit Server VM > 25.121-b13 > [09:29:35,249][INFO][main][IgniteKernal] VM total memory: 0.96GB > [09:29:35,249][INFO][main][IgniteKernal] Remote Management [restart: on, > REST: on, JMX (remote: on, port: 49171, auth: off, ssl: off)] > [09:29:35,250][INFO][main][IgniteKernal] Logger: JavaLogger [quiet=true, > config=null] > [09:29:35,250][INFO][main][IgniteKernal] > IGNITE_HOME=/usr/share/apache-ignite > [09:29:35,250][INFO][main][IgniteKernal] VM arguments: [-Xms1g, -Xmx1g, > -XX:+AggressiveOpts, -XX:MaxMetaspaceSize=256m, -DIGNITE_QUIET=true, > -DIGNITE_SUCCESS_FILE=/usr/share/apache-ignite/work/ > ignite_success_bff20ab1-13bd-44a4-a70c-2ae8d90e8486, > -Dcom.sun.management.jmxremote, -Dcom.sun.management.jmxremote.port=49171, > -Dcom.sun.management.jmxremote.authenticate=false, > -Dcom.sun.management.jmxremote.ssl=false, > -DIGNITE_HOME=/usr/share/apache-ignite, > -DIGNITE_PROG_NAME=/usr/share/apache-ignite/bin/ignite.sh] > [09:29:35,250][INFO][main][IgniteKernal] System cache's DataRegion size is > configured to 40 MB. Use DataStorageConfiguration.systemCacheMemorySize > property to change the setting. > [09:29:35,259][INFO][main][IgniteKernal] Configured caches [in 'sysMemPlc' > dataRegion: ['ignite-sys-cache']] > [09:29:35,276][INFO][main][IgniteKernal] 3-rd party licenses can be found > at: /usr/share/apache-ignite/libs/licenses > [09:29:35,353][INFO][main][IgnitePluginProcessor] Configured plugins: > [09:29:35,353][INFO][main][IgnitePluginProcessor] ^-- None > [09:29:35,354][INFO][main][IgnitePluginProcessor] > [09:29:35,356][INFO][main][FailureProcessor] Configured failure handler: > [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0]] > [09:29:35,418][INFO][main][TcpCommunicationSpi] Successfully bound > communication NIO server to TCP port [port=47100, locHost=0.0.0.0/0.0.0.0, > selectorsCnt=4, selectorSpins=0, pairedConn=false] > [09:29:35,422][WARNING][main][TcpCommunicationSpi] Message queue limit is > set to 0 which may lead to potential OOMEs when running cache operations in > FULL_ASYNC or PRIMARY_SYNC modes due to message queues growth on sender and > receiver sides. > [09:29:35,450][WARNING][main][NoopCheckpointSpi] Checkpoints are disabled > (to enable configure any GridCheckpointSpi implementation) > [09:29:35,477][WARNING][main][GridCollisionManager] Collision resolution > is > disabled (all jobs will be activated upon arrival). > [09:29:35,478][INFO][main][IgniteKernal] Security status > [authentication=off, tls/ssl=off] > [09:29:35,515][INFO][main][CacheObjectBinaryProcessorImpl] Resolved > directory for serialized binary metadata: > /usr/share/apache-ignite/work/binary_meta/hostname123 > [09:29:35,816][INFO][main][FilePageStoreManager] Resolved page store work > directory: /usr/share/apache-ignite/work/db/hostname123 > [09:29:35,816][INFO][main][FileWriteAheadLogManager] Resolved write ahead > log work directory: /usr/share/apache-ignite/work/db/wal/hostname123 > [09:29:35,816][INFO][main][FileWriteAheadLogManager] Resolved write ahead > log archive directory: > /usr/share/apache-ignite/work/db/wal/archive/hostname123 > [09:29:35,838][INFO][main][FileWriteAheadLogManager] Started write-ahead > log > manager [mode=LOG_ONLY] > [09:29:35,889][INFO][main][GridCacheDatabaseSharedManager] Read checkpoint > status > [startMarker=/usr/share/apache-ignite/work/db/ > hostname123/cp/1528270113599-8440d970-63bb-44fd-97ef- > 72941ad111ac-START.bin, > endMarker=/usr/share/apache-ignite/work/db/hostname123/cp/ > 1528270113599-8440d970-63bb-44fd-97ef-72941ad111ac-END.bin] > [09:29:35,902][INFO][main][PageMemoryImpl] Started page memory > [memoryAllocated=100.0 MiB, pages=24814, tableSize=1.9 MiB, > checkpointBuffer=100.0 MiB] > [09:29:35,903][INFO][main][GridCacheDatabaseSharedManager] Checking memory > state [lastValidPos=FileWALPointer [idx=0, fileOff=6995549, len=11681], > lastMarked=FileWALPointer [idx=0, fileOff=6995549, len=11681], > lastCheckpointId=8440d970-63bb-44fd-97ef-72941ad111ac] > [09:29:35,931][INFO][main][GridCacheDatabaseSharedManager] Found last > checkpoint marker [cpId=8440d970-63bb-44fd-97ef-72941ad111ac, > pos=FileWALPointer [idx=0, fileOff=6995549, len=11681]] > [09:29:35,970][INFO][main][GridCacheDatabaseSharedManager] Applying lost > cache updates since last checkpoint record [lastMarked=FileWALPointer > [idx=0, fileOff=6995549, len=11681], > lastCheckpointId=8440d970-63bb-44fd-97ef-72941ad111ac] > [09:29:35,983][INFO][main][GridCacheDatabaseSharedManager] Finished > applying > WAL changes [updatesApplied=0, time=10ms] > [09:29:36,029][INFO][main][GridClusterStateProcessor] Restoring history > for > BaselineTopology[id=7] > [09:29:36,141][INFO][main][ClientListenerProcessor] Client connector > processor has started on TCP port 10800 > [09:29:36,193][INFO][main][GridTcpRestProtocol] Command protocol > successfully started [name=TCP binary, host=0.0.0.0/0.0.0.0, port=11211] > [09:29:36,223][INFO][main][IgniteKernal] Non-loopback local IPs: 1.2.3.4 > [09:29:36,223][INFO][main][IgniteKernal] Enabled local MACs: 005056812C8A > [09:29:36,248][INFO][main][TcpDiscoverySpi] Successfully bound to TCP port > [port=47500, localHost=0.0.0.0/0.0.0.0, > locNodeId=d484e6c6-5d54-4d13-8704-6b2cea991cde] > [09:29:36,333][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery > accepted incoming connection [rmtAddr=/5.6.7.8, rmtPort=56991] > [09:29:36,344][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery > spawning a new thread for connection [rmtAddr=/5.6.7.8, rmtPort=56991] > [09:29:36,344][INFO][tcp-disco-sock-reader-#4][TcpDiscoverySpi] Started > serving remote node connection [rmtAddr=/5.6.7.8:56991, rmtPort=56991] > [09:29:36,435][SEVERE][tcp-disco-msg-worker-#2][TcpDiscoverySpi] > TcpDiscoverSpi's message worker thread failed abnormally. Stopping the node > in order to prevent cluster wide instability. > class org.apache.ignite.IgniteException: Node with BaselineTopology cannot > join mixed cluster running in compatibility mode > at > org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor. > onGridDataReceived(GridClusterStateProcessor.java:714) > at > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$5. > onExchange(GridDiscoveryManager.java:883) > at > org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi. > onExchange(TcpDiscoverySpi.java:1939) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker. > processNodeAddedMessage(ServerImpl.java:4354) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker. > processMessage(ServerImpl.java:2744) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker. > processMessage(ServerImpl.java:2536) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerAdapter.body( > ServerImpl.java:6775) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body( > ServerImpl.java:2621) > at > org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) > [09:29:36,437][SEVERE][tcp-disco-msg-worker-#2][] Critical system error > detected. Will be handled accordingly to configured handler [hnd=class > o.a.i.failure.StopNodeOrHaltFailureHandler, failureCtx=FailureContext > [type=SYSTEM_WORKER_TERMINATION, err=class o.a.i.IgniteException: Node > with > BaselineTopology cannot join mixed cluster running in compatibility mode]] > class org.apache.ignite.IgniteException: Node with BaselineTopology cannot > join mixed cluster running in compatibility mode > at > org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor. > onGridDataReceived(GridClusterStateProcessor.java:714) > at > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$5. > onExchange(GridDiscoveryManager.java:883) > at > org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi. > onExchange(TcpDiscoverySpi.java:1939) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker. > processNodeAddedMessage(ServerImpl.java:4354) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker. > processMessage(ServerImpl.java:2744) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker. > processMessage(ServerImpl.java:2536) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerAdapter.body( > ServerImpl.java:6775) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body( > ServerImpl.java:2621) > at > org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) > [09:29:36,438][SEVERE][tcp-disco-msg-worker-#2][] JVM will be halted > immediately due to the failure: [failureCtx=FailureContext > [type=SYSTEM_WORKER_TERMINATION, err=class o.a.i.IgniteException: Node > with > BaselineTopology cannot join mixed cluster running in compatibility mode]] > > > > > > -- > Sent from: http://apache-ignite-users.70518.x6.nabble.com/ > -- Best regards, Andrey V. Mashenkov
