[jira] [Created] (IGNITE-11838) Improve usability of UriDeploymentSpi documentation
Dmitry Sherstobitov created IGNITE-11838: Summary: Improve usability of UriDeploymentSpi documentation Key: IGNITE-11838 URL: https://issues.apache.org/jira/browse/IGNITE-11838 Project: Ignite Issue Type: Bug Components: documentation Affects Versions: 2.7 Reporter: Dmitry Sherstobitov I was trying to run UriDeploymentSPi feature and actually failed in it. I've only managed to stop it sung Java code. Here some issues in documentation I've found: 1. Not clear what is GAR file and how user can create it (manually?, using some utility?) 2. Local disk folder containing only compiled Java classes - this doesn’t work for me (and according to java code this shouldn't work) 3. Local disk folder with structure of unpacked GAR file - this DOES work but. META-INF/ actually is an optional folder, xyz.class -see previous). The only thing user need is to put lib/ folder in deployment URI and put .jar file there 4. Doesn’t clear what is ignite.xml descriptor file. How user can create it 5. I don’t like windows paths in examples (I think linux paths is more common in case of Ignite, we may create Note with Windows paths examples) 6. In case of Linux path user should write something like this: file:///tmp/path/deployment (3 slashes instead of 2) 7. https://apacheignite.readme.io/docs/service-grid-28#section-service-updates-redeployment - here link to URI looks strange and doesn’t work 8. Previous page: example temporaryDirectoryPath value is optional so we may remove it -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11667) OPTIMISTIC REPEATEBLE_READ transactions does not guarantee transactional consistency in blinking node scenario
Dmitry Sherstobitov created IGNITE-11667: Summary: OPTIMISTIC REPEATEBLE_READ transactions does not guarantee transactional consistency in blinking node scenario Key: IGNITE-11667 URL: https://issues.apache.org/jira/browse/IGNITE-11667 Project: Ignite Issue Type: Bug Reporter: Dmitry Sherstobitov Following scenario Start cluster, load data Start transactional loading (simple transfer task with PESSIMISTIC + OPTIMISTIC, REPEATABLE_READ transactions) repeat 10 times: Stop one node, sleep 10 seconds, start again Wait for finish rebalance (LocalNodeMovingPartitionsCount == 0 for each cache/cache_group) Validate that there is no conflicts in sum of fields (verify action for transfer task) In case of OPTIMISTIC/REPEATABLE_READ transactions there is no guarantee that transactional consistence will be supported. (last validate step will be failed) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11609) Add support of authentication and SSL in yardstick IgniteThinClient benchmark
Dmitry Sherstobitov created IGNITE-11609: Summary: Add support of authentication and SSL in yardstick IgniteThinClient benchmark Key: IGNITE-11609 URL: https://issues.apache.org/jira/browse/IGNITE-11609 Project: Ignite Issue Type: New Feature Affects Versions: 2.7 Reporter: Dmitry Sherstobitov Fix For: 2.8 Add support of following keys: Mandatory authentication: USER PASSWORD Mandatory SSL: SSL_KEY_PASSWORD SSL_KEY_PATH Optional SSL: SSL_CLIENT_STORE_TYPE (default JKS) SSL_SERVER_STORE_TYPE (default JKS) SSL_KEY_ALGORITHM (default SunX509) SSL_TRUST_ALL (default false) SSL_PROTOCOL (default TLS) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11408) AssertionError may occurs on client start
Dmitry Sherstobitov created IGNITE-11408: Summary: AssertionError may occurs on client start Key: IGNITE-11408 URL: https://issues.apache.org/jira/browse/IGNITE-11408 Project: Ignite Issue Type: Bug Reporter: Dmitry Sherstobitov Scenario from: https://issues.apache.org/jira/browse/IGNITE-11406 AssertionError may occurs on client start: {code} 2019-02-23T18:26:27,317][DEBUG][tcp-client-disco-msg-worker-#4][TcpDiscoverySpi] Grid runnable finished normally: tcp-client-disco-msg-worker-#4 Exception in thread “tcp-client-disco-msg-worker-#4” java.lang.AssertionError: TcpDiscoveryClientReconnectMessage [routerNodeId=76b33f1b-bef6-4805-bcca-0ea32df641ac, lastMsgId=null, super=TcpDiscoveryAbstractMessage [sndNodeId=76b33f1b-bef6-4805-bcca-0ea32df641ac, id=57c55fa1961-99d3d909-fa44-4b74-aea4-d375ad85e53e, verifierNodeId=6ba6bd09-4bc0-400c-ba11-a06d2507e983, topVer=0, pendingIdx=0, failedNodes=null, isClient=false]] at org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.processClientReconnectMessage(ClientImpl.java:2311) at org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.processDiscoveryMessage(ClientImpl.java:1914) at org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.body(ClientImpl.java:1798) at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) {code} Other trace {code:java} at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) [piclient-2.7.jar:?] at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) [piclient-2.7.jar:?] at py4j.Gateway.invoke(Gateway.java:282) [piclient-2.7.jar:?] at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) [piclient-2.7.jar:?] at py4j.commands.CallCommand.execute(CallCommand.java:79) [piclient-2.7.jar:?] at py4j.GatewayConnection.run(GatewayConnection.java:238) [piclient-2.7.jar:?] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181] Caused by: org.apache.ignite.IgniteCheckedException: Failed to start SPI: TcpDiscoverySpi [addrRslvr=null, sockTimeout=3, ackTimeout=6, marsh=JdkMarshaller [clsFilter=org.apache.ignite.marshaller.MarshallerUtils$1@59f2595b], reconCnt=2, reconDelay=2000, maxAckTimeout=30, forceSrvMode=false, clientReconnectDisabled=false, internalLsnr=null, skipAddrsRandomization=false] at org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:300) ~[ignite-core-2.4.15.jar:2.4.15] at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:901) ~[ignite-core-2.4.15.jar:2.4.15] at org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1672) ~[ignite-core-2.4.15.jar:2.4.15] ... 22 more Caused by: org.apache.ignite.spi.IgniteSpiException: Some error in join process. at org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.body(ClientImpl.java:1809) ~[ignite-core-2.4.15.jar:2.4.15] at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) ~[ignite-core-2.4.15.jar:2.4.15] 2019-02-23T18:26:27,320][ERROR][tcp-client-disco-sock-reader-#3][TcpDiscoverySpi] Connection failed [sock=Socket[addr=/172.25.1.34,port=47503,localport=60675], locNodeId=99d3d909-fa44-4b74-aea4-d375ad85e53e] 2019-02-23T18:26:27,320][ERROR][Thread-2][IgniteKernal] Got exception while starting (will rollback startup routine).{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11407) AssertionError may occurs on server start
Dmitry Sherstobitov created IGNITE-11407: Summary: AssertionError may occurs on server start Key: IGNITE-11407 URL: https://issues.apache.org/jira/browse/IGNITE-11407 Project: Ignite Issue Type: Bug Reporter: Dmitry Sherstobitov See https://issues.apache.org/jira/browse/IGNITE-11406 (same scenario) On 5th iteration (on each iteration there is 50 round cluster nodes restart) {code:java} ava.lang.AssertionError at org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.stopRoutine(GridContinuousProcessor.java:743) at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryManager.executeQuery0(CacheContinuousQueryManager.java:705) at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryManager.executeInternalQuery(CacheContinuousQueryManager.java:542) at org.apache.ignite.internal.processors.datastructures.DataStructuresProcessor.startQuery(DataStructuresProcessor.java:213) at org.apache.ignite.internal.processors.datastructures.DataStructuresProcessor.getAtomic(DataStructuresProcessor.java:541) at org.apache.ignite.internal.processors.datastructures.DataStructuresProcessor.atomicLong(DataStructuresProcessor.java:457) at org.apache.ignite.internal.IgniteKernal.atomicLong(IgniteKernal.java:3468) at org.apache.ignite.internal.IgniteKernal.atomicLong(IgniteKernal.java:3457) at org.apache.ignite.piclient.bean.LifecycleAtomicLongBean.onLifecycleEvent(LifecycleAtomicLongBean.java:48) at org.apache.ignite.internal.IgniteKernal.notifyLifecycleBeans(IgniteKernal.java:655) at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1064) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:1973) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1716) at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1144) at org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1062) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:948) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:847) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:717) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:686) at org.apache.ignite.Ignition.start(Ignition.java:352) at org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:302) Failed to start grid: null{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11406) NullPointerException may occurs on client start
Dmitry Sherstobitov created IGNITE-11406: Summary: NullPointerException may occurs on client start Key: IGNITE-11406 URL: https://issues.apache.org/jira/browse/IGNITE-11406 Project: Ignite Issue Type: Bug Reporter: Dmitry Sherstobitov During testing fixes for https://issues.apache.org/jira/browse/IGNITE-10878 # Start cluster, create caches with no persistence and load data into it # Restart each node in cluster by order (coordinator first) Do not wait until topology message occurs # Try to run utilities: activate, baseline (to check that cluster is alive) # Run clients and load data into alive caches On 4th step one of the clients throw NPE on start {code:java} 2019-02-23T18:36:24,045][DEBUG][tcp-client-disco-msg-worker-#4][TcpDiscoverySpi] Connection closed, local node received force fail message, will not try to restore connection 2019-02-23T18:36:24,045][DEBUG][tcp-client-disco-msg-worker-#4][TcpDiscoverySpi] Failed to restore closed connection, will try to reconnect [networkTimeout=5000, joinTimeout=0, failMsg=TcpDiscoveryNodeFailedMessage [failedNodeId=80f8b6ee-6a6d-4235-86e9-1b66ea310eb6, order=90, warning=Client node considered as unreachable and will be dropped from cluster, because no metrics update messages received in interval: TcpDiscoverySpi.clientFailureDetectionTimeout() ms. It may be caused by network problems or long GC pause on client node, try to increase this parameter. [nodeId=80f8b6ee-6a6d-4235-86e9-1b66ea310eb6, clientFailureDetectionTimeout=3], super=TcpDiscoveryAbstractMessage [sndNodeId=987d4a03-8233-4130-af5b-c06900bdb6d7, id=3642cfa1961-987d4a03-8233-4130-af5b-c06900bdb6d7, verifierNodeId=d9abbff3-4b4d-4a13-9cb1-0ca4d2436164, topVer=167, pendingIdx=0, failedNodes=null, isClient=false]]] 2019-02-23T18:36:24,046][DEBUG][tcp-client-disco-msg-worker-#4][TcpDiscoverySpi] Discovery notification [node=TcpDiscoveryNode [id=80f8b6ee-6a6d-4235-86e9-1b66ea310eb6, addrs=[172.25.1.34], sockAddrs=[lab34.gridgain.local/172.25.1.34:0], discPort=0, order=165, intOrder=0, lastExchangeTime=1550936128313, loc=true, ver=2.4.15#20190222-sha1:36b1d676, isClient=true], type=CLIENT_NODE_DISCONNECTED, topVer=166] 2019-02-23T18:36:24,049][INFO ][tcp-client-disco-msg-worker-#4][GridDhtPartitionsExchangeFuture] Finish exchange future [startVer=AffinityTopologyVersion [topVer=165, minorTopVer=0], resVer=null, err=class org.apache.ignite.internal.IgniteClientDisconnectedCheckedException: Client node disconnected: null] [2019-02-23T18:36:24,061][ERROR][Thread-2][IgniteKernal] Got exception while starting (will rollback startup routine). java.lang.NullPointerException: null at org.apache.ignite.internal.processors.cache.GridCacheProcessor.internalCacheEx(GridCacheProcessor.java:3886) ~[ignite-core-2.4.15.jar:2.4.15] at org.apache.ignite.internal.processors.cache.GridCacheProcessor.utilityCache(GridCacheProcessor.java:3858) ~[ignite-core-2.4.15.jar:2.4.15] at org.apache.ignite.internal.processors.service.GridServiceProcessor.updateUtilityCache(GridServiceProcessor.java:290) ~[ignite-core-2.4.15.jar:2.4.15] at org.apache.ignite.internal.processors.service.GridServiceProcessor.onKernalStart0(GridServiceProcessor.java:233) ~[ignite-core-2.4.15.jar:2.4.15] at org.apache.ignite.internal.processors.service.GridServiceProcessor.onKernalStart(GridServiceProcessor.java:221) ~[ignite-core-2.4.15.jar:2.4.15] at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1038) [ignite-core-2.4.15.jar:2.4.15] at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:1973) [ignite-core-2.4.15.jar:2.4.15] at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1716) [ignite-core-2.4.15.jar:2.4.15] at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1144) [ignite-core-2.4.15.jar:2.4.15] at org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1062) [ignite-core-2.4.15.jar:2.4.15] at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:948) [ignite-core-2.4.15.jar:2.4.15] at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:847) [ignite-core-2.4.15.jar:2.4.15] at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:717) [ignite-core-2.4.15.jar:2.4.15] at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:686) [ignite-core-2.4.15.jar:2.4.15] at org.apache.ignite.Ignition.start(Ignition.java:352) [ignite-core-2.4.15.jar:2.4.15] at org.apache.ignite.piclient.api.IgniteService.startIgniteClientNode(IgniteService.java:86) [piclient-2.7.jar:?] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_181] at
[jira] [Created] (IGNITE-11292) There is no way to disable WAL for cache in group
Dmitry Sherstobitov created IGNITE-11292: Summary: There is no way to disable WAL for cache in group Key: IGNITE-11292 URL: https://issues.apache.org/jira/browse/IGNITE-11292 Project: Ignite Issue Type: Bug Reporter: Dmitry Sherstobitov Following code doesn't work if cache is in cacheGroup: {code}ignite.cluster().disableWal(cacheName){code} cacheName == cacheName: {code} Caused by: class org.apache.ignite.IgniteCheckedException: Cannot change WAL mode because not all cache names belonging to the group are provided [group=cache_group_1, missingCaches=[cache_group_1_005, cache_group_3_063, cache_group_1_003, cache_group_3_064, cache_group_1_004, cache_group_3_061, cache_group_3_062, cache_group_1_001, cache_group_1_002]] {code} cacheName == groupName: {code} Caused by: class org.apache.ignite.IgniteCheckedException: Cache doesn't exist: cache_group_1 {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11100) AssertionError LocalJoinCachesContext occurs in sequential cluster restart
Dmitry Sherstobitov created IGNITE-11100: Summary: AssertionError LocalJoinCachesContext occurs in sequential cluster restart Key: IGNITE-11100 URL: https://issues.apache.org/jira/browse/IGNITE-11100 Project: Ignite Issue Type: Bug Reporter: Dmitry Sherstobitov Same scenario as in https://issues.apache.org/jira/browse/IGNITE-10878 {code} [2019-01-26T03:32:22,226][ERROR][tcp-disco-msg-worker-#2][TcpDiscoverySpi] TcpDiscoverSpi's message worker thread failed abnormally. Stopping the node in order to prevent cluster wide instability. java.lang.AssertionError: LocalJoinCachesContext [locJoinStartCaches=[IgniteBiTuple [val1=DynamicCacheDescriptor [deploymentId=bc3e0978861-fb98885f-92a5-47d2-9475-00173fab8ee1, staticCfg=true, sql=false, cacheType=UTILITY, template=false, updatesAllowed=true, cacheId=-2100569601, rcvdFrom=f97e4743-6cf2-488e-a7fc-14707e9a8eb0, objCtx=null, rcvdOnDiscovery=false, startTopVer=null, rcvdFromVer=null, clientCacheStartVer=null, schema=QuerySchema [], grpDesc=CacheGroupDescriptor [grpId=-2100569601, grpName=null, startTopVer=null, rcvdFrom=f97e4743-6cf2-488e-a7fc-14707e9a8eb0, deploymentId=bc3e0978861-fb98885f-92a5-47d2-9475-00173fab8ee1, caches={ignite-sys-cache=-2100569601}, rcvdFromVer=null, persistenceEnabled=false, walEnabled=false, cacheName=ignite-sys-cache], cacheName=ignite-sys-cache], val2=null], IgniteBiTuple [val1=DynamicCacheDescriptor [deploymentId=60771978861-398164df-6240-4d19-ad0b-308768d2a095, staticCfg=false, sql=false, cacheType=USER, template=false, updatesAllowed=true, cacheId=-1901084566, rcvdFrom=f00ec506-fc6c-45c5-b550-9308d17a39cf, objCtx=null, rcvdOnDiscovery=true, startTopVer=null, rcvdFromVer=null, clientCacheStartVer=null, schema=QuerySchema [], grpDesc=CacheGroupDescriptor [grpId=-1901084566, grpName=null, startTopVer=AffinityTopologyVersion [topVer=13, minorTopVer=20], rcvdFrom=f00ec506-fc6c-45c5-b550-9308d17a39cf, deploymentId=60771978861-398164df-6240-4d19-ad0b-308768d2a095, caches={config_third_copy=-1901084566}, rcvdFromVer=null, persistenceEnabled=false, walEnabled=false, cacheName=config_third_copy], cacheName=config_third_copy], val2=null], IgniteBiTuple [val1=DynamicCacheDescriptor [deploymentId=01771978861-398164df-6240-4d19-ad0b-308768d2a095, staticCfg=false, sql=false, cacheType=USER, template=false, updatesAllowed=true, cacheId=-1858528402, rcvdFrom=f00ec506-fc6c-45c5-b550-9308d17a39cf, objCtx=null, rcvdOnDiscovery=true, startTopVer=null, rcvdFromVer=null, clientCacheStartVer=null, schema=QuerySchema [], grpDesc=CacheGroupDescriptor [grpId=-1858528402, grpName=null, startTopVer=AffinityTopologyVersion [topVer=13, minorTopVer=22], rcvdFrom=f00ec506-fc6c-45c5-b550-9308d17a39cf, deploymentId=01771978861-398164df-6240-4d19-ad0b-308768d2a095, caches={trans_forth_copy=-1858528402}, rcvdFromVer=null, persistenceEnabled=false, walEnabled=false, cacheName=trans_forth_copy], cacheName=trans_forth_copy], val2=null], IgniteBiTuple [val1=DynamicCacheDescriptor [deploymentId=51771978861-398164df-6240-4d19-ad0b-308768d2a095, staticCfg=false, sql=false, cacheType=USER, template=false, updatesAllowed=true, cacheId=-1502999781, rcvdFrom=f00ec506-fc6c-45c5-b550-9308d17a39cf, objCtx=null, rcvdOnDiscovery=true, startTopVer=null, rcvdFromVer=null, clientCacheStartVer=null, schema=QuerySchema [], grpDesc=CacheGroupDescriptor [grpId=-1502999781, grpName=null, startTopVer=AffinityTopologyVersion [topVer=13, minorTopVer=23], rcvdFrom=f00ec506-fc6c-45c5-b550-9308d17a39cf, deploymentId=51771978861-398164df-6240-4d19-ad0b-308768d2a095, caches={id_forth_copy=-1502999781}, rcvdFromVer=null, persistenceEnabled=false, walEnabled=false, cacheName=id_forth_copy], cacheName=id_forth_copy], val2=null], IgniteBiTuple [val1=DynamicCacheDescriptor [deploymentId=8a671978861-398164df-6240-4d19-ad0b-308768d2a095, staticCfg=false, sql=false, cacheType=USER, template=false, updatesAllowed=true, cacheId=-1354792126, rcvdFrom=f00ec506-fc6c-45c5-b550-9308d17a39cf, objCtx=null, rcvdOnDiscovery=true, startTopVer=null, rcvdFromVer=null, clientCacheStartVer=null, schema=QuerySchema [], grpDesc=CacheGroupDescriptor [grpId=-1354792126, grpName=null, startTopVer=AffinityTopologyVersion [topVer=13, minorTopVer=5], rcvdFrom=f00ec506-fc6c-45c5-b550-9308d17a39cf, deploymentId=8a671978861-398164df-6240-4d19-ad0b-308768d2a095, caches={config=-1354792126}, rcvdFromVer=null, persistenceEnabled=false, walEnabled=false, cacheName=config], cacheName=config], val2=null], IgniteBiTuple [val1=DynamicCacheDescriptor [deploymentId=6d671978861-398164df-6240-4d19-ad0b-308768d2a095, staticCfg=false, sql=false, cacheType=USER, template=false, updatesAllowed=true, cacheId=-1176672452, rcvdFrom=f00ec506-fc6c-45c5-b550-9308d17a39cf, objCtx=null, rcvdOnDiscovery=true, startTopVer=null, rcvdFromVer=null,
[jira] [Created] (IGNITE-10995) GridDhtPartitionSupplier::handleDemandMessage suppress errors
Dmitry Sherstobitov created IGNITE-10995: Summary: GridDhtPartitionSupplier::handleDemandMessage suppress errors Key: IGNITE-10995 URL: https://issues.apache.org/jira/browse/IGNITE-10995 Project: Ignite Issue Type: Bug Reporter: Dmitry Sherstobitov Attachments: Screenshot 2019-01-20 at 23.19.08.png Scenario: # Cluster with data # Triggered historical rebalance In this case if OOM occurs on supplier there is no triggered failHandler and cluster is alive with inconsistent data (target node have MOVING partitions, supplier do nothing) Target rebalance node log: {code:java} [15:00:31,418][WARNING][sys-#86][GridDhtPartitionDemander] Rebalancing from node cancelled [grp=cache_group_4, topVer=AffinityTopologyVersion [topVer=17, minorTopVer=0], supplier=4cbc66d3-9d2c-4396-8366-2839a8d0cdb6, topic=5]]. Supplier has failed with error: java.lang.OutOfMemoryError: Java heap space{code} Supplier stack trace: !Screenshot 2019-01-20 at 23.19.08.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10943) "No next node in topology" infinite messages in log after cycle cluster nodes restart
Dmitry Sherstobitov created IGNITE-10943: Summary: "No next node in topology" infinite messages in log after cycle cluster nodes restart Key: IGNITE-10943 URL: https://issues.apache.org/jira/browse/IGNITE-10943 Project: Ignite Issue Type: Bug Affects Versions: 2.4 Reporter: Dmitry Sherstobitov Attachments: grid.1.node.1.jstack.log Same scenario as in https://issues.apache.org/jira/browse/IGNITE-10878 After cluster restarted here is one node with 100% CPU load and following messages in log: {code:java} 2019-01-15T15:16:41,333][DEBUG][tcp-disco-msg-worker-#2][TcpDiscoverySpi] Message has been added to queue: TcpDiscoveryNodeFailedMessage [failedNodeId=e006e575-bbc8-4004-8ce3-ddc165d1748c, order=12, warning=null, super=TcpDiscoveryAbstractMessage [sndNodeId=null, id=3cfe0715861-24a27aff-e471-4db1-ac46-cda072de17b9, verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null, isClient=false]] 2019-01-15T15:16:41,333][DEBUG][tcp-disco-msg-worker-#2][TcpDiscoverySpi] Pending messages will be resent to local node 2019-01-15T15:16:41,333][INFO ][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery spawning a new thread for connection [rmtAddr=/172.25.1.40, rmtPort=59236] 2019-01-15T15:16:41,333][INFO ][tcp-disco-sock-reader-#21][TcpDiscoverySpi] Started serving remote node connection [rmtAddr=/172.25.1.40:59236, rmtPort=59236] 2019-01-15T15:16:41,333][DEBUG][tcp-disco-msg-worker-#2][TcpDiscoverySpi] Message has been added to queue: TcpDiscoveryStatusCheckMessage [creatorNode=TcpDiscoveryNode [id=24a27aff-e471-4db1-ac46-cda072de17b9, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 172.17.0.1, 172.25.1.40], sockAddrs=[/172.17.0.1:47500, /0:0:0:0:0:0:0:1%lo:47500, lab40.gridgain.local/172.25.1.40:47500, /127.0.0.1:47500], discPort=47500, order=0, intOrder=15, lastExchangeTime=1547554584282, loc=true, ver=2.4.13#20190114-sha1:a7667ae6, isClient=false], failedNodeId=null, status=0, super=TcpDiscoveryAbstractMessage [sndNodeId=null, id=4cfe0715861-24a27aff-e471-4db1-ac46-cda072de17b9, verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null, isClient=false]] 2019-01-15T15:16:41,334][DEBUG][tcp-disco-msg-worker-#2][TcpDiscoverySpi] Ignore message failed nodes, sender node is in fail list [nodeId=e006e575-bbc8-4004-8ce3-ddc165d1748c, failedNodes=[a251994d-8df6-4b2d-a28c-18ec55a3a48c, a5fa9095-2e4b-48e5-803d-551a5ebde558]] 2019-01-15T15:16:41,334][DEBUG][tcp-disco-msg-worker-#2][TcpDiscoverySpi] No next node in topology. 2019-01-15T15:16:41,334][DEBUG][tcp-disco-msg-worker-#2][TcpDiscoverySpi] No next node in topology. 2019-01-15T15:16:41,334][DEBUG][tcp-disco-sock-reader-#21][TcpDiscoverySpi] Initialized connection with remote node [nodeId=6df245fe-6288-4d93-ab20-2b9ac1b35771, client=false] 2019-01-15T15:16:41,334][DEBUG][tcp-disco-msg-worker-#2][TcpDiscoverySpi] No next node in topology. 2019-01-15T15:16:41,335][DEBUG][tcp-disco-msg-worker-#2][TcpDiscoverySpi] No next node in topology. 2019-01-15T15:16:41,335][DEBUG][tcp-disco-msg-worker-#2][TcpDiscoverySpi] No next node in topology. 2019-01-15T15:16:41,335][DEBUG][tcp-disco-msg-worker-#2][TcpDiscoverySpi] No next node in topology. 2019-01-15T15:16:41,335][DEBUG][tcp-disco-msg-worker-#2][TcpDiscoverySpi] No next node in topology. 2019-01-15T15:16:41,335][DEBUG][tcp-disco-msg-worker-#2][TcpDiscoverySpi] No next node in topology. 2019-01-15T15:16:41,335][DEBUG][tcp-disco-msg-worker-#2][TcpDiscoverySpi] No next node in topology. 2019-01-15T15:16:41,336][DEBUG][tcp-disco-msg-worker-#2][TcpDiscoverySpi] No next node in topology. 2019-01-15T15:16:41,336][DEBUG][tcp-disco-msg-worker-#2][TcpDiscoverySpi] No next node in topology.{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10935) "Invalid node order" error occurs while cycle cluster nodes restart
Dmitry Sherstobitov created IGNITE-10935: Summary: "Invalid node order" error occurs while cycle cluster nodes restart Key: IGNITE-10935 URL: https://issues.apache.org/jira/browse/IGNITE-10935 Project: Ignite Issue Type: Bug Reporter: Dmitry Sherstobitov Same scenario as in https://issues.apache.org/jira/browse/IGNITE-10878 {code:java} Exception in thread "tcp-disco-msg-worker-#2" java.lang.AssertionError: Invalid node order: TcpDiscoveryNode [id=9a332aa3-3d60-469a-9ff5-3deee8918451, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 172.17.0.1, 172.25.1.40], sockAddrs=[/172.25.1.40:47501, /0:0:0:0:0:0:0:1%lo:47501, /127.0.0.1:47501, /172.17.0.1:47501], discPort=47501, order=0, intOrder=16, lastExchangeTime=1547486771047, loc=false, ver=2.4.13#20190114-sha1:a7667ae6, isClient=false] at org.apache.ignite.spi.discovery.tcp.internal.TcpDiscoveryNodesRing$1.apply(TcpDiscoveryNodesRing.java:51) at org.apache.ignite.spi.discovery.tcp.internal.TcpDiscoveryNodesRing$1.apply(TcpDiscoveryNodesRing.java:48) at org.apache.ignite.internal.util.lang.GridFunc.isAll(GridFunc.java:2030) at org.apache.ignite.internal.util.IgniteUtils.arrayList(IgniteUtils.java:9635) at org.apache.ignite.internal.util.IgniteUtils.arrayList(IgniteUtils.java:9608) at org.apache.ignite.spi.discovery.tcp.internal.TcpDiscoveryNodesRing.nodes(TcpDiscoveryNodesRing.java:625) at org.apache.ignite.spi.discovery.tcp.internal.TcpDiscoveryNodesRing.visibleNodes(TcpDiscoveryNodesRing.java:145) at org.apache.ignite.spi.discovery.tcp.ServerImpl.notifyDiscovery(ServerImpl.java:1429) at org.apache.ignite.spi.discovery.tcp.ServerImpl.access$2400(ServerImpl.java:176) at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processNodeAddFinishedMessage(ServerImpl.java:4565) at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2732) at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2554) at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerAdapter.body(ServerImpl.java:6955) at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2634) at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) Collaps{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10878) "Failed to find completed exchange future" error occurs in test with round cluster restart
Dmitry Sherstobitov created IGNITE-10878: Summary: "Failed to find completed exchange future" error occurs in test with round cluster restart Key: IGNITE-10878 URL: https://issues.apache.org/jira/browse/IGNITE-10878 Project: Ignite Issue Type: Bug Reporter: Dmitry Sherstobitov # Start cluster, create caches with no persistence and load data into it # Restart each node in cluster by order (coordinator first) Do not wait until topology message occurs # At some moment there is possibility of error (1 out of 20 runs) This is the case when the topology version has time to be reset {code:java} [23:27:17,218][INFO][exchange-worker-#62][GridCacheProcessor] Started cache [name=ENTITY_CONFIG, id=23889694, memoryPolicyName=no-evict, mode=REPLICATED, atomicity=ATOMIC, backups=2147483647] [23:27:17,222][SEVERE][exchange-worker-#62][GridDhtPartitionsExchangeFuture] Failed to reinitialize local partitions (preloading will be stopped): GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=0], discoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode [id=83bd0a25-4574-4723-9594-b95ddaab19be, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 172.17.0.1, 172.25.1.40], sockAddrs=[/0:0:0:0:0:0:0:1%lo:47503, /127.0.0.1:47503, /172.17.0.1:47503, lab40.gridgain.local/172.25.1.40:47503], discPort=47503, order=1, intOrder=1, lastExchangeTime=1547065626462, loc=true, ver=2.4.13#20181228-sha1:9033812f, isClient=false], topVer=1, nodeId8=83bd0a25, msg=null, type=NODE_JOINED, tstamp=1547065636782], nodeId=83bd0a25, evt=NODE_JOINED] class org.apache.ignite.IgniteCheckedException: Failed to find completed exchange future to fetch affinity. at org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager$18.applyx(CacheAffinitySharedManager.java:1798) at org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager$18.applyx(CacheAffinitySharedManager.java:1743) at org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.forAllRegisteredCacheGroups(CacheAffinitySharedManager.java:1107) at org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.initCoordinatorCaches(CacheAffinitySharedManager.java:1743) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.initCoordinatorCaches(GridDhtPartitionsExchangeFuture.java:573) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:679) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2398) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) at java.lang.Thread.run(Thread.java:748) [23:27:17,222][INFO][exchange-worker-#62][GridDhtPartitionsExchangeFuture] Finish exchange future [startVer=AffinityTopologyVersion [topVer=1, minorTopVer=0], resVer=null, err=class org.apache.ignite.IgniteCheckedException: Failed to find completed exchange future to fetch affinity.] [23:27:17,238][SEVERE][main][IgniteKernal] Got exception while starting (will rollback startup routine). class org.apache.ignite.IgniteCheckedException: Failed to find completed exchange future to fetch affinity. at org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager$18.applyx(CacheAffinitySharedManager.java:1798) at org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager$18.applyx(CacheAffinitySharedManager.java:1743) at org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.forAllRegisteredCacheGroups(CacheAffinitySharedManager.java:1107) at org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.initCoordinatorCaches(CacheAffinitySharedManager.java:1743) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.initCoordinatorCaches(GridDhtPartitionsExchangeFuture.java:573) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:679) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2398) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) at java.lang.Thread.run(Thread.java:748) [23:27:17,238][INFO][exchange-worker-#62][GridDhtPartitionsExchangeFuture] Completed partition exchange [localNode=83bd0a25-4574-4723-9594-b95ddaab19be, exchange=GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode
[jira] [Created] (IGNITE-10672) Changing walSegments property leads to fallen node
Dmitry Sherstobitov created IGNITE-10672: Summary: Changing walSegments property leads to fallen node Key: IGNITE-10672 URL: https://issues.apache.org/jira/browse/IGNITE-10672 Project: Ignite Issue Type: Bug Reporter: Dmitry Sherstobitov Start cluster with {code} {code} Load some data and then restart cluster with new config: {code} {code} This will lead node to error on start {code} [14:51:00,852][SEVERE][main][IgniteKernal] Got exception while starting (will rollback startup routine). class org.apache.ignite.IgniteCheckedException: Failed to start processor: GridProcessorAdapter [] at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1784) at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1008) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2020) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1725) at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1153) at org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1071) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:957) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:856) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:726) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:695) at org.apache.ignite.Ignition.start(Ignition.java:348) at org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:301) Caused by: class org.apache.ignite.internal.processors.cache.persistence.StorageException: Failed to initialize wal (work directory contains incorrect number of segments) [cur=10, expected=5] at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.checkOrPrepareFiles(FileWriteAheadLogManager.java:1408) at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.start0(FileWriteAheadLogManager.java:435) at org.apache.ignite.internal.processors.cache.GridCacheSharedManagerAdapter.start(GridCacheSharedManagerAdapter.java:61) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.start(GridCacheProcessor.java:741) at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1781) ... 11 more [14:51:00,853][WARNING][main][IgniteKernal] Attempt to stop starting grid. This operation cannot be guaranteed to be successful. [14:51:00,855][SEVERE][main][IgniteKernal] Failed to stop component (ignoring): GridProcessorAdapter [] java.lang.NullPointerException at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.stop0(FileWriteAheadLogManager.java:631) at org.apache.ignite.internal.processors.cache.GridCacheSharedManagerAdapter.stop(GridCacheSharedManagerAdapter.java:94) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.stop(GridCacheProcessor.java:980) at org.apache.ignite.internal.IgniteKernal.stop0(IgniteKernal.java:2312) at org.apache.ignite.internal.IgniteKernal.stop(IgniteKernal.java:2190) at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1164) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2020) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1725) at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1153) at org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1071) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:957) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:856) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:726) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:695) at org.apache.ignite.Ignition.start(Ignition.java:348) at org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:301) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10189) SslContextFactory's ciphers doesn't work with control.sh utility
Dmitry Sherstobitov created IGNITE-10189: Summary: SslContextFactory's ciphers doesn't work with control.sh utility Key: IGNITE-10189 URL: https://issues.apache.org/jira/browse/IGNITE-10189 Project: Ignite Issue Type: Bug Reporter: Dmitry Sherstobitov There is no options for control.sh utility if ciphers feature enabled on server If this property enabled on server: {code} ... ... {code} Control.sh utility doesn't work -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9752) Fix ODBC documentation
Dmitry Sherstobitov created IGNITE-9752: --- Summary: Fix ODBC documentation Key: IGNITE-9752 URL: https://issues.apache.org/jira/browse/IGNITE-9752 Project: Ignite Issue Type: Bug Reporter: Dmitry Sherstobitov Attachments: image-2018-10-01-17-12-21-555.png See screen shot. There is no matching between default values and values in example !image-2018-10-01-17-12-28-557.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9751) Fix odic driver description
Dmitry Sherstobitov created IGNITE-9751: --- Summary: Fix odic driver description Key: IGNITE-9751 URL: https://issues.apache.org/jira/browse/IGNITE-9751 Project: Ignite Issue Type: Bug Affects Versions: 2.7 Reporter: Dmitry Sherstobitov Attachments: Screen Shot 2018-10-01 at 14.55.21.png There is no version and company -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8895) Update yardstick libraries
Dmitry Sherstobitov created IGNITE-8895: --- Summary: Update yardstick libraries Key: IGNITE-8895 URL: https://issues.apache.org/jira/browse/IGNITE-8895 Project: Ignite Issue Type: Bug Affects Versions: 2.5 Reporter: Dmitry Sherstobitov There is some conflicts in yardstick libraries for now ||yardstick||core||problem|| |jline-0.9.94.jar|bin/include/sqlline/jline-2.4.3.jar|./sqlline.sh unable to start if yardstick libraries in path| -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8893) Blinking node in baseline may corrupt own WAL records
Dmitry Sherstobitov created IGNITE-8893: --- Summary: Blinking node in baseline may corrupt own WAL records Key: IGNITE-8893 URL: https://issues.apache.org/jira/browse/IGNITE-8893 Project: Ignite Issue Type: Bug Affects Versions: 2.5 Reporter: Dmitry Sherstobitov # Start cluster, load data # Start additional node that not in BLT # Repeat 10 times: kill 1 node in baseline and 1 node not in baseline, start node in blt and node not in BLT Node in baseline in some moment may unable to start because of corrupted WAL: Notice that there is no loading on cluster at all - so there is no reason to corrupt WAL, rebalance should be interruptible. Also there is another scenario that may case same error (but also may cause JVM crash) # Start cluster, load data, start nodes # Repeat 10 times: kill 1 node in baseline, clean LFS, start node again, while rebalance blink node that should rebalance data to previously killed node Node that should rebalance data to cleaned node may corrupt own WAL. But this second scenario has configuration "error" - number of backups in each case is 1. So obviously 2 nodes blinking actually may cause data loss. {code:java} [2018-06-28 17:33:39,583][ERROR][wal-file-archiver%null-#63][root] Critical system error detected. Will be handled accordingly to configured handler [hnd=class o.a.i.failure.StopNodeOrHaltFailureHandler, failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=java.lang.AssertionError: lastArchived=757, current=42]] java.lang.AssertionError: lastArchived=757, current=42 at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileArchiver.body(FileWriteAheadLogManager.java:1629) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8879) Blinking baseline node sometimes unable to connect to cluster
Dmitry Sherstobitov created IGNITE-8879: --- Summary: Blinking baseline node sometimes unable to connect to cluster Key: IGNITE-8879 URL: https://issues.apache.org/jira/browse/IGNITE-8879 Project: Ignite Issue Type: Bug Affects Versions: 2.5 Reporter: Dmitry Sherstobitov Almost the same scenario as in IGNITE-8874 but node left baseline while blinking All caches with 2 backups 4 nodes in cluster # Start cluster, load data # Start transactional loading (8 threads, 100 ops/second put/get in each op) # Repeat 10 times: kill one node, remove from baseline, start node again (*with no LFS clean*), wait for rebalance # Check idle_verify, check data corruption At some point killed node unable to start and join cluster because of {code:java} 080ee8-END.bin] [2018-06-26 19:01:43,039][INFO ][main][PageMemoryImpl] Started page memory [memoryAllocated=100.0 MiB, pages=24800, tableSize=1.9 MiB, checkpointBuffer=100.0 MiB] [2018-06-26 19:01:43,039][INFO ][main][GridCacheDatabaseSharedManager] Checking memory state [lastValidPos=FileWALPointer [idx=0, fileOff=583691, len=119], lastMarked=FileWALPointer [idx=0, fileOff=583691, len=119], lastCheckpointId=7fca4dbb-8f01-4b63-95e2-43283b080ee8] [2018-06-26 19:01:43,050][INFO ][main][GridCacheDatabaseSharedManager] Found last checkpoint marker [cpId=7fca4dbb-8f01-4b63-95e2-43283b080ee8, pos=FileWALPointer [idx=0, fileOff=583691, len=119]] [2018-06-26 19:01:43,082][INFO ][main][FileWriteAheadLogManager] Stopping WAL iteration due to an exception: EOF at position [100] expected to read [1] bytes, ptr=FileWALPointer [idx=0, fileOff=100, len=0] [2018-06-26 19:01:43,219][WARN ][main][FileWriteAheadLogManager] WAL segment tail is reached. [ Expected next state: {Index=19,Offset=794017}, Actual state : {Index=3602879702215753728,Offset=775434544} ] [2018-06-26 19:01:43,243][INFO ][main][GridCacheDatabaseSharedManager] Applying lost cache updates since last checkpoint record [lastMarked=FileWALPointer [idx=0, fileOff=583691, len=119], lastCheckpointId=7fca4dbb-8f01-4b63-95e2-43283b080ee8] [2018-06-26 19:01:43,246][INFO ][main][FileWriteAheadLogManager] Stopping WAL iteration due to an exception: EOF at position [100] expected to read [1] bytes, ptr=FileWALPointer [idx=0, fileOff=100, len=0] [2018-06-26 19:01:43,336][WARN ][main][FileWriteAheadLogManager] WAL segment tail is reached. [ Expected next state: {Index=19,Offset=794017}, Actual state : {Index=3602879702215753728,Offset=775434544} ] [2018-06-26 19:01:43,336][INFO ][main][GridCacheDatabaseSharedManager] Finished applying WAL changes [updatesApplied=0, time=101ms] [2018-06-26 19:01:43,450][INFO ][main][GridSnapshotAwareClusterStateProcessorImpl] Restoring history for BaselineTopology[id=4] [2018-06-26 19:01:43,454][ERROR][main][IgniteKernal] Exception during start processors, node will be stopped and close connections class org.apache.ignite.IgniteCheckedException: Failed to start processor: GridProcessorAdapter [] at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1769) at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1001) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2020) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1725) at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1153) at org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1071) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:957) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:856) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:726) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:695) at org.apache.ignite.Ignition.start(Ignition.java:352) at org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:301) Caused by: class org.apache.ignite.IgniteCheckedException: Restoring of BaselineTopology history has failed, expected history item not found for id=1 at org.apache.ignite.internal.processors.cluster.BaselineTopologyHistory.restoreHistory(BaselineTopologyHistory.java:54) at org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor.onReadyForRead(GridClusterStateProcessor.java:222) at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.notifyMetastorageReadyForRead(GridCacheDatabaseSharedManager.java:381) at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readMetastore(GridCacheDatabaseSharedManager.java:643) at
[jira] [Created] (IGNITE-8874) Blinking node in cluster may cause data corruption
Dmitry Sherstobitov created IGNITE-8874: --- Summary: Blinking node in cluster may cause data corruption Key: IGNITE-8874 URL: https://issues.apache.org/jira/browse/IGNITE-8874 Project: Ignite Issue Type: Bug Affects Versions: 2.5 Reporter: Dmitry Sherstobitov All caches with 2 backups 4 nodes in cluster # Start cluster, load data # Start transactional loading (8 threads, 100 ops/second put/get in each op) # Repeat 10 times: kill one node, clean LFS, start node again, wait for rebalance # Check idle_verify, check data corruption Here is idle_verify report: node2 - node that was blinking while test. Update counter are equal between partitions but data is different. {code:java} Conflict partition: PartitionKey [grpId=374280886, grpName=cache_group_3, partId=41] Partition instances: [PartitionHashRecord [isPrimary=true, partHash=885018783, updateCntr=16, size=15, consistentId=node4], PartitionHashRecord [isPrimary=false, partHash=885018783, updateCntr=16, size=15, consistentId=node3], PartitionHashRecord [isPrimary=false, partHash=-357162793, updateCntr=16, size=15, consistentId=node2]] Conflict partition: PartitionKey [grpId=1586135625, grpName=cache_group_1_015, partId=15] Partition instances: [PartitionHashRecord [isPrimary=true, partHash=-562597978, updateCntr=22, size=16, consistentId=node3], PartitionHashRecord [isPrimary=false, partHash=-562597978, updateCntr=22, size=16, consistentId=node1], PartitionHashRecord [isPrimary=false, partHash=780813725, updateCntr=22, size=16, consistentId=node2]] Conflict partition: PartitionKey [grpId=374280885, grpName=cache_group_2, partId=75] Partition instances: [PartitionHashRecord [isPrimary=true, partHash=-1500797699, updateCntr=21, size=16, consistentId=node3], PartitionHashRecord [isPrimary=false, partHash=-1500797699, updateCntr=21, size=16, consistentId=node1], PartitionHashRecord [isPrimary=false, partHash=-1592034435, updateCntr=21, size=16, consistentId=node2]] Conflict partition: PartitionKey [grpId=374280884, grpName=cache_group_1, partId=713] Partition instances: [PartitionHashRecord [isPrimary=false, partHash=-63058826, updateCntr=4, size=2, consistentId=node3], PartitionHashRecord [isPrimary=true, partHash=-63058826, updateCntr=4, size=2, consistentId=node1], PartitionHashRecord [isPrimary=false, partHash=670869467, updateCntr=4, size=2, consistentId=node2]] Conflict partition: PartitionKey [grpId=374280886, grpName=cache_group_3, partId=11] Partition instances: [PartitionHashRecord [isPrimary=false, partHash=-224572810, updateCntr=17, size=16, consistentId=node3], PartitionHashRecord [isPrimary=true, partHash=-224572810, updateCntr=17, size=16, consistentId=node1], PartitionHashRecord [isPrimary=false, partHash=176419075, updateCntr=17, size=16, consistentId=node2]]{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8620) Remove intOrder and lc keys from node info in control.sh --tx utility
Dmitry Sherstobitov created IGNITE-8620: --- Summary: Remove intOrder and lc keys from node info in control.sh --tx utility Key: IGNITE-8620 URL: https://issues.apache.org/jira/browse/IGNITE-8620 Project: Ignite Issue Type: Improvement Reporter: Dmitry Sherstobitov For now this information displayed in control.sh utility for each node: TcpDiscoveryNode [id=2ed402d5-b5a7-4ade-a77a-12c2feea95ec, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 172.25.1.47], sockAddrs=[/0:0:0:0:0:0:0:1%lo:0, /127.0.0.1:0, /172.25.1.47:0], discPort=0, order=6, intOrder=6, lastExchangeTime=1526482701193, loc=false, ver=2.5.1#20180510-sha1:ee417b82, isClient=true] loc and intOrder values are internal information and there is not need to display it -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8602) Add support filter label=null for control.sh tx utility
Dmitry Sherstobitov created IGNITE-8602: --- Summary: Add support filter label=null for control.sh tx utility Key: IGNITE-8602 URL: https://issues.apache.org/jira/browse/IGNITE-8602 Project: Ignite Issue Type: Improvement Affects Versions: 2.5 Reporter: Dmitry Sherstobitov -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8601) Add to control.sh tx utility information about transaction start time
Dmitry Sherstobitov created IGNITE-8601: --- Summary: Add to control.sh tx utility information about transaction start time Key: IGNITE-8601 URL: https://issues.apache.org/jira/browse/IGNITE-8601 Project: Ignite Issue Type: Improvement Affects Versions: 2.5 Reporter: Dmitry Sherstobitov This information will be useful -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8476) AssertionError exception occurs when trying to remove node from baseline by consistentId under loading
Dmitry Sherstobitov created IGNITE-8476: --- Summary: AssertionError exception occurs when trying to remove node from baseline by consistentId under loading Key: IGNITE-8476 URL: https://issues.apache.org/jira/browse/IGNITE-8476 Project: Ignite Issue Type: Bug Reporter: Dmitry Sherstobitov Run 6 nodes, start loading (8 threads, 1000 cache.put() in each thread per second) Kill 2 nodes and try to remove one node from baseline using control.sh --baseline remove node1 Utility hangs, this assertion appears in log {code:java} [18:40:12,858][SEVERE][sys-stripe-10-#11][GridCacheIoManager] Failed to process message [senderId=9fde40b1-3b21-49de-b1ad-cdd0d9d902e5, messageType=class o.a.i.i.processors.cache.distributed.near.GridNearSingleGetRequest] java.lang.AssertionError: Wrong ready topology version for invalid partitions response [topVer=AffinityTopologyVersion [topVer=11, minorTopVer=0], req=GridNearSingleGetRequest [futId=1526053201329, key=KeyCacheObjectImpl [part=42, val=1514, hasValBytes=true], flags=1, topVer=AffinityTopologyVersion [topVer=11, minorTopVer=0], subjId=9fde40b1-3b21-49de-b1ad-cdd0d9d902e5, taskNameHash=0, createTtl=-1, accessTtl=-1]] at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter$6.apply(GridDhtCacheAdapter.java:943) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter$6.apply(GridDhtCacheAdapter.java:906) at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:383) at org.apache.ignite.internal.util.future.GridFutureAdapter.listen(GridFutureAdapter.java:353) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter.processNearSingleGetRequest(GridDhtCacheAdapter.java:906) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$300(GridDhtAtomicCache.java:130) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$4.apply(GridDhtAtomicCache.java:252) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$4.apply(GridDhtAtomicCache.java:247) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1054) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:579) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:378) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:304) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:99) at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:293) at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556) at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184) at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:125) at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1091) at org.apache.ignite.internal.util.StripedExecutor$Stripe.run(StripedExecutor.java:511) at java.lang.Thread.run(Thread.java:748){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8467) minSize filter for transactions utility control.sh doesn't work
Dmitry Sherstobitov created IGNITE-8467: --- Summary: minSize filter for transactions utility control.sh doesn't work Key: IGNITE-8467 URL: https://issues.apache.org/jira/browse/IGNITE-8467 Project: Ignite Issue Type: Bug Affects Versions: 2.5 Reporter: Dmitry Sherstobitov I have following output when define control.sh utility with minSize filter Looks like it doesn't work. {code:java} Control utility --tx minDuration 15 minSize 10 order SIZE Control utility 2018 Copyright(C) Apache Software Foundation User: Matching transactions: [16:52:30][:688] TcpDiscoveryNode [id=02f47e9a-efca-4d8c-a49f-3de4ca82d3ee, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 172.17.0.1, 172.25.1.40], sockAddrs=[/172.17.0.1:0, /0:0:0:0:0:0:0:1%lo:0, lab40.gridgain.local/172.25.1.40:0, /127.0.0.1:0], discPort=0, order=5, intOrder=5, lastExchangeTime=1525960350163, loc=false, ver=2.5.1#20180427-sha1:48601cbd, isClient=true] Tx: [xid=0f1d25a4361--0831-2c15--0005, label=tx_5, state=ACTIVE, duration=16, isolation=REPEATABLE_READ, concurrency=PESSIMISTIC, timeout=0, size=1, dhtNodes=[63e05a51]] Tx: [xid=05ad25a4361--0831-2c15--0005, label=tx_6, state=ACTIVE, duration=15, isolation=REPEATABLE_READ, concurrency=PESSIMISTIC, timeout=0, size=1, dhtNodes=[473df74e]] Tx: [xid=7b2b25a4361--0831-2c15--0005, label=tx_1, state=ACTIVE, duration=20, isolation=REPEATABLE_READ, concurrency=PESSIMISTIC, timeout=0, size=1, dhtNodes=[63e05a51]] Tx: [xid=73ab25a4361--0831-2c15--0005, label=tx_2, state=ACTIVE, duration=19, isolation=REPEATABLE_READ, concurrency=PESSIMISTIC, timeout=0, size=1, dhtNodes=[473df74e]] Tx: [xid=47ca25a4361--0831-2c15--0005, label=tx_0, state=ACTIVE, duration=22, isolation=REPEATABLE_READ, concurrency=PESSIMISTIC, timeout=0, size=1, dhtNodes=[63e05a51]] Tx: [xid=b0ac25a4361--0831-2c15--0005, label=tx_4, state=ACTIVE, duration=17, isolation=REPEATABLE_READ, concurrency=PESSIMISTIC, timeout=0, size=1, dhtNodes=[473df74e]] Tx: [xid=3a1c25a4361--0831-2c15--0005, label=tx_3, state=ACTIVE, duration=18, isolation=REPEATABLE_READ, concurrency=PESSIMISTIC, timeout=0, size=1, dhtNodes=[0cd15184]]{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8466) Control.sh transacitions utility may hang on case with loading
Dmitry Sherstobitov created IGNITE-8466: --- Summary: Control.sh transacitions utility may hang on case with loading Key: IGNITE-8466 URL: https://issues.apache.org/jira/browse/IGNITE-8466 Project: Ignite Issue Type: Bug Affects Versions: 2.5 Reporter: Dmitry Sherstobitov Start nodes Start client and run transactional loading (8 threads with ~1000ops/second - moving some amount from one value to another) Start long running transactions (transactions with flexible sleep inside) with label tx_* Start control.sh --tx label tx kill -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8465) Support compatibility in control.sh utility
Dmitry Sherstobitov created IGNITE-8465: --- Summary: Support compatibility in control.sh utility Key: IGNITE-8465 URL: https://issues.apache.org/jira/browse/IGNITE-8465 Project: Ignite Issue Type: Bug Affects Versions: 2.5 Reporter: Dmitry Sherstobitov control.sh --tx from 2.5.1 for now may be launched with 2.4.x cluster e.g It may cause errors on cluster -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8438) Change boxed types to primitives in Transactions.TransactionMXBeanImpl.getActiveTransactions operation
Dmitry Sherstobitov created IGNITE-8438: --- Summary: Change boxed types to primitives in Transactions.TransactionMXBeanImpl.getActiveTransactions operation Key: IGNITE-8438 URL: https://issues.apache.org/jira/browse/IGNITE-8438 Project: Ignite Issue Type: Bug Affects Versions: 2.5 Reporter: Dmitry Sherstobitov Boxed types are useless for this operation Also default values in jconsole are wrong for java.lang.Long type: 0 used instead of 0L -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8437) Control utility fails to connect to cluster if zookeeper discovery used
Dmitry Sherstobitov created IGNITE-8437: --- Summary: Control utility fails to connect to cluster if zookeeper discovery used Key: IGNITE-8437 URL: https://issues.apache.org/jira/browse/IGNITE-8437 Project: Ignite Issue Type: Improvement Components: zookeeper Affects Versions: 2.5 Reporter: Dmitry Sherstobitov Unable to connect start cluster with zookeeper discovery and try to run control.sh --tx utility {code:java} 2018-05-03 16:56:36.225 [ERROR][mgmt-#115268%DPL_GRID%DplGridNodeName%][o.a.i.i.p.r.p.t.GridTcpRestProtocol] Failed to process client request [ses=GridSelectorNioSessionImpl [worker=ByteBufferNioClientWorker [readBuf=java.nio.HeapByteBuffer[pos=395 lim=395 cap=8192], super=AbstractNioClientWorker [idx=0, bytesRcvd=0, bytesSent=0, bytesRcvd0=0, bytesSent0=0, select=true, super=GridWorker [name=grid-nio-worker-tcp-rest-0, igniteInstanceName=DPL_GRID%DplGridNodeName, finished=false, hashCode=1766410348, interrupted=false, runner=grid-nio-worker-tcp-rest-0-#48%DPL_GRID%DplGridNodeName%]]], writeBuf=null, readBuf=null, inRecovery=null, outRecovery=null, super=GridNioSessionImpl [locAddr=/10.116.158.48:11211, rmtAddr=/10.78.10.31:55847, createTime=1525355795984, closeTime=1525355796217, bytesSent=521553, bytesRcvd=461, bytesSent0=521553, bytesRcvd0=461, sndSchedTime=1525355796217, lastSndTime=1525355796075, lastRcvTime=1525355796175, readsPaused=false, filterChain=FilterChain[filters=[GridNioCodecFilter [parser=GridTcpRestParser [marsh=JdkMarshaller [clsFilter=o.a.i.i.IgniteKernal$5@3e3d1d0d], routerClient=false], directMode=false]], accepted=true]], msg=GridClientTaskRequest [taskName=o.a.i.i.v.tx.VisorTxTask, arg=VisorTaskArgument [debug=false]]] org.apache.ignite.internal.util.nio.GridNioException: class org.apache.ignite.IgniteCheckedException: Failed to serialize object: GridClientResponse [clientId=587ea745-dd1e-4631-aa85-feb5d49acc36, reqId=2, destId=null, status=0, errMsg=null, result=GridClientTaskResultBean [res={ZookeeperClusterNode [id=c4cc818d-b29f-427b-86b5-9a625287feb6, addrs=[10.116.159.100], order=30, loc=false, client=true]=VisorTxTaskResult []}, error=null, finished=true, id=~a7245b33-a37a-4084-a954-460c31834442]] at org.apache.ignite.internal.util.nio.GridNioCodecFilter.onSessionWrite(GridNioCodecFilter.java:100) at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedSessionWrite(GridNioFilterAdapter.java:121) at org.apache.ignite.internal.util.nio.GridNioFilterChain$TailFilter.onSessionWrite(GridNioFilterChain.java:269) at org.apache.ignite.internal.util.nio.GridNioFilterChain.onSessionWrite(GridNioFilterChain.java:192) at org.apache.ignite.internal.util.nio.GridNioSessionImpl.send(GridNioSessionImpl.java:110) at org.apache.ignite.internal.processors.rest.protocols.tcp.GridTcpRestNioListener$1.apply(GridTcpRestNioListener.java:261) at org.apache.ignite.internal.processors.rest.protocols.tcp.GridTcpRestNioListener$1.apply(GridTcpRestNioListener.java:232) at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:383) at org.apache.ignite.internal.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:347) at org.apache.ignite.internal.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:335) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:495) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:474) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:451) at org.apache.ignite.internal.processors.rest.GridRestProcessor$2$1.apply(GridRestProcessor.java:165) at org.apache.ignite.internal.processors.rest.GridRestProcessor$2$1.apply(GridRestProcessor.java:162) at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:383) at org.apache.ignite.internal.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:347) at org.apache.ignite.internal.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:335) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:495) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:474) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:451) at org.apache.ignite.internal.util.future.GridFutureChainListener.applyCallback(GridFutureChainListener.java:78) at org.apache.ignite.internal.util.future.GridFutureChainListener.apply(GridFutureChainListener.java:70) at org.apache.ignite.internal.util.future.GridFutureChainListener.apply(GridFutureChainListener.java:30) at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:383) at
[jira] [Created] (IGNITE-8418) Add separate operation with kill argument in Transactions bean
Dmitry Sherstobitov created IGNITE-8418: --- Summary: Add separate operation with kill argument in Transactions bean Key: IGNITE-8418 URL: https://issues.apache.org/jira/browse/IGNITE-8418 Project: Ignite Issue Type: Improvement Reporter: Dmitry Sherstobitov getActiveTransactions operation contains last kill argument. By default in jconsole this argument has true value that may cause killing transactions by accident. We should add new operation with kill argument -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-7899) Write Zookeeper Discovery documentation in java docs
Dmitry Sherstobitov created IGNITE-7899: --- Summary: Write Zookeeper Discovery documentation in java docs Key: IGNITE-7899 URL: https://issues.apache.org/jira/browse/IGNITE-7899 Project: Ignite Issue Type: Task Components: zookeeper Affects Versions: 2.5 Reporter: Dmitry Sherstobitov Fix For: 2.5 Describe Zookeeper Discovery in java docs -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-7900) Write Zookeeper Discovery documentation in readme.io
Dmitry Sherstobitov created IGNITE-7900: --- Summary: Write Zookeeper Discovery documentation in readme.io Key: IGNITE-7900 URL: https://issues.apache.org/jira/browse/IGNITE-7900 Project: Ignite Issue Type: Task Components: zookeeper Affects Versions: 2.5 Reporter: Dmitry Sherstobitov Fix For: 2.5 Describe Zookeeper Discovery in readme.io -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-7786) Changing baseline topology on big cluster may have error in control.sh utility
Dmitry Sherstobitov created IGNITE-7786: --- Summary: Changing baseline topology on big cluster may have error in control.sh utility Key: IGNITE-7786 URL: https://issues.apache.org/jira/browse/IGNITE-7786 Project: Ignite Issue Type: Bug Affects Versions: 2.3 Reporter: Dmitry Sherstobitov Looks like there is hardcoded timeout for waiting result of change baseline operation In big cluster there is following behaviour: (154 nodes) # Set new baseline topology version # Utility connects, but then fails by connection error # Cluster successfully activated -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-7449) Rebalancing metrics doesn't display actual information about current rebalance state
Dmitry Sherstobitov created IGNITE-7449: --- Summary: Rebalancing metrics doesn't display actual information about current rebalance state Key: IGNITE-7449 URL: https://issues.apache.org/jira/browse/IGNITE-7449 Project: Ignite Issue Type: Bug Environment: In case of shutdown node with clear LFS following metrics doesn't display correct information about rebalance: RebalancingKeysRate RebalancingBytesRate KeysToRebalanceLeft Otherwise RebalancingPartitionsCount metric displays information correctly Steps to reproduce: 1. Cluster with enabled statistics 2. Shutdown node and clear LFS 3. Run node 4. Start node and ask node for current rebalance state through JMX: Current result: 1 tick RebalancingKeysRate 0 RebalancingBytesRate 0 KeysToRebalanceLeft 0 RebalancingPartitionsCount 342 2 tick RebalancingKeysRate 0 RebalancingBytesRate 0 KeysToRebalanceLeft 0 RebalancingPartitionsCount 80 Expected: 2 tick RebalancingKeysRate SOME_NON_ZERO_VALUE RebalancingBytesRate SOME_NON_ZERO_VALUE KeysToRebalanceLeft SOME_NON_ZERO_VALUE RebalancingPartitionsCount 80 UPD: -DIGNITE_REBALANCE_STATISTICS_TIME_INTERVAL=1000 doesn't affect results Reporter: Dmitry Sherstobitov -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-6540) Human readable WAL parser result has no human readable data
Dmitry Sherstobitov created IGNITE-6540: --- Summary: Human readable WAL parser result has no human readable data Key: IGNITE-6540 URL: https://issues.apache.org/jira/browse/IGNITE-6540 Project: Ignite Issue Type: Bug Components: persistence Reporter: Dmitry Sherstobitov Fix For: 2.3 Simple example with put 1 record in cache generation a lot of data in WAL {code:java} try (Ignite ignite = Ignition.start("examples/config/persistentstore/example-persistent-store.xml")) { try (IgniteCachecache = ignite.getOrCreateCache(CACHE_NAME)) { cache.put(1, "NEW_VALUE_1"); } } {code} See result of script in attachments -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (IGNITE-6539) Human readable WAL parser fails if empty log files exists in directory
Dmitry Sherstobitov created IGNITE-6539: --- Summary: Human readable WAL parser fails if empty log files exists in directory Key: IGNITE-6539 URL: https://issues.apache.org/jira/browse/IGNITE-6539 Project: Ignite Issue Type: Bug Components: persistence Reporter: Dmitry Sherstobitov Fix For: 2.3 While scanning WAL files script may detect empty files. In this case script throws SegmentEofException "Reached logical end of the segment" -- This message was sent by Atlassian JIRA (v6.4.14#64029)