[jira] [Commented] (IGNITE-12935) Disadvantages in log of historical rebalance
[ https://issues.apache.org/jira/browse/IGNITE-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17146658#comment-17146658 ] Ignite TC Bot commented on IGNITE-12935: {panel:title=Branch: [pull/7722/head] Base: [master] : No blockers found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel} {panel:title=Branch: [pull/7722/head] Base: [master] : New Tests (11)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1} {color:#8b}PDS 2{color} [tests 3] * {color:#013220}IgnitePdsTestSuite2: IgniteWalRebalanceLoggingTest.testFullRebalanceLogMsgs - PASSED{color} * {color:#013220}IgnitePdsTestSuite2: IgniteWalRebalanceLoggingTest.testHistoricalRebalanceLogMsg - PASSED{color} * {color:#013220}IgnitePdsTestSuite2: IgniteWalRebalanceLoggingTest.testFullRebalanceWithShortCpHistoryLogMsgs - PASSED{color} {color:#8b}Service Grid{color} [tests 4] * {color:#013220}IgniteServiceGridTestSuite: ServiceDeploymentProcessIdSelfTest.requestId[Test event=IgniteBiTuple [val1=DiscoveryCustomEvent [customMsg=ServiceChangeBatchRequest [id=96c5501f271-66e9bc30-1170-484b-9193-d8f9cfd301fd, reqs=SingletonList [ServiceUndeploymentRequest []]], affTopVer=null, super=DiscoveryEvent [evtNode=a304cc77-4e87-4ce2-b494-a989bf3ebf48, topVer=0, nodeId8=a304cc77, msg=null, type=DISCOVERY_CUSTOM_EVT, tstamp=1593181559909]], val2=AffinityTopologyVersion [topVer=8179224952790605856, minorTopVer=0]]] - PASSED{color} * {color:#013220}IgniteServiceGridTestSuite: ServiceDeploymentProcessIdSelfTest.topologyVersion[Test event=IgniteBiTuple [val1=DiscoveryEvent [evtNode=9b6e82d8-5428-41aa-a108-47cc665e6094, topVer=0, nodeId8=19cbc984, msg=, type=NODE_JOINED, tstamp=1593181559909], val2=AffinityTopologyVersion [topVer=-1440090742220744401, minorTopVer=0]]] - PASSED{color} * {color:#013220}IgniteServiceGridTestSuite: ServiceDeploymentProcessIdSelfTest.requestId[Test event=IgniteBiTuple [val1=DiscoveryEvent [evtNode=9b6e82d8-5428-41aa-a108-47cc665e6094, topVer=0, nodeId8=19cbc984, msg=, type=NODE_JOINED, tstamp=1593181559909], val2=AffinityTopologyVersion [topVer=-1440090742220744401, minorTopVer=0]]] - PASSED{color} * {color:#013220}IgniteServiceGridTestSuite: ServiceDeploymentProcessIdSelfTest.topologyVersion[Test event=IgniteBiTuple [val1=DiscoveryCustomEvent [customMsg=ServiceChangeBatchRequest [id=96c5501f271-66e9bc30-1170-484b-9193-d8f9cfd301fd, reqs=SingletonList [ServiceUndeploymentRequest []]], affTopVer=null, super=DiscoveryEvent [evtNode=a304cc77-4e87-4ce2-b494-a989bf3ebf48, topVer=0, nodeId8=a304cc77, msg=null, type=DISCOVERY_CUSTOM_EVT, tstamp=1593181559909]], val2=AffinityTopologyVersion [topVer=8179224952790605856, minorTopVer=0]]] - PASSED{color} {color:#8b}Service Grid (legacy mode){color} [tests 4] * {color:#013220}IgniteServiceGridTestSuite: ServiceDeploymentProcessIdSelfTest.topologyVersion[Test event=IgniteBiTuple [val1=DiscoveryEvent [evtNode=abd5e161-4ec6-4fdc-b350-86058cb1800c, topVer=0, nodeId8=40dcc9b2, msg=, type=NODE_JOINED, tstamp=1593181679755], val2=AffinityTopologyVersion [topVer=2068679071159579793, minorTopVer=0]]] - PASSED{color} * {color:#013220}IgniteServiceGridTestSuite: ServiceDeploymentProcessIdSelfTest.requestId[Test event=IgniteBiTuple [val1=DiscoveryEvent [evtNode=abd5e161-4ec6-4fdc-b350-86058cb1800c, topVer=0, nodeId8=40dcc9b2, msg=, type=NODE_JOINED, tstamp=1593181679755], val2=AffinityTopologyVersion [topVer=2068679071159579793, minorTopVer=0]]] - PASSED{color} * {color:#013220}IgniteServiceGridTestSuite: ServiceDeploymentProcessIdSelfTest.topologyVersion[Test event=IgniteBiTuple [val1=DiscoveryCustomEvent [customMsg=ServiceChangeBatchRequest [id=84d3111f271-9687f3aa-7b2b-4782-8da2-1f301caf60e5, reqs=SingletonList [ServiceUndeploymentRequest []]], affTopVer=null, super=DiscoveryEvent [evtNode=719469e6-3ad6-4407-be76-fee0a142a03c, topVer=0, nodeId8=719469e6, msg=null, type=DISCOVERY_CUSTOM_EVT, tstamp=1593181679755]], val2=AffinityTopologyVersion [topVer=-4572596404447967496, minorTopVer=0]]] - PASSED{color} * {color:#013220}IgniteServiceGridTestSuite: ServiceDeploymentProcessIdSelfTest.requestId[Test event=IgniteBiTuple [val1=DiscoveryCustomEvent [customMsg=ServiceChangeBatchRequest [id=84d3111f271-9687f3aa-7b2b-4782-8da2-1f301caf60e5, reqs=SingletonList [ServiceUndeploymentRequest []]], affTopVer=null, super=DiscoveryEvent [evtNode=719469e6-3ad6-4407-be76-fee0a142a03c, topVer=0, nodeId8=719469e6, msg=null, type=DISCOVERY_CUSTOM_EVT, tstamp=1593181679755]], val2=AffinityTopologyVersion [topVer=-4572596404447967496, minorTopVer=0]]] - PASSED{color} {panel} [TeamCity *-- Run :: All* Results|https://ci.ignite.apache.org/viewLog.html?buildId=5419518buildTypeId=IgniteTests24Java8_RunAll] > Disadvantages in log of historical rebalance > > > Key: IGNITE-12935 > URL:
[jira] [Updated] (IGNITE-9321) MVCC: support cache events
[ https://issues.apache.org/jira/browse/IGNITE-9321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-9321: -- Fix Version/s: (was: 2.9) > MVCC: support cache events > -- > > Key: IGNITE-9321 > URL: https://issues.apache.org/jira/browse/IGNITE-9321 > Project: Ignite > Issue Type: Task > Components: mvcc >Reporter: Vladimir Ozerov >Priority: Major > Attachments: EventsProblems.java > > > Currently cache events are not fired for MVCC caches. Need to restore all > cache events. > Number of problems were found in Events framework. Let's outline them before > proceeding with implementation for MVCC caches. Attached reproducer > demonstrates several problems. > h2. Bugs > 1. {{IgniteCache.remove}} fires event regardless entry presence in the cache. > 2. CACHE_OBJECT_PUT can report {{hasOldValue==true,oldValue==null}} for > transactional cache. > See attached reproducer. > Also it means that test coverage is not sufficient, negative tests could be > added, event content check could be added. > h2. Inconsistency > In current vision for the same operations with different cache modes we will > see different number of events fired. ATOMIC cache fires events for each > operation. TRANSACTIONAL cache fires only final changes on commit (_put > remove put_ on the same key will result in only one CACHE_OBJECT_PUT event) > and nothing on rollback. Current plan for MVCC is to fire events right away > with operation, so events for rolled back transactions will be fired as well. > So, for all 3 modes behavior is different. It looks hardly understandable and > subsequently could lead to usage errors. > Additionally there are confusion points for SQL operations. For SELECT > CACHE_QUERY_OBJECT_READ event is triggered and CACHE_OBJECT_READ is not. For > DML operations weird mix of events occurs. > h2. Use cases > Also it is good to understand in what use cases it is a good idea to use > IgniteEvents. Audit was mentioned as an example. But it looks like that > currently events framework solve _beforehand_ audit when event is triggered > before on actual operation. We could document _when_ each type of event is > triggered and what _ordering_ guarantees (if any) are there. > h2. Other > 1. EVT_CACHE_OBJECT_LOCKED, EVT_CACHE_OBJECT_UNLOCKED provokes questions for > MVCC. Should we reuse same event for lock while unlock happens implicitly on > transaction commit? Do we need some specific events? > 2. EVT_CACHE_ENTRY_CREATED, EVT_CACHE_ENTRY_DESTROYED events are almost > useless for a user, but it is not obvious immediately. > 3. {{EventType}} javadoc states that all events are enabled by default, > {{IgniteEvents}} javadoc states opposite and a latter seems to be true. > 4. {{IgniteEvents}} facade encapsulates 2 event processing workflows: event > listening and querying events from _event store_. While workflows are related > but from the first glance separation between them is not obvious. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-9314) MVCC TX: Datastreamer operations
[ https://issues.apache.org/jira/browse/IGNITE-9314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-9314: -- Fix Version/s: (was: 2.9) > MVCC TX: Datastreamer operations > > > Key: IGNITE-9314 > URL: https://issues.apache.org/jira/browse/IGNITE-9314 > Project: Ignite > Issue Type: Task > Components: mvcc >Reporter: Igor Seliverstov >Priority: Major > > Need to change DataStreamer semantics. > {{allowOverwrite=false}} mode currently is inconsistent with interval > _partition counters_ update approach used by MVCC transactions. > {{allowOverwrite=true}} mode is terribly slow when using single {{cache.put}} > operations (snapshot request, tx commit on coordinator overhead). Batched > mode using {{cache.putAll}} should handle write conflicts and possible > deadlocks. > Also there is a problem when {{DataStreamer}} with {{allowOverwrite == > false}} does not insert value when versions for entry exist but they all are > aborted. Proper transactional semantics should developed for such case. After > that attention should be put on Cache.size method behavior. Cache.size > addressed in https://issues.apache.org/jira/browse/IGNITE-8149 could be > decremented improperly in > {{org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManager#mvccRemoveAll}} > method (called during streamer processing) when all existing mvcc row > versions are aborted or last committed one is _remove_ version. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-9319) CacheAsyncOperationsFailoverTxTest.testPutAllAsyncFailover is flaky in master.
[ https://issues.apache.org/jira/browse/IGNITE-9319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-9319: -- Fix Version/s: (was: 2.9) > CacheAsyncOperationsFailoverTxTest.testPutAllAsyncFailover is flaky in master. > -- > > Key: IGNITE-9319 > URL: https://issues.apache.org/jira/browse/IGNITE-9319 > Project: Ignite > Issue Type: Bug >Reporter: Alexey Scherbakov >Assignee: Anton Kalashnikov >Priority: Major > Labels: MakeTeamcityGreenAgain > > https://ci.ignite.apache.org/viewLog.html?buildId=1688647=queuedBuildOverviewTab > https://ci.ignite.apache.org/viewLog.html?buildId=1688542=queuedBuildOverviewTab -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-9295) Add Warning message for multiple data streamers
[ https://issues.apache.org/jira/browse/IGNITE-9295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-9295: -- Fix Version/s: (was: 2.9) > Add Warning message for multiple data streamers > --- > > Key: IGNITE-9295 > URL: https://issues.apache.org/jira/browse/IGNITE-9295 > Project: Ignite > Issue Type: Improvement >Reporter: Alexander Gerus >Priority: Major > > DataStreamer is design to allocate as much resources as available. In case if > user is starting more then one instance per cache, it can cause significant > slowdown for the streaming due to significant consumption of resources > The proposal is to add warning message to the application log in case if two > or more data streamers per cache: > {quote}You are running multiple instances of IgniteDataStreamer. > For best performance use a single instance and share it between threads. > {quote} > Also, if user is using addData(key, value) method, we should print a warning > suggesting to use addData(Map) method (we should update the Javadoc as well): > {quote}You are using IgniteDataStreamer.addData(key, value) method. > For best performance use addData(Map) method. > {quote} > Each of these warnings should be printed only once, the 1st time such a case > was detected. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-9293) CPP: Add API to set baseline topology for C++
[ https://issues.apache.org/jira/browse/IGNITE-9293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-9293: -- Fix Version/s: (was: 2.9) > CPP: Add API to set baseline topology for C++ > -- > > Key: IGNITE-9293 > URL: https://issues.apache.org/jira/browse/IGNITE-9293 > Project: Ignite > Issue Type: New Feature > Components: platforms >Affects Versions: 2.6 >Reporter: Igor Sapego >Priority: Major > Labels: cpp > > We need to add API for C++ to set baseline topology. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-9291) IgniteJdbcThinDriver with SSL doesn't work for sqlline on Windows
[ https://issues.apache.org/jira/browse/IGNITE-9291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-9291: -- Fix Version/s: (was: 2.9) > IgniteJdbcThinDriver with SSL doesn't work for sqlline on Windows > - > > Key: IGNITE-9291 > URL: https://issues.apache.org/jira/browse/IGNITE-9291 > Project: Ignite > Issue Type: Bug > Components: jdbc >Affects Versions: 2.6 >Reporter: Sergey Kozlov >Priority: Major > > I tried to run AI 2.6 with same configuration under Linux (Ubuntu) and > Windows 10 and then connect {sqlline} with activated SSL. > It works fine for Linux and doesn't work for Windows. > There's no errors on server nodes but sqlline prints out following: > {noformat} > C:/BuildAgent/work/dd4d79acf76cc870/i2test/var/suite-client/gg-pro-fab-nolgpl/bin/sqlline.bat, > WARN: IGNITE_HOME environment variable may be pointing to wrong folder: > C:/BuildAgent/work/dd4d79acf76cc870/i2test/var/suite-client/gg-pro-fab-nolgpl > Setting property: [force, true] > Setting property: [showWarnings, true] > Setting property: [showNestedErrs, true] > issuing: !connect jdbc:ignite:thin://127.0.0.1/?sslMode=require '' '' > org.apache.ignite.IgniteJdbcThinDriver > Connecting to jdbc:ignite:thin://127.0.0.1/?sslMode=require > Error: Failed to SSL connect to server > [url=jdbc:ignite:thin://127.0.0.1:10800] (state=08001,code=0) > java.sql.SQLException: Failed to SSL connect to server > [url=jdbc:ignite:thin://127.0.0.1:10800] > at > org.apache.ignite.internal.jdbc.thin.JdbcThinSSLUtil.createSSLSocket(JdbcThinSSLUtil.java:93) > at > org.apache.ignite.internal.jdbc.thin.JdbcThinTcpIo.connect(JdbcThinTcpIo.java:217) > at > org.apache.ignite.internal.jdbc.thin.JdbcThinTcpIo.start(JdbcThinTcpIo.java:159) > at > org.apache.ignite.internal.jdbc.thin.JdbcThinTcpIo.start(JdbcThinTcpIo.java:134) > at > org.apache.ignite.internal.jdbc.thin.JdbcThinConnection.ensureConnected(JdbcThinConnection.java:151) > at > org.apache.ignite.internal.jdbc.thin.JdbcThinConnection.(JdbcThinConnection.java:140) > at > org.apache.ignite.IgniteJdbcThinDriver.connect(IgniteJdbcThinDriver.java:157) > at sqlline.DatabaseConnection.connect(DatabaseConnection.java:156) > at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:204) > at sqlline.Commands.connect(Commands.java:1095) > at sqlline.Commands.connect(Commands.java:1001) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:38) > at sqlline.SqlLine.dispatch(SqlLine.java:791) > at sqlline.SqlLine.initArgs(SqlLine.java:566) > at sqlline.SqlLine.begin(SqlLine.java:643) > at sqlline.SqlLine.start(SqlLine.java:373) > at sqlline.SqlLine.main(SqlLine.java:265) > Caused by: javax.net.ssl.SSLHandshakeException: > sun.security.validator.ValidatorException: PKIX path building failed: > sun.security.provider.certpath.SunCertPathBuilderException: unable to find > valid certification path to requested target > at sun.security.ssl.Alerts.getSSLException(Alerts.java:192) > at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1949) > at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:302) > at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:296) > at > sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1509) > at > sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216) > at sun.security.ssl.Handshaker.processLoop(Handshaker.java:979) > at sun.security.ssl.Handshaker.process_record(Handshaker.java:914) > at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1062) > at > sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375) > at > sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403) > at > sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1387) > at > org.apache.ignite.internal.jdbc.thin.JdbcThinSSLUtil.createSSLSocket(JdbcThinSSLUtil.java:88) > ... 20 more > Caused by: sun.security.validator.ValidatorException: PKIX path building > failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to > find valid certification path to requested target > at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:387) > at >
[jira] [Updated] (IGNITE-9270) Design thread per partition model
[ https://issues.apache.org/jira/browse/IGNITE-9270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-9270: -- Fix Version/s: (was: 2.9) > Design thread per partition model > - > > Key: IGNITE-9270 > URL: https://issues.apache.org/jira/browse/IGNITE-9270 > Project: Ignite > Issue Type: Sub-task > Components: cache >Reporter: Pavel Kovalenko >Assignee: Pavel Kovalenko >Priority: Major > Labels: thread-per-partition > > A new model of executing cache partition operations (READ, CREATE, UPDATE, > DELETE) should satisfy following conditions > 1) All modify operations (CREATE, UPDATE, DELETE) on some partition must be > performed by the same thread. > 2) Read operations can be executed by any thread. > 3) Ordering of modify operations on primary and backup nodes should be same. > We should investigate performance if we choose dedicated executor service for > such operations, or we can use a messaging model to use network threads to > perform such operations. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-9271) Implement transaction commit using thread per partition model
[ https://issues.apache.org/jira/browse/IGNITE-9271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-9271: -- Fix Version/s: (was: 2.9) > Implement transaction commit using thread per partition model > - > > Key: IGNITE-9271 > URL: https://issues.apache.org/jira/browse/IGNITE-9271 > Project: Ignite > Issue Type: Sub-task > Components: cache >Reporter: Pavel Kovalenko >Assignee: Pavel Kovalenko >Priority: Major > Labels: thread-per-partition > > Currently, we perform commit of a transaction from sys thread and do write > operations with multiple partitions. > We should delegate such operations to an appropriate thread and wait for > results. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-9257) LOCAL cache doesn't evict entries from heap
[ https://issues.apache.org/jira/browse/IGNITE-9257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-9257: -- Fix Version/s: (was: 2.9) > LOCAL cache doesn't evict entries from heap > --- > > Key: IGNITE-9257 > URL: https://issues.apache.org/jira/browse/IGNITE-9257 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.6 >Reporter: Dmitry Karachentsev >Priority: Critical > > Reproducer > http://apache-ignite-users.70518.x6.nabble.com/When-using-CacheMode-LOCAL-OOM-td23285.html > This happens because all entries are kept in GridCacheAdapter#map. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-9265) MVCC TX: Two rows with the same key in one MERGE statement produce an exception
[ https://issues.apache.org/jira/browse/IGNITE-9265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-9265: -- Fix Version/s: (was: 2.9) > MVCC TX: Two rows with the same key in one MERGE statement produce an > exception > --- > > Key: IGNITE-9265 > URL: https://issues.apache.org/jira/browse/IGNITE-9265 > Project: Ignite > Issue Type: Bug > Components: mvcc >Reporter: Igor Seliverstov >Priority: Major > Labels: transactions > > In case the operation like {{MERGE INTO INTEGER (_key, _val) KEY(_key) VALUES > (1,1),(1,2)}} is called an exception is occurred. > Correct behavior: each next update on the same key overwrites pervious one -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-9206) Node can't join to ring if all existing nodes have stopped and another new node joined ahead
[ https://issues.apache.org/jira/browse/IGNITE-9206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-9206: -- Fix Version/s: (was: 2.9) > Node can't join to ring if all existing nodes have stopped and another new > node joined ahead > > > Key: IGNITE-9206 > URL: https://issues.apache.org/jira/browse/IGNITE-9206 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.5 >Reporter: Pavel Kovalenko >Priority: Major > > TcpDiscovery SPI problem. > Situation: > Existing cluster with nodes 1 and 2. Nodes 1 and 2 are stopping. > 1) Node 3 joins to cluster and sends JoinMessage to node 2. > 2) Node 2 is stopping and unable to handle JoinMessage from node 3. Node 3 > choose node 1 as the next node to send the message. > 3) Node 3 sends JoinMessage to node 1. > 4) Node 4 joins to cluster. > 5) Node 1 is stopping and unable to handle JoinMessage from node 3. > 6) Node 4 sees that there are no alive nodes in the ring at the time and > become the first node in the topology. > 7) Node 3 sends JoinMessage to Node 4 and this process repeats again and > again without any success. > At Node 4 logs we can see that remote connection from Node 3 is established > but no active actions have performed. Node 3 leaves in CONNECTING state > forever. At the same time Node 4 thinks that Node 3 is already in the ring. > Failed test: > GridCacheReplicatedDataStructuresFailoverSelfTest#testAtomicSequenceConstantTopologyChange > Link to TC: > https://ci.ignite.apache.org/viewLog.html?buildId=1594376=buildResultsDiv=IgniteTests24Java8_DataStructures > Shrinked log: > {code:java} > [00:09:13] : [Step 3/4] [2018-08-04 21:09:13,733][INFO ][main][root] >>> > Stopping grid > [name=replicated.GridCacheReplicatedDataStructuresFailoverSelfTest0, > id=3e2c94bd-8e98-4dd9-8d1a-befbfe00] > [00:09:13] : [Step 3/4] [2018-08-04 21:09:13,739][INFO > ][thread-replicated.GridCacheReplicatedDataStructuresFailoverSelfTest7][root] > Start node: replicated.GridCacheReplicatedDataStructuresFailoverSelfTest7 > [00:09:13] : [Step 3/4] [2018-08-04 21:09:13,740][INFO > ][tcp-disco-msg-worker-#2146%replicated.GridCacheReplicatedDataStructuresFailoverSelfTest6%][TcpDiscoverySpi] > New next node [newNext=TcpDiscoveryNode > [id=3e2c94bd-8e98-4dd9-8d1a-befbfe00, addrs=ArrayList [127.0.0.1], > sockAddrs=HashSet [/127.0.0.1:47500], discPort=47500, order=1, intOrder=1, > lastExchangeTime=1533416953738, loc=false, ver=2.7.0#20180803-sha1:3ab8bbad, > isClient=false]] > [00:09:13] : [Step 3/4] [2018-08-04 21:09:13,741][INFO > ][tcp-disco-srvr-#2100%replicated.GridCacheReplicatedDataStructuresFailoverSelfTest0%][TcpDiscoverySpi] > TCP discovery accepted incoming connection [rmtAddr=/127.0.0.1, > rmtPort=50099] > [00:09:13] : [Step 3/4] [2018-08-04 21:09:13,741][INFO > ][tcp-disco-srvr-#2100%replicated.GridCacheReplicatedDataStructuresFailoverSelfTest0%][TcpDiscoverySpi] > TCP discovery spawning a new thread for connection [rmtAddr=/127.0.0.1, > rmtPort=50099] > [00:09:13] : [Step 3/4] [2018-08-04 21:09:13,743][INFO > ][tcp-disco-sock-reader-#2151%replicated.GridCacheReplicatedDataStructuresFailoverSelfTest0%][TcpDiscoverySpi] > Started serving remote node connection [rmtAddr=/127.0.0.1:50099, > rmtPort=50099] > [00:09:13] : [Step 3/4] [2018-08-04 21:09:13,746][INFO > ][thread-replicated.GridCacheReplicatedDataStructuresFailoverSelfTest7][GridCacheReplicatedDataStructuresFailoverSelfTest7] > > [00:09:13] : [Step 3/4] > [00:09:13] : [Step 3/4] >>>__ > [00:09:13] : [Step 3/4] >>> / _/ ___/ |/ / _/_ __/ __/ > [00:09:13] : [Step 3/4] >>> _/ // (7 7// / / / / _/ > [00:09:13] : [Step 3/4] >>> /___/\___/_/|_/___/ /_/ /___/ > [00:09:13] : [Step 3/4] >>> > [00:09:13] : [Step 3/4] >>> ver. 2.7.0-SNAPSHOT#20180803-sha1:3ab8bbad > [00:09:13] : [Step 3/4] >>> 2018 Copyright(C) Apache Software Foundation > [00:09:13] : [Step 3/4] >>> > [00:09:13] : [Step 3/4] >>> Ignite documentation: http://ignite.apache.org > [00:09:13] : [Step 3/4] > [00:09:13] : [Step 3/4] [2018-08-04 21:09:13,746][INFO > ][thread-replicated.GridCacheReplicatedDataStructuresFailoverSelfTest7][GridCacheReplicatedDataStructuresFailoverSelfTest7] > Config URL: n/a > [00:09:13] : [Step 3/4] [2018-08-04 21:09:13,747][INFO > ][thread-replicated.GridCacheReplicatedDataStructuresFailoverSelfTest7][GridCacheReplicatedDataStructuresFailoverSelfTest7] > IgniteConfiguration > [igniteInstanceName=replicated.GridCacheReplicatedDataStructuresFailoverSelfTest7, > pubPoolSize=8, svcPoolSize=8, callbackPoolSize=8, stripedPoolSize=8, > sysPoolSize=8, mgmtPoolSize=4, igfsPoolSize=5,
[jira] [Updated] (IGNITE-9255) -DIGNITE_QUIET=false not work in windows, -v work ok
[ https://issues.apache.org/jira/browse/IGNITE-9255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-9255: -- Fix Version/s: (was: 2.9) > -DIGNITE_QUIET=false not work in windows, -v work ok > > > Key: IGNITE-9255 > URL: https://issues.apache.org/jira/browse/IGNITE-9255 > Project: Ignite > Issue Type: Bug > Components: general >Affects Versions: 2.5 >Reporter: ARomantsov >Priority: Major > > I try to run > 1) Ignite.bat - Work anticipated > 2) Ignite.bat -v - Work anticipated > 3) Ignite.bat -J-DIGNITE_QUIET=false - Work like first , but expected like > second variant -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-9090) When client node make cache.QueryCursorImpl.getAll they have OOM and continue working
[ https://issues.apache.org/jira/browse/IGNITE-9090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-9090: -- Fix Version/s: (was: 2.9) > When client node make cache.QueryCursorImpl.getAll they have OOM and continue > working > - > > Key: IGNITE-9090 > URL: https://issues.apache.org/jira/browse/IGNITE-9090 > Project: Ignite > Issue Type: Bug > Components: clients >Affects Versions: 2.4 > Environment: 2 server node, 1 client, 1 cache with 15 kk size >Reporter: ARomantsov >Priority: Critical > > {code:java} > [12:21:22,390][SEVERE][query-#69][GridCacheIoManager] Failed to process > message [senderId=30cab4ec-1da7-4e9f-a262-bdfa4d466865, messageType=class > o.a.i.i.processors.cache.query.GridCacheQueryResponse] > java.lang.OutOfMemoryError: GC overhead limit exceeded > at java.lang.Long.valueOf(Long.java:840) > at > org.apache.ignite.internal.marshaller.optimized.OptimizedObjectInputStream.readObject0(OptimizedObjectInputStream.java:250) > at > org.apache.ignite.internal.marshaller.optimized.OptimizedObjectInputStream.readObjectOverride(OptimizedObjectInputStream.java:198) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:421) > at > org.apache.ignite.internal.processors.cache.query.GridCacheQueryResponseEntry.readExternal(GridCacheQueryResponseEntry.java:90) > at > org.apache.ignite.internal.marshaller.optimized.OptimizedObjectInputStream.readExternalizable(OptimizedObjectInputStream.java:555) > at > org.apache.ignite.internal.marshaller.optimized.OptimizedClassDescriptor.read(OptimizedClassDescriptor.java:917) > at > org.apache.ignite.internal.marshaller.optimized.OptimizedObjectInputStream.readObject0(OptimizedObjectInputStream.java:346) > at > org.apache.ignite.internal.marshaller.optimized.OptimizedObjectInputStream.readObjectOverride(OptimizedObjectInputStream.java:198) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:421) > at > org.apache.ignite.internal.marshaller.optimized.OptimizedMarshaller.unmarshal0(OptimizedMarshaller.java:227) > at > org.apache.ignite.marshaller.AbstractNodeNameAwareMarshaller.unmarshal(AbstractNodeNameAwareMarshaller.java:94) > at > org.apache.ignite.internal.binary.BinaryUtils.doReadOptimized(BinaryUtils.java:1777) > at > org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize0(BinaryReaderExImpl.java:1964) > at > org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize(BinaryReaderExImpl.java:1716) > at > org.apache.ignite.internal.binary.GridBinaryMarshaller.deserialize(GridBinaryMarshaller.java:310) > at > org.apache.ignite.internal.binary.BinaryMarshaller.unmarshal0(BinaryMarshaller.java:99) > at > org.apache.ignite.marshaller.AbstractNodeNameAwareMarshaller.unmarshal(AbstractNodeNameAwareMarshaller.java:82) > at > org.apache.ignite.internal.processors.cache.query.GridCacheQueryResponse.unmarshalCollection0(GridCacheQueryResponse.java:189) > at > org.apache.ignite.internal.processors.cache.query.GridCacheQueryResponse.finishUnmarshal(GridCacheQueryResponse.java:162) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.unmarshall(GridCacheIoManager.java:1530) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:576) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$700(GridCacheIoManager.java:101) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1613) > at > org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$4100(GridIoManager.java:125) > at > org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:2752) > at > org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1516) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$4400(GridIoManager.java:125) > at > org.apache.ignite.internal.managers.communication.GridIoManager$10.run(GridIoManager.java:1485) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [12:21:28,573][INFO][ignite-update-notifier-timer][GridUpdateNotifier] Update > status is not available.
[jira] [Updated] (IGNITE-9087) testClientInForceServerModeStopsOnExchangeHistoryExhaustion refactoring
[ https://issues.apache.org/jira/browse/IGNITE-9087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-9087: -- Fix Version/s: (was: 2.9) > testClientInForceServerModeStopsOnExchangeHistoryExhaustion refactoring > --- > > Key: IGNITE-9087 > URL: https://issues.apache.org/jira/browse/IGNITE-9087 > Project: Ignite > Issue Type: Test >Affects Versions: 2.6 >Reporter: Sergey Chugunov >Priority: Major > Labels: MakeTeamcityGreenAgain > > Initial implementation of the test relied on massive parallel client start to > get into a situation of exchange history exhaustion. > But after fix IGNITE-8998 even in massive start scenario probability of > client's exchange being cleaned up from exchange history becomes much smaller. > Test should be refactored so it won't rely on parallel operations but delay > exchange finish (e.g. by delaying particular messages). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-9088) Add ability to dump persistence after particular test
[ https://issues.apache.org/jira/browse/IGNITE-9088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-9088: -- Fix Version/s: (was: 2.9) > Add ability to dump persistence after particular test > - > > Key: IGNITE-9088 > URL: https://issues.apache.org/jira/browse/IGNITE-9088 > Project: Ignite > Issue Type: Improvement > Components: persistence >Reporter: Pavel Kovalenko >Assignee: Pavel Kovalenko >Priority: Major > > Sometimes it's needed to analyze persistence after a particular test finish > on TeamCity. > We need to add an ability to dump persistence dirs/files to the specified > directory on test running host for further analysis. > This should be managed by a property. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-9076) Service Grid: Introduce timeout to interrupt long Service#init on deployment
[ https://issues.apache.org/jira/browse/IGNITE-9076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-9076: -- Fix Version/s: (was: 2.9) > Service Grid: Introduce timeout to interrupt long Service#init on deployment > > > Key: IGNITE-9076 > URL: https://issues.apache.org/jira/browse/IGNITE-9076 > Project: Ignite > Issue Type: Task > Components: managed services >Reporter: Vyacheslav Daradur >Assignee: Vyacheslav Daradur >Priority: Major > > According to the issue IGNITE-3392 we should propagate services deployment > results to an initiator, that means we should wait for Service#init method > completion. > We should introduce some kind of to interrupt long Service#init method on > deployment. > Probably it should be on per-service level. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-9067) Clients to do a random delay when reconnecting to the cluster
[ https://issues.apache.org/jira/browse/IGNITE-9067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-9067: -- Fix Version/s: (was: 2.9) > Clients to do a random delay when reconnecting to the cluster > - > > Key: IGNITE-9067 > URL: https://issues.apache.org/jira/browse/IGNITE-9067 > Project: Ignite > Issue Type: Improvement >Affects Versions: 2.6 >Reporter: Sergey Chugunov >Priority: Major > > IGNITE-8657 and subsequent tickets fixed situation when client nodes might > hang when too many clients join topology simultaneously. > After IGNITE-8657 client nodes try to reconnect immediately on receiving > reply from coordinator. This may cause starvation-like issue when many > clients make frequent reconnect attempts and prevents each other to join > cluster. > Random delay on next reconnect attempt may significantly improve speed of > massive reconnect. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-9045) TxRecord is logged to WAL during node stop procedure
[ https://issues.apache.org/jira/browse/IGNITE-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-9045: -- Fix Version/s: (was: 2.9) > TxRecord is logged to WAL during node stop procedure > > > Key: IGNITE-9045 > URL: https://issues.apache.org/jira/browse/IGNITE-9045 > Project: Ignite > Issue Type: Bug > Components: persistence >Affects Versions: 2.6 >Reporter: Sergey Chugunov >Priority: Major > > When *IGNITE_WAL_LOG_TX_RECORDS* flag is set to true special TxRecords are > logged to WAL on changes of transaction state. > It turned out that during node stop transaction futures (e.g. > GridDhtTxPrepareFuture) change transaction state which is logged to WAL. > This situation may violate transactional consistency and should be fixed: no > writes to WAL should be issued during node stop. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-9008) CPP Thin: Implement benchmark for C++ thin
[ https://issues.apache.org/jira/browse/IGNITE-9008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-9008: -- Fix Version/s: (was: 2.9) > CPP Thin: Implement benchmark for C++ thin > -- > > Key: IGNITE-9008 > URL: https://issues.apache.org/jira/browse/IGNITE-9008 > Project: Ignite > Issue Type: New Feature > Components: platforms >Reporter: Igor Sapego >Assignee: Igor Sapego >Priority: Major > > We need a benchmark for C++ thin to understand how is its performance > compares to to other clients. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-9011) RendezvousAffinity.excludeNeighbors should be removed and be a default behavior
[ https://issues.apache.org/jira/browse/IGNITE-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-9011: -- Fix Version/s: (was: 2.9) (was: 3.0) > RendezvousAffinity.excludeNeighbors should be removed and be a default > behavior > --- > > Key: IGNITE-9011 > URL: https://issues.apache.org/jira/browse/IGNITE-9011 > Project: Ignite > Issue Type: Improvement > Components: cache >Affects Versions: 2.6 >Reporter: Dmitry Karachentsev >Priority: Major > > According to this [discussion | > http://apache-ignite-developers.2346864.n4.nabble.com/Neighbors-exclusion-td32550.html], > cache backup distribution should be more straightforward. > Right now we have not obvious logic on how backups will be stored across > nodes. For example: > 1. If set nodeFilter, it will filter backup nodes and if there are not enough > nodes there will be less backups... > 2. If set property excludeNeighbors, it will ignore manually set backupFilter. > 3. By default excludeNeighbors is false. > There seems no need to keep excludeNeighbors property at all and it should be > removed. Instead, node always must do the best to distribute backups to > different machines. > If user set backupFilter, it must be used, otherwise distribute backups to > other machines if it's possible. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8986) Need to add configuration validation template on startup
[ https://issues.apache.org/jira/browse/IGNITE-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8986: -- Fix Version/s: (was: 2.9) 2.10 > Need to add configuration validation template on startup > > > Key: IGNITE-8986 > URL: https://issues.apache.org/jira/browse/IGNITE-8986 > Project: Ignite > Issue Type: Improvement > Components: general >Reporter: Dmitriy Setrakyan >Priority: Major > Fix For: 2.10 > > > We should have a validation template file (e.g. ignite-validate.xml), and > make sure on startup that all config properties specified in that file match. > This way a user could put this file somewhere on a shared network drive and > have an extra degree of confidence that the configuration is valid. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8983) .NET long-running suite fails in master
[ https://issues.apache.org/jira/browse/IGNITE-8983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8983: -- Fix Version/s: (was: 2.9) 2.10 > .NET long-running suite fails in master > --- > > Key: IGNITE-8983 > URL: https://issues.apache.org/jira/browse/IGNITE-8983 > Project: Ignite > Issue Type: Test >Affects Versions: 2.6 >Reporter: Alexey Goncharuk >Priority: Major > Labels: MakeTeamcityGreenAgain > Fix For: 2.10 > > > One of the following changes triggered the fails: > {code} > IGNITE-8681 Using ExpiryPolicy with persistence causes significant slowdown > - Fixes #4285. > IGNITE-7149 : Gradient boosting for decision tree > IGNITE-8746 EVT_CACHE_REBALANCE_PART_DATA_LOST event received twice on the > coordinator IGNITE-8821 Reduced amount of logs for BPlusTreeSelfTest > put/remove family tests - Fixes #4218. > IGNITE-8203 Handle ClosedByInterruptionException in FilePageStore - Fixes > #4211. > IGNITE-8857 new IgnitePredicate filtering credential attribute > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8913) Uninformative SQL query cancellation message
[ https://issues.apache.org/jira/browse/IGNITE-8913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8913: -- Fix Version/s: (was: 2.9) 2.10 > Uninformative SQL query cancellation message > > > Key: IGNITE-8913 > URL: https://issues.apache.org/jira/browse/IGNITE-8913 > Project: Ignite > Issue Type: Improvement > Components: sql >Affects Versions: 2.5 >Reporter: Vladislav Pyatkov >Priority: Major > Labels: iep-29, sql-stability > Fix For: 2.10 > > Attachments: sgrimstad_IGNITE_8913_Query_cancelled_me.patch > > Time Spent: 10m > Remaining Estimate: 0h > > When query timeouted or cancelled or other exception, we getting message: > "The query was cancelled while executing". > Need make message more clear - text of query, node which the cancelled, > reason of cancel query e.t.c. > {noformat} > 2018-06-19 > 00:00:10.653[ERROR][query-#93192%DPL_GRID%DplGridNodeName%][o.a.i.i.p.q.h.t.GridMapQueryExecutor] > Failed to execute local query. > org.apache.ignite.cache.query.QueryCancelledException: The query was > cancelled while executing. > at > org.apache.ignite.internal.processors.query.GridQueryCancel.set(GridQueryCancel.java:53) > at > org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.executeSqlQuery(IgniteH2Indexing.java:1115) > at > org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.executeSqlQueryWithTimer(IgniteH2Indexing.java:1207) > at > org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.executeSqlQueryWithTimer(IgniteH2Indexing.java:1185) > at > org.apache.ignite.internal.processors.query.h2.twostep.GridMapQueryExecutor.onQueryRequest0(GridMapQueryExecutor.java:683) > at > org.apache.ignite.internal.processors.query.h2.twostep.GridMapQueryExecutor.onQueryRequest(GridMapQueryExecutor.java:527) > at > org.apache.ignite.internal.processors.query.h2.twostep.GridMapQueryExecutor.onMessage(GridMapQueryExecutor.java:218) > at > org.apache.ignite.internal.processors.query.h2.twostep.GridMapQueryExecutor$2.onMessage(GridMapQueryExecutor.java:178) > at > org.apache.ignite.internal.managers.communication.GridIoManager$ArrayListener.onMessage(GridIoManager.java:2333) > at > org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556) > at > org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:125) > at > org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1091) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > 2018-06-19 > 00:00:11.629[ERROR][query-#93187%DPL_GRID%DplGridNodeName%][o.a.i.i.p.q.h.t.GridMapQueryExecutor] > Failed to execute local query. > org.apache.ignite.cache.query.QueryCancelledException: The query was > cancelled while executing. > at > org.apache.ignite.internal.processors.query.h2.twostep.GridMapQueryExecutor.onQueryRequest0(GridMapQueryExecutor.java:670) > at > org.apache.ignite.internal.processors.query.h2.twostep.GridMapQueryExecutor.onQueryRequest(GridMapQueryExecutor.java:527) > at > org.apache.ignite.internal.processors.query.h2.twostep.GridMapQueryExecutor.onMessage(GridMapQueryExecutor.java:218) > at > org.apache.ignite.internal.processors.query.h2.twostep.GridMapQueryExecutor$2.onMessage(GridMapQueryExecutor.java:178) > at > org.apache.ignite.internal.managers.communication.GridIoManager$ArrayListener.onMessage(GridIoManager.java:2333) > at > org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556) > at > org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:125) > at > org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1091) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8921) Add control.sh --cache affinity command to output current and ideal assignment and optionally show diff between them
[ https://issues.apache.org/jira/browse/IGNITE-8921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8921: -- Fix Version/s: (was: 2.9) 2.10 > Add control.sh --cache affinity command to output current and ideal > assignment and optionally show diff between them > > > Key: IGNITE-8921 > URL: https://issues.apache.org/jira/browse/IGNITE-8921 > Project: Ignite > Issue Type: Improvement >Reporter: Alexey Scherbakov >Priority: Major > Fix For: 2.10 > > > Will help debugging. > Ex: > control.sh --cache affinity current > control.sh --cache affinity ideal > control.sh --cache affinity diff -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8885) GridCacheDhtPreloadMultiThreadedSelfTest.testConcurrentNodesStartStop fails flakily
[ https://issues.apache.org/jira/browse/IGNITE-8885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8885: -- Fix Version/s: (was: 2.9) 2.10 > GridCacheDhtPreloadMultiThreadedSelfTest.testConcurrentNodesStartStop fails > flakily > --- > > Key: IGNITE-8885 > URL: https://issues.apache.org/jira/browse/IGNITE-8885 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.5 >Reporter: Andrey Kuznetsov >Priority: Major > Labels: MakeTeamcityGreenAgain > Fix For: 2.10 > > > Sometimes the following assertion failure can be observed (master branch). > {noformat} > [11:05:34] (err) Failed to notify listener: > o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture$8$1$1...@e18f6e4java.lang.AssertionError > at > org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager$13.applyx(CacheAffinitySharedManager.java:1335) > at > org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager$13.applyx(CacheAffinitySharedManager.java:1326) > at > org.apache.ignite.internal.util.lang.IgniteInClosureX.apply(IgniteInClosureX.java:38) > at > org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.forAllCacheGroups(CacheAffinitySharedManager.java:1119) > at > org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.onLocalJoin(CacheAffinitySharedManager.java:1326) > at > org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.processFullMessage(GridDhtPartitionsExchangeFuture.java:3281) > at > org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onBecomeCoordinator(GridDhtPartitionsExchangeFuture.java:3730) > at > org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.access$3300(GridDhtPartitionsExchangeFuture.java:130) > at > org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture$8$1$1.apply(GridDhtPartitionsExchangeFuture.java:3626) > at > org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture$8$1$1.apply(GridDhtPartitionsExchangeFuture.java:3615) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:383) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:347) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:335) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:495) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:474) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:451) > at > org.apache.ignite.internal.util.future.GridCompoundFuture.checkComplete(GridCompoundFuture.java:285) > at > org.apache.ignite.internal.util.future.GridCompoundFuture.apply(GridCompoundFuture.java:144) > at > org.apache.ignite.internal.util.future.GridCompoundFuture.apply(GridCompoundFuture.java:45) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:383) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:347) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:335) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:495) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:474) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:440) > at > org.apache.ignite.internal.processors.cache.distributed.dht.preloader.InitNewCoordinatorFuture.onMessage(InitNewCoordinatorFuture.java:254) > at > org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onReceiveSingleMessage(GridDhtPartitionsExchangeFuture.java:2115) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.processSinglePartitionUpdate(GridCachePartitionExchangeManager.java:1580) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.access$1000(GridCachePartitionExchangeManager.java:138) > at >
[jira] [Updated] (IGNITE-8880) Add setIgnite() in SpringCacheManager and SpringTransactionManager
[ https://issues.apache.org/jira/browse/IGNITE-8880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8880: -- Fix Version/s: (was: 2.9) 2.10 > Add setIgnite() in SpringCacheManager and SpringTransactionManager > -- > > Key: IGNITE-8880 > URL: https://issues.apache.org/jira/browse/IGNITE-8880 > Project: Ignite > Issue Type: Improvement > Components: spring >Reporter: Amir Akhmedov >Assignee: Amir Akhmedov >Priority: Major > Labels: newbie > Fix For: 2.10 > > > Neet to add setIgnite() in SpringCacheManager and SpringTransactionManager to > make explicit injection of Ignite instance. > For more details refer: > https://issues.apache.org/jira/browse/IGNITE-8740?focusedCommentId=16520894=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16520894 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8848) Introduce new split-brain tests when topology is under load
[ https://issues.apache.org/jira/browse/IGNITE-8848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8848: -- Fix Version/s: (was: 2.9) 2.10 > Introduce new split-brain tests when topology is under load > --- > > Key: IGNITE-8848 > URL: https://issues.apache.org/jira/browse/IGNITE-8848 > Project: Ignite > Issue Type: Improvement > Components: cache, zookeeper >Affects Versions: 2.5 >Reporter: Pavel Kovalenko >Assignee: Pavel Kovalenko >Priority: Major > Fix For: 2.10 > > > We should check following cases: > 1) Primary node of transaction located at a part of a cluster that will > survive, while backup doesn't. > 2) Backup node of transaction located at a part of a cluster that will > survive, while primary doesn't. > 3) A client has a connection to both split-brained parts. > 4) A client has a connection to only 1 part of a split cluster. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8878) Make WebConsole agent logging more verbose.
[ https://issues.apache.org/jira/browse/IGNITE-8878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8878: -- Fix Version/s: (was: 2.9) 2.10 > Make WebConsole agent logging more verbose. > --- > > Key: IGNITE-8878 > URL: https://issues.apache.org/jira/browse/IGNITE-8878 > Project: Ignite > Issue Type: Bug > Components: wizards >Reporter: Andrey Mashenkov >Priority: Major > Fix For: 2.10 > > > For now it is impossible to debug ssl handshake failures in console agent > as it just log meaningless message with no cause and stacktrace and call > System.exit(1). > > We have to make logging more verbose at debug level at least. > > This is full log of failure: > > [2018-06-26 15:01:54,060][INFO ][main][AgentLauncher] Connecting to: > [https://|https://10.44.38.53/] [x|http://10.44.38.53:8080/] > [.x.x.x|https://10.44.38.53/] > [2018-06-26 15:01:54,233][ERROR][EventThread][AgentLauncher] Failed to > establish SSL connection to server, due to errors with SSL handshake. > [2018-06-26 15:01:54,233][ERROR][EventThread][AgentLauncher] Add to > environment variable JVM_OPTS parameter "-Dtrust.all=true" to skip > certificate validation in case of using self-signed certificate. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8845) GridUnsafe.allocateMemory throws OutOfMemoryError which isn't handled
[ https://issues.apache.org/jira/browse/IGNITE-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8845: -- Fix Version/s: (was: 2.9) 2.10 > GridUnsafe.allocateMemory throws OutOfMemoryError which isn't handled > - > > Key: IGNITE-8845 > URL: https://issues.apache.org/jira/browse/IGNITE-8845 > Project: Ignite > Issue Type: Bug > Components: general >Affects Versions: 2.5 >Reporter: Mikhail Cherkasov >Priority: Major > Fix For: 2.10 > > Attachments: Main.java > > > If there's no more native memory then Unsafe.allocateMemor throws > java.lang.OutOfMemoryError. Errors - is type of exception after which you > can't restore application and you need to close it and restart. I think in > this case we can handle it and throw IgniteOOM instead. > > Reproducer is attached, it throws the following exception: > > Exception in thread "main" java.lang.OutOfMemoryError > at sun.misc.Unsafe.allocateMemory(Native Method) > at > org.apache.ignite.internal.util.GridUnsafe.allocateMemory(GridUnsafe.java:1068) > at > org.apache.ignite.internal.mem.unsafe.UnsafeMemoryProvider.nextRegion(UnsafeMemoryProvider.java:80) > at > org.apache.ignite.internal.pagemem.impl.PageMemoryNoStoreImpl.addSegment(PageMemoryNoStoreImpl.java:612) > at > org.apache.ignite.internal.pagemem.impl.PageMemoryNoStoreImpl.allocatePage(PageMemoryNoStoreImpl.java:287) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8844) Provide example how to implement auto-activation policy when cluster is activated first time
[ https://issues.apache.org/jira/browse/IGNITE-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8844: -- Fix Version/s: (was: 2.9) 2.10 > Provide example how to implement auto-activation policy when cluster is > activated first time > > > Key: IGNITE-8844 > URL: https://issues.apache.org/jira/browse/IGNITE-8844 > Project: Ignite > Issue Type: Improvement > Components: cache >Affects Versions: 2.4, 2.5 >Reporter: Pavel Kovalenko >Priority: Major > Labels: activation, cluster, usability > Fix For: 2.10 > > > Some of the our users which use Ignite embedded face with the problem how to > activate cluster first time, when no first baseline established. > We should provide an example of such policy as we did it with > BaselineWatcher. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8823) Incorrect transaction state in tx manager
[ https://issues.apache.org/jira/browse/IGNITE-8823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8823: -- Fix Version/s: (was: 2.9) 2.10 > Incorrect transaction state in tx manager > - > > Key: IGNITE-8823 > URL: https://issues.apache.org/jira/browse/IGNITE-8823 > Project: Ignite > Issue Type: Bug >Affects Versions: 2.6 >Reporter: Andrey N. Gura >Priority: Major > Fix For: 2.10 > > Attachments: Ignite8823ReproducerTest.java > > > Reproducable by test method {{testCreateConsistencyMultithreaded}} in > {{IgfsPrimaryMultiNodeSelfTest}} and > {{IgfsPrimaryRelaxedConsistencyMultiNodeSelfTest}}: > {noformat} > 18:34:40,701][SEVERE][sys-stripe-0-#44%ignite%][GridCacheIoManager] Failed > processing message [senderId=e273c3f8-02ed-4201-9ac8-09f9ab6a1d31, > msg=GridNearTxPrepareResponse [pending=[], > futId=b4df8831461-9735f9d5-79a0-47a3-a951-e62a03af71ef, miniId=1, > dhtVer=GridCacheVersion [topVer=140816081, order=1529336085358, nodeOrder=3], > writeVer=GridCacheVersion [topVer=140816081, order=1529336085360, > nodeOrder=3], ownedVals=null, retVal=GridCacheReturn [v=null, cacheObj=null, > success=true, invokeRes=true, loc=true, cacheId=0], clientRemapVer=null, > super=GridDistributedTxPrepareResponse > [txState=IgniteTxImplicitSingleStateImpl [init=true, recovery=false], > part=-1, err=null, super=GridDistributedBaseMessage [ver=GridCacheVersion > [topVer=140816081, order=1529336085224, nodeOrder=1], committedVers=null, > rolledbackVers=null, cnt=0, super=GridCacheIdMessage [cacheId=0] > java.lang.AssertionError: true instead of GridCacheReturnCompletableWrapper > at > org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager.removeTxReturn(IgniteTxManager.java:1098) > at > org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxFinishFuture.ackBackup(GridNearTxFinishFuture.java:533) > at > org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxFinishFuture.doFinish(GridNearTxFinishFuture.java:500) > at > org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxFinishFuture.finish(GridNearTxFinishFuture.java:417) > at > org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal$19.apply(GridNearTxLocal.java:3341) > at > org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal$19.apply(GridNearTxLocal.java:3335) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:383) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:347) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:335) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:495) > at > org.apache.ignite.internal.processors.cache.GridCacheCompoundFuture.onDone(GridCacheCompoundFuture.java:56) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:474) > at > org.apache.ignite.internal.processors.cache.distributed.near.GridNearOptimisticTxPrepareFuture.onComplete(GridNearOptimisticTxPrepareFuture.java:310) > at > org.apache.ignite.internal.processors.cache.distributed.near.GridNearOptimisticTxPrepareFuture.onDone(GridNearOptimisticTxPrepareFuture.java:288) > at > org.apache.ignite.internal.processors.cache.distributed.near.GridNearOptimisticTxPrepareFuture.onDone(GridNearOptimisticTxPrepareFuture.java:78) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:451) > at > org.apache.ignite.internal.util.future.GridCompoundFuture.checkComplete(GridCompoundFuture.java:285) > at > org.apache.ignite.internal.util.future.GridCompoundFuture.apply(GridCompoundFuture.java:144) > at > org.apache.ignite.internal.util.future.GridCompoundFuture.apply(GridCompoundFuture.java:45) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:383) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:347) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:335) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:495) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:474) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:451) > at >
[jira] [Updated] (IGNITE-8828) Detecting and stopping unresponsive nodes during Partition Map Exchange
[ https://issues.apache.org/jira/browse/IGNITE-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8828: -- Fix Version/s: 2.10 > Detecting and stopping unresponsive nodes during Partition Map Exchange > --- > > Key: IGNITE-8828 > URL: https://issues.apache.org/jira/browse/IGNITE-8828 > Project: Ignite > Issue Type: Improvement > Components: general >Reporter: Sergey Chugunov >Priority: Major > Labels: iep-25 > Fix For: 2.10 > > Original Estimate: 264h > Remaining Estimate: 264h > > During PME process coordinator (1) gathers local partition maps from all > nodes and (2) sends calculated full partition map back to all nodes in the > topology. > However if one or more nodes fail to send local information on step 1 for any > reason, PME process hangs blocking all operations. The only solution will be > to manually identify and stop nodes which failed to send info to coordinator. > This should be done by coordinator itself: in case it didn't receive in time > local partition maps from any nodes, it should check that stopping these > nodes won't lead to data loss and then stop them forcibly. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8828) Detecting and stopping unresponsive nodes during Partition Map Exchange
[ https://issues.apache.org/jira/browse/IGNITE-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8828: -- Fix Version/s: (was: 2.9) > Detecting and stopping unresponsive nodes during Partition Map Exchange > --- > > Key: IGNITE-8828 > URL: https://issues.apache.org/jira/browse/IGNITE-8828 > Project: Ignite > Issue Type: Improvement > Components: general >Reporter: Sergey Chugunov >Priority: Major > Labels: iep-25 > Original Estimate: 264h > Remaining Estimate: 264h > > During PME process coordinator (1) gathers local partition maps from all > nodes and (2) sends calculated full partition map back to all nodes in the > topology. > However if one or more nodes fail to send local information on step 1 for any > reason, PME process hangs blocking all operations. The only solution will be > to manually identify and stop nodes which failed to send info to coordinator. > This should be done by coordinator itself: in case it didn't receive in time > local partition maps from any nodes, it should check that stopping these > nodes won't lead to data loss and then stop them forcibly. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8802) Need to fix Ignite INFO output
[ https://issues.apache.org/jira/browse/IGNITE-8802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8802: -- Fix Version/s: (was: 2.9) 2.10 > Need to fix Ignite INFO output > -- > > Key: IGNITE-8802 > URL: https://issues.apache.org/jira/browse/IGNITE-8802 > Project: Ignite > Issue Type: Improvement > Components: general >Reporter: Dmitriy Setrakyan >Assignee: Alexey Goncharuk >Priority: Major > Fix For: 2.10 > > > I have noticed that we output a lot of garbage, almost trace level > information into the log. Moreover, such information is logged every time a > topology changes. > > Here are examples: > > {quote}[22:32:06,330][INFO][exchange-worker-#42][GridDhtPartitionsExchangeFuture] > Finished waiting for partition release future > [topVer=AffinityTopologyVersion [topVer=2, minorTopVer=0], waitTime=0ms, > futInfo=NA] > [22:32:06,624][INFO][grid-nio-worker-tcp-comm-0-#25][TcpCommunicationSpi] > Accepted incoming communication connection [locAddr=/127.0.0.1:48100, > rmtAddr=/127.0.0.1:62157] > [22:32:06,663][INFO][exchange-worker-#42][GridDhtPartitionsExchangeFuture] > Finished waiting for partitions release latch: ServerLatch [permits=0, > pendingAcks=[], super=CompletableLatch [id=exchange, > topVer=AffinityTopologyVersion [topVer=2, minorTopVer=0]]] > [22:32:06,664][INFO][exchange-worker-#42][GridDhtPartitionsExchangeFuture] > Finished waiting for partition release future [topVer=AffinityTopologyVersion > [topVer=2, minorTopVer=0], waitTime=0ms, futInfo=NA] > [22:32:06,667][INFO][exchange-worker-#42][time] Finished exchange init > [topVer=AffinityTopologyVersion [topVer=2, minorTopVer=0], crd=true] > [22:32:06,676][INFO][sys-#46][GridDhtPartitionsExchangeFuture] Coordinator > received single message [ver=AffinityTopologyVersion [topVer=2, > minorTopVer=0], node=bf2a5abd-4a7c-4a89-b760-1b8c8021cff3, allReceived=true] > [22:32:06,694][INFO][sys-#46][GridDhtPartitionsExchangeFuture] Coordinator > received all messages, try merge [ver=AffinityTopologyVersion [topVer=2, > minorTopVer=0]] > [22:32:06,694][INFO][sys-#46][GridDhtPartitionsExchangeFuture] > finishExchangeOnCoordinator [topVer=AffinityTopologyVersion [topVer=2, > minorTopVer=0], resVer=AffinityTopologyVersion [topVer=2, minorTopVer=0]] > [22:32:06,703][INFO][sys-#46][GridDhtPartitionsExchangeFuture] Finish > exchange future [startVer=AffinityTopologyVersion [topVer=2, minorTopVer=0], > resVer=AffinityTopologyVersion [topVer=2, minorTopVer=0], err=null]{quote} > > The information above does not belong at INFO level. This is a debug level or > trace level output. I understand that it makes it easier to solve user > issues, but in this case we should create a separate log category and log > this stuff into a separate file. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8793) Introduce metrics for File I/O operations to monitor disk performance
[ https://issues.apache.org/jira/browse/IGNITE-8793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8793: -- Fix Version/s: (was: 2.9) 2.10 > Introduce metrics for File I/O operations to monitor disk performance > - > > Key: IGNITE-8793 > URL: https://issues.apache.org/jira/browse/IGNITE-8793 > Project: Ignite > Issue Type: Improvement > Components: cache >Affects Versions: 2.5 >Reporter: Pavel Kovalenko >Priority: Major > Fix For: 2.10 > > > It would be good to introduce some kind of wrapper for File I/O to measure > read/write times for > better understanding what is happening with persistence. Measurements should > be exposed as JMX-metrics. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8785) Node may hang indefinitely in CONNECTING state during cluster segmentation
[ https://issues.apache.org/jira/browse/IGNITE-8785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8785: -- Fix Version/s: (was: 2.9) 2.10 > Node may hang indefinitely in CONNECTING state during cluster segmentation > -- > > Key: IGNITE-8785 > URL: https://issues.apache.org/jira/browse/IGNITE-8785 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.5 >Reporter: Pavel Kovalenko >Priority: Major > Fix For: 2.10 > > > Affected test: > org.apache.ignite.internal.processors.cache.IgniteTopologyValidatorGridSplitCacheTest#testTopologyValidatorWithCacheGroup > Node hangs with following stacktrace: > {noformat} > "grid-starter-testTopologyValidatorWithCacheGroup-22" #117619 prio=5 > os_prio=0 tid=0x7f17dd19b800 nid=0x304a in Object.wait() > [0x7f16b19df000] >java.lang.Thread.State: TIMED_WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl.joinTopology(ServerImpl.java:931) > - locked <0x000705ee4a60> (a java.lang.Object) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl.spiStart(ServerImpl.java:373) > at > org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.spiStart(TcpDiscoverySpi.java:1948) > at > org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:297) > at > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:915) > at > org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1739) > at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1046) > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2014) > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1723) > - locked <0x000705995ec0> (a > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance) > at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1151) > at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:649) > at > org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:882) > at > org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:845) > at > org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:833) > at > org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:799) > at > org.apache.ignite.testframework.junits.GridAbstractTest$3.call(GridAbstractTest.java:742) > at > org.apache.ignite.testframework.GridTestThread.run(GridTestThread.java:86) > {noformat} > It seems that node never receives acknowledgment from coordinator. > There were some failure before: > {noformat} > [org.apache.ignite:ignite-core] [2018-06-10 04:59:18,876][WARN > ][grid-starter-testTopologyValidatorWithCacheGroup-22][IgniteCacheTopologySplitAbstractTest$SplitTcpDiscoverySpi] > Node has not been connected to topology and will repeat join process. Check > remote nodes logs for possible error messages. Note that large topology may > require significant time to start. Increase 'TcpDiscoverySpi.networkTimeout' > configuration property if getting this message on the starting nodes > [networkTimeout=5000] > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8784) Deadlock during simultaneous client reconnect and node stop
[ https://issues.apache.org/jira/browse/IGNITE-8784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8784: -- Fix Version/s: (was: 2.9) 2.10 > Deadlock during simultaneous client reconnect and node stop > --- > > Key: IGNITE-8784 > URL: https://issues.apache.org/jira/browse/IGNITE-8784 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.5 >Reporter: Pavel Kovalenko >Priority: Critical > Fix For: 2.10 > > > {noformat} > [18:48:22,665][ERROR][tcp-client-disco-msg-worker-#467%client%][IgniteKernal%client] > Failed to reconnect, will stop node > class org.apache.ignite.IgniteException: Failed to wait for local node joined > event (grid is stopping). > at > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.localJoin(GridDiscoveryManager.java:2193) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.onKernalStart(GridCachePartitionExchangeManager.java:583) > at > org.apache.ignite.internal.processors.cache.GridCacheSharedContext.onReconnected(GridCacheSharedContext.java:396) > at > org.apache.ignite.internal.processors.cache.GridCacheProcessor.onReconnected(GridCacheProcessor.java:1159) > at > org.apache.ignite.internal.IgniteKernal.onReconnected(IgniteKernal.java:3915) > at > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:830) > at > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery(GridDiscoveryManager.java:589) > at > org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.notifyDiscovery(ClientImpl.java:2423) > at > org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.notifyDiscovery(ClientImpl.java:2402) > at > org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.processNodeAddFinishedMessage(ClientImpl.java:2047) > at > org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.processDiscoveryMessage(ClientImpl.java:1896) > at > org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.body(ClientImpl.java:1788) > at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) > Caused by: class org.apache.ignite.IgniteCheckedException: Failed to wait for > local node joined event (grid is stopping). > at > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.onKernalStop0(GridDiscoveryManager.java:1657) > at > org.apache.ignite.internal.managers.GridManagerAdapter.onKernalStop(GridManagerAdapter.java:652) > at org.apache.ignite.internal.IgniteKernal.stop0(IgniteKernal.java:2218) > at org.apache.ignite.internal.IgniteKernal.stop(IgniteKernal.java:2166) > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2588) > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2551) > at org.apache.ignite.internal.IgnitionEx.stop(IgnitionEx.java:372) > at org.apache.ignite.Ignition.stop(Ignition.java:229) > at > org.apache.ignite.testframework.junits.GridAbstractTest.stopGrid(GridAbstractTest.java:1088) > at > org.apache.ignite.testframework.junits.GridAbstractTest.stopAllGrids(GridAbstractTest.java:1128) > at > org.apache.ignite.testframework.junits.GridAbstractTest.stopAllGrids(GridAbstractTest.java:1109) > at > org.gridgain.grid.internal.processors.cache.database.IgniteDbSnapshotNotStableTopologiesTest.afterTest(IgniteDbSnapshotNotStableTopologiesTest.java:250) > at > org.apache.ignite.testframework.junits.GridAbstractTest.tearDown(GridAbstractTest.java:1694) > at > org.apache.ignite.testframework.junits.common.GridCommonAbstractTest.tearDown(GridCommonAbstractTest.java:492) > at junit.framework.TestCase.runBare(TestCase.java:146) > at junit.framework.TestResult$1.protect(TestResult.java:122) > at junit.framework.TestResult.runProtected(TestResult.java:142) > at junit.framework.TestResult.run(TestResult.java:125) > at junit.framework.TestCase.run(TestCase.java:129) > at junit.framework.TestSuite.runTest(TestSuite.java:255) > at junit.framework.TestSuite.run(TestSuite.java:250) > at junit.framework.TestSuite.runTest(TestSuite.java:255) > at junit.framework.TestSuite.run(TestSuite.java:250) > at > org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:369) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:275) > at >
[jira] [Updated] (IGNITE-8750) IgniteWalFlushDefaultSelfTest.testFailAfterStart fails on TC
[ https://issues.apache.org/jira/browse/IGNITE-8750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8750: -- Fix Version/s: (was: 2.9) 2.10 > IgniteWalFlushDefaultSelfTest.testFailAfterStart fails on TC > > > Key: IGNITE-8750 > URL: https://issues.apache.org/jira/browse/IGNITE-8750 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.5 >Reporter: Pavel Kovalenko >Assignee: Pavel Kovalenko >Priority: Major > Labels: MakeTeamcityGreenAgain > Fix For: 2.10 > > > {noformat} > org.apache.ignite.IgniteException: Failed to get object field > [obj=GridCacheSharedManagerAdapter [starting=true, stop=false], > fieldNames=[mmap]] > Caused by: java.lang.NoSuchFieldException: mmap > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8759) TcpDiscoverySpi: detailed info about current state in MBean
[ https://issues.apache.org/jira/browse/IGNITE-8759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8759: -- Fix Version/s: (was: 2.9) 2.10 > TcpDiscoverySpi: detailed info about current state in MBean > --- > > Key: IGNITE-8759 > URL: https://issues.apache.org/jira/browse/IGNITE-8759 > Project: Ignite > Issue Type: Improvement > Components: general >Reporter: Sergey Chugunov >Priority: Major > Labels: discovery > Fix For: 2.10 > > > When TcpDiscoverySpi is used all nodes keep information about current > topology locally. Discovery protocol is responsible of keeping this > information consistent across all nodes. > However in situations of network glitches some nodes may have different > pictures of current state and topology of the cluster. > To make it easier to monitor state of the whole cluster and identify such > nodes quicker the following information should be presented via MBean > interface on each node: > * Current topology version; > * Hash of current topology (e.g. sum of hash codes of all nodeIds) (to allow > quick matching); > * Pretty-formatted snapshot of current topology visible from the node; > * Information about nodes visible/invisible to local one; > * Other information useful to monitoring of topology state. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8728) "IllegalStateException: Duplicate Key" on node join
[ https://issues.apache.org/jira/browse/IGNITE-8728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8728: -- Fix Version/s: (was: 2.9) 2.10 > "IllegalStateException: Duplicate Key" on node join > --- > > Key: IGNITE-8728 > URL: https://issues.apache.org/jira/browse/IGNITE-8728 > Project: Ignite > Issue Type: Bug > Components: sql >Affects Versions: 2.7 >Reporter: Mahesh Renduchintala >Priority: Critical > Fix For: 2.10 > > Attachments: NS1_ignite-9676df15.0.log, NS2_ignite-7cfc8008.0.log, > node-config.xml > > Time Spent: 0.5h > Remaining Estimate: 0h > > I have two nodes on which we have 3 tables which are partitioned. Index are > also built on these tables. > For 24 hours caches work fine. The tables are definitely distributed across > both the nodes > Node 2 reboots due to some issue - goes out of the baseline - comes back and > joins the baseline. Other baseline nodes crash and in the logs we see > duplicate Key error > [10:38:35,437][INFO][tcp-disco-srvr-#2|#2][TcpDiscoverySpi] TCP discovery > accepted incoming connection [rmtAddr=/192.168.1.7, rmtPort=45102] > [10:38:35,437][INFO][tcp-disco-srvr-#2|#2][TcpDiscoverySpi] TCP discovery > spawning a new thread for connection [rmtAddr=/192.168.1.7, rmtPort=45102] > [10:38:35,437][INFO][tcp-disco-sock-reader-#12|#12][TcpDiscoverySpi] Started > serving remote node connection [rmtAddr=/192.168.1.7:45102, rmtPort=45102] > [10:38:35,451][INFO][tcp-disco-sock-reader-#12|#12][TcpDiscoverySpi] > Finished serving remote node connection [rmtAddr=/192.168.1.7:45102, > rmtPort=45102 > [10:38:35,457][SEVERE][tcp-disco-msg-worker-#3|#3][TcpDiscoverySpi] > TcpDiscoverSpi's message worker thread failed abnormally. Stopping the node > in order to prevent cluster wide instability. > *java.lang.IllegalStateException: Duplicate key* > at org.apache.ignite.cache.QueryEntity.checkIndexes(QueryEntity.java:223) > at org.apache.ignite.cache.QueryEntity.makePatch(QueryEntity.java:174) > at > org.apache.ignite.internal.processors.query.QuerySchema.makePatch(QuerySchema.java:114) > at > org.apache.ignite.internal.processors.cache.DynamicCacheDescriptor.makeSchemaPatch(DynamicCacheDescriptor.java:360) > at > org.apache.ignite.internal.processors.cache.GridCacheProcessor.validateNode(GridCacheProcessor.java:2536) > at > org.apache.ignite.internal.managers.GridManagerAdapter$1.validateNode(GridManagerAdapter.java:566) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processJoinRequestMessage(ServerImpl.java:3629) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2736) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2536) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerAdapter.body(ServerImpl.java:6775) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2621) > at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) > [10:38:35,459][SEVERE][tcp-disco-msg-worker-#3|#3][] Critical system error > detected. Will be handled accordingly to configured handler [hnd=class > o.a.i.failure.StopNodeOrHaltFailureHandler, failureCtx=FailureContext > [type=SYSTEM_WORKER_TERMINATION, err=java.lang.IllegalStateException: > Duplicate key]] > java.lang.IllegalStateException: Duplicate key > at org.apache.ignite.cache.QueryEntity.checkIndexes(QueryEntity.java:223) > at org.apache.ignite.cache.QueryEntity.makePatch(QueryEntity.java:174) > at > org.apache.ignite.internal.processors.query.QuerySchema.makePatch(QuerySchema.java:114) > at > org.apache.ignite.internal.processors.cache.DynamicCacheDescriptor.makeSchemaPatch(DynamicCacheDescriptor.java:360) > at > org.apache.ignite.internal.processors.cache.GridCacheProcessor.validateNode(GridCacheProcessor.java:2536) > at > org.apache.ignite.internal.managers.GridManagerAdapter$1.validateNode(GridManagerAdapter.java:566) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processJoinRequestMessage(ServerImpl.java:3629) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2736) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2536) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerAdapter.body(ServerImpl.java:6775) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2621) > at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) > [10:38:35,460][SEVERE][tcp-disco-msg-worker-#3|#3][] JVM will be halted > immediately due to the failure:
[jira] [Updated] (IGNITE-8723) Detect index corruption at runtime
[ https://issues.apache.org/jira/browse/IGNITE-8723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8723: -- Fix Version/s: (was: 2.9) 2.10 > Detect index corruption at runtime > -- > > Key: IGNITE-8723 > URL: https://issues.apache.org/jira/browse/IGNITE-8723 > Project: Ignite > Issue Type: Improvement >Reporter: Alexey Goncharuk >Priority: Major > Fix For: 2.10 > > > Several times Ignite users faced a situation when SQL indexes were corrupted > (linked to invalid or removed data). > I think it makes sense to slightly rework such errors during index > dereference, and > 1) Fail all ongoing and further SQL queries assigned to this node. SQL mapper > should cache this and try to re-route SQL requests to other OWNING nodes > 2) After index corruption is detected, all or only corrupted indexes should > be deallocated (the decision depends on what is faster) and rebuilt > 3) After indexes are rebuilt, we should notify other nodes that the node is > available for queries again -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-13176) C++: Remove autotools build after merging CMake
[ https://issues.apache.org/jira/browse/IGNITE-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17146558#comment-17146558 ] Ignite TC Bot commented on IGNITE-13176: {panel:title=Branch: [pull/7956/head] Base: [master] : Possible Blockers (3)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1} {color:#d04437}Platform C++ (Win x64 / Debug){color} [[tests 0 BuildFailureOnMessage |https://ci.ignite.apache.org/viewLog.html?buildId=5419522]] {color:#d04437}Platform C++ (Linux Clang){color} [[tests 0 Exit Code |https://ci.ignite.apache.org/viewLog.html?buildId=5419520]] {color:#d04437}Platform C++ (Linux)*{color} [[tests 0 Exit Code |https://ci.ignite.apache.org/viewLog.html?buildId=5419521]] {panel} {panel:title=Branch: [pull/7956/head] Base: [master] : New Tests (876)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1} {color:#8b}Platform C++ (Win x64 / Debug){color} [tests 876] * {color:#013220}IgniteOdbcTest: SqlNumericFunctionTestSuite: TestNumericFunctionFloor - PASSED{color} * {color:#013220}IgniteOdbcTest: SqlNumericFunctionTestSuite: TestNumericFunctionLog - PASSED{color} * {color:#013220}IgniteOdbcTest: SqlDateTimeFunctionTestSuite: TestCurrentDate - PASSED{color} * {color:#013220}IgniteCoreTest: CacheQueryTestSuite: TestFieldsQueryByteArrayInsertSelect - PASSED{color} * {color:#013220}IgniteCoreTest: ContinuousQueryTestSuite: TestBasic - PASSED{color} * {color:#013220}IgniteCoreTest: ContinuousQueryTestSuite: TestInitialQueryScan - PASSED{color} * {color:#013220}IgniteOdbcTest: ApiRobustnessTestSuite: TestSQLGetStmtAttr - PASSED{color} * {color:#013220}IgniteOdbcTest: ApplicationDataBufferTestSuite: TestPutStringToLong - PASSED{color} * {color:#013220}IgniteCoreTest: ContinuousQueryTestSuite: TestInitialQuerySql - PASSED{color} * {color:#013220}IgniteOdbcTest: ApplicationDataBufferTestSuite: TestPutStringToTiny - PASSED{color} * {color:#013220}IgniteCoreTest: ContinuousQueryTestSuite: TestInitialQueryText - PASSED{color} ... and 865 tests blockers {panel} [TeamCity *- Run :: CPP* Results|https://ci.ignite.apache.org/viewLog.html?buildId=5419529buildTypeId=IgniteTests24Java8_RunCpp] > C++: Remove autotools build after merging CMake > --- > > Key: IGNITE-13176 > URL: https://issues.apache.org/jira/browse/IGNITE-13176 > Project: Ignite > Issue Type: Improvement > Components: platforms >Reporter: Ivan Daschinskiy >Assignee: Ivan Daschinskiy >Priority: Trivial > Fix For: 2.9 > > Time Spent: 10m > Remaining Estimate: 0h > > Old autotools build scripts and release steps in {{pom.xml}} should be > removed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (IGNITE-7499) DataRegionMetricsImpl#getPageSize returns ZERO for system data regions
[ https://issues.apache.org/jira/browse/IGNITE-7499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Kuznetsov reassigned IGNITE-7499: Assignee: (was: Andrey Kuznetsov) > DataRegionMetricsImpl#getPageSize returns ZERO for system data regions > -- > > Key: IGNITE-7499 > URL: https://issues.apache.org/jira/browse/IGNITE-7499 > Project: Ignite > Issue Type: Bug > Components: cache >Reporter: Alexey Kuznetsov >Priority: Major > Fix For: 2.10 > > > Working on IGNITE-7492 I found that DataRegionMetricsImpl#getPageSize returns > ZERO for system data regions. > Meanwhile there is also > org.apache.ignite.internal.pagemem.PageMemory#systemPageSize method. > That looks a bit strange, why we need PageSize and SystemPageSize ? > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8635) Add a method to inspect BinaryObject size
[ https://issues.apache.org/jira/browse/IGNITE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8635: -- Fix Version/s: (was: 2.9) 2.10 > Add a method to inspect BinaryObject size > - > > Key: IGNITE-8635 > URL: https://issues.apache.org/jira/browse/IGNITE-8635 > Project: Ignite > Issue Type: Improvement >Reporter: Alexey Goncharuk >Priority: Major > Fix For: 2.10 > > > Currently only concrete implementations of {{BinaryObject}} interface provide > some information regarding the object serialized size. It makes it hard for > users to reason about storage size and estimate required storage capacity. > We need to add a method to the {{BinaryObject}} interface itself to return > the actual size required to store the object. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8646) Setting different MAC addresses to nodes in test environment causes mass test fail
[ https://issues.apache.org/jira/browse/IGNITE-8646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8646: -- Fix Version/s: (was: 2.9) 2.10 > Setting different MAC addresses to nodes in test environment causes mass test > fail > -- > > Key: IGNITE-8646 > URL: https://issues.apache.org/jira/browse/IGNITE-8646 > Project: Ignite > Issue Type: Bug >Affects Versions: 2.5 >Reporter: Ivan Rakov >Priority: Major > Fix For: 2.10 > > > There are some parts of logic in Ignite that check whether two nodes are > actually hosted on the same physical machine (e.g. excludeNeighbors flag in > affinity function, load balancing for replicated cache, etc) and choose the > appropriate behavior. These part can be tracked by usages of > IgniteNodeAttributes#ATTR_MACS attribute. > I've tried to emulate distributed environment in tests by overriding > ATTR_MACS with random UUID. This caused mass consistency failures in basic > and Full API tests. We should investigate this: probably, many bugs are > hidden by the fact that nodes are always started on the same physical machine > in our TeamCity tests. > PR with macs override: https://github.com/apache/ignite/pull/4084 > TC run: > https://ci.ignite.apache.org/viewLog.html?buildId=1342076=buildResultsDiv=IgniteTests24Java8_RunAll -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8652) Cache dynamically started from client while there are no affinity server nodes will be forever considered in-memory
[ https://issues.apache.org/jira/browse/IGNITE-8652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8652: -- Fix Version/s: (was: 2.9) 2.10 > Cache dynamically started from client while there are no affinity server > nodes will be forever considered in-memory > --- > > Key: IGNITE-8652 > URL: https://issues.apache.org/jira/browse/IGNITE-8652 > Project: Ignite > Issue Type: Bug >Reporter: Ivan Rakov >Priority: Major > Labels: IEP-4, Phase-2 > Fix For: 2.10 > > > We implemented stealing data storage configuration from affinity server node > during initialization of dynamic cache on client (IGNITE-8476). Though, if > there are no affinity nodes at the moment of cache start, client will > consider cache as in-memory even when affinity node will proper data storage > configuration (telling that it's actually persistent cache) will appear. > That means cache operations on client may fail with the same error: > {noformat} > java.lang.AssertionError: Wrong ready topology version for invalid partitions > response > {noformat} > ClientAffinityAssignmentWithBaselineTest#testDynamicCacheStartNoAffinityNodes > should pass after the fix. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8572) StandaloneWalRecordsIterator may throw NPE if compressed WAL segment is empty
[ https://issues.apache.org/jira/browse/IGNITE-8572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8572: -- Fix Version/s: (was: 2.9) 2.10 > StandaloneWalRecordsIterator may throw NPE if compressed WAL segment is empty > - > > Key: IGNITE-8572 > URL: https://issues.apache.org/jira/browse/IGNITE-8572 > Project: Ignite > Issue Type: Bug > Components: persistence >Reporter: Ivan Rakov >Assignee: Ivan Rakov >Priority: Major > Fix For: 2.10 > > > In case ZIP archive with WAL segment doesn't contain any ZIP entries, attempt > to iterate through it with standalone WAL iterator will throw NPE: > {noformat} > Caused by: java.lang.NullPointerException > at > org.apache.ignite.internal.processors.cache.persistence.file.UnzipFileIO.(UnzipFileIO.java:53) > at > org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.initReadHandle(AbstractWalRecordsIterator.java:265) > at > org.apache.ignite.internal.processors.cache.persistence.wal.reader.StandaloneWalRecordsIterator.advanceSegment(StandaloneWalRecordsIterator.java:262) > at > org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.advance(AbstractWalRecordsIterator.java:155) > at > org.apache.ignite.internal.processors.cache.persistence.wal.reader.StandaloneWalRecordsIterator.(StandaloneWalRecordsIterator.java:111) > at > org.apache.ignite.internal.processors.cache.persistence.wal.reader.IgniteWalIteratorFactory.iteratorArchiveDirectory(IgniteWalIteratorFactory.java:156) > ... 6 more > {noformat} > We should throw excpetion with descriptive error message instead. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8500) [Cassandra] Allow dynamic cache creation with Cassandra Store
[ https://issues.apache.org/jira/browse/IGNITE-8500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8500: -- Fix Version/s: (was: 2.9) 2.10 > [Cassandra] Allow dynamic cache creation with Cassandra Store > - > > Key: IGNITE-8500 > URL: https://issues.apache.org/jira/browse/IGNITE-8500 > Project: Ignite > Issue Type: Improvement > Components: cassandra >Affects Versions: 2.5 >Reporter: Dmitry Karachentsev >Priority: Major > Fix For: 2.10 > > > It's not possible for now dynamically create caches with Cassandra cache > store: > {noformat} > class org.apache.ignite.IgniteCheckedException: null > at > org.apache.ignite.internal.util.IgniteUtils.cast(IgniteUtils.java:7307) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.resolve(GridFutureAdapter.java:259) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:207) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:159) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:151) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:2433) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2299) > at > org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.NullPointerException > at > org.apache.ignite.cache.store.cassandra.persistence.PojoField.calculatedField(PojoField.java:155) > at > org.apache.ignite.cache.store.cassandra.persistence.PersistenceController.prepareLoadStatements(PersistenceController.java:313) > at > org.apache.ignite.cache.store.cassandra.persistence.PersistenceController.(PersistenceController.java:85) > at > org.apache.ignite.cache.store.cassandra.CassandraCacheStore.(CassandraCacheStore.java:106) > at > org.apache.ignite.cache.store.cassandra.CassandraCacheStoreFactory.create(CassandraCacheStoreFactory.java:59) > at > org.apache.ignite.cache.store.cassandra.CassandraCacheStoreFactory.create(CassandraCacheStoreFactory.java:34) > at > org.apache.ignite.internal.processors.cache.GridCacheProcessor.createCache(GridCacheProcessor.java:1437) > at > org.apache.ignite.internal.processors.cache.GridCacheProcessor.prepareCacheStart(GridCacheProcessor.java:1945) > at > org.apache.ignite.internal.processors.cache.GridCacheProcessor.startReceivedCaches(GridCacheProcessor.java:1864) > at > org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:674) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:2419) > ... 3 more > {noformat} > PR with test https://github.com/apache/ignite/pull/4003 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8621) TcpDiscoverySpi's UpTime/StartTimestamp methods do not work
[ https://issues.apache.org/jira/browse/IGNITE-8621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8621: -- Fix Version/s: (was: 2.9) 2.10 > TcpDiscoverySpi's UpTime/StartTimestamp methods do not work > --- > > Key: IGNITE-8621 > URL: https://issues.apache.org/jira/browse/IGNITE-8621 > Project: Ignite > Issue Type: Bug >Affects Versions: 2.5 >Reporter: Max Shonichev >Priority: Major > Labels: jmx > Fix For: 2.10 > > Attachments: Screenshot from 2018-05-28 12-57-13.png > > > getUpTime and getStartTimestamp methods of TcpDiscoverySpiMBeanImpl returns > 0, which does not look like normal value > also, getUpTimeFormatted returns 00:00:00.000 and getStartTimestampFormatted > returns Jan 1, 1970 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8596) SQL: remove unnecessary index lookups when query parallelism is enabled
[ https://issues.apache.org/jira/browse/IGNITE-8596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8596: -- Fix Version/s: (was: 2.9) 2.10 > SQL: remove unnecessary index lookups when query parallelism is enabled > --- > > Key: IGNITE-8596 > URL: https://issues.apache.org/jira/browse/IGNITE-8596 > Project: Ignite > Issue Type: Improvement > Components: sql >Affects Versions: 2.5 >Reporter: Vladimir Ozerov >Priority: Major > Labels: performance > Fix For: 2.10 > > > See > {{org.apache.ignite.internal.processors.query.h2.twostep.GridMapQueryExecutor#onQueryRequest}} > method. If table is segmented, we will submit as many SQL requests as much > segments. But consider a case when target cache partition(s) is already > defined by user or derived through partition pruning. In this case most of > segments will not contain useful information and return empty result set. At > the same time these queries may impose index or data page scans, thus > consuming resources without a reason. > To mitigate the problem we should not submit SQL requests to segments we are > not interested in. > Note that it is not sufficient to simply skip SQL requests on mapper, because > reducer expects separate response for every message. We should fix both local > mapper logic as well as protocol. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8481) VisorValidateIndexesJob works very slowly in case of many partitions/keys for each partition.
[ https://issues.apache.org/jira/browse/IGNITE-8481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8481: -- Fix Version/s: (was: 2.9) 2.10 > VisorValidateIndexesJob works very slowly in case of many partitions/keys for > each partition. > - > > Key: IGNITE-8481 > URL: https://issues.apache.org/jira/browse/IGNITE-8481 > Project: Ignite > Issue Type: Bug >Affects Versions: 2.5 >Reporter: Alexey Scherbakov >Priority: Major > Fix For: 2.10 > > Attachments: ignite.zip, thrdump-server.log > > > I tried to validate indexes using newly introduced VisorValidateIndexesTask > from control.sh and found what on large data set it works very slowly. > Process was not finished for 12 hours from start. > Looking through a thread dump I've noticed following problems: > 1. ValidateIndexesClosure works not in optimal way by doing btree lookup for > each index for each entry of each partition. It should be faster to validate > by scanning index tree. > 2. Thread dump shows contention on acquiring segment read lock by worker > pool-XXX threads, but no obvious reason for holding write lock (no load on > grid) > 3. > org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.Segment#partGeneration > generates garbage on each page access. > Check attachment for log and thread dump. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8496) Check necessity of JDK check in SqlLine script
[ https://issues.apache.org/jira/browse/IGNITE-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8496: -- Fix Version/s: (was: 2.9) 2.10 > Check necessity of JDK check in SqlLine script > -- > > Key: IGNITE-8496 > URL: https://issues.apache.org/jira/browse/IGNITE-8496 > Project: Ignite > Issue Type: Task >Affects Versions: 2.5 >Reporter: Peter Ivanov >Assignee: Peter Ivanov >Priority: Major > Fix For: 2.10 > > > Investigate the necessity of JDK checks in SqlLine scripts (both sh and cmd). > Currently, it check that Java has version 1.8 or 9. With current Java release > cycle, regular updating of this check can be excessive. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-8411) Binary Client Protocol spec: other parts clarifications
[ https://issues.apache.org/jira/browse/IGNITE-8411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17146430#comment-17146430 ] Aleksey Plekhanov commented on IGNITE-8411: --- [~isapego], Do we have a chance to include this ticket to 2.9 release? > Binary Client Protocol spec: other parts clarifications > --- > > Key: IGNITE-8411 > URL: https://issues.apache.org/jira/browse/IGNITE-8411 > Project: Ignite > Issue Type: Improvement > Components: documentation, thin client >Affects Versions: 2.4 >Reporter: Alexey Kosenchuk >Assignee: Igor Sapego >Priority: Major > Fix For: 2.9 > > > issues against previous parts: IGNITE-8039 IGNITE-8212 > Cache Configuration > --- > > [https://apacheignite.readme.io/docs/binary-client-protocol-cache-configuration-operations] > - OP_CACHE_GET_CONFIGURATION and OP_CACHE_CREATE_WITH_CONFIGURATION - > QueryEntity - Structure of QueryField: > absent "default value - type Object" - it is the last field of the > QueryField in reality. > - OP_CACHE_GET_CONFIGURATION - Structure of Cache Configuration: > Absent CacheAtomicityMode - is the first field in reality. > Absent MaxConcurrentAsyncOperations - is between DefaultLockTimeout and > MaxQueryIterators in reality. > "Invalidate" field - does not exist in reality. > - meaning and possible values of every configuration parameter must be > clarified. If clarified in other docs, this spec must have link(s) to that > docs. > - suggest to combine somehow Cache Configuration descriptions in > OP_CACHE_GET_CONFIGURATION and OP_CACHE_CREATE_WITH_CONFIGURATION - to avoid > duplicated descriptions. > SQL and Scan Queries > > [https://apacheignite.readme.io/docs/binary-client-protocol-sql-operations] > - "Flag. Pass 0 for default, or 1 to keep the value in binary form.": > "the value in binary form" flag should be left end clarified in the > operations to which it is applicable for. > - OP_QUERY_SQL: > most of the fields in the request must be clarified. If clarified in other > docs, this spec must have link(s) to that docs. > For example: > ** "Name of a type or SQL table": name of what type? > - OP_QUERY_SQL_FIELDS: > most of the fields in the request must be clarified. If clarified in other > docs, this spec must have link(s) to that docs. > For example: > ** is there any correlation between "Query cursor page size" and "Max rows"? > ** "Statement type": why there are only three types? what about INSERT, etc.? > - OP_QUERY_SQL_FIELDS_CURSOR_GET_PAGE Response does not contain Cursor id. > But responses for all other query operations contain it. Is it intentional? > - OP_QUERY_SCAN_CURSOR_GET_PAGE Response - Cursor id is absent in reality. > - OP_QUERY_SCAN_CURSOR_GET_PAGE Response - Row count field: says type > "long". Should be "int". > - OP_QUERY_SCAN: > format and rules of the Filter object must be clarified. If clarified in > other docs, this spec must have link(s) to that docs. > - OP_QUERY_SCAN: > in general, it's not clear how this operation should be supported on > platforms other than the mentioned in "Filter platform" field. > - OP_QUERY_SCAN: "Number of partitions to query" > Should be updated to "A partition number to query" > > Binary Types > > > [https://apacheignite.readme.io/docs/binary-client-protocol-binary-type-operations] > - somewhere should be explained when and why these operations need to be > supported by a client. > - Type id and Field id: > should be clarified that before an Id calculation Type and Field names must > be updated to low case. > - OP_GET_BINARY_TYPE and OP_PUT_BINARY_TYPE - BinaryField - Type id: > in reality it is not a type id (hash code) but a type code (1, 2,... 10,... > 103,...). > - OP_GET_BINARY_TYPE and OP_PUT_BINARY_TYPE - "Affinity key field name": > should be explained what is it. If explained in other docs, this spec must > have link(s) to that docs. > - OP_PUT_BINARY_TYPE - schema id: > mandatory algorithm of schema Id calculation must be described somewhere. If > described in other docs, this spec must have link(s) to that docs. > - OP_REGISTER_BINARY_TYPE_NAME and OP_GET_BINARY_TYPE_NAME: > should be explained when and why these operations need to be supported by a > client. > How this operation should be supported on platforms other than the mentioned > in "Platform id" field. > - OP_REGISTER_BINARY_TYPE_NAME: > Type name - is it "full" or "short" name here? > Type id - is it a hash from "full" or "short" name here? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8415) Manual cache().rebalance() invocation may cancel currently running rebalance
[ https://issues.apache.org/jira/browse/IGNITE-8415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8415: -- Fix Version/s: (was: 2.9) 2.10 > Manual cache().rebalance() invocation may cancel currently running rebalance > > > Key: IGNITE-8415 > URL: https://issues.apache.org/jira/browse/IGNITE-8415 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.4 >Reporter: Pavel Kovalenko >Priority: Major > Fix For: 2.10 > > > If historical rebalance happens and during this process we manually invoke > {noformat} > Ignite.cache(CACHE_NAME).rebalance().get(); > {noformat} > then currently running rebalance will be cancelled and started new which > seems not right way. Moreover, after new rebalance finish we can lost some > data in case of rebalancing entry removes. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8392) Removing WAL history directory leads to JVM crush on that node.
[ https://issues.apache.org/jira/browse/IGNITE-8392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8392: -- Fix Version/s: (was: 2.9) 2.10 > Removing WAL history directory leads to JVM crush on that node. > --- > > Key: IGNITE-8392 > URL: https://issues.apache.org/jira/browse/IGNITE-8392 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.4 > Environment: Ubuntu 17.10 > Oracle JVM Server (1.8.0_151-b12) >Reporter: Pavel Kovalenko >Priority: Major > Fix For: 2.10 > > > Problem: > 1) Start node, load some data, deactivate cluster > 2) Remove WAL history directory. > 3) Activate cluster. > Cluster activation will be failed due to JVM crush like this: > {noformat} > # > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGBUS (0x7) at pc=0x7feda1052526, pid=29331, tid=0x7fed193d7700 > # > # JRE version: Java(TM) SE Runtime Environment (8.0_151-b12) (build > 1.8.0_151-b12) > # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.151-b12 mixed mode > linux-amd64 compressed oops) > # Problematic frame: > # v ~StubRoutines::jshort_disjoint_arraycopy > # > # Failed to write core dump. Core dumps have been disabled. To enable core > dumping, try "ulimit -c unlimited" before starting Java again > # > # If you would like to submit a bug report, please visit: > # http://bugreport.java.com/bugreport/crash.jsp > # > --- T H R E A D --- > Current thread (0x7fec8b202800): JavaThread > "db-checkpoint-thread-#243%wal.IgniteWalRebalanceTest0%" [_thread_in_Java, > id=29655, stack(0x7fed192d7000,0x7fed193d8000)] > siginfo: si_signo: 7 (SIGBUS), si_code: 2 (BUS_ADRERR), si_addr: > 0x7fed198ee0b2 > Registers: > RAX=0x0007710a9f28, RBX=0x000120b2, RCX=0x0800, > RDX=0xfe08 > RSP=0x7fed193d5c60, RBP=0x7fed193d5c60, RSI=0x7fed198ef0aa, > RDI=0x0007710a9f20 > R8 =0x1000, R9 =0x000120b2, R10=0x7feda1052da0, > R11=0x1004 > R12=0x, R13=0x0007710a9f28, R14=0x1000, > R15=0x7fec8b202800 > RIP=0x7feda1052526, EFLAGS=0x00010282, CSGSFS=0x002b0033, > ERR=0x0006 > TRAPNO=0x000e > Top of Stack: (sp=0x7fed193d5c60) > 0x7fed193d5c60: 0007710a9f28 7feda1be314f > 0x7fed193d5c70: 00010002 7feda17747fd > 0x7fed193d5c80: a8008c96 7feda11cfb3e > 0x7fed193d5c90: > 0x7fed193d5ca0: > 0x7fed193d5cb0: > 0x7fed193d5cc0: 0007710a9f28 7feda1fb37e0 > 0x7fed193d5cd0: 0007710a8ef0 00076fa5f5c0 > 0x7fed193d5ce0: 0007710a9f28 0007710a8ef0 > 0x7fed193d5cf0: 0007710a8ef0 7fed193d5d18 > 0x7fed193d5d00: 7fedb8428c76 > 0x7fed193d5d10: 1014 00076fa5f650 > 0x7fed193d5d20: f8043261 7feda1ee597c > 0x7fed193d5d30: 00076fa5f5a8 0007710a9f28 > 0x7fed193d5d40: 0007710a8ef0 000120a2 > 0x7fed193d5d50: 00012095 1021 > 0x7fed193d5d60: edf4bec3 0001209e > 0x7fed193d5d70: 0007710a9f28 00076fa5f650 > 0x7fed193d5d80: 7fed193d5da8 1014 > 0x7fed193d5d90: 0007710a8ef0 7fed198dc000 > 0x7fed193d5da0: 00076fa5f650 7feda1b7a040 > 0x7fed193d5db0: 0007710a9f28 00076fa700d0 > 0x7fed193d5dc0: 0007710a9f68 ee2153e5f8043261 > 0x7fed193d5dd0: 0007710a8ef0 0007710a9f98 > 0x7fed193d5de0: 00012095 0007710a9f28 > 0x7fed193d5df0: 1fa0 > 0x7fed193d5e00: > 0x7fed193d5e10: 0007710a8ef0 7feda2001530 > 0x7fed193d5e20: 0007710a8ef0 00076f7c05e8 > 0x7fed193d5e30: edef80bd > 0x7fed193d5e40: > 0x7fed193d5e50: 7fedb2266000 7feda1cb1f8c > Instructions: (pc=0x7feda1052526) > 0x7feda1052506: 00 00 74 08 66 8b 47 08 66 89 46 08 48 33 c0 c9 > 0x7feda1052516: c3 66 0f 1f 84 00 00 00 00 00 c5 fe 6f 44 d7 c8 > 0x7feda1052526: c5 fe 7f 44 d6 c8 c5 fe 6f 4c d7 e8 c5 fe 7f 4c > 0x7feda1052536: d6 e8 48 83 c2 08 7e e2 48 83 ea 04 7f 10 c5 fe > Register to memory mapping: > RAX=0x0007710a9f28 is an oop > java.nio.DirectByteBuffer > - klass: 'java/nio/DirectByteBuffer' > RBX=0x000120b2 is an unknown value > RCX=0x0800 is an unknown
[jira] [Updated] (IGNITE-8391) Removing some WAL history segments leads to WAL rebalance hanging
[ https://issues.apache.org/jira/browse/IGNITE-8391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8391: -- Fix Version/s: (was: 2.9) 2.10 > Removing some WAL history segments leads to WAL rebalance hanging > - > > Key: IGNITE-8391 > URL: https://issues.apache.org/jira/browse/IGNITE-8391 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.4 >Reporter: Pavel Kovalenko >Priority: Major > Fix For: 2.10 > > > Problem: > 1) Start 2 nodes, load some data to it. > 2) Stop node 2, load some data to cache. > 3) Remove WAL archived segment which doesn't contain Checkpoint record needed > to find start point for WAL rebalance, but contains necessary data for > rebalancing. > 4) Start node 2, this node will start rebalance data from node 1 using WAL. > Rebalance will be hanged with following assertion: > {noformat} > java.lang.AssertionError: Partitions after rebalance should be either done or > missing: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, > 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] > at > org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionSupplier.handleDemandMessage(GridDhtPartitionSupplier.java:417) > at > org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPreloader.handleDemandMessage(GridDhtPreloader.java:364) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:379) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:364) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1054) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:579) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$700(GridCacheIoManager.java:99) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1603) > at > org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$4100(GridIoManager.java:125) > at > org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:2752) > at > org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1516) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$4400(GridIoManager.java:125) > at > org.apache.ignite.internal.managers.communication.GridIoManager$10.run(GridIoManager.java:1485) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {noformat} > > This happened because we never reached necessary data and updateCounters > contained in removed WAL segment. > To resolve such problems we should introduce some fallback strategy if > rebalance by WAL has been failed. Example of fallback strategy is - re-run > full rebalance for partitions that were not able properly rebalanced using > WAL. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8375) NPE due to race on cache stop and timeout handler execution.
[ https://issues.apache.org/jira/browse/IGNITE-8375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8375: -- Fix Version/s: (was: 2.9) 2.10 > NPE due to race on cache stop and timeout handler execution. > > > Key: IGNITE-8375 > URL: https://issues.apache.org/jira/browse/IGNITE-8375 > Project: Ignite > Issue Type: Bug >Affects Versions: 2.4 >Reporter: Alexey Scherbakov >Priority: Major > Fix For: 2.10 > > > {noformat} > 2018-04-22 22:03:08.547 [INFO > ][Thread-25420][o.a.i.i.p.cache.GridCacheProcessor] Stopped cache > [cacheName=com.sbt.cdm.api.model.published.dictionary.PublishedSystem, > group=CACHEGROUP_DICTIONARY] > 2018-04-22 22:03:08.548 > [ERROR][grid-timeout-worker-#119%DPL_GRID%DplGridNodeName%][o.a.i.i.p.t.GridTimeoutProcessor] > Error when executing timeout callback: LockTimeoutObject [] > java.lang.NullPointerException: null > at > org.apache.ignite.internal.processors.cache.GridCacheContext.loadPreviousValue(GridCacheContext.java:1450) > at > org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.loadMissingFromStore(GridDhtLockFuture.java:1030) > at > org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.onComplete(GridDhtLockFuture.java:731) > at > org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.access$900(GridDhtLockFuture.java:82) > at > org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture$LockTimeoutObject.onTimeout(GridDhtLockFuture.java:1133) > at > org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:163) > at > org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) > at java.lang.Thread.run(Thread.java:745) > {noformat} > NPE caused by execution of method [1] during timeout handler execution [2]: > cacheCfg.isLoadPreviousValue() throws NPE because cacheCfg can be nulled by > [3] on stop. > [1] > org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture#loadMissingFromStore > [2] > org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.LockTimeoutObject#onTimeout > [3] org.apache.ignite.internal.processors.cache.GridCacheContext#cleanup -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8377) Add cluster (de)activation LifecycleBean callbacks
[ https://issues.apache.org/jira/browse/IGNITE-8377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8377: -- Fix Version/s: (was: 2.9) 2.10 > Add cluster (de)activation LifecycleBean callbacks > -- > > Key: IGNITE-8377 > URL: https://issues.apache.org/jira/browse/IGNITE-8377 > Project: Ignite > Issue Type: Improvement >Reporter: Alexey Goncharuk >Assignee: Sergey Dorozhkin >Priority: Major > Labels: newbie > Fix For: 2.10 > > > I suggest to add new {{LifecycleEventType}}, {{BEFORE_CLUSTER_ACTIVATE}}, > {{AFTER_CLUSTER_ACTIVATE}}, {{BEFORE_CLUSTER_DEACTIVATE}}, > {{AFTER_CLUSTER_DEACTIVATE}} and fire corresponding lifecycle events along > with regular events. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8356) Possible race at the discovery on the start node
[ https://issues.apache.org/jira/browse/IGNITE-8356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8356: -- Fix Version/s: (was: 2.9) 2.10 > Possible race at the discovery on the start node > > > Key: IGNITE-8356 > URL: https://issues.apache.org/jira/browse/IGNITE-8356 > Project: Ignite > Issue Type: Bug >Affects Versions: 2.4 >Reporter: Taras Ledkov >Priority: Major > Fix For: 2.10 > > > The problem was discovered on the IGNITE-8355. > But *root cause* is the race on the start of the node discovery. > The race happens because a joining node may start processing NodeAddMessage > before processing local node's NodeAddFinishedMessage. Because of this, the > local node will not have any constructed DiscoCache yet and NPE happens. > We need to take a look at the workaround suggested in IGNITE-8355 and > 1) Check if any public API changes are needed on DiscoverySpi interface > 2) Verify it works for ZookeeperDiscoverySpi. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8360) Page recovery from WAL can be very slow.
[ https://issues.apache.org/jira/browse/IGNITE-8360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8360: -- Fix Version/s: (was: 2.9) 2.10 > Page recovery from WAL can be very slow. > > > Key: IGNITE-8360 > URL: https://issues.apache.org/jira/browse/IGNITE-8360 > Project: Ignite > Issue Type: Improvement > Components: persistence >Affects Versions: 2.4 >Reporter: Alexey Scherbakov >Priority: Major > Fix For: 2.10 > > > Current implementation tries to recover corrupted page from WAL, potentially > scanning all archived segments [1] > If archive is very large, on example due to large history, this might take > significant time preventing cache start with consequences like hanging PME. > [1] > org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl#tryToRestorePage -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8366) Replace service instance parameter with a class name in ServiceConfiguration
[ https://issues.apache.org/jira/browse/IGNITE-8366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8366: -- Fix Version/s: (was: 2.9) 2.10 > Replace service instance parameter with a class name in ServiceConfiguration > > > Key: IGNITE-8366 > URL: https://issues.apache.org/jira/browse/IGNITE-8366 > Project: Ignite > Issue Type: Improvement > Components: managed services >Reporter: Denis Mekhanikov >Priority: Major > Labels: iep-17 > Fix For: 2.10 > > > {{ServiceConfiguration#service}} parameter should be replaced with a > {{className}}. All parameters, needed for service initialization, should be > provided as a map of properties in {{ServiceConfiguration}}. > This approach has two advantages: > # It allows service redeployment with changed classes, because there will be > no need to deserialize the service object. > # Changes of initialization parameters will be able to be detected, when > manual redeployment happens. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8338) Cache operations hang after cluster deactivation and activation again
[ https://issues.apache.org/jira/browse/IGNITE-8338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8338: -- Fix Version/s: (was: 2.9) 2.10 > Cache operations hang after cluster deactivation and activation again > - > > Key: IGNITE-8338 > URL: https://issues.apache.org/jira/browse/IGNITE-8338 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.4 >Reporter: Pavel Kovalenko >Priority: Major > Fix For: 2.10 > > > Problem: > 1) Start several nodes > 2) Activate cluster > 3) Run cache load > 4) Deactivate cluster > 5) Activate again > After second activation cache operations hang with following stacktrace: > {noformat} > "cache-load-2" #210 prio=5 os_prio=0 tid=0x7efbb401b800 nid=0x602b > waiting on condition [0x7efb809b3000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140) > at > org.apache.ignite.internal.processors.cache.GridCacheProcessor.publicJCache(GridCacheProcessor.java:3782) > at > org.apache.ignite.internal.processors.cache.GridCacheProcessor.publicJCache(GridCacheProcessor.java:3753) > at > org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.checkProxyIsValid(GatewayProtectedCacheProxy.java:1486) > at > org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.onEnter(GatewayProtectedCacheProxy.java:1508) > at > org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.put(GatewayProtectedCacheProxy.java:785) > at > org.apache.ignite.internal.processors.cache.IgniteClusterActivateDeactivateTestWithPersistence.lambda$testDeactivateDuringEviction$0(IgniteClusterActivateDeactivateTestWithPersistence.java:316) > at > org.apache.ignite.internal.processors.cache.IgniteClusterActivateDeactivateTestWithPersistence$$Lambda$39/832408842.run(Unknown > Source) > at > org.apache.ignite.testframework.GridTestUtils$6.call(GridTestUtils.java:1254) > at > org.apache.ignite.testframework.GridTestThread.run(GridTestThread.java:86) > {noformat} > It seems, dynamicStartCache future never completes after second activation. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8352) IgniteStandByClusterSuite: IgniteClusterActivateDeactivateTest.testClientReconnectClusterDeactivateInProgress flaky fail rate 12%
[ https://issues.apache.org/jira/browse/IGNITE-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8352: -- Fix Version/s: (was: 2.9) 2.10 > IgniteStandByClusterSuite: > IgniteClusterActivateDeactivateTest.testClientReconnectClusterDeactivateInProgress > flaky fail rate 12% > - > > Key: IGNITE-8352 > URL: https://issues.apache.org/jira/browse/IGNITE-8352 > Project: Ignite > Issue Type: Test >Reporter: Maxim Muzafarov >Priority: Major > Labels: MakeTeamcityGreenAgain > Fix For: 2.10 > > > IgniteStandByClusterSuite: > IgniteClusterActivateDeactivateTest.testClientReconnectClusterDeactivateInProgress > (master fail rate 12,8%) > [https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=5210129488604303757=%3Cdefault%3E=testDetails] > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8299) Optimize allocations and CPU consumption in active page replacement scenario
[ https://issues.apache.org/jira/browse/IGNITE-8299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8299: -- Fix Version/s: (was: 2.9) 2.10 > Optimize allocations and CPU consumption in active page replacement scenario > > > Key: IGNITE-8299 > URL: https://issues.apache.org/jira/browse/IGNITE-8299 > Project: Ignite > Issue Type: Improvement > Components: persistence >Reporter: Ivan Rakov >Priority: Major > Fix For: 2.10 > > Attachments: loader-2018-04-17T12-12-21.jfr, > loader-2018-04-17T12-12-21.jfr, loader-2018-04-17T15-10-52.jfr, > loader-2018-04-17T15-10-52.jfr > > > Ignite performance significantly decreases when total size of local data is > much greater than size of RAM. It can be explained by change of disk access > pattern (random reads + random writes is complex even for SSDs), but after > analysis of persistence code and JFRs it's clear that there's still room for > optimization. > The following possible optimizations should be investigated: > 1) PageMemoryImpl.Segment#partGeneration performs allocation of > GroupPartitionId during HashMap.get - we can get rid of it. > 2) LoadedPagesMap#getNearestAt is invoked at least 5 times in > PageMemoryImpl.Segment#removePageForReplacement. It performs two allocations > - we can get rid of it. > 3) If one of 5 evict candidates was erroneous, we'll find 5 new ones - we can > reuse remaining 4 instead. > JFRs that highlight excessive CPU usage by page replacement code is attached. > See 1st and 3rd positions in "Hot Methods" section: > Stack Trace Sample CountPercentage(%) > PageMemoryImpl.acquirePage(int, long, boolean)4 963 19,73 > scala.Some.equals(Object) 4 932 19,606 > java.util.HashMap.getNode(int, Object)3 236 12,864 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8293) BinaryUtils#isCustomJavaSerialization fails when only readObject is declared in a class
[ https://issues.apache.org/jira/browse/IGNITE-8293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8293: -- Fix Version/s: (was: 2.9) 2.10 > BinaryUtils#isCustomJavaSerialization fails when only readObject is declared > in a class > --- > > Key: IGNITE-8293 > URL: https://issues.apache.org/jira/browse/IGNITE-8293 > Project: Ignite > Issue Type: Bug > Components: binary >Affects Versions: 2.4 >Reporter: MihkelJ >Assignee: MihkelJ >Priority: Minor > Fix For: 2.10 > > > Consider this class: > > {code:java} > public class Test implements Serializable { > private transient AtomicBoolean dirty = new AtomicBoolean(false); > private void readObject(java.io.ObjectInputStream in) throws IOException, > ClassNotFoundException { > dirty = new AtomicBoolean(false); > } > //methods to check and mark class as dirty > }{code} > {{isCustomJavaSerialization}} will get a {{NoSuchMethodException}} when > trying to grab the {{writeObject}} method and falsely conclude that Test > doesn't use custom serialization. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8283) CPP: Implement 'varint' solution to be configurable via BinaryConfiguration
[ https://issues.apache.org/jira/browse/IGNITE-8283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8283: -- Fix Version/s: (was: 2.9) 2.10 > CPP: Implement 'varint' solution to be configurable via BinaryConfiguration > --- > > Key: IGNITE-8283 > URL: https://issues.apache.org/jira/browse/IGNITE-8283 > Project: Ignite > Issue Type: Sub-task >Reporter: Vyacheslav Daradur >Priority: Major > Labels: c++ > Fix For: 2.10 > > > Need to finish the solution that has been prepared into IGNITE-5153 to be > configurable via {{BinaryConfiguration}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8225) Add a command to control script to print current topology version
[ https://issues.apache.org/jira/browse/IGNITE-8225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8225: -- Fix Version/s: (was: 2.9) 2.10 > Add a command to control script to print current topology version > - > > Key: IGNITE-8225 > URL: https://issues.apache.org/jira/browse/IGNITE-8225 > Project: Ignite > Issue Type: Improvement >Reporter: Alexey Goncharuk >Assignee: ruchir choudhry >Priority: Critical > Fix For: 2.10 > > > The command should be {{./control.sh --topology}} and should print a short > summary about the current topology (topology version, number of client nodes, > number of server nodes, baseline topology information) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8224) Print out a warning message if there are partitions mapped only to offline nodes
[ https://issues.apache.org/jira/browse/IGNITE-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8224: -- Fix Version/s: (was: 2.9) 2.10 > Print out a warning message if there are partitions mapped only to offline > nodes > > > Key: IGNITE-8224 > URL: https://issues.apache.org/jira/browse/IGNITE-8224 > Project: Ignite > Issue Type: Improvement >Reporter: Alexey Goncharuk >Assignee: Sergey Skudnov >Priority: Critical > Fix For: 2.10 > > > We need to print out the following: > 1) A warning on partition map exchange if we got a partition mapped only to > offline baseline nodes > 2) Add periodic printouts if we have a cache with configured backups, but the > actual number of backups for any partition is 0 because of offline baseline > nodes (the warning should suggest ways to fix this - either start the offline > baseline node or change the baseline topology) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8213) control.sh: make a correct error message in case of adding non-existen node to baseline
[ https://issues.apache.org/jira/browse/IGNITE-8213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8213: -- Fix Version/s: (was: 2.9) 2.10 > control.sh: make a correct error message in case of adding non-existen node > to baseline > --- > > Key: IGNITE-8213 > URL: https://issues.apache.org/jira/browse/IGNITE-8213 > Project: Ignite > Issue Type: Improvement >Reporter: Pavel Konstantinov >Priority: Major > Fix For: 2.10 > > > We need to print out a readable error message with a corresponding error code > instead of this > {code} > PS C:\work\master\bin> .\control.bat --baseline add 121212 > Warning: the command will perform changes in baseline. > Press 'y' to continue...y > Failed to add nodes to baseline. > Error: Failed to reduce job results due to undeclared user exception > [task=org.apache.ignite.internal.visor.baseline.VisorBaselineTask@71e96f24, > err=class org.apache.ignite.compute.ComputeUserUndeclaredException: Failed to > execute job due to unexpected runtime exception > [jobId=3064672b261-df38f2fd-ca81-423c-b4c3-dd002a77f0af, > ses=GridJobSessionImpl [ses=GridTaskSessionImpl > [taskName=org.apache.ignite.internal.visor.baseline.VisorBaselineTask, > dep=LocalDeployment [super=GridDeployment [ts=1523412519601, depMode=SHARED, > clsLdr=sun.misc.Launcher$AppClassLoader@764c12b6, > clsLdrId=ba54672b261-df38f2fd-ca81-423c-b4c3-dd002a77f0af, userVer=0, > loc=true, sampleClsName=java.lang.String, pendingUndeploy=false, > undeployed=false, usage=0]], > taskClsName=org.apache.ignite.internal.visor.baseline.VisorBaselineTask, > sesId=2064672b261-df38f2fd-ca81-423c-b4c3-dd002a77f0af, > startTime=1523412552937, endTime=9223372036854775807, > taskNodeId=df38f2fd-ca81-423c-b4c3-dd002a77f0af, > clsLdr=sun.misc.Launcher$AppClassLoader@764c12b6, closed=false, cpSpi=null, > failSpi=null, loadSpi=null, usage=1, fullSup=false, internal=true, > topPred=null, subjId=df38f2fd-ca81-423c-b4c3-dd002a77f0af, > mapFut=IgniteFuture [orig=GridFutureAdapter [ignoreInterrupts=false, > state=INIT, res=null, hash=1531276047]], execName=null], > jobId=3064672b261-df38f2fd-ca81-423c-b4c3-dd002a77f0af], err=Node not found > for consistent ID: 121212]] > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8120) Improve test coverage of rebalance failing
[ https://issues.apache.org/jira/browse/IGNITE-8120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8120: -- Fix Version/s: (was: 2.9) 2.10 > Improve test coverage of rebalance failing > -- > > Key: IGNITE-8120 > URL: https://issues.apache.org/jira/browse/IGNITE-8120 > Project: Ignite > Issue Type: Test > Components: general >Affects Versions: 2.4 >Reporter: Ivan Daschinskiy >Assignee: Ivan Daschinskiy >Priority: Minor > Labels: test > Fix For: 2.10 > > > Need to cover situation, when some archived wal segments, which are not > reserved by IgniteWriteAheadLogManager, are deleted during rebalancing or > were deleted before. However, rebalancing from WAL is broken. When fix > [IGNITE-8116|https://issues.apache.org/jira/browse/IGNITE-8116] is available, > it will be implemented. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8118) sqlline.bat throws NPE under PowerShell in corner case
[ https://issues.apache.org/jira/browse/IGNITE-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8118: -- Fix Version/s: (was: 2.9) 2.10 > sqlline.bat throws NPE under PowerShell in corner case > -- > > Key: IGNITE-8118 > URL: https://issues.apache.org/jira/browse/IGNITE-8118 > Project: Ignite > Issue Type: Bug >Reporter: Pavel Konstantinov >Priority: Major > Fix For: 2.10 > > > {code} > .\sqlline.bat -u > "'jdbc:ignite:thin://18.17.12.22:9652?user=ignite=111'" > No known driver to handle "'jdbc:ignite:thin://18.17.12.22:9652?user=ignite". > Searching for known drivers... > java.lang.NullPointerException > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8103) Node with BLT is not allowed to join cluster without one
[ https://issues.apache.org/jira/browse/IGNITE-8103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8103: -- Fix Version/s: (was: 2.9) 2.10 > Node with BLT is not allowed to join cluster without one > > > Key: IGNITE-8103 > URL: https://issues.apache.org/jira/browse/IGNITE-8103 > Project: Ignite > Issue Type: Improvement > Components: general >Affects Versions: 2.4 >Reporter: Alexander Belyak >Priority: Major > Fix For: 2.10 > > Attachments: ActivateTest.java > > > 1) Start cluster of 2-3 nodes and activate it, fill some data > 2) Stop cluster, clear LFS on first node > 3) Start cluster from first node (or start all nodes synchronously) > Expected result: ? > Actual result: "Node with set up BaselineTopology is not allowed to join > cluster without one: cons_srv2" > In the technical point of view it's expected behaviour, because first node > with cleared storage became grid coordinator and reject any connection > attempts from nodes with different baseline. But it's bad for usability: if > we always start all nodes together and wanna clear storage on one node by > some reason - we need to define start sequence. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8063) Transaction rollback is unmanaged in case when commit produced Runtime exception
[ https://issues.apache.org/jira/browse/IGNITE-8063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8063: -- Fix Version/s: (was: 2.9) 2.10 > Transaction rollback is unmanaged in case when commit produced Runtime > exception > > > Key: IGNITE-8063 > URL: https://issues.apache.org/jira/browse/IGNITE-8063 > Project: Ignite > Issue Type: Improvement > Components: cache >Affects Versions: 2.4 >Reporter: Pavel Kovalenko >Assignee: Pavel Kovalenko >Priority: Minor > Labels: newbie > Fix For: 2.10 > > > When 'userCommit' produces an runtime exception transaction state is moved to > UNKNOWN, and tx.finishFuture() completes, after that rollback process runs > asynchronously and there is no simple way to await rollback completion on > such transactions. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8027) Expired entry appears back in case of node restart with persistence enabled
[ https://issues.apache.org/jira/browse/IGNITE-8027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8027: -- Fix Version/s: (was: 2.9) 2.10 > Expired entry appears back in case of node restart with persistence enabled > --- > > Key: IGNITE-8027 > URL: https://issues.apache.org/jira/browse/IGNITE-8027 > Project: Ignite > Issue Type: Bug > Components: persistence >Affects Versions: 2.4 >Reporter: Valentin Kulichenko >Priority: Major > Fix For: 2.10 > > > Detailed description of the issue and a reproducer can be found in this > thread: > http://apache-ignite-users.70518.x6.nabble.com/Ignite-Expiry-Inconsistency-with-Native-Persistence-td20390.html > In short, if entry expires, it can then be read again after node restart. > This is reproduced ONLY if node is killed with 'kill -9', in case of graceful > stop everything works fine. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8061) GridCachePartitionedDataStructuresFailoverSelfTest.testCountDownLatchConstantMultipleTopologyChange may hang on TeamCity
[ https://issues.apache.org/jira/browse/IGNITE-8061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-8061: -- Fix Version/s: (was: 2.9) 2.10 > GridCachePartitionedDataStructuresFailoverSelfTest.testCountDownLatchConstantMultipleTopologyChange > may hang on TeamCity > > > Key: IGNITE-8061 > URL: https://issues.apache.org/jira/browse/IGNITE-8061 > Project: Ignite > Issue Type: Bug > Components: data structures >Affects Versions: 2.4 >Reporter: Andrey Kuznetsov >Priority: Major > Fix For: 2.10 > > Attachments: log.txt > > > The log attached contains 'Test has been timed out and will be interrupted' > message, but does not contain subsequent 'Test has been timed out [test=...'. > Known facts: > * There is pending GridDhtColocatedLockFuture in the log. > * On timeout, InterruptedException comes to doTestCountDownLatch, but > finally-block contains the code leading to distributed locking. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-7913) Current implementation of Internal Diagnostics may cause OOM on server nodes.
[ https://issues.apache.org/jira/browse/IGNITE-7913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-7913: -- Fix Version/s: (was: 2.9) 2.10 > Current implementation of Internal Diagnostics may cause OOM on server nodes. > - > > Key: IGNITE-7913 > URL: https://issues.apache.org/jira/browse/IGNITE-7913 > Project: Ignite > Issue Type: Improvement >Affects Versions: 2.3 >Reporter: Alexey Scherbakov >Priority: Major > Fix For: 2.10 > > > If many transactions are active in grid, Internal Diagnostics can cause OOM > on server nodes serving IgniteDiagnosticMessage because of heap buffering. > See the stack trace demonstrating the issue: > {noformat} > at > org.apache.ignite.internal.util.tostring.GridToStringBuilder.toStringImpl(GridToStringBuilder.java:1012) > at > org.apache.ignite.internal.util.tostring.GridToStringBuilder.toString(GridToStringBuilder.java:762) > at > org.apache.ignite.internal.util.tostring.GridToStringBuilder.toString(GridToStringBuilder.java:710) > at > org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry.toString(GridDhtCacheEntry.java:818) > at java.lang.String.valueOf(String.java:2994) > at > org.apache.ignite.internal.util.GridStringBuilder.a(GridStringBuilder.java:101) > at > org.apache.ignite.internal.util.tostring.SBLimitedLength.a(SBLimitedLength.java:88) > at > org.apache.ignite.internal.util.tostring.GridToStringBuilder.toString(GridToStringBuilder.java:939) > at > org.apache.ignite.internal.util.tostring.GridToStringBuilder.toStringImpl(GridToStringBuilder.java:1005) > at > org.apache.ignite.internal.util.tostring.GridToStringBuilder.toString(GridToStringBuilder.java:826) > at > org.apache.ignite.internal.util.tostring.GridToStringBuilder.toString(GridToStringBuilder.java:783) > at > org.apache.ignite.internal.processors.cache.transactions.IgniteTxEntry.toString(IgniteTxEntry.java:1267) > at java.lang.String.valueOf(String.java:2994) > at java.lang.StringBuilder.append(StringBuilder.java:131) > at java.util.AbstractMap.toString(AbstractMap.java:559) > at java.lang.String.valueOf(String.java:2994) > at > org.apache.ignite.internal.util.GridStringBuilder.a(GridStringBuilder.java:101) > at > org.apache.ignite.internal.util.tostring.SBLimitedLength.a(SBLimitedLength.java:88) > at > org.apache.ignite.internal.util.tostring.GridToStringBuilder.toString(GridToStringBuilder.java:939) > at > org.apache.ignite.internal.util.tostring.GridToStringBuilder.toStringImpl(GridToStringBuilder.java:1005) > at > org.apache.ignite.internal.util.tostring.GridToStringBuilder.toString(GridToStringBuilder.java:864) > at > org.apache.ignite.internal.processors.cache.transactions.IgniteTxRemoteStateImpl.toString(IgniteTxRemoteStateImpl.java:180) > at java.lang.String.valueOf(String.java:2994) > at > org.apache.ignite.internal.util.GridStringBuilder.a(GridStringBuilder.java:101) > at > org.apache.ignite.internal.util.tostring.SBLimitedLength.a(SBLimitedLength.java:88) > at > org.apache.ignite.internal.util.tostring.GridToStringBuilder.toString(GridToStringBuilder.java:939) > at > org.apache.ignite.internal.util.tostring.GridToStringBuilder.toStringImpl(GridToStringBuilder.java:1005) > at > org.apache.ignite.internal.util.tostring.GridToStringBuilder.toString(GridToStringBuilder.java:826) > at > org.apache.ignite.internal.util.tostring.GridToStringBuilder.toString(GridToStringBuilder.java:783) > at > org.apache.ignite.internal.processors.cache.distributed.GridDistributedTxRemoteAdapter.toString(GridDistributedTxRemoteAdapter.java:926) > at > org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxRemote.toString(GridDhtTxRemote.java:373) > at java.lang.String.valueOf(String.java:2994) > at > org.apache.ignite.internal.util.GridStringBuilder.a(GridStringBuilder.java:101) > at > org.apache.ignite.internal.util.tostring.SBLimitedLength.a(SBLimitedLength.java:88) > at > org.apache.ignite.internal.util.tostring.GridToStringBuilder.toString(GridToStringBuilder.java:939) > at > org.apache.ignite.internal.util.tostring.GridToStringBuilder.toStringImpl(GridToStringBuilder.java:1005) > at > org.apache.ignite.internal.util.tostring.GridToStringBuilder.toString(GridToStringBuilder.java:826) > at > org.apache.ignite.internal.util.tostring.GridToStringBuilder.toString(GridToStringBuilder.java:783) > at >
[jira] [Updated] (IGNITE-7948) SQL: read only necessary fields into the row when possible
[ https://issues.apache.org/jira/browse/IGNITE-7948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-7948: -- Fix Version/s: (was: 2.9) 2.10 > SQL: read only necessary fields into the row when possible > -- > > Key: IGNITE-7948 > URL: https://issues.apache.org/jira/browse/IGNITE-7948 > Project: Ignite > Issue Type: Task > Components: sql >Reporter: Vladimir Ozerov >Priority: Major > Fix For: 2.10 > > Time Spent: 10m > Remaining Estimate: 0h > > When H2 row is read, we always fill it with data eagerly through link > materialization. Materialization is performed under page "read lock" what > guarantees row-level consistency. This may lead to excessive memory pressure > due to memory copying. For example, consider a class with 50 fields and a > query which reads only 2 of them. 48 other fields will be copied without a > reason. Lazy initialization is not an option because it will only defer > memcpy, but not eliminate it. > Instead we can try using H2. It passes {{TableFilter}} class to some of index > access methods*. We can analyze this class and create the list of required > fields. Then we can read these fields under read lock from offheap and put > them to the row. > In addition to saved memcpy this could give us more benefits: > 1) No more need for field cache ({{GridH2KeyValueRowOnheap#valCache}}) > 2) No more need to read {{_VER}} column and possibly {{_KEY}} or {{_VAL}} > But there are a number of drawbacks as well. E.g. it is impossible to read > strings from offheap efficiently, so queries with VARCHAR will definitely > suffer from this change. > \* {{org.h2.index.Index#find(org.h2.table.TableFilter, > org.h2.result.SearchRow, org.h2.result.SearchRow)}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-7899) Write Zookeeper Discovery documentation in java docs
[ https://issues.apache.org/jira/browse/IGNITE-7899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-7899: -- Fix Version/s: (was: 2.9) 2.10 > Write Zookeeper Discovery documentation in java docs > > > Key: IGNITE-7899 > URL: https://issues.apache.org/jira/browse/IGNITE-7899 > Project: Ignite > Issue Type: Task > Components: zookeeper >Affects Versions: 2.5 >Reporter: Dmitry Sherstobitov >Priority: Major > Fix For: 2.10 > > > Describe Zookeeper Discovery in java docs -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-7882) Atomic update requests should always use topology mappings instead of affinity
[ https://issues.apache.org/jira/browse/IGNITE-7882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-7882: -- Fix Version/s: (was: 2.9) 2.10 > Atomic update requests should always use topology mappings instead of affinity > -- > > Key: IGNITE-7882 > URL: https://issues.apache.org/jira/browse/IGNITE-7882 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.4 >Reporter: Pavel Kovalenko >Assignee: Pavel Kovalenko >Priority: Major > Fix For: 2.10 > > > Currently for mapping cache atomic updates we can use two ways: > 1) Use nodes reporting status OWNING for partition where we send the update. > 2) Use only affinity nodes mapping if rebalance is finished. > Using the second way we may route update request only to affinity node, while > there is also node which is still owner and can process read requests. > It can lead to reading null values for some key, while update for such key > was successful a moment ago. > - > Problem with using topology mapping: > 1) We send update request with key K to near node N > 2) N performs mapping for K to nodes P, B1, B2, B3 (Primary and backups) and > starts waiting for succesful update responses for all of these nodes. > 3) N sends update request to P. During this time B3 change status to RENTING > (Eviction). > 4) P also performs mapping for K to backup nodes B1, B2. > 5) All updates are succesful, but N is still waiting for response from B3. > Update request will be not finished and hangs. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-7798) Give user an ability to check driver metrics in Cassandra store
[ https://issues.apache.org/jira/browse/IGNITE-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-7798: -- Fix Version/s: (was: 2.9) 2.10 > Give user an ability to check driver metrics in Cassandra store > --- > > Key: IGNITE-7798 > URL: https://issues.apache.org/jira/browse/IGNITE-7798 > Project: Ignite > Issue Type: Improvement > Components: cassandra >Affects Versions: 2.3 >Reporter: Valentin Kulichenko >Priority: Major > Fix For: 2.10 > > > Cassandra store uses generic client driver to connect to the cluster. The > driver provides {{Metrics}} object which can be useful for monitoring, > however there is no way for user to acquire it. Need to find a way to expose > this information to public API. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-7795) Correct handling partitions restored in RENTING state
[ https://issues.apache.org/jira/browse/IGNITE-7795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-7795: -- Fix Version/s: (was: 2.9) 2.10 > Correct handling partitions restored in RENTING state > - > > Key: IGNITE-7795 > URL: https://issues.apache.org/jira/browse/IGNITE-7795 > Project: Ignite > Issue Type: Bug > Components: cache, persistence >Affects Versions: 2.1, 2.2, 2.3, 2.4 >Reporter: Pavel Kovalenko >Priority: Major > Fix For: 2.10 > > > Let's we have node which has partition in state RENTING after start. It could > happen if node was stopped during partition eviction. > Started up node is only one owner by affinity for this partition. > Currently we will own this partition during rebalance preparing phase which > seems is not correct. > If we don't have owners for some partitions we should fail activation > process, move this partition to MOVING state and clear it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-7820) Investigate and fix perfromance drop of WAL for FSYNC mode
[ https://issues.apache.org/jira/browse/IGNITE-7820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-7820: -- Fix Version/s: (was: 2.9) 2.10 > Investigate and fix perfromance drop of WAL for FSYNC mode > -- > > Key: IGNITE-7820 > URL: https://issues.apache.org/jira/browse/IGNITE-7820 > Project: Ignite > Issue Type: Bug >Affects Versions: 2.4 >Reporter: Andrey N. Gura >Assignee: Andrey N. Gura >Priority: Critical > Fix For: 2.10 > > > WAL performance drop was introduced by > https://issues.apache.org/jira/browse/IGNITE-6339 fix. In order to provide > better performance for {{FSYNC}} WAL mode > {{FsyncModeFileWriteAheadLogManager}} implementation was added as result of > fix issue https://issues.apache.org/jira/browse/IGNITE-7594. > *What we know about this performance drop:* > * It affects {{IgnitePutAllBenchmark}} and {{IgnitePutAllTxBenchmark}} and > measurements show 10-15% drop and ~50% drop accordingly. > * It is reproducible not for all hardware configuration. That is for some > configuration we see performance improvements instead of drop. > * It is reproducible for [Many clients --> One server] topology. > * If {{IGNITE_WAL_MMAP == false}} then we have better performance. > * If {{fsyncDelay == 0}} then we have better performance. > *What were tried during initial investigation:* > * Replacing of {{LockSupport.park/unpark}} to spin leads to improvement about > 2%. > * Using {{FileWriteHandle.fsync(null)}} (unconditional flush) instead of > {{FileWriteHandle.fsync(position)}} (conditional flush) doesn't affect > benchmarks. > *What should we do:* > Investigate the problem and provide fix or recommendation for system tuning. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-7679) Move all test plugins to a separate module
[ https://issues.apache.org/jira/browse/IGNITE-7679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-7679: -- Fix Version/s: (was: 2.9) 2.10 > Move all test plugins to a separate module > -- > > Key: IGNITE-7679 > URL: https://issues.apache.org/jira/browse/IGNITE-7679 > Project: Ignite > Issue Type: Test >Reporter: Dmitriy Govorukhin >Priority: Major > Fix For: 2.10 > > > Curretly all test run with plugins (TestReconnectPlugin > 1.0,StanByClusterTestProvider 1.0, because it is in classpath), this makes > the tests not clean. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-7692) affinityCall and affinityRun may execute code on backup partitions
[ https://issues.apache.org/jira/browse/IGNITE-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-7692: -- Fix Version/s: (was: 2.9) 2.10 > affinityCall and affinityRun may execute code on backup partitions > -- > > Key: IGNITE-7692 > URL: https://issues.apache.org/jira/browse/IGNITE-7692 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.1 >Reporter: Alexey Goncharuk >Assignee: Sergey Chugunov >Priority: Major > Labels: MakeTeamcityGreenAgain, Muted_test, usability > Fix For: 2.10 > > > Apparently, the affinityCall and affinityRun methods reserve partitions and > check their state to be OWNING, however, if topology changes and partition > role is changed to backup from primary, the code is still executed. > This can be an issue if a user executes a local SQL query inside the > affinityCall runnable. In this case, the query result may return null. > This can be observed in the > IgniteCacheLockPartitionOnAffinityRunTest#getPersonsCountSingleCache - note > an additional check I've added to make the test pass. > I think it is ok to have an old semantics for the API, because in some cases > (scan query, local gets) a backup OWNER is enough. However, it looks like we > need to add another API method to enforce that affinity run be executed on > primary nodes and forbid primary role change. > Another option is to detect a topology version of the affinity run and use > that version for local SQL queries. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-7671) Fix '.gitignore files are tracked' error
[ https://issues.apache.org/jira/browse/IGNITE-7671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-7671: -- Fix Version/s: (was: 2.9) 2.10 > Fix '.gitignore files are tracked' error > > > Key: IGNITE-7671 > URL: https://issues.apache.org/jira/browse/IGNITE-7671 > Project: Ignite > Issue Type: Task >Reporter: Peter Ivanov >Assignee: Peter Ivanov >Priority: Major > Fix For: 2.10 > > > Current {{.gitignore}} has definitions of files to ignore, which are already > under the version control system (some *.sh scripts for example). It needs to > be fixed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-7644) Add an utility to export all key-value data from a persisted partition
[ https://issues.apache.org/jira/browse/IGNITE-7644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-7644: -- Fix Version/s: (was: 2.9) 2.10 > Add an utility to export all key-value data from a persisted partition > -- > > Key: IGNITE-7644 > URL: https://issues.apache.org/jira/browse/IGNITE-7644 > Project: Ignite > Issue Type: Improvement > Components: persistence >Affects Versions: 2.1 >Reporter: Alexey Goncharuk >Assignee: Dmitriy Govorukhin >Priority: Major > Fix For: 2.10 > > > We need an emergency utility analogous to pgdump which will be able to > full-scan all PDS partition pages and extract all survived data in some form > that later can be uploaded back to Ignite cluster -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-7678) Persistence: newly created cache creates partitions in MOVING state
[ https://issues.apache.org/jira/browse/IGNITE-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-7678: -- Fix Version/s: (was: 2.9) 2.10 > Persistence: newly created cache creates partitions in MOVING state > --- > > Key: IGNITE-7678 > URL: https://issues.apache.org/jira/browse/IGNITE-7678 > Project: Ignite > Issue Type: Bug > Components: persistence >Affects Versions: 2.4 >Reporter: Alexey Goncharuk >Assignee: Alexey Goncharuk >Priority: Major > Fix For: 2.10 > > > initPartitions0 will generate partitions in MOVING state if there are no > partition files in the FS. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-7572) Local cache fails to start on client node
[ https://issues.apache.org/jira/browse/IGNITE-7572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-7572: -- Fix Version/s: (was: 2.9) 2.10 > Local cache fails to start on client node > - > > Key: IGNITE-7572 > URL: https://issues.apache.org/jira/browse/IGNITE-7572 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.3 >Reporter: Mikhail Cherkasov >Assignee: Dmitry Karachentsev >Priority: Major > Fix For: 2.10 > > > Client node doesn't have default configuration for data region and fails to > start local cache with the following exception: > {noformat} > Caused by: class org.apache.ignite.IgniteCheckedException: null > at org.apache.ignite.internal.util.IgniteUtils.cast(IgniteUtils.java:7272) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.resolve(GridFutureAdapter.java:259) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:232) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:159) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:151) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.onKernalStart(GridCachePartitionExchangeManager.java:625) > at > org.apache.ignite.internal.processors.cache.GridCacheProcessor.onKernalStart(GridCacheProcessor.java:819) > at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1041) > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:1973) > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1716) > at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1144) > at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:664) > at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:589) > at org.apache.ignite.Ignition.start(Ignition.java:322) > ... 1 more > Caused by: java.lang.NullPointerException > at > org.apache.ignite.internal.processors.cache.CacheGroupContext.(CacheGroupContext.java:203) > at > org.apache.ignite.internal.processors.cache.GridCacheProcessor.startCacheGroup(GridCacheProcessor.java:1971) > at > org.apache.ignite.internal.processors.cache.GridCacheProcessor.prepareCacheStart(GridCacheProcessor.java:1869) > at > org.apache.ignite.internal.processors.cache.GridCacheProcessor.startCachesOnLocalJoin(GridCacheProcessor.java:1759) > at > org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.initCachesOnLocalJoin(GridDhtPartitionsExchangeFuture.java:744) > at > org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:626) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2337) > at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) > at java.lang.Thread.run(Thread.java:748) > {noformat} > Reproducer: > > {code:java} > import java.util.Arrays; > import org.apache.ignite.Ignite; > import org.apache.ignite.Ignition; > import org.apache.ignite.cache.CacheMode; > import org.apache.ignite.configuration.CacheConfiguration; > import org.apache.ignite.configuration.IgniteConfiguration; > import org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi; > import org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder; > import org.jetbrains.annotations.NotNull; > public class LocalCache { > private static int id; > public static void main(String[] args) throws InterruptedException { > Ignition.setClientMode(false); > Ignite server = Ignition.start(getConfiguration()); > System.out.println("Server is up"); > Ignition.setClientMode(true); > Ignite client = Ignition.start(getConfiguration()); > System.out.println("Client is up"); > } > @NotNull private static IgniteConfiguration getConfiguration() { > IgniteConfiguration cfg = new IgniteConfiguration(); > TcpDiscoveryVmIpFinder finder = new TcpDiscoveryVmIpFinder(true); > finder.setAddresses(Arrays.asList("localhost:47500..47600")); > cfg.setIgniteInstanceName("test" + id++); > CacheConfiguration cacheConfiguration = new > CacheConfiguration("TEST"); > cacheConfiguration.setCacheMode(CacheMode.LOCAL); > cfg.setCacheConfiguration(cacheConfiguration); > cfg.setDiscoverySpi(new TcpDiscoverySpi().setIpFinder(finder)); > return cfg; > } > } > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-7641) Add CacheEntry#ttl method
[ https://issues.apache.org/jira/browse/IGNITE-7641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-7641: -- Fix Version/s: (was: 2.9) 2.10 > Add CacheEntry#ttl method > - > > Key: IGNITE-7641 > URL: https://issues.apache.org/jira/browse/IGNITE-7641 > Project: Ignite > Issue Type: Improvement > Components: cache >Affects Versions: 2.3 >Reporter: Valentin Kulichenko >Priority: Major > Fix For: 2.10 > > > Ignite provides a way to specify an expiry policy on per entry level, but > there is no way to know the current TTL for a particular key. > We can add {{CacheEntry#ttl()}} and/or {{IgniteCache#ttl(K key)}} method that > will provide this information. Looks like it's already available via > {{GridCacheMapEntry#ttl()}}, so we just need to properly expose it to public > API. > Here is the user forum discussion about this: > http://apache-ignite-users.70518.x6.nabble.com/Get-TTL-of-the-specific-K-V-entry-td19817.html -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-7640) Refactor DiscoveryDataClusterState to be immutable
[ https://issues.apache.org/jira/browse/IGNITE-7640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-7640: -- Fix Version/s: (was: 2.9) 2.10 > Refactor DiscoveryDataClusterState to be immutable > -- > > Key: IGNITE-7640 > URL: https://issues.apache.org/jira/browse/IGNITE-7640 > Project: Ignite > Issue Type: Improvement > Components: cache >Affects Versions: 2.4 >Reporter: Alexey Goncharuk >Priority: Major > Fix For: 2.10 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-7552) .NET: AtomicConfiguration.DefaultBackups should be 1
[ https://issues.apache.org/jira/browse/IGNITE-7552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-7552: -- Fix Version/s: (was: 2.9) 2.10 > .NET: AtomicConfiguration.DefaultBackups should be 1 > > > Key: IGNITE-7552 > URL: https://issues.apache.org/jira/browse/IGNITE-7552 > Project: Ignite > Issue Type: Bug > Components: platforms >Affects Versions: 2.4 >Reporter: Pavel Tupitsyn >Priority: Major > Labels: .NET, newbie > Fix For: 2.10 > > > Defaults have changed in Java (see {{AtomicConfiguration.DFLT_BACKUPS}}), > update .NET part, add test (we usually check that .NET and Java defaults > match in {{IgniteConfigurationTest.TestSpringXml}}). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-7510) IgnitePdsClientNearCachePutGetTest fails flaky on TC
[ https://issues.apache.org/jira/browse/IGNITE-7510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-7510: -- Fix Version/s: (was: 2.9) 2.10 > IgnitePdsClientNearCachePutGetTest fails flaky on TC > > > Key: IGNITE-7510 > URL: https://issues.apache.org/jira/browse/IGNITE-7510 > Project: Ignite > Issue Type: Test > Components: persistence >Affects Versions: 2.4 >Reporter: Alexey Goncharuk >Assignee: Alexey Goncharuk >Priority: Major > Labels: MakeTeamcityGreenAgain > Fix For: 2.10 > > > Muting this test until this ticket is fixed -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-7499) DataRegionMetricsImpl#getPageSize returns ZERO for system data regions
[ https://issues.apache.org/jira/browse/IGNITE-7499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-7499: -- Fix Version/s: (was: 2.9) 2.10 > DataRegionMetricsImpl#getPageSize returns ZERO for system data regions > -- > > Key: IGNITE-7499 > URL: https://issues.apache.org/jira/browse/IGNITE-7499 > Project: Ignite > Issue Type: Bug > Components: cache >Reporter: Alexey Kuznetsov >Assignee: Andrey Kuznetsov >Priority: Major > Fix For: 2.10 > > > Working on IGNITE-7492 I found that DataRegionMetricsImpl#getPageSize returns > ZERO for system data regions. > Meanwhile there is also > org.apache.ignite.internal.pagemem.PageMemory#systemPageSize method. > That looks a bit strange, why we need PageSize and SystemPageSize ? > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-7445) Thin client: SQL batching
[ https://issues.apache.org/jira/browse/IGNITE-7445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-7445: -- Fix Version/s: (was: 2.9) 2.10 > Thin client: SQL batching > - > > Key: IGNITE-7445 > URL: https://issues.apache.org/jira/browse/IGNITE-7445 > Project: Ignite > Issue Type: Improvement > Components: thin client >Affects Versions: 2.4 >Reporter: Pavel Tupitsyn >Priority: Major > Fix For: 2.10 > > > SQL batching allows executing multiple SQL queries in one client request, > improving performance. > See how JDBC does it in {{JdbcBatchExecuteRequest}}, add a similar thing in > Thin Client protocol. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-7388) MVCC TX Support async tx rollback on timeout during version requesting.
[ https://issues.apache.org/jira/browse/IGNITE-7388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-7388: -- Fix Version/s: (was: 2.9) 2.10 > MVCC TX Support async tx rollback on timeout during version requesting. > --- > > Key: IGNITE-7388 > URL: https://issues.apache.org/jira/browse/IGNITE-7388 > Project: Ignite > Issue Type: Task > Components: mvcc >Reporter: Igor Seliverstov >Priority: Major > Labels: transactions > Fix For: 2.10 > > > Currently TX timeout isn't taken into consideration while MVCC version is > requesting. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-7384) MVCC TX: Support historical rebalance
[ https://issues.apache.org/jira/browse/IGNITE-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-7384: -- Fix Version/s: (was: 2.9) 2.10 > MVCC TX: Support historical rebalance > - > > Key: IGNITE-7384 > URL: https://issues.apache.org/jira/browse/IGNITE-7384 > Project: Ignite > Issue Type: Task > Components: mvcc >Reporter: Igor Seliverstov >Priority: Major > Labels: rebalance > Fix For: 2.10 > > > Currently MVCC doesn't support historical (delta) rebalance. > The main difficulty is that MVCC writes changes on tx active phase while > partition update version, aka update counter, is being applied on tx finish. > This means we cannot start iteration over WAL right from the pointer where > the update counter updated, but should include updates, which the transaction > that updated the counter did. > Currently proposed approach: > * Maintain a list of active TXs with update counter (UC) which was actual at > the time before TX did its first update (on per partition basis) > * on each checkpoint save two counters - update counter (UC) and back > counter (BC) which is earliest UC mapped to a tx from active list at > checkpoint time. > * during local restore move UC and BC forward as far as possible. > * send BC instead of update counter in demand message. > * start iteration from a first checkpoint having UC less or equal received BC > See [linked dev list > thread|http://apache-ignite-developers.2346864.n4.nabble.com/Historical-rebalance-td38380.html] > for details -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (IGNITE-7330) When client connects during cluster activation process it hangs on obtaining cache proxy
[ https://issues.apache.org/jira/browse/IGNITE-7330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov resolved IGNITE-7330. --- Fix Version/s: (was: 2.9) (was: 3.0) Resolution: Cannot Reproduce > When client connects during cluster activation process it hangs on obtaining > cache proxy > > > Key: IGNITE-7330 > URL: https://issues.apache.org/jira/browse/IGNITE-7330 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Chugunov >Priority: Critical > Labels: IEP-4, Phase-2 > > The test below reproduces the issue: > {noformat} > public void testClientJoinWhenActivationInProgress() throws Exception { > Ignite srv = startGrids(5); > srv.active(true); > srv.createCaches(Arrays.asList(cacheConfigurations1())); > Map cacheData = new LinkedHashMap<>(); > for (int i = 1; i <= 100; i++) { > for (CacheConfiguration ccfg : cacheConfigurations1()) { > srv.cache(ccfg.getName()).put(-i, i); > cacheData.put(-i, i); > } > } > stopAllGrids(); > srv = startGrids(5); > final CountDownLatch clientStartLatch = new CountDownLatch(1); > IgniteInternalFuture clStartFut = GridTestUtils.runAsync(new > Runnable() { > @Override public void run() { > try { > clientStartLatch.await(); > Thread.sleep(10); > client = true; > Ignite cl = startGrid("client0"); > IgniteCache atomicCache = > cl.cache(CACHE_NAME_PREFIX + '0'); > IgniteCache txCache = > cl.cache(CACHE_NAME_PREFIX + '1'); > assertEquals(100, atomicCache.size()); > assertEquals(100, txCache.size()); > } > catch (Exception e) { > log.error("Error occurred", e); > } > } > }, "client-starter-thread"); > clientStartLatch.countDown(); > srv.active(true); > clStartFut.get(); > } > {noformat} > Expected behavior: test finishes successfully. > Actual behavior: test hangs on waiting for client start future to complete > while "client-started-thread" will be hanging on obtaining a reference to the > first cache. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-7330) When client connects during cluster activation process it hangs on obtaining cache proxy
[ https://issues.apache.org/jira/browse/IGNITE-7330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17146395#comment-17146395 ] Aleksey Plekhanov commented on IGNITE-7330: --- I've tried the attached reproducer on the current master and it's passed. Looks like this ticket is not actual anymore. I closed the ticket. If it's still actual feel free to reopen it. > When client connects during cluster activation process it hangs on obtaining > cache proxy > > > Key: IGNITE-7330 > URL: https://issues.apache.org/jira/browse/IGNITE-7330 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Chugunov >Priority: Critical > Labels: IEP-4, Phase-2 > Fix For: 3.0, 2.9 > > > The test below reproduces the issue: > {noformat} > public void testClientJoinWhenActivationInProgress() throws Exception { > Ignite srv = startGrids(5); > srv.active(true); > srv.createCaches(Arrays.asList(cacheConfigurations1())); > Map cacheData = new LinkedHashMap<>(); > for (int i = 1; i <= 100; i++) { > for (CacheConfiguration ccfg : cacheConfigurations1()) { > srv.cache(ccfg.getName()).put(-i, i); > cacheData.put(-i, i); > } > } > stopAllGrids(); > srv = startGrids(5); > final CountDownLatch clientStartLatch = new CountDownLatch(1); > IgniteInternalFuture clStartFut = GridTestUtils.runAsync(new > Runnable() { > @Override public void run() { > try { > clientStartLatch.await(); > Thread.sleep(10); > client = true; > Ignite cl = startGrid("client0"); > IgniteCache atomicCache = > cl.cache(CACHE_NAME_PREFIX + '0'); > IgniteCache txCache = > cl.cache(CACHE_NAME_PREFIX + '1'); > assertEquals(100, atomicCache.size()); > assertEquals(100, txCache.size()); > } > catch (Exception e) { > log.error("Error occurred", e); > } > } > }, "client-starter-thread"); > clientStartLatch.countDown(); > srv.active(true); > clStartFut.get(); > } > {noformat} > Expected behavior: test finishes successfully. > Actual behavior: test hangs on waiting for client start future to complete > while "client-started-thread" will be hanging on obtaining a reference to the > first cache. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-7326) Fix ignitevisorcmd | sqlline scripts to be able to run from /usr/bin installed as symbolic links
[ https://issues.apache.org/jira/browse/IGNITE-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Plekhanov updated IGNITE-7326: -- Fix Version/s: (was: 2.9) 2.10 > Fix ignitevisorcmd | sqlline scripts to be able to run from /usr/bin > installed as symbolic links > > > Key: IGNITE-7326 > URL: https://issues.apache.org/jira/browse/IGNITE-7326 > Project: Ignite > Issue Type: Bug >Reporter: Peter Ivanov >Assignee: Peter Ivanov >Priority: Major > Fix For: 2.10 > > > Currenlty, {{ignitevisorcmd.sh}} and {{sqlline.sh}} being installed into > {{/usr/bin}} will fail to run because of: > * their unawarelessness of theirs real location; > * necessity to write to {{$\{IGNITE_HOME}/work}} which can have different > permissions and owner (in packages, for example). > It is required to rewrite these scripts to be able to run from anywhere by > theirs symbolic linka and with some temporary dir ({{/tmp}} for example) as > workdir. -- This message was sent by Atlassian Jira (v8.3.4#803005)