[jira] [Created] (IGNITE-14197) Checkpoint thread can't take checkpoint write lock because it waits for parked threads to complete their work

2021-02-17 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-14197:
--

 Summary: Checkpoint thread can't take checkpoint write lock 
because it waits for parked threads to complete their work
 Key: IGNITE-14197
 URL: https://issues.apache.org/jira/browse/IGNITE-14197
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


In case of enabled write throttling, when, for example, node parks data 
streamer thread, it still holds checkpoint read lock and it leads to the long 
pauses on waiting for checkpoint lock:
[2020-07-23 07:09:21,614][INFO 
][db-checkpoint-thread-#371][GridCacheDatabaseSharedManager] Checkpoint started 
[checkpointId=f964c8f2-daa5-41b2-80ef-944326f26f8a, startPtr=FileWALPointer 
[idx=56913, fileOff=10362905, len=41972], checkpointBeforeLockTime=1983ms, 
*checkpointLockWait=812117ms*, checkpointListenersExecuteTime=90ms, 
checkpointLockHoldTime=93ms, walCpRecordFsyncDuration=123ms, 
writeCheckpointEntryDuration=4ms, splitAndSortCpPagesDuration=4155ms, 
pages=10516815, reason='too big size of WAL without checkpoint']
All operations at this moment are blocked.

Sometimes, it can lead to a complete disaster:
Parking thread=data-streamer-stripe-47-#144 for timeout(ms)=*21278855*
{quote}“data-streamer-stripe-78-#175” #209 prio=5 os_prio=0 
tid=0x7f6161d6a800 nid=0xf932 waiting on condition [0x7f5c292d1000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:338)
at 
org.apache.ignite.internal.processors.cache.persistence.pagemem.PagesWriteSpeedBasedThrottle.doPark(PagesWriteSpeedBasedThrottle.java:244)
at 
org.apache.ignite.internal.processors.cache.persistence.pagemem.PagesWriteSpeedBasedThrottle.onMarkDirty(PagesWriteSpeedBasedThrottle.java:227)
at 
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlockPage(PageMemoryImpl.java:1730)
at 
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlock(PageMemoryImpl.java:491)
at 
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlock(PageMemoryImpl.java:483)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writeUnlock(PageHandler.java:394)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writePage(PageHandler.java:369)
at 
org.apache.ignite.internal.processors.cache.persistence.DataStructure.write(DataStructure.java:296)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.access$11300(BPlusTree.java:98)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Put.tryInsert(BPlusTree.java:3864)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Put.access$7100(BPlusTree.java:3544)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Invoke.onNotFound(BPlusTree.java:4103)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Invoke.access$5800(BPlusTree.java:3894)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invokeDown(BPlusTree.java:2022)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invokeDown(BPlusTree.java:1997)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invoke(BPlusTree.java:1904)
at 
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke0(IgniteCacheOffheapManagerImpl.java:1662)
at 
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke(IgniteCacheOffheapManagerImpl.java:1645)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.invoke(GridCacheOffheapManager.java:2473)
at 
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.invoke(IgniteCacheOffheapManagerImpl.java:436)
at 
org.apache.ignite.internal.processors.cache.GridCacheMapEntry.storeValue(GridCacheMapEntry.java:4306)
at 
org.apache.ignite.internal.processors.cache.GridCacheMapEntry.initialValue(GridCacheMapEntry.java:3441)
at 
org.apache.ignite.internal.processors.cache.GridCacheEntryEx.initialValue(GridCacheEntryEx.java:770)
at 
org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl$IsolatedUpdater.receive(DataStreamerImpl.java:2278)
at 
org.apache.ignite.internal.processors.datastreamer.DataStreamerUpdateJob.call(DataStreamerUpdateJob.java:139)
at 
org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7104)
at 
org.apache.ignite.internal.processors.closure.GridClosureProcessor$2.body(GridClosureProcessor.java:966)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
at 
org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:559)
at 

[jira] [Created] (IGNITE-14110) Create networking module

2021-02-02 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-14110:
--

 Summary: Create networking module
 Key: IGNITE-14110
 URL: https://issues.apache.org/jira/browse/IGNITE-14110
 Project: Ignite
  Issue Type: Sub-task
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


It needs to create a networking module with some API and simple implementation 
for further improvment



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14092) Design network address resolver

2021-01-28 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-14092:
--

 Summary: Design network address resolver
 Key: IGNITE-14092
 URL: https://issues.apache.org/jira/browse/IGNITE-14092
 Project: Ignite
  Issue Type: Sub-task
Reporter: Anton Kalashnikov


It needs to design network address resolver/ip finder/discovery which would 
help to choose the right ip/port for connection. Perhaps we don't need such a 
service at all but it should be explicitly agreed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14091) Implement messaging service

2021-01-28 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-14091:
--

 Summary: Implement messaging service
 Key: IGNITE-14091
 URL: https://issues.apache.org/jira/browse/IGNITE-14091
 Project: Ignite
  Issue Type: Sub-task
Reporter: Anton Kalashnikov


It needs to implement the ability to send/receive messages to/from network 
members:
 * there's a requirements of being able to send idempotent messages with very 
weak guarantees:

 ** no delivery guarantees required;

 ** multiple copies of the same message might be sent;

 ** no need to have any kind of acknowledgement;

 * there's another requirement for the common use:

 ** message must be sent exactly once with an acknowledgement that it has 
actually been received (not necessarily processed);

 ** messages must be received in the same order they were sent.
These types of messages might utilize current recovery protocol with acks every 
32 (or so) messages. This setting must be flexible enough so that we won't get 
OOM in big topologies.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14090) Networking API

2021-01-28 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-14090:
--

 Summary: Networking API
 Key: IGNITE-14090
 URL: https://issues.apache.org/jira/browse/IGNITE-14090
 Project: Ignite
  Issue Type: Sub-task
Reporter: Anton Kalashnikov


It needs to design convinient public API for networking module which allow to 
get information about network members and send/receive messages from them.

Draft:

{noformat}

public interface NetworkService \{ static NetworkService 
create(NetworkConfiguration cfg); void shutdown() throws ???; NetworkMember 
localMember(); Collection remoteMembers(); void 
weakSend(NetworkMember member, Message msg); Future 
guaranteedSend(NetworkMember member, Message msg); void 
listenMembers(MembershipListener lsnr); void 
listenMessages(Consumer lsnr); } public interface 
MembershipListener \{ void onAppeared(NetworkMember member); void 
onDisappeared(NetworkMember member); void onAcceptedByGroup(List 
remoteMembers); } public interface NetworkMember \{ UUID id(); }

{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14089) Override scalecube internal message by custom one

2021-01-28 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-14089:
--

 Summary: Override scalecube internal message by custom one
 Key: IGNITE-14089
 URL: https://issues.apache.org/jira/browse/IGNITE-14089
 Project: Ignite
  Issue Type: Sub-task
Reporter: Anton Kalashnikov


There is some custom logic in the networking module like a specific handshake, 
message recovery etc. which requires to have specific messages but at the same 
time default scalecube behaviour should be worked correctly. So it needs to 
implement one logic over another.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14088) Implement scalecube transport API over netty

2021-01-28 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-14088:
--

 Summary: Implement scalecube transport API over netty
 Key: IGNITE-14088
 URL: https://issues.apache.org/jira/browse/IGNITE-14088
 Project: Ignite
  Issue Type: Sub-task
Reporter: Anton Kalashnikov


scalecube has its own netty inside but it is idea to integrate our expanded 
netty into it. It will help us to support more features like our own handshake, 
marshalling etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14086) Implement retry of establishing connection if it was lost

2021-01-28 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-14086:
--

 Summary: Implement retry of establishing connection if it was lost
 Key: IGNITE-14086
 URL: https://issues.apache.org/jira/browse/IGNITE-14086
 Project: Ignite
  Issue Type: Sub-task
Reporter: Anton Kalashnikov


It needs to implement a retry of establishing the connection. It is not clear 
which way is better to implement such idea because the current implementation 
too difficult to configure(number of retries, several properties of retry 
time). So it needs to think a better way to configure it. And it needs to be 
implementeded.

Perhaps, scalecube(gossip protocol) do all work already and we should do 
nothing here. Need to recheck.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14085) Implement message recovery protocol over handshake

2021-01-28 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-14085:
--

 Summary: Implement message recovery protocol over handshake
 Key: IGNITE-14085
 URL: https://issues.apache.org/jira/browse/IGNITE-14085
 Project: Ignite
  Issue Type: Sub-task
Reporter: Anton Kalashnikov


The central idea of recovery protocol is the same as it is in the current 
implementation. So it needs to implement a similar idea with the recovery 
descriptor. This means information about last sending/received messages should 
be sent during the handshake and according to this information messages which 
were not received should be sent one more time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14084) Integrate direct marshalling to networking

2021-01-28 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-14084:
--

 Summary: Integrate direct marshalling to networking
 Key: IGNITE-14084
 URL: https://issues.apache.org/jira/browse/IGNITE-14084
 Project: Ignite
  Issue Type: Sub-task
Reporter: Anton Kalashnikov


Direct marshalling can be extracted from ignite2.x and integrate to ignite3.0. 
It helps to avoid extra data copy during the sending/receiving messages.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14083) Add SSL support to networking

2021-01-28 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-14083:
--

 Summary: Add SSL support to networking
 Key: IGNITE-14083
 URL: https://issues.apache.org/jira/browse/IGNITE-14083
 Project: Ignite
  Issue Type: Sub-task
Reporter: Anton Kalashnikov


It needs to add the ability to establish SSL connection. It looks like it 
should not be a problem. But at least, it needs to design configuration which 
allow to manage the ssl(path to certificate, password, etc.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14082) Implementation of handshake for new connection

2021-01-28 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-14082:
--

 Summary: Implementation of handshake for new connection
 Key: IGNITE-14082
 URL: https://issues.apache.org/jira/browse/IGNITE-14082
 Project: Ignite
  Issue Type: Sub-task
Reporter: Anton Kalashnikov


It needs to implement the handshake after netty establish the connection. 
Perhaps, It makes sense to use netty handlers. During the handshake, It needs 
to exchange instanceId from one endpoint to another.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14081) Networking module

2021-01-28 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-14081:
--

 Summary: Networking module
 Key: IGNITE-14081
 URL: https://issues.apache.org/jira/browse/IGNITE-14081
 Project: Ignite
  Issue Type: New Feature
Reporter: Anton Kalashnikov






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14055) Deadlock in timeoutObjectProcessor between 'send messag'e & 'handshake timeout'

2021-01-25 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-14055:
--

 Summary: Deadlock in timeoutObjectProcessor between 'send messag'e 
& 'handshake timeout'
 Key: IGNITE-14055
 URL: https://issues.apache.org/jira/browse/IGNITE-14055
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


Cluster hangs after jvm pauses on one of server nodes.
Scenario:
1. Start three server nodes with put operations using StartServerWithTxPuts.
2. Emulate jvm freezes on one server node by running the attached script:
{{*sh freeze.sh  *}}
3. Wait until the script has finished.

Result:
The cluster hangs on tx put operations.

The first server node continuously prints:
{{{noformat}}}
{{}}{{[2020-11-03 09:36:01,719][INFO 
][grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi%][TcpCommunicationSpi] 
Accepted incoming communication connection [locAddr=/127.0.0.1:47100, 
rmtAddr=/127.0.0.1:57714][2020-11-03 09:36:01,720][INFO 
][grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi%][TcpCommunicationSpi] 
Received incoming connection from remote node while connecting to this node, 
rejecting [locNode=5defd32f-5bdb-4b9e-8a6e-5ee268edac42, locNodeOrder=1, 
rmtNode=07583a9d-36c8-4100-a69c-8cbd26ca82c9, rmtNodeOrder=3][2020-11-03 
09:36:01,922][INFO 
][grid-nio-worker-tcp-comm-0-#23%TcpCommunicationSpi%][TcpCommunicationSpi] 
Accepted incoming communication connection [locAddr=/127.0.0.1:47100, 
rmtAddr=/127.0.0.1:57716][2020-11-03 09:36:01,922][INFO 
][grid-nio-worker-tcp-comm-0-#23%TcpCommunicationSpi%][TcpCommunicationSpi] 
Received incoming connection from remote node while connecting to this node, 
rejecting [locNode=5defd32f-5bdb-4b9e-8a6e-5ee268edac42, locNodeOrder=1, 
rmtNode=07583a9d-36c8-4100-a69c-8cbd26ca82c9, rmtNodeOrder=3][2020-11-03 
09:36:02,124][INFO 
][grid-nio-worker-tcp-comm-1-#24%TcpCommunicationSpi%][TcpCommunicationSpi] 
Accepted incoming communication connection [locAddr=/127.0.0.1:47100, 
rmtAddr=/127.0.0.1:57718][2020-11-03 09:36:02,125][INFO 
][grid-nio-worker-tcp-comm-1-#24%TcpCommunicationSpi%][TcpCommunicationSpi] 
Received incoming connection from remote node while connecting to this node, 
rejecting [locNode=5defd32f-5bdb-4b9e-8a6e-5ee268edac42, locNodeOrder=1, 
rmtNode=07583a9d-36c8-4100-a69c-8cbd26ca82c9, rmtNodeOrder=3][2020-11-03 
09:36:02,326][INFO 
][grid-nio-worker-tcp-comm-2-#25%TcpCommunicationSpi%][TcpCommunicationSpi] 
Accepted incoming communication connection [locAddr=/127.0.0.1:47100, 
rmtAddr=/127.0.0.1:57720][2020-11-03 09:36:02,327][INFO 
][grid-nio-worker-tcp-comm-2-#25%TcpCommunicationSpi%][TcpCommunicationSpi] 
Received incoming connection from remote node while connecting to this node, 
rejecting [locNode=5defd32f-5bdb-4b9e-8a6e-5ee268edac42, locNodeOrder=1, 
rmtNode=07583a9d-36c8-4100-a69c-8cbd26ca82c9, rmtNodeOrder=3][2020-11-03 
09:36:02,528][INFO 
][grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi%][TcpCommunicationSpi] 
Accepted incoming communication connection [locAddr=/127.0.0.1:47100, 
rmtAddr=/127.0.0.1:57722][2020-11-03 09:36:02,529][INFO 
][grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi%][TcpCommunicationSpi] 
Received incoming connection from remote node while connecting to this node, 
rejecting [locNode=5defd32f-5bdb-4b9e-8a6e-5ee268edac42, locNodeOrder=1, 
rmtNode=07583a9d-36c8-4100-a69c-8cbd26ca82c9, rmtNodeOrder=3]}}
{{{noformat}}}{{}}
The second node prints long running transactions in prepared state ignoring the 
default tx timeout:

{{{noformat}}}
{{1}}{{[2020-11-03 09:36:46,199][WARN 
][sys-#83%56b4f715-82d6-4d63-ba99-441ffcd673b4%][diagnostic] >>> Future 
[startTime=09:33:08.496, curTime=09:36:46.181, fut=GridNearTxFinishFuture 
[futId=425decc8571-4ce98554-8c56-4daf-a7a9-5b9bff52fa08, tx=GridNearTxLocal 
[mappings=IgniteTxMappingsSingleImpl [mapping=GridDistributedTxMapping 
[entries=LinkedHashSet [IgniteTxEntry [txKey=IgniteTxKey 
[key=KeyCacheObjectImpl [part=833, val=833, hasValBytes=true], 
cacheId=-923393186], val=TxEntryValueHolder [val=CacheObjectByteArrayImpl 
[arrLen=1048576], op=CREATE], prevVal=TxEntryValueHolder [val=null, op=NOOP], 
oldVal=TxEntryValueHolder [val=null, op=NOOP], entryProcessorsCol=null, ttl=-1, 
conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null, 
filters=CacheEntryPredicate[] [], filtersPassed=false, filtersSet=true, 
entry=GridDhtDetachedCacheEntry [super=GridDistributedCacheEntry 
[super=GridCacheMapEntry [key=KeyCacheObjectImpl [part=833, val=833, 
hasValBytes=true], val=null, ver=GridCacheVersion [topVer=0, order=0, 
nodeOrder=0], hash=833, extras=null, flags=0]]], prepared=0, locked=false, 
nodeId=07583a9d-36c8-4100-a69c-8cbd26ca82c9, locMapped=false, expiryPlc=null, 
transferExpiryPlc=false, flags=0, partUpdateCntr=0, serReadVer=null, 
xidVer=GridCacheVersion [topVer=215865159, order=1604385188157, nodeOrder=2]]], 
explicitLock=false, queryUpdate=false, 

[jira] [Created] (IGNITE-13972) Clear the item id before moving the page to the reuse bucket

2021-01-11 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-13972:
--

 Summary: Clear the item id before moving the page to the reuse  
bucket
 Key: IGNITE-13972
 URL: https://issues.apache.org/jira/browse/IGNITE-13972
 Project: Ignite
  Issue Type: Task
Reporter: Anton Kalashnikov


There is assert - 'Incorrectly recycled pageId in reuse 
bucket:'(org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList#takeEmptyPage).
 This assert sometimes fails. The reason is not clear because the same 
condition checked before putting this page in to reuse bucket. (Perhaps we have 
more than 1 link to this page?)

There is an idea to reset item id to 1 before the putting page to reuse bucket 
in order of decreasing the possible invariants which can break this assert. It 
is already true for all data pages but item id can be still more than 1 if it 
is not a data page(ex. inner page).

After that, we can change this assert from checking the range to checking the 
equality to 1 which theoretically will help us detect the problem fastly.

Maybe it is also not a bad idea to set itemId to an impossible value(ex. 0 or 
255). Then we can add the assert on every taking from the free list which 
checks that itemId more than 0 and if it is false that means we have a link to 
the reuse bucket page from the bucket which is not reused. Which is a bug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13843) Wrapper/Converter for primitive configuration

2020-12-11 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-13843:
--

 Summary: Wrapper/Converter for primitive configuration 
 Key: IGNITE-13843
 URL: https://issues.apache.org/jira/browse/IGNITE-13843
 Project: Ignite
  Issue Type: Sub-task
Reporter: Anton Kalashnikov


Do we need the ability to use complex type such InternetAddress as wrapper of 
some string property?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13842) Creating the new configuration on old cluster

2020-12-11 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-13842:
--

 Summary: Creating the new configuration on old cluster
 Key: IGNITE-13842
 URL: https://issues.apache.org/jira/browse/IGNITE-13842
 Project: Ignite
  Issue Type: Sub-task
Reporter: Anton Kalashnikov


Do we need the ability to create a new configuration/property on the working 
cluster? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13841) Cluster bootstrapping

2020-12-11 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-13841:
--

 Summary: Cluster bootstrapping 
 Key: IGNITE-13841
 URL: https://issues.apache.org/jira/browse/IGNITE-13841
 Project: Ignite
  Issue Type: Sub-task
Reporter: Anton Kalashnikov


How cluster bootstrapping should look like? Format of files? What is the right 
moment fr applying configuration? What is the state of the cluster before 
applying?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13840) Rething API of Init*, change* classes

2020-12-11 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-13840:
--

 Summary: Rething API  of Init*, change* classes
 Key: IGNITE-13840
 URL: https://issues.apache.org/jira/browse/IGNITE-13840
 Project: Ignite
  Issue Type: Sub-task
Reporter: Anton Kalashnikov


Right now, API of Init*, change* classes look too heavy and contain a lot of 
code boilerplate. It needs to think about how to simplify it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13837) Configuration initialization

2020-12-10 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-13837:
--

 Summary: Configuration initialization
 Key: IGNITE-13837
 URL: https://issues.apache.org/jira/browse/IGNITE-13837
 Project: Ignite
  Issue Type: Sub-task
Reporter: Anton Kalashnikov


It needs to think how the first initialization of node/cluster should look 
like. What is the format of initial properties(json/hocon etc.)? How should 
they be handled?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13836) Multiple property roots support

2020-12-10 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-13836:
--

 Summary: Multiple property roots support
 Key: IGNITE-13836
 URL: https://issues.apache.org/jira/browse/IGNITE-13836
 Project: Ignite
  Issue Type: Sub-task
Reporter: Anton Kalashnikov


Right now, Configurator is able to manage only one root. It looks like it is 
not enough. The current idea is to provide the ability to maintain multiple 
property roots, which allows other modules to create their own roots as needed.

ex.:
 * indexing.query.bufferSize
 * persistence.pageSize

NB! There is not any local/cluster root because it looks like local/cluster 
shouldn't be there at all. Perhaps it should be a storage-specific feature 
rather than a property path specific.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13720) Defragmentation parallelism implementation

2020-11-18 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-13720:
--

 Summary: Defragmentation parallelism implementation
 Key: IGNITE-13720
 URL: https://issues.apache.org/jira/browse/IGNITE-13720
 Project: Ignite
  Issue Type: Sub-task
  Components: persistence
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


Defragmentation is executed in a single thread right now. It makes sense to 
execute the defragmentation of partitions of one group in parallel.

Several parameters will be added to the defragmentation configuration:
 * checkpointThreadPoolSize - the size of thread pool which would be used by 
checkpointer for writing defragmented pages to disk.
 * executionThreadPoolSize - the size of the thread pool which shows how many 
partitions maximum can be defragmented at the same time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13684) Rewrite PageIo resolver from static to explicit dependency

2020-11-06 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-13684:
--

 Summary: Rewrite PageIo resolver from static to explicit dependency
 Key: IGNITE-13684
 URL: https://issues.apache.org/jira/browse/IGNITE-13684
 Project: Ignite
  Issue Type: Sub-task
Reporter: Anton Kalashnikov
Assignee: Ivan Bessonov


Right now, ignite has a static pageIo resolver which not allow substituting the 
different implementation if needed. So it is needed to rewrite the current 
implementation in order of this target.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13683) Added MVCC validation to ValidateIndexesClosure

2020-11-06 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-13683:
--

 Summary: Added MVCC validation to ValidateIndexesClosure
 Key: IGNITE-13683
 URL: https://issues.apache.org/jira/browse/IGNITE-13683
 Project: Ignite
  Issue Type: Sub-task
Reporter: Anton Kalashnikov
Assignee: Semyon Danilov


MVCC indexes validation should be added to ValidateIndexesClosure



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13682) Added generic to maintenance mode feature

2020-11-06 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-13682:
--

 Summary: Added generic to maintenance mode feature
 Key: IGNITE-13682
 URL: https://issues.apache.org/jira/browse/IGNITE-13682
 Project: Ignite
  Issue Type: Sub-task
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


MaintenanceAction has no generic right now which lead to parametirezed problem



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13681) Non markers checkpoint implementation

2020-11-06 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-13681:
--

 Summary: Non markers checkpoint implementation
 Key: IGNITE-13681
 URL: https://issues.apache.org/jira/browse/IGNITE-13681
 Project: Ignite
  Issue Type: Sub-task
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


It's needed to implement a new version of checkpoint which will be simpler than 
the current one. The main differences compared to the current checkpoint:
* It doesn't contain any write operation to WAL.
* It doesn't create checkpoint markers.
* It should be possible to configure checkpoint listener only on the exact data 
region
This checkpoint will be helpful for defragmentation and for recovery(it is not 
possible to use the current checkpoint during recovery right now)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13569) disable archiving + walCompactionEnabled probably broke reading from wal on server restart

2020-10-09 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-13569:
--

 Summary: disable archiving + walCompactionEnabled probably broke 
reading from wal on server restart
 Key: IGNITE-13569
 URL: https://issues.apache.org/jira/browse/IGNITE-13569
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


* Start cluster with 4 server node
* Preload
* Start 4 clients 
* Start transactional loading
* Wait 10 sec
While loading:
For node in server nodes:
   Kill -9 node
   Wait 20 sec
   Return node back
   Wait 20 sec

Wal + Wal_archive - lab40, lab41 - 
/storage/hdd/aromantsov/GG-18739

Looks like node can't read all wal files that was generated before start node 
back

{noformat}
[12:50:27,001][SEVERE][wal-file-compressor-%null%-1-#71][FileWriteAheadLogManager]
 Compression of WAL segment [idx=0] was skipped due to unexpected error
class org.apache.ignite.IgniteCheckedException: WAL archive segment is missing: 
/storage/ssd/aromantsov/tiden/snapshots-190514-121520/test_pitr/node_1_1/.wal
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileCompressorWorker.body0(FileWriteAheadLogManager.java:2076)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileCompressorWorker.body(FileWriteAheadLogManager.java:2054)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at java.lang.Thread.run(Thread.java:748)
[12:50:27,001][SEVERE][wal-file-compressor-%null%-0-#69][FileWriteAheadLogManager]
 Compression of WAL segment [idx=2] was skipped due to unexpected error
class org.apache.ignite.IgniteCheckedException: WAL archive segment is missing: 
/storage/ssd/aromantsov/tiden/snapshots-190514-121520/test_pitr/node_1_1/0002.wal
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileCompressorWorker.body0(FileWriteAheadLogManager.java:2076)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileCompressorWorker.access$4800(FileWriteAheadLogManager.java:2019)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileCompressor.body(FileWriteAheadLogManager.java:1995)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at java.lang.Thread.run(Thread.java:748)
[12:50:27,001][SEVERE][wal-file-compressor-%null%-3-#73][FileWriteAheadLogManager]
 Compression of WAL segment [idx=3] was skipped due to unexpected error
class org.apache.ignite.IgniteCheckedException: WAL archive segment is missing: 
/storage/ssd/aromantsov/tiden/snapshots-190514-121520/test_pitr/node_1_1/0003.wal
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileCompressorWorker.body0(FileWriteAheadLogManager.java:2076)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileCompressorWorker.body(FileWriteAheadLogManager.java:2054)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at java.lang.Thread.run(Thread.java:748)
[12:50:27,001][SEVERE][wal-file-compressor-%null%-2-#72][FileWriteAheadLogManager]
 Compression of WAL segment [idx=1] was skipped due to unexpected error
class org.apache.ignite.IgniteCheckedException: WAL archive segment is missing: 
/storage/ssd/aromantsov/tiden/snapshots-190514-121520/test_pitr/node_1_1/0001.wal
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileCompressorWorker.body0(FileWriteAheadLogManager.java:2076)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileCompressorWorker.body(FileWriteAheadLogManager.java:2054)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at java.lang.Thread.run(Thread.java:748)
[12:50:27,002][SEVERE][wal-file-compressor-%null%-1-#71][FileWriteAheadLogManager]
 Compression of WAL segment [idx=4] was skipped due to unexpected error
class org.apache.ignite.IgniteCheckedException: WAL archive segment is missing: 
/storage/ssd/aromantsov/tiden/snapshots-190514-121520/test_pitr/node_1_1/0004.wal
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileCompressorWorker.body0(FileWriteAheadLogManager.java:2076)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileCompressorWorker.body(FileWriteAheadLogManager.java:2054)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at java.lang.Thread.run(Thread.java:748)
[12:50:27,002][SEVERE][wal-file-compressor-%null%-0-#69][FileWriteAheadLogManager]
 

[jira] [Created] (IGNITE-13562) Prototype dynamic configuration

2020-10-08 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-13562:
--

 Summary: Prototype dynamic configuration
 Key: IGNITE-13562
 URL: https://issues.apache.org/jira/browse/IGNITE-13562
 Project: Ignite
  Issue Type: Sub-task
Reporter: Anton Kalashnikov
Assignee: Semyon Danilov


The main target to add a new extra configuration module with a framework that 
allows us to create dynamic properties(node local and cluster wide?).

The framework should provide the following:
* Describing a rule for the schema by which public and private property classes 
would be generated
* Implementing generation public and private classes from schema
* Describing a view of public POJO(update/insert/get) to interact with 
properties in a type-safe way 
* Converting the property from HOCON to the inner view







--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13511) Unified configuration

2020-10-02 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-13511:
--

 Summary: Unified configuration
 Key: IGNITE-13511
 URL: https://issues.apache.org/jira/browse/IGNITE-13511
 Project: Ignite
  Issue Type: New Feature
Reporter: Anton Kalashnikov


https://cwiki.apache.org/confluence/display/IGNITE/IEP-55+Unified+Configuration



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13500) Checkpoint read lock fail if it is taking under write lock during the stopping node

2020-09-30 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-13500:
--

 Summary: Checkpoint read lock fail if it is taking under write 
lock during the stopping node
 Key: IGNITE-13500
 URL: https://issues.apache.org/jira/browse/IGNITE-13500
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov


org.apache.ignite.internal.processors.cache.index.BasicIndexTest#testDynamicIndexesDropWithPersistence

{noformat}
[2020-09-30 
15:09:26,085][ERROR][db-checkpoint-thread-#371%index.BasicIndexTest0%][Checkpointer]
 Runtime error caught during grid runnable execution: GridWorker 
[name=db-checkpoint-thread, igniteInstanceName=index.BasicIndexTest0, 
finished=false, heartbeatTs=1601467766063, hashCode=963964001, 
interrupted=false, runner=db-checkpoint-thread-#371%index.BasicIndexTest0%]
class org.apache.ignite.IgniteException: Failed to perform cache update: node 
is stopping.
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.doCheckpoint(Checkpointer.java:396)
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.body(Checkpointer.java:263)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at java.lang.Thread.run(Thread.java:748)
Caused by: class org.apache.ignite.IgniteException: Failed to perform cache 
update: node is stopping.
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock.checkpointReadLock(CheckpointTimeoutLock.java:128)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.checkpointReadLock(GridCacheDatabaseSharedManager.java:1298)
at 
org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor.removeDurableBackgroundTask(DurableBackgroundTasksProcessor.java:245)
at 
org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor.onMarkCheckpointBegin(DurableBackgroundTasksProcessor.java:277)
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointWorkflow.markCheckpointBegin(CheckpointWorkflow.java:274)
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.doCheckpoint(Checkpointer.java:387)
... 3 more
Caused by: class org.apache.ignite.internal.NodeStoppingException: Failed to 
perform cache update: node is stopping.
... 9 more
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13368) Speed base throttling unexpectedly degraded to zero

2020-08-18 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-13368:
--

 Summary: Speed base throttling unexpectedly degraded to zero
 Key: IGNITE-13368
 URL: https://issues.apache.org/jira/browse/IGNITE-13368
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


New test failure in master PagesWriteThrottleSmokeTest.testThrottle 
https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=2808794487465215609=%3Cdefault%3E=testDetails

Throttling degraded to zero.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13207) Checkpointer code refactoring: Splitting GridCacheDatabaseSharedManager ant Checkpointer

2020-07-02 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-13207:
--

 Summary: Checkpointer code refactoring: Splitting 
GridCacheDatabaseSharedManager ant Checkpointer
 Key: IGNITE-13207
 URL: https://issues.apache.org/jira/browse/IGNITE-13207
 Project: Ignite
  Issue Type: Sub-task
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13080) Incorrect hash calculation for binaryObject in case of deduplication

2020-05-27 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-13080:
--

 Summary: Incorrect hash calculation for binaryObject in case of 
deduplication
 Key: IGNITE-13080
 URL: https://issues.apache.org/jira/browse/IGNITE-13080
 Project: Ignite
  Issue Type: Bug
  Components: binary
Reporter: Anton Kalashnikov


Lets suppose we have two follows classes(Implimentation of SubKey doesn't 
matter here):
{noformat}
public static class Key {   
private SubKey subKey;
}

public static class Value {
private SubKey subKey;
private Key key;
}
{noformat}
If subKey would be same in Key and Value, and we try to do follows things:
{noformat}
SubKey subKey = new SubKey();
Key key = new Key(subKey);
Value value = new Value(subKey, key);

cache.put(key, value); 

assert cache.size() == 1; //true

BinaryObject keyAsBinaryObject = cache.get(key).field("key");

cache.put(keyAsBinaryObject, value); // cache.size shuld be 1 but it would be 2

assert cache.size() == 1; //false because right now we have to different key 
which is wrong
{noformat}
We get two different record instead of one.

Reason:
When we put raw class Key to cache ignite convert it to binary object(literally 
to a byte array), and then calculate the hash over this byte array and store it 
to this object.

When we put the raw class Value, the same thing happens, but since we have two 
references to the same object (subKey) inside Value, deduplication occurs. This 
means that the first time we meet an object, we save it as it is and remember 
its location, and then if we meet the same object again instead of saving all 
their bytes as is, we mark this place as HANDLE and record only the offset at 
which we can find the saved object.
After that, we try to receive some object(Key) from BinaryObject of Value as a 
result we don't have a new BinaryObject with a new byte array but instead, we 
have BinaryObject with same byte array and with offset which shows us where we 
can find the requested value(Key). And when we try to store this object to 
cache, ignite does it incorrectly - first of all, byte array contains HANDLE 
mark with offset instead of real bytes of the inner object what is already 
wrong but more than it we also calculate hash incorrectly.

Problem:
Right now, Ignite isn't able to store BinaryObject with contains HANDLE. And as 
I understand, it's not so easy to fix. Maybe it makes sense just explicitly 
forbid to work with BinaryObject such described above but of course, it is 
discussable.

Workaround:
we can change the order of field in Value, like this:
{noformat}
public static class Value {   
private Key key;
private SubKey subKey;
}
{noformat}
After that subKey would be inlined inside of key and subKey inside of Value 
would be represented as HANDLE.

Also we can rebuild the object such that:
{noformat}
keyAsBinaryObject.toBuilder().build();
{noformat}
During the this procedure all HANDLE would be restored to real objects.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13041) PDS (Indexing) is failed with 137 code

2020-05-20 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-13041:
--

 Summary: PDS (Indexing) is failed with 137 code
 Key: IGNITE-13041
 URL: https://issues.apache.org/jira/browse/IGNITE-13041
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


Process exited with code 137

https://ci.ignite.apache.org/buildConfiguration/IgniteTests24Java8_PdsIndexing?branch=%3Cdefault%3E=overview=builds



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12817) Streamer threads don't update timestamp

2020-03-20 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-12817:
--

 Summary: Streamer threads don't update timestamp
 Key: IGNITE-12817
 URL: https://issues.apache.org/jira/browse/IGNITE-12817
 Project: Ignite
  Issue Type: Bug
  Components: streaming
Reporter: Anton Kalashnikov


Scenario:
1. Start 3 data nodes 
2. Start load with a streamer on 6 clients
3. Start data nodes restarter

Result:
Keys weren't loaded in all (1000) caches.
In the server node log I see:
{noformat}
[2019-07-17 16:52:36,881][ERROR][tcp-disco-msg-worker-#2] Blocked 
system-critical thread has been detected. This can lead to cluster-wide 
undefined behaviour [threadName=data-streamer-stripe-7, blockedFor=16s]
[2019-07-17 16:52:36,883][WARN ][tcp-disco-msg-worker-#2] Thread 
[name="data-streamer-stripe-7-#24", id=43, state=WAITING, blockCnt=111, 
waitCnt=169964]
[2019-07-17 16:52:36,885][ERROR][tcp-disco-msg-worker-#2] Critical system error 
detected. Will be handled accordingly to configured handler 
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet 
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], 
failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class 
o.a.i.IgniteException: GridWorker [name=data-streamer-stripe-7, 
igniteInstanceName=null, finished=false, heartbeatTs=1563371540069]]]
org.apache.ignite.IgniteException: GridWorker [name=data-streamer-stripe-7, 
igniteInstanceName=null, finished=false, heartbeatTs=1563371540069]
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1838)
 ~[ignite-core-2.5.9.jar:2.5.9]
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1833)
 ~[ignite-core-2.5.9.jar:2.5.9]
at 
org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:230)
 ~[ignite-core-2.5.9.jar:2.5.9]
at 
org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:297) 
~[ignite-core-2.5.9.jar:2.5.9]
at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.lambda$new$0(ServerImpl.java:2804)
 ~[ignite-core-2.5.9.jar:2.5.9]
at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorker.body(ServerImpl.java:7568)
 [ignite-core-2.5.9.jar:2.5.9]
at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2866)
 [ignite-core-2.5.9.jar:2.5.9]
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) 
[ignite-core-2.5.9.jar:2.5.9]
at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerThread.body(ServerImpl.java:7506)
 [ignite-core-2.5.9.jar:2.5.9]
at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) 
[ignite-core-2.5.9.jar:2.5.9]
{noformat}



The problem is in data streamer threads. They should update progress timestamps.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12801) Possible extra page release when throttling and checkpoint thread store its concurrently

2020-03-18 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-12801:
--

 Summary: Possible extra page release when throttling and 
checkpoint thread store its concurrently
 Key: IGNITE-12801
 URL: https://issues.apache.org/jira/browse/IGNITE-12801
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


* User thread acquire page on write release
* Checkpoint thread sees that page was acquired
* Throttling thread sees that page was acquired
* Checkpoint thread saves page to disk and releases the page
* Throttling thread understand that the page was already saved but nonetheless 
release this page again. - this is not ok.
{noformat}
java.lang.AssertionError: null
at 
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.copyPageForCheckpoint(PageMemoryImpl.java:1181)
at 
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.checkpointWritePage(PageMemoryImpl.java:1160)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$WriteCheckpointPages.writePages(GridCacheDatabaseSharedManager.java:4868)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$WriteCheckpointPages.run(GridCacheDatabaseSharedManager.java:4792)
... 3 common frames omitted
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12714) Absence of default value of IGNITE_SYSTEM_WORKER_BLOCKED TIMEOUT

2020-02-21 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-12714:
--

 Summary: Absence of default value of IGNITE_SYSTEM_WORKER_BLOCKED 
TIMEOUT
 Key: IGNITE-12714
 URL: https://issues.apache.org/jira/browse/IGNITE-12714
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


Scenario:
1. Start 3 data nodes 
2. Start load with a streamer on 6 clients
3. Start data nodes restarter

Result:
Keys weren't loaded in all (1000) caches.
In the server node log I see:
{noformat}
[2019-07-17 16:52:36,881][ERROR][tcp-disco-msg-worker-#2] Blocked 
system-critical thread has been detected. This can lead to cluster-wide 
undefined behaviour [threadName=data-streamer-stripe-7, blockedFor=16s]
[2019-07-17 16:52:36,883][WARN ][tcp-disco-msg-worker-#2] Thread 
[name="data-streamer-stripe-7-#24", id=43, state=WAITING, blockCnt=111, 
waitCnt=169964]
[2019-07-17 16:52:36,885][ERROR][tcp-disco-msg-worker-#2] Critical system error 
detected. Will be handled accordingly to configured handler 
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet 
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], 
failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class 
o.a.i.IgniteException: GridWorker [name=data-streamer-stripe-7, 
igniteInstanceName=null, finished=false, heartbeatTs=1563371540069]]]
org.apache.ignite.IgniteException: GridWorker [name=data-streamer-stripe-7, 
igniteInstanceName=null, finished=false, heartbeatTs=1563371540069]
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1838)
 ~[ignite-core-2.5.9.jar:2.5.9]
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1833)
 ~[ignite-core-2.5.9.jar:2.5.9]
at 
org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:230)
 ~[ignite-core-2.5.9.jar:2.5.9]
at 
org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:297) 
~[ignite-core-2.5.9.jar:2.5.9]
at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.lambda$new$0(ServerImpl.java:2804)
 ~[ignite-core-2.5.9.jar:2.5.9]
at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorker.body(ServerImpl.java:7568)
 [ignite-core-2.5.9.jar:2.5.9]
at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2866)
 [ignite-core-2.5.9.jar:2.5.9]
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) 
[ignite-core-2.5.9.jar:2.5.9]
at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerThread.body(ServerImpl.java:7506)
 [ignite-core-2.5.9.jar:2.5.9]
at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) 
[ignite-core-2.5.9.jar:2.5.9]
{noformat}

Logs: ftp://gg@172.25.2.50/poc-tester-logs/1723/log-2019-07-17-17-33-23
Log with dumps: 
ftp://gg@172.25.2.50/poc-tester-logs/1723/log-2019-07-17-17-33-23/servers/172.25.1.12/poc-tester-server-172.25.1.12-id-0-2019-07-17-16-46-58.log-1-2019-07-17.log.gz


*Solution:*
Increase timeout to 2 min 
org.apache.ignite.IgniteSystemProperties#IGNITE_SYSTEM_WORKER_BLOCKED_TIMEOUT



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12713) [Suite] PDS 1 flaky failed on TC

2020-02-21 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-12713:
--

 Summary: [Suite] PDS 1 flaky failed on TC
 Key: IGNITE-12713
 URL: https://issues.apache.org/jira/browse/IGNITE-12713
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


IgnitePdsTestSuite: 
BPlusTreeReuseListPageMemoryImplTest.testIterateConcurrentPutRemove_2   
⚂IgnitePdsTestSuite: 
BPlusTreeReuseListPageMemoryImplTest.testMassiveRemove2_false  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12712) NPE in checkpoint thread

2020-02-21 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-12712:
--

 Summary: NPE in checkpoint thread
 Key: IGNITE-12712
 URL: https://issues.apache.org/jira/browse/IGNITE-12712
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


NPE occured in checkpoint thread (rare reproducing):
{noformat}
[2019-11-04 20:54:58,018][INFO ][sys-#50][GridDhtPartitionsExchangeFuture] 
Received full message, will finish exchange 
[node=1784645d-3bef-44fe-8288-e0c16202f5e3, resVer=AffinityTopologyVersion 
[topVer=4, minorTopVer=9]]
[2019-11-04 20:54:58,023][INFO ][sys-#50][GridDhtPartitionsExchangeFuture] 
Finish exchange future [startVer=AffinityTopologyVersion [topVer=4, 
minorTopVer=9], resVer=AffinityTopologyVersion [topVer=4, minorTopVer=9], 
err=null]
[2019-11-04 20:54:58,029][INFO ][sys-#50][GridCacheProcessor] Finish proxy 
initialization, cacheName=SQL_PUBLIC_T8, 
localNodeId=5b153e14-70f2-4408-a125-584752532ebd
[2019-11-04 20:54:58,030][INFO ][sys-#50][GridDhtPartitionsExchangeFuture] 
Completed partition exchange [localNode=5b153e14-70f2-4408-a125-584752532ebd, 
exchange=GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion 
[topVer=4, minorTopVer=9], evt=DISCOVERY_CUSTOM_EVT, evtNode=TcpDiscoveryNode 
[id=1784645d-3bef-44fe-8288-e0c16202f5e3, consistentId=1, addrs=ArrayList 
[127.0.0.1], sockAddrs=HashSet [/127.0.0.1:47500], discPort=47500, order=1, 
intOrder=1, lastExchangeTime=1572890071469, loc=false, 
ver=8.7.8#20191101-sha1:e344ed04, isClient=false], done=true, newCrdFut=null], 
topVer=AffinityTopologyVersion [topVer=4, minorTopVer=9]]
[2019-11-04 20:54:58,030][INFO ][sys-#50][GridDhtPartitionsExchangeFuture] 
Exchange timings [startVer=AffinityTopologyVersion [topVer=4, minorTopVer=9], 
resVer=AffinityTopologyVersion [topVer=4, minorTopVer=9], stage="Waiting in 
exchange queue" (0 ms), stage="Exchange parameters initialization" (0 ms), 
stage="Update caches registry" (0 ms), stage="Start caches" (52 ms), 
stage="Affinity initialization on cache group start" (1 ms), stage="Determine 
exchange type" (0 ms), stage="Preloading notification" (0 ms), stage="WAL 
history reservation" (0 ms), stage="Wait partitions release" (1 ms), 
stage="Wait partitions release latch" (5 ms), stage="Wait partitions release" 
(0 ms), stage="Restore partition states" (7 ms), stage="After states restored 
callback" (10 ms), stage="Waiting for Full message" (59 ms), stage="Affinity 
recalculation" (0 ms), stage="Full map updating" (4 ms), stage="Exchange done" 
(7 ms), stage="Total time" (146 ms)]
[2019-11-04 20:54:58,030][INFO ][sys-#50][GridDhtPartitionsExchangeFuture] 
Exchange longest local stages [startVer=AffinityTopologyVersion [topVer=4, 
minorTopVer=9], resVer=AffinityTopologyVersion [topVer=4, minorTopVer=9], 
stage="Affinity initialization on cache group start [grp=SQL_PUBLIC_T8]" (1 ms) 
(parent=Affinity initialization on cache group start), stage="Restore partition 
states [grp=SQL_PUBLIC_T8]" (6 ms) (parent=Restore partition states), 
stage="Restore partition states [grp=ignite-sys-cache]" (3 ms) (parent=Restore 
partition states), stage="Restore partition states [grp=cache_group_3]" (0 ms) 
(parent=Restore partition states)]
[2019-11-04 20:54:58,037][INFO 
][exchange-worker-#45][GridCachePartitionExchangeManager] Skipping rebalancing 
(nothing scheduled) [top=AffinityTopologyVersion [topVer=4, minorTopVer=9], 
force=false, evt=DISCOVERY_CUSTOM_EVT, 
node=1784645d-3bef-44fe-8288-e0c16202f5e3]
[2019-11-04 20:54:58,713][INFO 
][db-checkpoint-thread-#53][GridCacheDatabaseSharedManager] Checkpoint started 
[checkpointId=82969270-b1a5-4480-9513-3af65bab0e17, startPtr=FileWALPointer 
[idx=0, fileOff=3550077, len=12350], checkpointBeforeLockTime=8ms, 
checkpointLockWait=4ms, checkpointListenersExecuteTime=56ms, 
checkpointLockHoldTime=61ms, walCpRecordFsyncDuration=4ms, 
writeCheckpointEntryDuration=8ms, splitAndSortCpPagesDuration=1ms,  pages=178, 
reason='timeout']
[2019-11-04 20:54:58,914][INFO ][exchange-worker-#45][time] Started exchange 
init [topVer=AffinityTopologyVersion [topVer=4, minorTopVer=10], crd=false, 
evt=DISCOVERY_CUSTOM_EVT, evtNode=1784645d-3bef-44fe-8288-e0c16202f5e3, 
customEvt=DynamicCacheChangeBatch 
[id=8b06d873e61-af9e27a6-8fe9-4da1-bc0a-d19cd0eabd36, reqs=ArrayList 
[DynamicCacheChangeRequest [cacheName=SQL_PUBLIC_T9, hasCfg=true, 
nodeId=1784645d-3bef-44fe-8288-e0c16202f5e3, clientStartOnly=false, stop=false, 
destroy=false, disabledAfterStartfalse]], exchangeActions=ExchangeActions 
[startCaches=[SQL_PUBLIC_T9], stopCaches=null, startGrps=[cache_group_4], 
stopGrps=[], resetParts=null, stateChangeRequest=null], startCaches=false], 
allowMerge=false]
[2019-11-04 20:54:58,930][INFO ][exchange-worker-#45][PageMemoryImpl] Started 
page memory [memoryAllocated=200.0 MiB, pages=49630, tableSize=3.9 MiB, 
checkpointBuffer=200.0 MiB]
[2019-11-04 

[jira] [Created] (IGNITE-12709) Server latch initialized after client latch in Zookeeper discovery

2020-02-20 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-12709:
--

 Summary: Server latch initialized after client latch in Zookeeper 
discovery
 Key: IGNITE-12709
 URL: https://issues.apache.org/jira/browse/IGNITE-12709
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


The coordinator node missed latch message from the client because it doesn't 
receive a triggered message of exchange. So it leads to infinity wait of answer 
from the coordinator.

{noformat}

[2019-10-23 
12:49:42,110]\[ERROR]\[sys-#39470%continuous.GridEventConsumeSelfTest0%]\[GridIoManager]
 An error occurred processing the message \[msg=GridIoMessage \[plc=2, 
topic=TOPIC_EXCHANGE, topicOrd=31, ordered=fa
lse, timeout=0, skipOnTimeout=false, 
msg=org.apache.ignite.internal.processors.cache.distributed.dht.preloader.latch.LatchAckMessage@7699f4f2],
 nodeId=857a40a8-f384-4740-816c-dd54d3a1].
class org.apache.ignite.IgniteException: Topology AffinityTopologyVersion 
\[topVer=54, minorTopVer=0] not found in discovery history ; consider 
increasing IGNITE_DISCOVERY_HISTORY_SIZE property. Current value is
-1
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.latch.ExchangeLatchManager.aliveNodesForTopologyVer(ExchangeLatchManager.java:292)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.latch.ExchangeLatchManager.getLatchCoordinator(ExchangeLatchManager.java:334)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.latch.ExchangeLatchManager.processAck(ExchangeLatchManager.java:379)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.latch.ExchangeLatchManager.lambda$new$0(ExchangeLatchManager.java:119)
at 
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1632)
at 
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1252)
at 
org.apache.ignite.internal.managers.communication.GridIoManager.access$4300(GridIoManager.java:143)
at 
org.apache.ignite.internal.managers.communication.GridIoManager$8.execute(GridIoManager.java:1143)
at 
org.apache.ignite.internal.managers.communication.TraceRunnable.run(TraceRunnable.java:50)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

[2019-10-23 12:50:02,106]\[WARN 
]\[exchange-worker-#39517%continuous.GridEventConsumeSelfTest1%]\[GridDhtPartitionsExchangeFuture]
 Unable to await partitions release latch within timeout: ClientLatch 
\[coordinator=ZookeeperClusterNode \[id=760ca6b5-f30b-4c40-81b1-5b602c20, 
addrs=\[127.0.0.1], order=1, loc=false, client=false], ackSent=true, 
super=CompletableLatch \[id=CompletableLatchUid \[id=exchange, 
topVer=AffinityTopologyVersion \[topVer=54, minorTopVer=0

[2019-10-23 12:50:02,192]\[WARN 
]\[exchange-worker-#39469%continuous.GridEventConsumeSelfTest0%]\[GridDhtPartitionsExchangeFuture]
 Unable to await partitions release latch within timeout: ServerLatch 
\[permits=1, pendingAcks=HashSet \[06c3094b-c1f3-4fe8-81e8-22cb6602], 
super=CompletableLatch \[id=CompletableLatchUid \[id=exchange, 
topVer=AffinityTopologyVersion \[topVer=54, minorTopVer=0

{noformat}

Reproduced by 
org.apache.ignite.internal.processors.continuous.GridEventConsumeSelfTest#testMultithreadedWithNodeRestart



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12653) Add example of baseline auto-adjust feature

2020-02-10 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-12653:
--

 Summary: Add example of baseline auto-adjust feature
 Key: IGNITE-12653
 URL: https://issues.apache.org/jira/browse/IGNITE-12653
 Project: Ignite
  Issue Type: Task
  Components: examples
Reporter: Anton Kalashnikov


Work on the Phase II of IEP-4 (Baseline topology) [1] has finished. It makes 
sense to implement some examples of "Baseline auto-adjust" [2]. 

"Baseline auto-adjust" feature implements mechanism of auto-adjust baseline 
corresponding to current topology after event join/left was appeared. It is 
required because when a node left the grid and nobody would change baseline 
manually it can lead to lost data(when some more nodes left the grid on depends 
in backup factor) but permanent tracking of grid is not always 
possible/desirible. Looks like in many cases auto-adjust baseline after some 
timeout is very helpfull. 

Distributed metastore[3](it is already done): 

First of all it is required the ability to store configuration data 
consistently and cluster-wide. Ignite doesn't have any specific API for such 
configurations and we don't want to have many similar implementations of the 
same feature in our code. After some thoughts is was proposed to implement it 
as some kind of distributed metastorage that gives the ability to store any 
data in it. 
First implementation is based on existing local metastorage API for persistent 
clusters (in-memory clusters will store data in memory). Write/remove operation 
use Discovery SPI to send updates to the cluster, it guarantees updates order 
and the fact that all existing (alive) nodes have handled the update message. 
As a way to find out which node has the latest data there is a "version" value 
of distributed metastorage, which is basically . All updates history until some point in the past is stored along with 
the data, so when an outdated node connects to the cluster it will receive all 
the missing data and apply it locally. If there's not enough history stored or 
joining node is clear then it'll receive shapshot of distributed metastorage so 
there won't be inconsistencies. 

Baseline auto-adjust: 

Main scenario: 
- There is a grid with the baseline is equal to the current topology 
- New node joins to grid or some node left(failed) the grid 
- New mechanism detects this event and it add a task for changing 
baseline to queue with configured timeout 
- If a new event happens before baseline would be changed task would be 
removed from the queue and a new task will be added 
- When a timeout is expired the task would try to set new baseline 
corresponded to current topology 

First of all we need to add two parameters[4]: 
- baselineAutoAdjustEnabled - enable/disable "Baseline auto-adjust" 
feature. 
- baselineAutoAdjustTimeout - timeout after which baseline should be 
changed. 

These parameters are cluster-wide and can be changed in real-time because it is 
based on "Distributed metastore". 

Restrictions: 
- This mechanism handling events only on active grid 
- for in-memory nodes - enabled by default. For persistent nodes - 
disabled.
- If lost partitions was detected this feature would be disabled 
- If baseline was adjusted manually on baselineNodes != gridNodes the 
exception would be thrown

[1] 
https://cwiki.apache.org/confluence/display/IGNITE/IEP-4+Baseline+topology+for+caches
[2] https://issues.apache.org/jira/browse/IGNITE-8571
[3] https://issues.apache.org/jira/browse/IGNITE-10640
[4] https://issues.apache.org/jira/browse/IGNITE-8573



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12652) Add example of failure handling

2020-02-10 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-12652:
--

 Summary: Add example of failure handling
 Key: IGNITE-12652
 URL: https://issues.apache.org/jira/browse/IGNITE-12652
 Project: Ignite
  Issue Type: Task
  Components: examples
Reporter: Anton Kalashnikov


Ignite has the following feature - 
https://apacheignite.readme.io/docs/critical-failures-handling, but there is 
not an example of how to use it correctly. So it is good to add some examples.

Also, Ignite has DiagnosticProcessor which invokes when the failure handler is 
triggered. Maybe it is a good idea to add to this example some samples of 
diagnostic work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12647) Get rid of IGFS and Hadoop Accelerator

2020-02-10 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-12647:
--

 Summary: Get rid of IGFS and Hadoop Accelerator
 Key: IGNITE-12647
 URL: https://issues.apache.org/jira/browse/IGNITE-12647
 Project: Ignite
  Issue Type: Improvement
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


There is no single committer who maintains the
integrations; they are no longer tested and, even more, the community
stopped providing the binaries since Ignite 2.6.0 release (look for
In-Memory Hadoop Accelerator table).

So it makes sense to get rid of IGFS and Hadoop Accelerator



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12631) Incorrect rewriting wal record type in marshalled mode during iteration

2020-02-06 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-12631:
--

 Summary: Incorrect rewriting wal record type in marshalled mode 
during iteration 
 Key: IGNITE-12631
 URL: https://issues.apache.org/jira/browse/IGNITE-12631
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


The fail happens on iteration over wal record which was written under 
marshalled mode in case when RecordType#ordinal != RecordType#index
{noformat}
[16:46:58,800][SEVERE][pitr-ctx-exec-#399][GridRecoveryProcessor] Fail scan wal 
log for recovery localNodeConstId=node_1_1
 class org.apache.ignite.IgniteCheckedException: Failed to read WAL record at 
position: 45905 size: -1
at 
org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.handleRecordException(AbstractWalRecordsIterator.java:292)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.handleRecordException(FileWriteAheadLogManager.java:3302)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.advanceRecord(AbstractWalRecordsIterator.java:258)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.advance(AbstractWalRecordsIterator.java:154)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.onNext(AbstractWalRecordsIterator.java:123)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.onNext(AbstractWalRecordsIterator.java:52)
at 
org.apache.ignite.internal.util.GridCloseableIteratorAdapter.nextX(GridCloseableIteratorAdapter.java:41)
at 
org.apache.ignite.internal.util.lang.GridIteratorAdapter.next(GridIteratorAdapter.java:35)
... 7 more
Caused by: class org.apache.ignite.IgniteCheckedException: Failed to read WAL 
record at position: 45905 size: -1
at 
org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV1Serializer.readWithCrc(RecordV1Serializer.java:394)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV2Serializer.readRecord(RecordV2Serializer.java:235)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.advanceRecord(AbstractWalRecordsIterator.java:243)
... 12 more
Caused by: java.io.IOException: Unknown record type: null, expected pointer 
[idx=2, offset=45905]
at 
org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV2Serializer$2.readWithHeaders(RecordV2Serializer.java:122)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV1Serializer.readWithCrc(RecordV1Serializer.java:373)
... 14 more
Suppressed: class 
org.apache.ignite.internal.processors.cache.persistence.wal.crc.IgniteDataIntegrityViolationException:
 val: 1445348818 writtenCrc: 374280888
at 
org.apache.ignite.internal.processors.cache.persistence.wal.io.FileInput$Crc32CheckingFileInput.close(FileInput.java:106)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV1Serializer.readWithCrc(RecordV1Serializer.java:380)
... 14 more
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12594) Deadlock between GridCacheDataStore#purgeExpiredInternal and GridNearTxLocal#enlistWriteEntry

2020-01-28 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-12594:
--

 Summary: Deadlock between GridCacheDataStore#purgeExpiredInternal 
and GridNearTxLocal#enlistWriteEntry
 Key: IGNITE-12594
 URL: https://issues.apache.org/jira/browse/IGNITE-12594
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


The deadlock is reproduced occasionally in PDS3 suite and can be seen in the 
thread dump below.
One thread attempts to unwind evicts, acquires checkpoint read lock and then 
locks {{GridCacheMapEntry}}. Another thread does {{GridCacheMapEntry#unswap}}, 
determines that the entry is expired and acquires checkpoint read lock to 
remove the entry from the store. 
We should not acquire checkpoint read lock inside of a locked 
{{GridCacheMapEntry}}.

{code:java}Thread [name="updater-1", id=29900, state=WAITING, blockCnt=2, 
waitCnt=4450]
Lock 
[object=java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@2fc51685, 
ownerName=null, ownerId=-1]
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
at 
java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
at 
o.a.i.i.processors.cache.persistence.GridCacheDatabaseSharedManager.checkpointReadLock(GridCacheDatabaseSharedManager.java:1632)
   <- CP read lock
at 
o.a.i.i.processors.cache.GridCacheMapEntry.onExpired(GridCacheMapEntry.java:4081)
at 
o.a.i.i.processors.cache.GridCacheMapEntry.unswap(GridCacheMapEntry.java:559)
at 
o.a.i.i.processors.cache.GridCacheMapEntry.unswap(GridCacheMapEntry.java:519)   
   <- locked entry
at 
o.a.i.i.processors.cache.distributed.near.GridNearTxLocal.enlistWriteEntry(GridNearTxLocal.java:1437)
at 
o.a.i.i.processors.cache.distributed.near.GridNearTxLocal.enlistWrite(GridNearTxLocal.java:1303)
at 
o.a.i.i.processors.cache.distributed.near.GridNearTxLocal.putAllAsync0(GridNearTxLocal.java:957)
at 
o.a.i.i.processors.cache.distributed.near.GridNearTxLocal.putAllAsync(GridNearTxLocal.java:491)
at 
o.a.i.i.processors.cache.GridCacheAdapter$29.inOp(GridCacheAdapter.java:2526)
at 
o.a.i.i.processors.cache.GridCacheAdapter$SyncInOp.op(GridCacheAdapter.java:4727)
at 
o.a.i.i.processors.cache.GridCacheAdapter.syncOp(GridCacheAdapter.java:3740)
at 
o.a.i.i.processors.cache.GridCacheAdapter.putAll0(GridCacheAdapter.java:2524)
at 
o.a.i.i.processors.cache.GridCacheAdapter.putAll(GridCacheAdapter.java:2513)
at 
o.a.i.i.processors.cache.IgniteCacheProxyImpl.putAll(IgniteCacheProxyImpl.java:1264)
at 
o.a.i.i.processors.cache.GatewayProtectedCacheProxy.putAll(GatewayProtectedCacheProxy.java:863)
at 
o.a.i.i.processors.cache.persistence.IgnitePdsContinuousRestartTest$1.call(IgnitePdsContinuousRestartTest.java:291)
at o.a.i.testframework.GridTestThread.run(GridTestThread.java:83)

Locked synchronizers:
java.util.concurrent.locks.ReentrantLock$NonfairSync@762613f7


Thread 
[name="sys-stripe-0-#24086%persistence.IgnitePdsContinuousRestartTestWithExpiryPolicy0%",
 id=29617, state=WAITING, blockCnt=2, waitCnt=65381]
Lock [object=java.util.concurrent.locks.ReentrantLock$NonfairSync@762613f7, 
ownerName=updater-1, ownerId=29900]
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at 
java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at 
java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)   
<- lock entry
at 
o.a.i.i.processors.cache.GridCacheMapEntry.lockEntry(GridCacheMapEntry.java:5017)
at 
o.a.i.i.processors.cache.GridCacheMapEntry.markObsoleteVersion(GridCacheMapEntry.java:2799)
at 
o.a.i.i.processors.cache.distributed.dht.topology.GridDhtLocalPartition.removeVersionedEntry(GridDhtLocalPartition.java:392)
at 

[jira] [Created] (IGNITE-12593) Corruption of B+Tree caused by byte array values and TTL

2020-01-28 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-12593:
--

 Summary: Corruption of B+Tree caused by byte array values and TTL
 Key: IGNITE-12593
 URL: https://issues.apache.org/jira/browse/IGNITE-12593
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


It seems that the following set of parameters may lead to a corruption of 
B+Tree:
 - persistence is enabled
 - TTL is enabled 
 - Expiry policy - AccessedExpiryPolicy 1 sec.
 - cache value type is byte[]
 - all caches belong to the same cache group

Example of the stack trace:
{code:java}
[2019-07-16 
21:13:19,288][ERROR][sys-stripe-2-#46%db.IgnitePdsWithTtlDeactivateOnHighloadTest1%][IgniteTestResources]
 Critical system error detected. Will be handled accordingly to configured 
handler [hnd=NoOpFailureHandler [super=AbstractFailureHandler 
[ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, 
SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext 
[type=CRITICAL_ERROR, err=class 
o.a.i.i.processors.cache.persistence.tree.CorruptedTreeException: B+Tree is 
corrupted [pages(groupId, pageId)=[IgniteBiTuple [val1=-1237460590, 
val2=281586645860358]], msg=Runtime failure on search row: SearchRow 
[key=KeyCacheObjectImpl [part=26, val=378, hasValBytes=true], hash=378, 
cacheId=-1806498247
class 
org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException:
 B+Tree is corrupted [pages(groupId, pageId)=[IgniteBiTuple [val1=-1237460590, 
val2=281586645860358]], msg=Runtime failure on search row: SearchRow 
[key=KeyCacheObjectImpl [part=26, val=378, hasValBytes=true], hash=378, 
cacheId=-1806498247]]
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.corruptedTreeException(BPlusTree.java:5910)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invoke(BPlusTree.java:1859)
at 
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke0(IgniteCacheOffheapManagerImpl.java:1662)
at 
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke(IgniteCacheOffheapManagerImpl.java:1645)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.invoke(GridCacheOffheapManager.java:2410)
at 
org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.invoke(IgniteCacheOffheapManagerImpl.java:445)
at 
org.apache.ignite.internal.processors.cache.GridCacheMapEntry.innerUpdate(GridCacheMapEntry.java:2309)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateSingle(GridDhtAtomicCache.java:2570)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.update(GridDhtAtomicCache.java:2030)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1848)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1668)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3235)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$400(GridDhtAtomicCache.java:139)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:273)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:268)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1141)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:591)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:392)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:318)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:109)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:308)
at 
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1558)
at 
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1186)
at 
org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:125)
at 

[jira] [Created] (IGNITE-12463) Inconsistancy of checkpoint progress future with its state

2019-12-17 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-12463:
--

 Summary: Inconsistancy of checkpoint progress future with its 
state 
 Key: IGNITE-12463
 URL: https://issues.apache.org/jira/browse/IGNITE-12463
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


It needs to reorganize checkpoint futures(start, finish) so they should be 
matched to states.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12460) Cluster fails to find the node by consistent ID

2019-12-17 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-12460:
--

 Summary: Cluster fails to find the node by consistent ID
 Key: IGNITE-12460
 URL: https://issues.apache.org/jira/browse/IGNITE-12460
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


Steps to reproduce 1:

* Start cluster of three nodes
* Navigate to Baseline screen
* Start one more node
* Include it into baseline
* Hit 'Save' btn

Expected:

* Success alert, node enters baseline

Actual:

* Exception is thrown and is displayed

Steps to reproduce 2:

# Start topology with 2 nodes.
# Activate cluster.
# Start third node.
# Stop second node.
# Try to add third node to baseline in Web console.

Also reproduced with *control.sh --baseline set* command.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12459) Searching checkpoint record in WAL doesn't work with segment compaction

2019-12-17 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-12459:
--

 Summary: Searching checkpoint record in WAL doesn't work with 
segment compaction
 Key: IGNITE-12459
 URL: https://issues.apache.org/jira/browse/IGNITE-12459
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


During iteration over WAL we have two invariants about result(Tuple):
* WALPointer equal to WALRecord.position() when segment is uncompacted
* WALPointer not equal to WALRecord.position() when the segment is compacted
Unfortunately, the second variant is broken in 
FileWriteAheadLogManager#read(WALPointer ptr)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12227) Default auto-adjust baseline enabled flag calculated incorrectly in some cases

2019-09-24 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-12227:
--

 Summary: Default auto-adjust baseline enabled flag calculated 
incorrectly in some cases
 Key: IGNITE-12227
 URL: https://issues.apache.org/jira/browse/IGNITE-12227
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


baselineAutoAdjustEnabled can be been different on different nodes because of 
the calculation of default value happening locally on each node and including 
only local configuration. It issue can happen by the following reasons:
*  If IGNITE_BASELINE_AUTO_ADJUST_ENABLED flag set to a different value on 
different nodes it leads to cluster hanging due to baseline calculation 
finishing with the unpredictable state on each node.
* if cluster in mixed mode(included in-memory and persistent nodes) sometimes 
flag is set to a different value due to calculation doesn't consider remote 
nodes configuration.

Possible solution(both points required):
* Get rid of IGNITE_BASELINE_AUTO_ADJUST_ENABLED and replace it by the explicit 
call of IgniteCluster#baselineAutoAdjustEnabled where it required(test only).
* Calculating default value on the first started node as early as 
possible(instead of activation) and this value always should be set to 
distributed metastorage(unlike it happening now). It means that instead of 
awaiting activation, the default value would be calculated by the first started 
node.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12179) Test and javadoc fixes

2019-09-17 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-12179:
--

 Summary: Test and javadoc fixes
 Key: IGNITE-12179
 URL: https://issues.apache.org/jira/browse/IGNITE-12179
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


Some javadoc package descriptions missed:
* org.apache.ignite.spi.communication.tcp.internal
* org.apache.ignite.spi.discovery.zk
* org.apache.ignite.spi.discovery.zk.internal
* org.apache.ignite.ml.structures.partition
* org.gridgain.grid.persistentstore.snapshot.file.copy
Unclear CLEANUP_RESTARTING_CACHES command in snapshot utility
unclear error when connecting to secure cluster (SSL + Auth)
Update log message to avoid confusion for an user

*.testTtlNoTx flaky failed on TC
TcpCommunicationSpiFreezingClientTest failed
TcpCommunicationSpiFaultyClientSslTest.testNotAcceptedConnection failed
testCacheIdleVerifyPrintLostPartitions failed



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (IGNITE-12154) Test testCheckpointFailBeforeMarkEntityWrite fail in compression suit

2019-09-10 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-12154:
--

 Summary: Test testCheckpointFailBeforeMarkEntityWrite fail in 
compression suit
 Key: IGNITE-12154
 URL: https://issues.apache.org/jira/browse/IGNITE-12154
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


CheckpointFailBeforeWriteMarkTest.testCheckpointFailBeforeMarkEntityWrite


https://ci.ignite.apache.org/viewLog.html?buildId=4584051=IgniteTests24Java8_DiskPageCompressions=buildResultsDiv_IgniteTests24Java8=%3Cdefault%3E



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (IGNITE-12121) Double checkpoint triggering due to incorrect place of update current checkpoint

2019-08-29 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-12121:
--

 Summary: Double checkpoint triggering due to incorrect place of 
update current checkpoint
 Key: IGNITE-12121
 URL: https://issues.apache.org/jira/browse/IGNITE-12121
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


Double checkpoint triggering due to incorrect place of update current 
checkpoint. This can lead to two ckeckpoints one by one if checkpoint trigger 
was 'too many dirty pages'.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (IGNITE-11982) Fix bugs of pds

2019-07-15 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-11982:
--

 Summary: Fix bugs of pds
 Key: IGNITE-11982
 URL: https://issues.apache.org/jira/browse/IGNITE-11982
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


fixed pds crash:
* Fail during logical recovery
* JVM crash in all compatibility LFS tests
* WAL segments serialization problem
* Unable to read last WAL record after crash during checkpoint
* Node failed on detecting storage block size if page compression enabled on 
many caches
* Can not change baseline for in-memory cluster
* SqlFieldsQuery DELETE FROM causes JVM crash
* Fixed IgniteCheckedException: Compound exception for CountDownFuture.

fixed tests:
* WalCompactionAndPageCompressionTest
* IgnitePdsRestartAfterFailedToWriteMetaPageTest.test
 * GridPointInTimeRecoveryRebalanceTest.testRecoveryNotFailsIfWalSomewhereEnab
* 
IgniteClusterActivateDeactivateTest.testDeactivateSimple_5_Servers_5_Clients_Fro
* IgniteCacheReplicatedQuerySelfTest.testNodeLeft 
* .NET tests

optimization:
* Replace TcpDiscoveryNode to nodeId in TcpDiscoveryMessages
* Failures to deserialize discovery data should be handled by a failure handler
* Optimize GridToStringBuilder



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (IGNITE-11969) Incorrect DefaultConcurrencyLevel value in .net test

2019-07-09 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-11969:
--

 Summary: Incorrect DefaultConcurrencyLevel value in .net test
 Key: IGNITE-11969
 URL: https://issues.apache.org/jira/browse/IGNITE-11969
 Project: Ignite
  Issue Type: Test
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


Incorrect DefaultConcurrencyLevel value in .net test after default 
configuration in java was changed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11892) Incorrect assert in wal scanner test

2019-06-04 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-11892:
--

 Summary: Incorrect assert  in wal scanner test
 Key: IGNITE-11892
 URL: https://issues.apache.org/jira/browse/IGNITE-11892
 Project: Ignite
  Issue Type: Improvement
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


https://ci.ignite.apache.org/viewLog.html?buildId=4038516=IgniteTests24Java8_Pds2

{noformat}
junit.framework.AssertionFailedError: Next WAL record :: Record : PAGE_RECORD - 
Unable to convert to string representation.
at 
org.apache.ignite.internal.processors.cache.persistence.wal.scanner.WalScannerTest.shouldDumpToFileFoundRecord(WalScannerTest.java:254)
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11818) Support JMX/control.sh for debug page info

2019-04-26 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-11818:
--

 Summary: Support JMX/control.sh for debug page info
 Key: IGNITE-11818
 URL: https://issues.apache.org/jira/browse/IGNITE-11818
 Project: Ignite
  Issue Type: Improvement
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


Support JMX/control.sh for debug page info



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11816) Debug processor for dump page history info

2019-04-26 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-11816:
--

 Summary: Debug processor for dump page history info
 Key: IGNITE-11816
 URL: https://issues.apache.org/jira/browse/IGNITE-11816
 Project: Ignite
  Issue Type: Improvement
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


Debug processor for dump page history info



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11782) WAL iterator for collect per-pageId info

2019-04-18 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-11782:
--

 Summary: WAL iterator for collect per-pageId info
 Key: IGNITE-11782
 URL: https://issues.apache.org/jira/browse/IGNITE-11782
 Project: Ignite
  Issue Type: Improvement
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


Implement WAL iterator for collect per-pageId info (page is root)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11678) Forbidding joining persistence node to in-memory cluster

2019-04-03 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-11678:
--

 Summary: Forbidding joining persistence node to in-memory cluster
 Key: IGNITE-11678
 URL: https://issues.apache.org/jira/browse/IGNITE-11678
 Project: Ignite
  Issue Type: Improvement
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


Forbidding joining persistence node to in-memory cluster when baseline 
auto-adjust enabled and timeout equal to 0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11650) Commutication worker doesn't kick client node after expired idleConnTimeout

2019-03-28 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-11650:
--

 Summary: Commutication worker doesn't kick client node after 
expired idleConnTimeout
 Key: IGNITE-11650
 URL: https://issues.apache.org/jira/browse/IGNITE-11650
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


Reproduced by TcpCommunicationSpiFreezingClientTest.testFreezingClient
{noformat}
java.lang.AssertionError: Client node must be kicked from topology
at org.junit.Assert.fail(Assert.java:88)
at 
org.apache.ignite.testframework.junits.JUnitAssertAware.fail(JUnitAssertAware.java:49)
at 
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpiFreezingClientTest.testFreezingClient(TcpCommunicationSpiFreezingClientTest.java:122)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at 
org.apache.ignite.testframework.junits.GridAbstractTest$6.run(GridAbstractTest.java:2102)
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11627) Test CheckpointFreeListTest.testRestoreFreeListCorrectlyAfterRandomStop always fails in DiskCompression suite

2019-03-26 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-11627:
--

 Summary: Test 
CheckpointFreeListTest.testRestoreFreeListCorrectlyAfterRandomStop always fails 
in DiskCompression suite
 Key: IGNITE-11627
 URL: https://issues.apache.org/jira/browse/IGNITE-11627
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=5828425958400232265=testDetails_IgniteTests24Java8=%3Cdefault%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11605) Incorrect check condition in BinaryTypeRegistrationTest.shouldSendOnlyOneMetadataMessage

2019-03-22 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-11605:
--

 Summary: Incorrect check condition in 
BinaryTypeRegistrationTest.shouldSendOnlyOneMetadataMessage
 Key: IGNITE-11605
 URL: https://issues.apache.org/jira/browse/IGNITE-11605
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


BinaryTypeRegistrationTest.shouldSendOnlyOneMetadataMessage is flaky.
{noformat}
java.lang.AssertionError: 
Expected :1
Actual :2


at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at org.junit.Assert.assertEquals(Assert.java:542)
at 
org.apache.ignite.testframework.junits.JUnitAssertAware.assertEquals(JUnitAssertAware.java:94)
at 
org.apache.ignite.internal.processors.cache.BinaryTypeRegistrationTest.shouldSendOnlyOneMetadataMessage(BinaryTypeRegistrationTest.java:106)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at 
org.apache.ignite.testframework.junits.GridAbstractTest$6.run(GridAbstractTest.java:2102)
at java.lang.Thread.run(Thread.java:748)
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11590) NPE during onKernalStop in mvcc processor

2019-03-21 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-11590:
--

 Summary: NPE during onKernalStop in mvcc processor 
 Key: IGNITE-11590
 URL: https://issues.apache.org/jira/browse/IGNITE-11590
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


IgniteProjectionStartStopRestartSelfTest#testStopNodesByIds
{noformat}
java.lang.NullPointerException
at 
java.util.concurrent.ConcurrentHashMap.replaceNode(ConcurrentHashMap.java:1106)
at 
java.util.concurrent.ConcurrentHashMap.remove(ConcurrentHashMap.java:1097)
at 
org.apache.ignite.internal.processors.cache.mvcc.MvccProcessorImpl.onCoordinatorFailed(MvccProcessorImpl.java:527)
at 
org.apache.ignite.internal.processors.cache.mvcc.MvccProcessorImpl.onKernalStop(MvccProcessorImpl.java:459)
at org.apache.ignite.internal.IgniteKernal.stop0(IgniteKernal.java:2335)
at org.apache.ignite.internal.IgniteKernal.stop(IgniteKernal.java:2283)
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2570)
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2533)
at org.apache.ignite.internal.IgnitionEx.stop(IgnitionEx.java:330)
at org.apache.ignite.internal.IgnitionEx.stop(IgnitionEx.java:297)
at org.apache.ignite.Ignition.stop(Ignition.java:200)
at 
org.apache.ignite.internal.IgniteProjectionStartStopRestartSelfTest.afterTest(IgniteProjectionStartStopRestartSelfTest.java:190)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.tearDown(GridAbstractTest.java:1804)
at 
org.apache.ignite.testframework.junits.JUnit3TestLegacySupport.runTestCase(JUnit3TestLegacySupport.java:70)
at 
org.apache.ignite.testframework.junits.GridAbstractTest$2.evaluate(GridAbstractTest.java:185)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.evaluateInsideFixture(GridAbstractTest.java:2579)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.access$500(GridAbstractTest.java:152)
at 
org.apache.ignite.testframework.junits.GridAbstractTest$BeforeFirstAndAfterLastTestRule$1.evaluate(GridAbstractTest.java:2559)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
at 
com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
at 
com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
at 
com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11569) Enable baseline auto-adjust by default only for empty cluster

2019-03-19 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-11569:
--

 Summary: Enable baseline auto-adjust by default only for empty 
cluster
 Key: IGNITE-11569
 URL: https://issues.apache.org/jira/browse/IGNITE-11569
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


It is required to enable baseline auto-adjust by default only for empty cluster



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11545) Logging baseline auto-adjust

2019-03-14 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-11545:
--

 Summary: Logging baseline auto-adjust
 Key: IGNITE-11545
 URL: https://issues.apache.org/jira/browse/IGNITE-11545
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


It needs to add some extra log to baseline auto-adjust process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11391) Test on free list is freezes sometimes

2019-02-22 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-11391:
--

 Summary: Test on free list is freezes sometimes
 Key: IGNITE-11391
 URL: https://issues.apache.org/jira/browse/IGNITE-11391
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


CheckpointFreeListTest#testRestoreFreeListCorrectlyAfterRandomStop - freezed 
sometimes 
 CheckpointFreeListTest.testFreeListRestoredCorrectly - flaky



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11382) Stop managers from all caches before caches stop

2019-02-21 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-11382:
--

 Summary: Stop managers from all caches before caches stop
 Key: IGNITE-11382
 URL: https://issues.apache.org/jira/browse/IGNITE-11382
 Project: Ignite
  Issue Type: Improvement
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


It is required to stop all cache managers before stopping this caches



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11377) Display time to baseline auto-adjust event in console.sh

2019-02-20 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-11377:
--

 Summary: Display time to baseline auto-adjust event in console.sh
 Key: IGNITE-11377
 URL: https://issues.apache.org/jira/browse/IGNITE-11377
 Project: Ignite
  Issue Type: Improvement
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


It required to add information about next auto-adjust.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11297) Improving read of hot variables in WAL

2019-02-12 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-11297:
--

 Summary: Improving read of hot variables in WAL
 Key: IGNITE-11297
 URL: https://issues.apache.org/jira/browse/IGNITE-11297
 Project: Ignite
  Issue Type: Improvement
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


Looks like it is not neccessery mark some variables as volatile in 
FileWriteAheadLogManager because its initialized only one time on start but its 
have a lot of read of them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-10720) Decrease time to save metadata during checkpoint

2018-12-17 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-10720:
--

 Summary: Decrease time to save metadata during checkpoint
 Key: IGNITE-10720
 URL: https://issues.apache.org/jira/browse/IGNITE-10720
 Project: Ignite
  Issue Type: Improvement
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


Looks like it's not neccessery save all metadata(like free list) under write 
checkpoint lock because sometimes it's too long.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-10636) Deadlock on stopping node due to segmentation

2018-12-11 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-10636:
--

 Summary: Deadlock on stopping node due to segmentation
 Key: IGNITE-10636
 URL: https://issues.apache.org/jira/browse/IGNITE-10636
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov


* Node have "put" operations
* Node detected segmentation
* Node do call to failulre handler(StopNodeFailureHandler) to stop itself
* Failure handler try to get GridKernalGateway write lock but await all 
operation finished
* GridNearTxLocal uninterruptebly await rollbackNearTxLocalAsync future

Failure handler await:
{noformat}
Lock 
[object=java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@2370ac7a, 
ownerName=null, ownerId=-1]
[03:24:53] : [Step 4/5] at sun.misc.Unsafe.park(Native Method)
[03:24:53] : [Step 4/5] at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
[03:24:53] : [Step 4/5] at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireNanos(AbstractQueuedSynchronizer.java:934)
[03:24:53] : [Step 4/5] at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireNanos(AbstractQueuedSynchronizer.java:1247)
[03:24:53] : [Step 4/5] at 
java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.tryLock(ReentrantReadWriteLock.java:1115)
[03:24:53] : [Step 4/5] at 
o.a.i.i.util.StripedCompositeReadWriteLock$WriteLock.tryLock(StripedCompositeReadWriteLock.java:220)
[03:24:53] : [Step 4/5] at 
o.a.i.i.GridKernalGatewayImpl.tryWriteLock(GridKernalGatewayImpl.java:143)
[03:24:53] : [Step 4/5] at 
o.a.i.i.IgniteKernal.stop0(IgniteKernal.java:2313)
[03:24:53] : [Step 4/5] at 
o.a.i.i.IgniteKernal.stop(IgniteKernal.java:2230)
[03:24:53] : [Step 4/5] at 
o.a.i.i.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2613)
[03:24:53] : [Step 4/5] - locked 
o.a.i.i.IgnitionEx$IgniteNamedInstance@41294371
[03:24:53] : [Step 4/5] at 
o.a.i.i.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2576)
[03:24:53] : [Step 4/5] at 
o.a.i.i.IgnitionEx.stop(IgnitionEx.java:379)
[03:24:53] : [Step 4/5] at 
o.a.i.failure.StopNodeFailureHandler$1.run(StopNodeFailureHandler.java:36)
[03:24:53] : [Step 4/5] at java.lang.Thread.run(Thread.java:748)
{noformat}
Put await:
{noformat}
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
at 
org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal.close(GridNearTxLocal.java:4358)
at 
org.apache.ignite.internal.processors.cache.GridCacheSharedContext.endTx(GridCacheSharedContext.java:1017)
at 
org.apache.ignite.internal.processors.cache.transactions.TransactionProxyImpl.close(TransactionProxyImpl.java:329)
at 
org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest$3.run(GridCacheAbstractNodeRestartSelfTest.java:782)
at java.lang.Thread.run(Thread.java:748)
{noformat}


Reproduced by 
GridCacheAbstractNodeRestartSelfTest#testRestartWithPutTenNodesTwoBackups and 
other tests from this class



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-10622) Undelivered ensure message to some nodes

2018-12-10 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-10622:
--

 Summary: Undelivered ensure message to some nodes
 Key: IGNITE-10622
 URL: https://issues.apache.org/jira/browse/IGNITE-10622
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


We have follow case:
* Grid from 5 nodes(node1, node2, node3, node4, node5)
* node1 detect that node4 was failed and send NodeFailed message to node2
* node2 send NodeFailedNode3 message to node3
* node3 accepted message but does not handle because it also failed
* node1 detect that node3 was failed and send NodeFailed message to node2
* node2 select new next node(node4) and send NodeFailedNode3 message to node4

As result node4 received only NodeFailedNode3 but don't received 
NodeFailedNode3.

This case also valid for other ensure message which sending  one after another.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-10522) Remote node has not joined

2018-12-04 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-10522:
--

 Summary: Remote node has not joined
 Key: IGNITE-10522
 URL: https://issues.apache.org/jira/browse/IGNITE-10522
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


Sometimes tests failed because of "Remote node has not joined"
suit - PDS (Indexing)
example test - IgniteWalRecoveryWithCompactionTest.testLargeRandomCrash 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-10509) Rollback exception instead of timeout exception

2018-12-03 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-10509:
--

 Summary: Rollback exception instead of timeout exception
 Key: IGNITE-10509
 URL: https://issues.apache.org/jira/browse/IGNITE-10509
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


Looks like we have race on changing transaction state between timedOut and 
state set
Reproducer - TxRollbackOnTimeoutNearCacheTest.testEnlistManyWrite



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-10491) Out of memory: unable to create new native thread(test150Clients)

2018-11-30 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-10491:
--

 Summary:  Out of memory: unable to create new native 
thread(test150Clients)
 Key: IGNITE-10491
 URL: https://issues.apache.org/jira/browse/IGNITE-10491
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


IgniteCache150ClientsTest.test150Clients

https://ci.ignite.apache.org/viewLog.html?buildId=2424817=buildResultsDiv=IgniteTests24Java8_Cache6



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-10423) Hangs grid-nio-worker-tcp-comm

2018-11-27 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-10423:
--

 Summary: Hangs grid-nio-worker-tcp-comm
 Key: IGNITE-10423
 URL: https://issues.apache.org/jira/browse/IGNITE-10423
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


{noformat}
[org.apache.ignite:ignite-core] [2018-11-24 
04:49:34,736][ERROR][tcp-disco-msg-worker-#89615%replicated.GridCacheReplicatedNodeRestartSelfTest2%][G]
 Blocked system-critical thread has be
en detected. This can lead to cluster-wide undefined behaviour 
[threadName=grid-nio-worker-tcp-comm-1, blockedFor=11s]
[org.apache.ignite:ignite-core] [2018-11-24 04:49:44,894][WARN 
][tcp-disco-msg-worker-#89615%replicated.GridCacheReplicatedNodeRestartSelfTest2%][G]
 Thread [name="grid-nio-worker-tcp-com
m-1-#454082%replicated.GridCacheReplicatedNodeRestartSelfTest2%", id=562184, 
state=RUNNABLE, blockCnt=1, waitCnt=0]

[org.apache.ignite:ignite-core] [2018-11-24 
04:49:44,897][ERROR][tcp-disco-msg-worker-#89615%replicated.GridCacheReplicatedNodeRestartSelfTest2%][IgniteTestResources]
 Critical system err
or detected. Will be handled accordingly to configured handler 
[hnd=NoOpFailureHandler [super=AbstractFailureHandler 
[ignoredFailureTypes=SingletonSet [SYSTEM_WORKER_BLOCKED]]], 
failureCtx=FailureContext [type=S
YSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker 
[name=grid-nio-worker-tcp-comm-1, 
igniteInstanceName=replicated.GridCacheReplicatedNodeRestartSelfTest2, 
finished=false, heartbeatTs=154303498488
9]]]
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-10156) Invalid convertation DynamicCacheDescriptor to StoredCacheData

2018-11-07 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-10156:
--

 Summary: Invalid convertation DynamicCacheDescriptor to 
StoredCacheData
 Key: IGNITE-10156
 URL: https://issues.apache.org/jira/browse/IGNITE-10156
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


Invalid convertation DynamicCacheDescriptor to StoredCacheData in 
CacheRegistry#persistCacheConfigurations



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-10111) Affinity doesn't recalculate after lost partitions

2018-11-01 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-10111:
--

 Summary: Affinity doesn't recalculate after lost partitions
 Key: IGNITE-10111
 URL: https://issues.apache.org/jira/browse/IGNITE-10111
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
 Attachments: AffinityLostPartitionTest.java

Case: 

1)Start 3 data nodes and activate the cluster with cache with 1 backup and 
PartitionLossPolicy.READ_ONLY_SAFE.
2)Start client and add the data to your cache. Stop the client
3)Stop DN2 and clear it pds and val
4)Start DN2. Rebalance will start.
5)During rebalance stop DN3. At this moment some partitions from DN2 marked as 
LOST. 
6)Start DN3.

In fact all data was come back but affinity instead of DN3 use DN2 which have  
partitions(lost) with loss some data.

Reproducer is attached.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9962) Unhandled exception during BatchCacheChangeRequest

2018-10-22 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-9962:
-

 Summary: Unhandled exception during BatchCacheChangeRequest
 Key: IGNITE-9962
 URL: https://issues.apache.org/jira/browse/IGNITE-9962
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


Node hangs if exception in GridQueryProcessor#onCacheChangeRequested throw.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9909) Merge FsyncWalManager and WalManager

2018-10-17 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-9909:
-

 Summary: Merge FsyncWalManager and WalManager
 Key: IGNITE-9909
 URL: https://issues.apache.org/jira/browse/IGNITE-9909
 Project: Ignite
  Issue Type: Improvement
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


Now we have two similar WAL managers FileWriteAheadLogManager and 
FsyncModeFileWriteAheadLogManager and because of similarity it is too difficult 
to support them. It is need to extract unique part from them and leave only one 
manager.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9761) Deadlock SegmentArchivedStorage <-> SegmentLockStorage

2018-10-02 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-9761:
-

 Summary: Deadlock SegmentArchivedStorage <-> SegmentLockStorage
 Key: IGNITE-9761
 URL: https://issues.apache.org/jira/browse/IGNITE-9761
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


{noformat}
Found one Java-level deadlock:
=
"wal-file-archiver%cache.IgniteClusterActivateDeactivateTestWithPersistence2-#11729%cache.IgniteClusterActivateDeactivateTestWithPersistence2%":
  waiting to lock monitor 0x7fa33c0121e8 (object 0xf7142560, a 
org.apache.ignite.internal.processors.cache.persistence.wal.aware.SegmentLockStorage),
  which is held by 
"exchange-worker-#11646%cache.IgniteClusterActivateDeactivateTestWithPersistence2%"
"exchange-worker-#11646%cache.IgniteClusterActivateDeactivateTestWithPersistence2%":
  waiting to lock monitor 0x7fa3503b6058 (object 0xf7142578, a 
org.apache.ignite.internal.processors.cache.persistence.wal.aware.SegmentArchivedStorage),
  which is held by 
"wal-file-archiver%cache.IgniteClusterActivateDeactivateTestWithPersistence2-#11729%cache.IgniteClusterActivateDeactivateTestWithPersistence2%"
Java stack information for the threads listed above:
===
"wal-file-archiver%cache.IgniteClusterActivateDeactivateTestWithPersistence2-#11729%cache.IgniteClusterActivateDeactivateTestWithPersistence2%":
at 
org.apache.ignite.internal.processors.cache.persistence.wal.aware.SegmentLockStorage.locked(SegmentLockStorage.java:41)
- waiting to lock <0xf7142560> (a 
org.apache.ignite.internal.processors.cache.persistence.wal.aware.SegmentLockStorage)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.aware.SegmentArchivedStorage.markAsMovedToArchive(SegmentArchivedStorage.java:101)
- locked <0xf7142578> (a 
org.apache.ignite.internal.processors.cache.persistence.wal.aware.SegmentArchivedStorage)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.aware.SegmentAware.markAsMovedToArchive(SegmentAware.java:91)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileArchiver.body(FileWriteAheadLogManager.java:1643)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
at java.lang.Thread.run(Thread.java:748)
"exchange-worker-#11646%cache.IgniteClusterActivateDeactivateTestWithPersistence2%":
at 
org.apache.ignite.internal.processors.cache.persistence.wal.aware.SegmentArchivedStorage.onSegmentUnlocked(SegmentArchivedStorage.java:135)
- waiting to lock <0xf7142578> (a 
org.apache.ignite.internal.processors.cache.persistence.wal.aware.SegmentArchivedStorage)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.aware.SegmentArchivedStorage$$Lambda$2/2113450692.accept(Unknown
 Source)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.aware.SegmentObservable.lambda$notifyObservers$0(SegmentObservable.java:44)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.aware.SegmentObservable$$Lambda$6/688404745.accept(Unknown
 Source)
at java.util.ArrayList.forEach(ArrayList.java:1257)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.aware.SegmentObservable.notifyObservers(SegmentObservable.java:44)
- locked <0xf7142560> (a 
org.apache.ignite.internal.processors.cache.persistence.wal.aware.SegmentLockStorage)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.aware.SegmentLockStorage.releaseWorkSegment(SegmentLockStorage.java:74)
- locked <0xf7142560> (a 
org.apache.ignite.internal.processors.cache.persistence.wal.aware.SegmentLockStorage)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.aware.SegmentAware.releaseWorkSegment(SegmentAware.java:226)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.io.LockedReadFileInput.ensure(LockedReadFileInput.java:81)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV1Serializer.readSegmentHeader(RecordV1Serializer.java:260)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.initReadHandle(AbstractWalRecordsIterator.java:381)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.initReadHandle(FileWriteAheadLogManager.java:2942)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$RecordsIterator.advanceSegment(FileWriteAheadLogManager.java:3024)
at 
org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.advance(AbstractWalRecordsIterator.java:163)
at 

[jira] [Created] (IGNITE-9760) NPE is possible during WAL flushing for FSYNC mode

2018-10-02 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-9760:
-

 Summary: NPE is possible during WAL flushing for FSYNC mode
 Key: IGNITE-9760
 URL: https://issues.apache.org/jira/browse/IGNITE-9760
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


{noformat}
class org.apache.ignite.IgniteCheckedException: Failed to update keys (retry 
update if possible).: [9483]

at 
org.apache.ignite.internal.util.IgniteUtils.cast(IgniteUtils.java:7409)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.resolve(GridFutureAdapter.java:261)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:172)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
at 
org.apache.ignite.testframework.GridTestUtils.lambda$runMultiThreadedAsync$96d302c5$1(GridTestUtils.java:853)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:385)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:349)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:337)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:497)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:476)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:464)
at 
org.apache.ignite.testframework.GridTestUtils.lambda$runAsync$2(GridTestUtils.java:1005)
at 
org.apache.ignite.testframework.GridTestUtils$7.call(GridTestUtils.java:1295)
at 
org.apache.ignite.testframework.GridTestThread.run(GridTestThread.java:86)
Caused by: org.apache.ignite.cache.CachePartialUpdateException: Failed to 
update keys (retry update if possible).: [9483]
at 
org.apache.ignite.internal.processors.cache.GridCacheUtils.convertToCacheException(GridCacheUtils.java:1307)
at 
org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.cacheException(IgniteCacheProxyImpl.java:1742)
at 
org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.put(IgniteCacheProxyImpl.java:1092)
at 
org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.put(GatewayProtectedCacheProxy.java:820)
at 
org.apache.ignite.internal.processors.cache.persistence.db.wal.WalRolloverRecordLoggingTest.lambda$testAvoidInfinityWaitingOnRolloverOfSegment$0(WalRolloverRecordLoggingTest.java:119)
... 2 more
Caused by: class 
org.apache.ignite.internal.processors.cache.CachePartialUpdateCheckedException: 
Failed to update keys (retry update if possible).: [9483]
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.onPrimaryError(GridNearAtomicAbstractUpdateFuture.java:397)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.onPrimaryResponse(GridNearAtomicSingleUpdateFuture.java:253)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture$1.apply(GridNearAtomicAbstractUpdateFuture.java:303)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture$1.apply(GridNearAtomicAbstractUpdateFuture.java:300)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1855)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1668)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.sendSingleRequest(GridNearAtomicAbstractUpdateFuture.java:299)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.map(GridNearAtomicSingleUpdateFuture.java:483)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.mapOnTopology(GridNearAtomicSingleUpdateFuture.java:443)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.map(GridNearAtomicAbstractUpdateFuture.java:248)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.update0(GridDhtAtomicCache.java:1153)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.put0(GridDhtAtomicCache.java:611)
at 
org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2449)
at 

[jira] [Created] (IGNITE-9729) Ability to start GridQueryProcessor in parallel

2018-09-27 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-9729:
-

 Summary: Ability to start GridQueryProcessor in parallel
 Key: IGNITE-9729
 URL: https://issues.apache.org/jira/browse/IGNITE-9729
 Project: Ignite
  Issue Type: Improvement
Reporter: Anton Kalashnikov


After task 
[StartCachesInParallel|https://issues.apache.org/jira/browse/IGNITE-8006] we 
can start caches in parallel but GridQueryProcessor is narrow place because it 
should be start consistently by following reasons:
* checking index to duplicate(and other checking) require one order on every 
nodes.
* onCacheStart and createSchema contains a lot of mutex.
* maybe it has other reasons.

After this task GridCacheProcessor#prepareStartCaches should be rewrited.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9441) Failed to read WAL record at position

2018-08-31 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-9441:
-

 Summary: Failed to read WAL record at position
 Key: IGNITE-9441
 URL: https://issues.apache.org/jira/browse/IGNITE-9441
 Project: Ignite
  Issue Type: Test
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


IgnitePdsAtomicCacheHistoricalRebalancingTest.testPartitionCounterConsistencyOnUnstableTopology
IgnitePdsAtomicCacheHistoricalRebalancingTest.testTopologyChangesWithConstantLoad
IgnitePdsTxHistoricalRebalancingTest.testPartitionCounterConsistencyOnUnstableTopology



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9424) Partition equal to -1 during insert to atomic cache

2018-08-29 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-9424:
-

 Summary: Partition equal to -1 during insert to atomic cache
 Key: IGNITE-9424
 URL: https://issues.apache.org/jira/browse/IGNITE-9424
 Project: Ignite
  Issue Type: Test
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


Reproduced by IgnitePdsThreadInterruptionTest.testInterruptsOnWALWrite

{noformat}
org.apache.ignite.cache.CachePartialUpdateException: Failed to update keys 
(retry update if possible).: [31108]
at 
org.apache.ignite.internal.processors.cache.GridCacheUtils.convertToCacheException(GridCacheUtils.java:1261)
at 
org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.cacheException(IgniteCacheProxyImpl.java:1740)
at 
org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.put(IgniteCacheProxyImpl.java:1090)
at 
org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.put(GatewayProtectedCacheProxy.java:817)
at 
org.apache.ignite.internal.processors.cache.persistence.db.file.IgnitePdsThreadInterruptionTest$3.run(IgnitePdsThreadInterruptionTest.java:208)
at java.lang.Thread.run(Thread.java:748)
Caused by: class 
org.apache.ignite.internal.processors.cache.CachePartialUpdateCheckedException: 
Failed to update keys (retry update if possible).: [31108]
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.onPrimaryError(GridNearAtomicAbstractUpdateFuture.java:397)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.onPrimaryResponse(GridNearAtomicSingleUpdateFuture.java:253)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture$1.apply(GridNearAtomicAbstractUpdateFuture.java:303)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture$1.apply(GridNearAtomicAbstractUpdateFuture.java:300)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicAbstractUpdateFuture.map(GridDhtAtomicAbstractUpdateFuture.java:394)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1865)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1664)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.sendSingleRequest(GridNearAtomicAbstractUpdateFuture.java:299)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.map(GridNearAtomicSingleUpdateFuture.java:483)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.mapOnTopology(GridNearAtomicSingleUpdateFuture.java:443)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.map(GridNearAtomicAbstractUpdateFuture.java:248)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.update0(GridDhtAtomicCache.java:1153)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.put0(GridDhtAtomicCache.java:611)
at 
org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2430)
at 
org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2407)
at 
org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.put(IgniteCacheProxyImpl.java:1087)
... 3 more
Suppressed: class org.apache.ignite.IgniteCheckedException: Failed to 
update keys.
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.UpdateErrors.addFailedKey(UpdateErrors.java:108)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicUpdateResponse.addFailedKey(GridNearAtomicUpdateResponse.java:329)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateSingle(GridDhtAtomicCache.java:2623)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.update(GridDhtAtomicCache.java:1942)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1776)
... 13 more
Suppressed: class 
org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException:
 Runtime failure on search row: 
org.apache.ignite.internal.processors.cache.tree.SearchRow@371d7ce1
at 

[jira] [Created] (IGNITE-9407) Node is hang when it was stopping from several client in one time

2018-08-29 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-9407:
-

 Summary: Node is hang when it was stopping from several client in 
one time
 Key: IGNITE-9407
 URL: https://issues.apache.org/jira/browse/IGNITE-9407
 Project: Ignite
  Issue Type: Test
Reporter: Anton Kalashnikov


Reproduced by IgniteChangeGlobalStateTest#testFailGetLock
{noformat}
[2018-08-27 
19:00:29,463][ERROR][sys-#32068%node0-backUp-client%][GridClosureProcessor] 
Closure execution failed with error.
[22:00:29]W: [org.apache.ignite:ignite-core] 
java.lang.AssertionError: ignite-sys-cache
[22:00:29]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.internalCacheEx(GridCacheProcessor.java:3847)
[22:00:29]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.utilityCache(GridCacheProcessor.java:3829)
[22:00:29]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.processors.service.GridServiceProcessor.updateUtilityCache(GridServiceProcessor.java:298)
[22:00:29] : [Step 3/4] [2018-08-27 19:00:29,463][INFO 
][sys-#32069%node2-backUp-client%][GridCacheProcessor] Stopped cache 
[cacheName=ignite-sys-cache]
[22:00:29]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.processors.service.GridServiceProcessor.onKernalStart0(GridServiceProcessor.java:241)
[22:00:29]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.processors.service.GridServiceProcessor.onActivate(GridServiceProcessor.java:397)
[22:00:29]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor$6.run(GridClusterStateProcessor.java:1151)
[22:00:29]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6756)
[22:00:29]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:827)
[22:00:29]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
[22:00:29]W: [org.apache.ignite:ignite-core]at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[22:00:29]W: [org.apache.ignite:ignite-core]at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[22:00:29]W: [org.apache.ignite:ignite-core]at 
java.lang.Thread.run(Thread.java:748)
[22:00:29]W: [org.apache.ignite:ignite-core] [2018-08-27 
19:00:29,469][ERROR][sys-#32068%node0-backUp-client%][GridClosureProcessor] 
Runtime error caught during grid runnable execution: GridWorker 
[name=closure-proc-worker, igniteInstanceName=node0-backUp-client, 
finished=false, hashCode=669424318, interrupted=false, 
runner=sys-#32068%node0-backUp-client%]
[22:00:29]W: [org.apache.ignite:ignite-core] 
java.lang.AssertionError: ignite-sys-cache
[22:00:29]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.internalCacheEx(GridCacheProcessor.java:3847)
[22:00:29]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.utilityCache(GridCacheProcessor.java:3829)
[22:00:29]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.processors.service.GridServiceProcessor.updateUtilityCache(GridServiceProcessor.java:298)
[22:00:29]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.processors.service.GridServiceProcessor.onKernalStart0(GridServiceProcessor.java:241)
[22:00:29]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.processors.service.GridServiceProcessor.onActivate(GridServiceProcessor.java:397)
[22:00:29]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor$6.run(GridClusterStateProcessor.java:1151)
[22:00:29]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6756)
[22:00:29]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:827)
[22:00:29]W: [org.apache.ignite:ignite-core]at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
[22:00:29]W: [org.apache.ignite:ignite-core]at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[22:00:29]W: 

[jira] [Created] (IGNITE-9402) IgnitePdsDiskErrorsRecoveringTest.testRecoveringOnWALWritingFail2 because of LogOnly mode

2018-08-28 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-9402:
-

 Summary: 
IgnitePdsDiskErrorsRecoveringTest.testRecoveringOnWALWritingFail2 because of 
LogOnly mode
 Key: IGNITE-9402
 URL: https://issues.apache.org/jira/browse/IGNITE-9402
 Project: Ignite
  Issue Type: Test
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


IgnitePdsDiskErrorsRecoveringTest.testRecoveringOnWALWritingFail2 failed 
because it can lost last WAL data which have not flushed yet.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9391) Incorrect calculated esitmated rabalancing finish time

2018-08-27 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-9391:
-

 Summary: Incorrect calculated esitmated rabalancing finish time
 Key: IGNITE-9391
 URL: https://issues.apache.org/jira/browse/IGNITE-9391
 Project: Ignite
  Issue Type: Test
Reporter: Anton Kalashnikov


Actually looks like test 
CacheGroupsMetricsRebalanceTest.testRebalanceEstimateFinishTime is incorrect or 
we have bug in esitmated rabalancing finish time calculation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9327) Client Nodes hangs because client reconnect not handled

2018-08-20 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-9327:
-

 Summary: Client Nodes hangs because client reconnect not handled 
 Key: IGNITE-9327
 URL: https://issues.apache.org/jira/browse/IGNITE-9327
 Project: Ignite
  Issue Type: Test
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


Reproduced by 
IgniteCacheClientReconnectTest#testClientInForceServerModeStopsOnExchangeHistoryExhaustion

If IgniteNeedReconnectException happend we should stop not if reconnect doens't 
supported. But after https://issues.apache.org/jira/browse/IGNITE-8673 we 
became to ignore this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9307) Node is hang when it was stopping during eviction

2018-08-17 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-9307:
-

 Summary: Node is hang when it was stopping during eviction
 Key: IGNITE-9307
 URL: https://issues.apache.org/jira/browse/IGNITE-9307
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


{noformat}
"main" #1 prio=5 os_prio=0 tid=0x7f0ae800e000 nid=0x2e26 waiting on 
condition [0x7f0aef33]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.PartitionsEvictManager$GroupEvictionContext.awaitFinish(PartitionsEvictManager.java:362)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.PartitionsEvictManager$GroupEvictionContext$$Lambda$203/1143143890.accept(Unknown
 Source)
at 
java.util.concurrent.ConcurrentHashMap.forEach(ConcurrentHashMap.java:1597)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.PartitionsEvictManager$GroupEvictionContext.awaitFinishAll(PartitionsEvictManager.java:348)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.PartitionsEvictManager$GroupEvictionContext.access$100(PartitionsEvictManager.java:265)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.PartitionsEvictManager.onCacheGroupStopped(PartitionsEvictManager.java:103)
at 
org.apache.ignite.internal.processors.cache.CacheGroupContext.stopGroup(CacheGroupContext.java:725)
at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.stopCacheGroup(GridCacheProcessor.java:2366)
at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.stopCacheGroup(GridCacheProcessor.java:2359)
at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.stopCaches(GridCacheProcessor.java:959)
at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.stop(GridCacheProcessor.java:924)
at org.apache.ignite.internal.IgniteKernal.stop0(IgniteKernal.java:2206)
at org.apache.ignite.internal.IgniteKernal.stop(IgniteKernal.java:2081)
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2594)
- locked <0xf39b8770> (a 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance)
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2557)
at org.apache.ignite.internal.IgnitionEx.stop(IgnitionEx.java:374)
at org.apache.ignite.Ignition.stop(Ignition.java:225)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.stopGrid(GridAbstractTest.java:1153)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.stopAllGrids(GridAbstractTest.java:1196)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.stopAllGrids(GridAbstractTest.java:1174)
at 
org.apache.ignite.internal.processors.query.h2.IgniteSqlQueryMinMaxTest.afterTest(IgniteSqlQueryMinMaxTest.java:55)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.tearDown(GridAbstractTest.java:1766)
at 
org.apache.ignite.testframework.junits.common.GridCommonAbstractTest.tearDown(GridCommonAbstractTest.java:503)
at junit.framework.TestCase.runBare(TestCase.java:146)
at junit.framework.TestResult$1.protect(TestResult.java:122)
at junit.framework.TestResult.runProtected(TestResult.java:142)
at junit.framework.TestResult.run(TestResult.java:125)
at junit.framework.TestCase.run(TestCase.java:129)
at junit.framework.TestSuite.runTest(TestSuite.java:255)
at junit.framework.TestSuite.run(TestSuite.java:250)
at junit.framework.TestSuite.runTest(TestSuite.java:255)
at junit.framework.TestSuite.run(TestSuite.java:250)
at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:369)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:275)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:239)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:160)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 

[jira] [Created] (IGNITE-9268) Hangs on await offheap read lock

2018-08-14 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-9268:
-

 Summary: Hangs on await offheap read lock
 Key: IGNITE-9268
 URL: https://issues.apache.org/jira/browse/IGNITE-9268
 Project: Ignite
  Issue Type: Test
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


During awaitiing of read lock node has failed and handler are stopping the 
node. And nobody can wake up awaiting thread.
{noformat}
Lock 
[object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@65067d90,
 ownerName=null, ownerId=-1]
[12:24:51] : [Step 3/4] at sun.misc.Unsafe.park(Native Method)
[12:24:51] : [Step 3/4] at 
java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
[12:24:51] : [Step 3/4] at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
[12:24:51] : [Step 3/4] at 
o.a.i.i.util.OffheapReadWriteLock.waitAcquireReadLock(OffheapReadWriteLock.java:435)
[12:24:51] : [Step 3/4] at 
o.a.i.i.util.OffheapReadWriteLock.readLock(OffheapReadWriteLock.java:142)
[12:24:51] : [Step 3/4] at 
o.a.i.i.pagemem.impl.PageMemoryNoStoreImpl.readLock(PageMemoryNoStoreImpl.java:463)
[12:24:51] : [Step 3/4] at 
o.a.i.i.processors.cache.persistence.tree.util.PageHandler.readLock(PageHandler.java:185)
[12:24:51] : [Step 3/4] at 
o.a.i.i.processors.cache.persistence.tree.util.PageHandler.readPage(PageHandler.java:157)
[12:24:51] : [Step 3/4] at 
o.a.i.i.processors.cache.persistence.DataStructure.read(DataStructure.java:334)
[12:24:51] : [Step 3/4] at 
o.a.i.i.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2348)
[12:24:51] : [Step 3/4] at 
o.a.i.i.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2360)
[12:24:51] : [Step 3/4] at 
o.a.i.i.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2360)
[12:24:51] : [Step 3/4] at 
o.a.i.i.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2360)
[12:24:51] : [Step 3/4] at 
o.a.i.i.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2360)
[12:24:51] : [Step 3/4] at 
o.a.i.i.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2360)
[12:24:51] : [Step 3/4] at 
o.a.i.i.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2360)
[12:24:51] : [Step 3/4] at 
o.a.i.i.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2360)
[12:24:51] : [Step 3/4] at 
o.a.i.i.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2360)
[12:24:51] : [Step 3/4] at 
o.a.i.i.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2360)
[12:24:51] : [Step 3/4] at 
o.a.i.i.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2360)
[12:24:51] : [Step 3/4] at 
o.a.i.i.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2360)
[12:24:51] : [Step 3/4] at 
o.a.i.i.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2360)
[12:24:51] : [Step 3/4] at 
o.a.i.i.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2360)
[12:24:51] : [Step 3/4] at 
o.a.i.i.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2360)
[12:24:51] : [Step 3/4] at 
o.a.i.i.processors.cache.persistence.tree.BPlusTree.putDown(BPlusTree.java:2360)
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9250) Replace CacheAffinitySharedManager.CachesInfo by ClusterCachesInfo

2018-08-10 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-9250:
-

 Summary: Replace CacheAffinitySharedManager.CachesInfo by 
ClusterCachesInfo
 Key: IGNITE-9250
 URL: https://issues.apache.org/jira/browse/IGNITE-9250
 Project: Ignite
  Issue Type: Improvement
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


Now we have duplicate of registerCaches(and groups). They holds in 
ClusterCachesInfo - main storage, and also they holds in 
CacheAffinitySharedManager.CachesInfo. It looks like redundantly and can lead 
to unconsistancy of caches info.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9004) Failed to move temp file during segment creation

2018-07-13 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-9004:
-

 Summary: Failed to move temp file during segment creation
 Key: IGNITE-9004
 URL: https://issues.apache.org/jira/browse/IGNITE-9004
 Project: Ignite
  Issue Type: Test
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


Reproduced by Activate/Deactivate suit, for example  
IgniteChangeGlobalStateTest#testStopPrimaryAndActivateFromClientNode

{noformat}

class org.apache.ignite.internal.pagemem.wal.StorageException: Failed to move 
temp file to a regular WAL segment file: /data/teamcity/work/c182b70f2dfa650
7/work/IgniteChangeGlobalStateTest/db/wal/node1/0002.wal
[13:56:05]W: [org.apache.ignite:ignite-core] at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.createFile(FileWriteAheadLogManager.java:1446)
[13:56:05]W: [org.apache.ignite:ignite-core] at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.checkFiles(FileWriteAheadLogManager.java:2269)
[13:56:05]W: [org.apache.ignite:ignite-core] at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.access$4500(FileWriteAheadLogManager.java:143)
[13:56:05]W: [org.apache.ignite:ignite-core] at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileArchiver.allocateRemainingFiles(FileWriteAheadLogManage
r.java:1862)
[13:56:05]W: [org.apache.ignite:ignite-core] at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileArchiver.body(FileWriteAheadLogManager.java:1606)
[13:56:05]W: [org.apache.ignite:ignite-core] at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
[13:56:05]W: [org.apache.ignite:ignite-core] at 
java.lang.Thread.run(Thread.java:748)
[13:56:05]W: [org.apache.ignite:ignite-core] Caused by: 
java.nio.file.NoSuchFileException: 
/data/teamcity/work/c182b70f2dfa6507/work/IgniteChangeGlobalStateTest/db/wal/node1/0002.wal.tmp
[13:56:05]W: [org.apache.ignite:ignite-core] at 
sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
[13:56:05]W: [org.apache.ignite:ignite-core] at 
sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
[13:56:05]W: [org.apache.ignite:ignite-core] at 
sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
[13:56:05]W: [org.apache.ignite:ignite-core] at 
sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:409)
[13:56:05]W: [org.apache.ignite:ignite-core] at 
sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)
[13:56:05]W: [org.apache.ignite:ignite-core] at 
java.nio.file.Files.move(Files.java:1395)
[13:56:05]W: [org.apache.ignite:ignite-core] at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.createFile(FileWriteAheadLogManager.java:1442)
[13:56:05]W: [org.apache.ignite:ignite-core] ... 6 more

{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8998) Client hangs after merge exchange

2018-07-13 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-8998:
-

 Summary: Client hangs after merge exchange
 Key: IGNITE-8998
 URL: https://issues.apache.org/jira/browse/IGNITE-8998
 Project: Ignite
  Issue Type: Test
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


Reproduce by CacheExchangeMergeTest#testConcurrentStartServersAndClients



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8969) Unable to await partitions release latch

2018-07-10 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-8969:
-

 Summary: Unable to await partitions release latch
 Key: IGNITE-8969
 URL: https://issues.apache.org/jira/browse/IGNITE-8969
 Project: Ignite
  Issue Type: Test
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov


 Unable to await partitions release latch within timeout for ClientLatch after 
this node become latch coordinator after old latch coordinator was failed.

Reproduced by 
TcpDiscoverySslSelfTest.testNodeShutdownOnRingMessageWorkerStartNotFinished, 
TcpDiscoverySslTrustedSelfTest.testNodeShutdownOnRingMessageWorkerStartNotFinished



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8953) Test fail: Bind address already in use(TcpDiscoverySpiFailureTimeoutSelfTest)

2018-07-06 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-8953:
-

 Summary: Test fail: Bind address already in 
use(TcpDiscoverySpiFailureTimeoutSelfTest)
 Key: IGNITE-8953
 URL: https://issues.apache.org/jira/browse/IGNITE-8953
 Project: Ignite
  Issue Type: Test
Reporter: Anton Kalashnikov


During execution beforeTestsStarted in TcpDiscoverySpiFailureTimeoutSelfTest, 
TcpDiscoverySpiSelfTest, registration of MBean server failed with error "Bind 
address already in use" but tests continue to execute because try-catch block.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8940) Activation job hangs(IgniteChangeGlobalStateTest#testFailGetLock)

2018-07-05 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-8940:
-

 Summary: Activation job 
hangs(IgniteChangeGlobalStateTest#testFailGetLock)
 Key: IGNITE-8940
 URL: https://issues.apache.org/jira/browse/IGNITE-8940
 Project: Ignite
  Issue Type: Test
Reporter: Anton Kalashnikov


given:
 # Cluster consisted from 3 nodes which should fail activation cause "can't get 
lock" 
 # 3 clients for cluster

when:
 # Try to activate cluster from one of client
 # Activation job start execution on one of 3 server nodes
 # Activation triggered exchange
 # Exchange is finished with error cause lock can not be acquired

then:

If job is executed on coordinator, its future successfuly finished otherwise 
job is hang

 

expected:

Job is finished on any nodes.

 

why it is happen:

During handling of GridDhtPartitionsFullMessage on non-coordinator node, 
execution finished before job's future finished cause if 
clause(GridDhtPartitionsExchangeFuture#3230)

 

Reason of this maybe task of https://issues.apache.org/jira/browse/IGNITE-8657 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8872) WAL scanner for crash recovery

2018-06-25 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-8872:
-

 Summary: WAL scanner for crash recovery
 Key: IGNITE-8872
 URL: https://issues.apache.org/jira/browse/IGNITE-8872
 Project: Ignite
  Issue Type: Task
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8739) Implement WA for TCP communication related to hanging on descriptor reservation

2018-06-07 Thread Anton Kalashnikov (JIRA)
Anton Kalashnikov created IGNITE-8739:
-

 Summary: Implement WA for TCP communication related to hanging on 
descriptor reservation
 Key: IGNITE-8739
 URL: https://issues.apache.org/jira/browse/IGNITE-8739
 Project: Ignite
  Issue Type: Bug
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >