[jira] [Created] (IGNITE-8755) NegativeArraySizeException when trying to serialize in GridClientOptimizedMarshaller humongous object

2018-06-08 Thread Ivan Daschinskiy (JIRA)
Ivan Daschinskiy created IGNITE-8755:


 Summary: NegativeArraySizeException when trying to serialize in 
GridClientOptimizedMarshaller humongous object
 Key: IGNITE-8755
 URL: https://issues.apache.org/jira/browse/IGNITE-8755
 Project: Ignite
  Issue Type: Bug
  Components: binary
Affects Versions: 2.5
Reporter: Ivan Daschinskiy
 Fix For: 2.6


When trying to serialize humongous object in GridClientOptimizedMarshaller, 
NegativeArraySizeException thrown. See below



{code:java}
java.io.IOException: class org.apache.ignite.IgniteCheckedException: Failed to 
serialize object: GridClientResponse [clientId=null, reqId=0, destId=null, 
status=0, errMsg=null, 
result=org.apache.ignite.internal.processors.rest.protocols.tcp.TcpRestParserSelfTest$HugeObject@60a582c1]

at 
org.apache.ignite.internal.client.marshaller.optimized.GridClientOptimizedMarshaller.marshal(GridClientOptimizedMarshaller.java:101)
at 
org.apache.ignite.internal.processors.rest.protocols.tcp.TcpRestParserSelfTest.testHugeObject(TcpRestParserSelfTest.java:103)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at junit.framework.TestCase.runTest(TestCase.java:176)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.runTestInternal(GridAbstractTest.java:2086)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.access$000(GridAbstractTest.java:140)
at 
org.apache.ignite.testframework.junits.GridAbstractTest$5.run(GridAbstractTest.java:2001)
at java.lang.Thread.run(Thread.java:748)
Caused by: class org.apache.ignite.IgniteCheckedException: Failed to serialize 
object: GridClientResponse [clientId=null, reqId=0, destId=null, status=0, 
errMsg=null, 
result=org.apache.ignite.internal.processors.rest.protocols.tcp.TcpRestParserSelfTest$HugeObject@60a582c1]
at 
org.apache.ignite.internal.marshaller.optimized.OptimizedMarshaller.marshal0(OptimizedMarshaller.java:206)
at 
org.apache.ignite.marshaller.AbstractNodeNameAwareMarshaller.marshal(AbstractNodeNameAwareMarshaller.java:58)
at 
org.apache.ignite.internal.util.IgniteUtils.marshal(IgniteUtils.java:10059)
at 
org.apache.ignite.internal.client.marshaller.optimized.GridClientOptimizedMarshaller.marshal(GridClientOptimizedMarshaller.java:88)
... 10 more
Caused by: java.lang.NegativeArraySizeException
at 
org.apache.ignite.internal.util.io.GridUnsafeDataOutput.requestFreeSize(GridUnsafeDataOutput.java:131)
at 
org.apache.ignite.internal.util.io.GridUnsafeDataOutput.write(GridUnsafeDataOutput.java:166)
at 
org.apache.ignite.internal.marshaller.optimized.OptimizedObjectOutputStream.write(OptimizedObjectOutputStream.java:142)
at 
org.apache.ignite.internal.processors.rest.protocols.tcp.TcpRestParserSelfTest$HugeObject.writeExternal(TcpRestParserSelfTest.java:122)
at 
org.apache.ignite.internal.marshaller.optimized.OptimizedObjectOutputStream.writeExternalizable(OptimizedObjectOutputStream.java:319)
at 
org.apache.ignite.internal.marshaller.optimized.OptimizedClassDescriptor.write(OptimizedClassDescriptor.java:814)
at 
org.apache.ignite.internal.marshaller.optimized.OptimizedObjectOutputStream.writeObject0(OptimizedObjectOutputStream.java:242)
at 
org.apache.ignite.internal.marshaller.optimized.OptimizedObjectOutputStream.writeObjectOverride(OptimizedObjectOutputStream.java:159)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:344)
at 
org.apache.ignite.internal.processors.rest.client.message.GridClientResponse.writeExternal(GridClientResponse.java:103)
at 
org.apache.ignite.internal.marshaller.optimized.OptimizedObjectOutputStream.writeExternalizable(OptimizedObjectOutputStream.java:319)
at 
org.apache.ignite.internal.marshaller.optimized.OptimizedClassDescriptor.write(OptimizedClassDescriptor.java:814)
at 
org.apache.ignite.internal.marshaller.optimized.OptimizedObjectOutputStream.writeObject0(OptimizedObjectOutputStream.java:242)
at 
org.apache.ignite.internal.marshaller.optimized.OptimizedObjectOutputStream.writeObjectOverride(OptimizedObjectOutputStream.java:159)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:344)
at 
org.apache.ignite.internal.marshaller.optimized.OptimizedMarshaller.marshal0(OptimizedMarshaller.java:201)
{code}

The main cause of this that GridClientOptimizedMarshaller marshall object 
through OptimizedMarshaller without backed OutputStream, so arithmetical 
overflow occurs in 

[jira] [Created] (IGNITE-8820) Add ability to accept changing txTimeoutOnPartitionMapExchange while waiting for pending transactions.

2018-06-18 Thread Ivan Daschinskiy (JIRA)
Ivan Daschinskiy created IGNITE-8820:


 Summary: Add ability to accept changing 
txTimeoutOnPartitionMapExchange while waiting for pending transactions.
 Key: IGNITE-8820
 URL: https://issues.apache.org/jira/browse/IGNITE-8820
 Project: Ignite
  Issue Type: Improvement
Affects Versions: 2.5
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy
 Fix For: 2.6


Currently, if ExchangeFuture waits whith old value of 
txTimeoutOnPartitionMapExchange, new value is not accepted until next exchange 
starts. Sometimes it's very usefull (while timeout is too long and must be 
shorter applied immediatelly)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8624) Add test coverage for NPE in TTL Manager [IGNITE-7972]

2018-05-28 Thread Ivan Daschinskiy (JIRA)
Ivan Daschinskiy created IGNITE-8624:


 Summary: Add test coverage for NPE in TTL Manager [IGNITE-7972]
 Key: IGNITE-8624
 URL: https://issues.apache.org/jira/browse/IGNITE-8624
 Project: Ignite
  Issue Type: Test
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy


Add test coverage (reproducer) to the [IGNITE-7972] case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8869) PartitionsExchangeOnDiscoveryHistoryOverflowTest

2018-06-25 Thread Ivan Daschinskiy (JIRA)
Ivan Daschinskiy created IGNITE-8869:


 Summary: PartitionsExchangeOnDiscoveryHistoryOverflowTest 
 Key: IGNITE-8869
 URL: https://issues.apache.org/jira/browse/IGNITE-8869
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.5
Reporter: Ivan Daschinskiy
 Fix For: 2.6


After introduction of ExhangeLatches, 
PartitionsExchangeOnDiscoveryHistoryOverflowTest  will hangs permanently.  In 
current implementation, ExchangeLatchManager retrieves alive nodes from 
discoveryCache with specific affinity topology version and fails because of a 
too short discovery history. This causes fail of exchange-worker and therefore 
NoOpFailureHandler leaves node in hanging state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8429) Unexpected error during incorrect WAL segment decompression, causes node termination.

2018-05-03 Thread Ivan Daschinskiy (JIRA)
Ivan Daschinskiy created IGNITE-8429:


 Summary: Unexpected error during incorrect WAL segment 
decompression, causes node termination.
 Key: IGNITE-8429
 URL: https://issues.apache.org/jira/browse/IGNITE-8429
 Project: Ignite
  Issue Type: Bug
  Components: persistence
Affects Versions: 2.5
Reporter: Ivan Daschinskiy
 Fix For: 2.5


File decompressor failure due to incorrect (zero-length) archived segment. 

2018-04-30 00:00:02.811 
[ERROR][wal-file-decompressor%DPL_GRID%DplGridNodeName][org.apache.ignite.Ignite]
 Critical system error detected. Will be handled accordingly to configured 
handler [hnd=class o.a.i.failure.StopNodeOrHaltFailureHandler, 
failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, 
err=java.lang.IllegalStateException: Thread 
wal-file-decompressor%DPL_GRID%DplGridNodeName is terminated unexpectedly]]
java.lang.IllegalStateException: Thread 
wal-file-decompressor%DPL_GRID%DplGridNodeName is terminated unexpectedly
at 
org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager$FileDecompressor.run(FileWriteAheadLogManager.java:2104)
2018-04-30 00:00:02.812 
[ERROR][wal-file-decompressor%DPL_GRID%DplGridNodeName][org.apache.ignite.Ignite]
 JVM will be halted immediately due to the failure: [failureCtx=FailureContext 
[type=SYSTEM_WORKER_TERMINATION, err=java.lang.IllegalStateException: Thread 
wal-file-decompressor%DPL_GRID%DplGridNodeName is terminated unexpectedly]]


touch 0754.wal
zip 0754.wal.zip 0754.wal
ls -l
-rw-rw-r-- 1 dmitriy dmitriy   0 май  1 16:40 0754.wal
-rw-rw-r-- 1 dmitriy dmitriy 190 май  1 16:46 0754.wal.zip

Archive:  /tmp/temp/0754.wal.zip
 Length   MethodSize  CmprDateTime   CRC-32   Name
  --  ---  -- -   
   0  Stored0   0% 2018-05-01 16:40   0754.wal
  ---  ------
   00   0%1 file

We should softly handle this situation: print message in log and continue the 
compression with next segment.
We also should handle "skipped" segments and don't delete them in 
deleteObsoleteRawSegments().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8920) Node should be failed when during tx finish indices are corrupted.

2018-07-03 Thread Ivan Daschinskiy (JIRA)
Ivan Daschinskiy created IGNITE-8920:


 Summary: Node should be failed when during tx finish indices are 
corrupted.
 Key: IGNITE-8920
 URL: https://issues.apache.org/jira/browse/IGNITE-8920
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.5
Reporter: Ivan Daschinskiy
 Fix For: 2.7


While transaction is processed after receiving finish request 
(IgniteTxHandler.finish) , node should be failed by FailureHandler if page 
content of indices is corrupted. Currently this case is not handled properly 
and cause to long running transactions over the grid. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9183) Proper handling UUID columns, that are added by DDL.

2018-08-03 Thread Ivan Daschinskiy (JIRA)
Ivan Daschinskiy created IGNITE-9183:


 Summary: Proper handling UUID columns, that are added by DDL.
 Key: IGNITE-9183
 URL: https://issues.apache.org/jira/browse/IGNITE-9183
 Project: Ignite
  Issue Type: Bug
  Components: sql
Affects Versions: 2.6, 2.5
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy
 Fix For: 2.7


Currently, if we added new UUID columnt thru DDL, it is saved to schema as 
byte[]. So it's impossible to use it with DML without placeholders and put 
values thru cache api without converting UUID to byte[].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9192) Dump statistics of processing IO messages.

2018-08-06 Thread Ivan Daschinskiy (JIRA)
Ivan Daschinskiy created IGNITE-9192:


 Summary: Dump statistics of processing IO messages. 
 Key: IGNITE-9192
 URL: https://issues.apache.org/jira/browse/IGNITE-9192
 Project: Ignite
  Issue Type: Improvement
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy


When debugging various performance problem, it's crucial to understand how long 
and what messages are processing. When enabled, this statistics should be 
collected and dumped in log with predefined frequency.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9023) LinkageError or ClassNotFoundException should not be swollen by GridDeploymentCommunication during processing deployment request.

2018-07-17 Thread Ivan Daschinskiy (JIRA)
Ivan Daschinskiy created IGNITE-9023:


 Summary: LinkageError or ClassNotFoundException should not be 
swollen by GridDeploymentCommunication during processing deployment request.
 Key: IGNITE-9023
 URL: https://issues.apache.org/jira/browse/IGNITE-9023
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.5
Reporter: Ivan Daschinskiy
 Fix For: 2.7


In current implementation any error, that is thrown in 
GridDeploymentCommunication#processResourceRequest, is ignored silently.

Any error should be logged and send to client.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8945) Stored cache data files corruption when node stops abruptly.

2018-07-05 Thread Ivan Daschinskiy (JIRA)
Ivan Daschinskiy created IGNITE-8945:


 Summary: Stored cache data files corruption when node stops 
abruptly.
 Key: IGNITE-8945
 URL: https://issues.apache.org/jira/browse/IGNITE-8945
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.5
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy
 Fix For: 2.7


When node is halted during saving stored cache data, content of this file can 
be corrupted. 
1. Additional check should be implemented in FilePageStoreManager.readCacheData 
 
(print the name of corrupted file)
2. In storeCacheData we need to serialize StoredCacheData to temp file then 
swap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8975) Invalid initialization of compressed archived WAL segment when WAL compression is switched off.

2018-07-10 Thread Ivan Daschinskiy (JIRA)
Ivan Daschinskiy created IGNITE-8975:


 Summary: Invalid initialization of compressed archived WAL segment 
when WAL compression is switched off.
 Key: IGNITE-8975
 URL: https://issues.apache.org/jira/browse/IGNITE-8975
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.5
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy
 Fix For: 2.7


After restarting node with WAL compression disabled and when compressed wal 
archive 
presentd, current implementation of FileWriteAheadLogManager ignores presenting 
compressed wal segment and initalizes empty brand new one. This causes 
following error:

{code:java}
2018-07-05 16:14:25.761 
[ERROR][exchange-worker-#153%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.c.CheckpointHistory]
 Failed to process checkpoint: CheckpointEntry 
[id=8dc4b1cc-dedd-4a57-8748-f5a7ecfd389d, timestamp=1530785506909, 
ptr=FileWALPointer [idx=4520, fileOff=860507725, len=691515]]
org.apache.ignite.IgniteCheckedException: Failed to find checkpoint record at 
the given WAL pointer: FileWALPointer [idx=4520, fileOff=860507725, len=691515]
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointEntry$GroupStateLazyStore.initIfNeeded(CheckpointEntry.java:346)
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointEntry$GroupStateLazyStore.access$300(CheckpointEntry.java:231)
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointEntry.initIfNeeded(CheckpointEntry.java:123)
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointEntry.groupState(CheckpointEntry.java:105)
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointHistory.isCheckpointApplicableForGroup(CheckpointHistory.java:377)
at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointHistory.searchAndReserveCheckpoints(CheckpointHistory.java:304)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.reserveHistoryForExchange(GridCacheDatabaseSharedManager.java:1614)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.distributedExchange(GridDhtPartitionsExchangeFuture.java:1139)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:724)
at 
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:2477)
at 
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2357)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
at java.lang.Thread.run(Thread.java:745)
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8203) Interrupting task can cause node fail with PersistenceStorageIOException.

2018-04-10 Thread Ivan Daschinskiy (JIRA)
Ivan Daschinskiy created IGNITE-8203:


 Summary: Interrupting task can cause node fail with 
PersistenceStorageIOException. 
 Key: IGNITE-8203
 URL: https://issues.apache.org/jira/browse/IGNITE-8203
 Project: Ignite
  Issue Type: Bug
  Components: persistence
Affects Versions: 2.4
Reporter: Ivan Daschinskiy
 Fix For: 2.6
 Attachments: GridFailNodesOnCanceledTaskTest.java

Interrupting task with simple cache operations (i.e. get, put) can cause 
PersistenceStorageIOException. Main cause of this failure is lack of proper 
handling InterruptedException in FilePageStore.init() etc. This cause a throw 
ClosedByInterruptException by FileChannel.write() and so on. 

PersistenceStorageIOException is a critical failure and typically makes a node 
to stop.

A reproducer is attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8120) Improve test coverage of rebalance failing

2018-04-03 Thread Ivan Daschinskiy (JIRA)
Ivan Daschinskiy created IGNITE-8120:


 Summary: Improve test coverage of rebalance failing
 Key: IGNITE-8120
 URL: https://issues.apache.org/jira/browse/IGNITE-8120
 Project: Ignite
  Issue Type: Test
  Components: general
Affects Versions: 2.4
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy
 Fix For: 2.5






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-10242) NPE in GridDhtPartitionDemander#handleSupplyMessage when concurrently rebalancing and stopping cache in same cache group.

2018-11-13 Thread Ivan Daschinskiy (JIRA)
Ivan Daschinskiy created IGNITE-10242:
-

 Summary: NPE in GridDhtPartitionDemander#handleSupplyMessage when 
concurrently rebalancing and stopping cache in same cache group.
 Key: IGNITE-10242
 URL: https://issues.apache.org/jira/browse/IGNITE-10242
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.6, 2.5
Reporter: Ivan Daschinskiy
 Fix For: 2.8


NPE in GridDhtPartitionDemander#handleSupplyMessage occurs when concurrently 
rebalancing and stopping cache in same cache group. Reproducer is attached


{noformat}
java.lang.NullPointerException
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander.preloadEntry(GridDhtPartitionDemander.java:893)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander.handleSupplyMessage(GridDhtPartitionDemander.java:772)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPreloader.handleSupplyMessage(GridDhtPreloader.java:331)
at 
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:411)
at 
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:401)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1058)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:583)
at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$700(GridCacheIoManager.java:101)

{noformat}








--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9854) NullPointerException in PageMemoryImpl.refreshOutdatedPages during removing from segCheckpointPages

2018-10-11 Thread Ivan Daschinskiy (JIRA)
Ivan Daschinskiy created IGNITE-9854:


 Summary: NullPointerException in 
PageMemoryImpl.refreshOutdatedPages during removing from segCheckpointPages
 Key: IGNITE-9854
 URL: https://issues.apache.org/jira/browse/IGNITE-9854
 Project: Ignite
  Issue Type: Bug
  Components: persistence
Affects Versions: 2.6
Reporter: Ivan Daschinskiy
 Fix For: 2.8


Because of possibility of concurrently setting segCheckpointPages to null of 
segment not under segment writeLock (i.e. in PageMemoryImpl#finishCheckpoint), 
NullPointerException is possible. This causes immediate node failure. 

Example stack trace is attached (failure during iteration in rebalance 
supplier).


{code:java}
java.lang.NullPointerException: null
at 
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.refreshOutdatedPage(PageMemoryImpl.java:840)
at 
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.access$5100(PageMemoryImpl.java:120)
at 
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl$Segment.removePageForReplacement(PageMemoryImpl.java:2175)
at 
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl$Segment.access$900(PageMemoryImpl.java:1841)
at 
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:686)
at 
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:627)
at 
org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.initFromLink(CacheDataRowAdapter.java:140)
at 
org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.initFromLink(CacheDataRowAdapter.java:102)
at 
org.apache.ignite.internal.processors.cache.tree.DataRow.(DataRow.java:54)
at 
org.apache.ignite.internal.processors.cache.tree.CacheDataRowStore.dataRow(CacheDataRowStore.java:73)
at 
org.apache.ignite.internal.processors.cache.tree.CacheDataTree.getRow(CacheDataTree.java:146)
at 
org.apache.ignite.internal.processors.cache.tree.CacheDataTree.getRow(CacheDataTree.java:41)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$ForwardCursor.fillFromBuffer(BPlusTree.java:4660)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$ForwardCursor.nextPage(BPlusTree.java:4760)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$ForwardCursor.next(BPlusTree.java:4689)
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9452) Correct GridInternalTaskUnusedWalSegmentsTest after merging IGNITE-6552

2018-09-03 Thread Ivan Daschinskiy (JIRA)
Ivan Daschinskiy created IGNITE-9452:


 Summary: Correct GridInternalTaskUnusedWalSegmentsTest after 
merging IGNITE-6552
 Key: IGNITE-9452
 URL: https://issues.apache.org/jira/browse/IGNITE-9452
 Project: Ignite
  Issue Type: Test
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy
 Fix For: 2.8


After merging  IGNITE-6552 need to correct 
GridInternalTaskUnusedWalSegmentsTest a little bit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9658) Add ability to disable memory deallocation on deactivation.

2018-09-20 Thread Ivan Daschinskiy (JIRA)
Ivan Daschinskiy created IGNITE-9658:


 Summary: Add ability to disable memory deallocation on 
deactivation.
 Key: IGNITE-9658
 URL: https://issues.apache.org/jira/browse/IGNITE-9658
 Project: Ignite
  Issue Type: Improvement
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy
 Fix For: 2.8


Currently, in some systems (i.e. RHEL 7.4), we can see, that during massive 
UNSAFE.freeMemory process freezes. This behaviour can lead to SEGMENTATION of 
node, especcially when ZookeeperDiscoverySPI is used. There should be an 
abillity to disable memory deallocation during deactivation of cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-10997) Add new property to DataRegionMetrics: empty pages count in reuseList.

2019-01-21 Thread Ivan Daschinskiy (JIRA)
Ivan Daschinskiy created IGNITE-10997:
-

 Summary: Add new property to DataRegionMetrics: empty pages count 
in reuseList.
 Key: IGNITE-10997
 URL: https://issues.apache.org/jira/browse/IGNITE-10997
 Project: Ignite
  Issue Type: Improvement
Reporter: Ivan Daschinskiy
 Fix For: 2.8


In order to estimate available space in data region, new property should be 
added in dataregions metrics -- empty pages count from 
org.apache.ignite.internal.processors.cache.persistence.freelist.AbstractFreeList#emptyDataPages



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-10339) Connection to cluster failed in control.sh while using --cache

2018-11-20 Thread Ivan Daschinskiy (JIRA)
Ivan Daschinskiy created IGNITE-10339:
-

 Summary: Connection to cluster failed in control.sh while using 
--cache
 Key: IGNITE-10339
 URL: https://issues.apache.org/jira/browse/IGNITE-10339
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.6
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy
 Fix For: 2.8






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-10406) .NET Failed to run ScanQuery with custom filter after server node restart

2018-11-26 Thread Ivan Daschinskiy (JIRA)
Ivan Daschinskiy created IGNITE-10406:
-

 Summary: .NET Failed to run ScanQuery with custom filter after 
server node restart
 Key: IGNITE-10406
 URL: https://issues.apache.org/jira/browse/IGNITE-10406
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Daschinskiy


Scenario:
1. Start server
2. Start client.
3. Restart server and wait for client reconnects the server.
4. Put some data to cache and run ScanQuery with custom filter
 
StackTrace:

{code:java}
class org.apache.ignite.IgniteCheckedException: Failed to inject resource 
[method=setIgniteInstance, 
target=org.apache.ignite.internal.processors.platform.cache.PlatformCacheEntryFilterImpl@6225c21c,
 rsrc=IgniteKernal [cfg=IgniteConfiguration 
[igniteInstanceName=CashflowCluster, pubPoolSize=8, svcPoolSize=8, 
callbackPoolSize=8, stripedPoolSize=8, sysPoolSize=8, mgmtPoolSize=4, 
igfsPoolSize=4, dataStreamerPoolSize=8,
 utilityCachePoolSize=8, utilityCacheKeepAliveTime=6, p2pPoolSize=2, 
qryPoolSize=8, 
igniteHome=C:\Job\fd-tasks\7404\IgniteTests2\packages\Apache.Ignite.2.6.0, 
igniteWorkDir=C:\Job\fd-tasks\7404\IgniteTests2\packages\Apache.Ignite.2.6.0\work,
 mbeanSrv=com.sun.jmx.mbeanserver.JmxMBeanServer@49993335, 
nodeId=3f4aadd9-01b3-4ffe-b629-895fb6ac886f, 
marsh=org.apache.ignite.internal.binary.BinaryMarshaller@77a57272, mar
shLocJobs=false, daemon=false, p2pEnabled=false, netTimeout=5000, 
sndRetryDelay=1000, sndRetryCnt=3, metricsHistSize=1, 
metricsUpdateFreq=2000, metricsExpTime=9223372036854775807, 
discoSpi=TcpDiscoverySpi [addrRslvr=null, sockTimeout=5000, ackTimeout=5000, 
marsh=JdkMarshaller 
[clsFilter=org.apache.ignite.marshaller.MarshallerUtils$1@65b1c1e3], 
reconCnt=10, reconDelay=2000, maxAckTimeout=60, forceSrvMode=fals
e, clientReconnectDisabled=false, internalLsnr=null], segPlc=STOP, 
segResolveAttempts=2, waitForSegOnStart=true, allResolversPassReq=true, 
segChkFreq=1, commSpi=TcpCommunicationSpi 
[connectGate=org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$ConnectGateway@4737110c,
 connPlc=org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$6@bce0ed4, 
enableForcibleNodeKill=false, enableTroubleshootingLog=fa
lse, 
srvLsnr=org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$2@11c20519, 
locAddr=null, locHost=0.0.0.0/0.0.0.0, locPort=47100, locPortRange=100, 
shmemPort=-1, directBuf=true, directSndBuf=false, idleConnTimeout=60, 
connTimeout=5000, maxConnTimeout=60, reconCnt=10, sockSndBuf=32768, 
sockRcvBuf=32768, msgQueueLimit=0, slowClientQueueLimit=0, 
nioSrvr=GridNioServer [selectorSpins=0, filterChain=Filte
rChain[filters=[GridNioCodecFilter 
[parser=org.apache.ignite.internal.util.nio.GridDirectParser@6839fd4e, 
directMode=true], GridConnectionBytesVerifyFilter], 
lsnr=org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$2@11c20519, 
closed=false, directBuf=true, tcpNoDelay=true, sockSndBuf=32768, 
sockRcvBuf=32768, writeTimeout=2000, idleTimeout=60, skipWrite=false, 
skipRead=false, locAddr=0.0.0.0/0.0.0.0:47100
, order=LITTLE_ENDIAN, sndQueueLimit=0, directMode=true, 
metricsLsnr=org.apache.ignite.spi.communication.tcp.TcpCommunicationMetricsListener@4e41089d,
 sslFilter=null, msgQueueLsnr=null, readerMoveCnt=0, writerMoveCnt=0, 
readWriteSelectorsAssign=false], shmemSrv=null, usePairedConnections=false, 
connectionsPerNode=1, tcpNoDelay=true, filterReachableAddresses=false, 
ackSndThreshold=32, unackedMsgsBufSize=0, sockWriteT
imeout=2000, 
lsnr=org.apache.ignite.internal.managers.communication.GridIoManager$2@432d2e4e,
 boundTcpPort=47100, boundTcpShmemPort=-1, selectorsCnt=4, selectorSpins=0, 
addrRslvr=null, ctxInitLatch=java.util.concurrent.CountDownLatch@70beb599[Count 
= 0], stopping=false, 
metricsLsnr=org.apache.ignite.spi.communication.tcp.TcpCommunicationMetricsListener@4e41089d],
 evtSpi=org.apache.ignite.spi.eventstorage.NoopEventSt
orageSpi@32a068d1, colSpi=NoopCollisionSpi [], deploySpi=LocalDeploymentSpi 
[lsnr=org.apache.ignite.internal.managers.deployment.GridDeploymentLocalStore$LocalDeploymentListener@3c6df856],
 indexingSpi=org.apache.ignite.spi.indexing.noop.NoopIndexingSpi@282003e1, 
addrRslvr=null, clientMode=false, rebalanceThreadPoolSize=1, 
txCfg=org.apache.ignite.configuration.TransactionConfiguration@7fad8c79, 
cacheSanityCheckEnable
d=true, discoStartupDelay=6, deployMode=SHARED, p2pMissedCacheSize=100, 
locHost=null, timeSrvPortBase=31100, timeSrvPortRange=100, 
failureDetectionTimeout=1, clientFailureDetectionTimeout=3, 
metricsLogFreq=6, hadoopCfg=null, 
connectorCfg=org.apache.ignite.configuration.ConnectorConfiguration@71a794e5, 
odbcCfg=null, warmupClos=null, atomicCfg=AtomicConfiguration 
[seqReserveSize=1000, cacheMode=PARTITI
ONED, backups=1, aff=null, grpName=null], classLdr=null, sslCtxFactory=null, 
platformCfg=PlatformDotNetConfiguration [binaryCfg=null], binaryCfg=null, 

[jira] [Created] (IGNITE-11400) Rebalancing caches with TTL enabled can cause data corruption.

2019-02-24 Thread Ivan Daschinskiy (JIRA)
Ivan Daschinskiy created IGNITE-11400:
-

 Summary: Rebalancing caches with TTL enabled can cause data 
corruption.
 Key: IGNITE-11400
 URL: https://issues.apache.org/jira/browse/IGNITE-11400
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.5
Reporter: Ivan Daschinskiy


During or just after rebalancing caches with TTL enabled, data corruption can 
occurs while ttl-cleanup-worker purges expired data.

See details in log

{code:java}
[15:24:01,677][INFO 
][sys-#49%datafabric-dev-21.example.com%][GridDhtPartitionDemander] Started 
rebalance routine [M2_PRODUCT_CACHE, 
supplier=14c0d3aa-6720-4c7f-a0e5-3ae1a00948b6, topic=0, fullPartitions=[1, 55, 
112, 153, 170, 175, 204, 236, 247, 331, 347, 417, 473, 503, 514, 524, 551, 745, 
748, 752, 762, 803, 816, 831, 851, 877, 928, 939], histPartitions=[]]
[15:24:02,031][ERROR][ttl-cleanup-worker-#39%datafabric-dev-21.example.com%][GridCacheTtlManager]
  Failed to process entry expiration: class 
o.a.i.i.processors.cache.persistence.tree.CorruptedTreeException: Runtime 
failure on bounds: [lower=null, upper=PendingRow []]
class 
org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException:
 Runtime failure on bounds: [lower=null, upper=PendingRow []]
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:1000)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:979)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.purgeExpiredInternal(GridCacheOffheapManager.java:1957)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.purgeExpired(GridCacheOffheapManager.java:1913)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.expire(GridCacheOffheapManager.java:861)
at 
org.apache.ignite.internal.processors.cache.GridCacheTtlManager.expire(GridCacheTtlManager.java:207)
at 
org.apache.ignite.internal.processors.cache.GridCacheSharedTtlCleanupManager$CleanupWorker.body(GridCacheSharedTtlCleanupManager.java:142)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: Invalid object type: 0
at 
org.apache.ignite.internal.processors.cacheobject.IgniteCacheObjectProcessorImpl.toKeyCacheObject(IgniteCacheObjectProcessorImpl.java:166)
at 
org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.toKeyCacheObject(CacheObjectBinaryProcessorImpl.java:980)
at 
org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.readFullRow(CacheDataRowAdapter.java:299)
at 
org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.initFromLink(CacheDataRowAdapter.java:159)
at 
org.apache.ignite.internal.processors.cache.persistence.CacheDataRowAdapter.initFromLink(CacheDataRowAdapter.java:102)
at 
org.apache.ignite.internal.processors.cache.tree.PendingRow.initKey(PendingRow.java:72)
at 
org.apache.ignite.internal.processors.cache.tree.PendingEntriesTree.getRow(PendingEntriesTree.java:118)
at 
org.apache.ignite.internal.processors.cache.tree.PendingEntriesTree.getRow(PendingEntriesTree.java:31)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$ForwardCursor.fillFromBuffer(BPlusTree.java:4702)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$ForwardCursor.init(BPlusTree.java:4604)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$ForwardCursor.access$5000(BPlusTree.java:4543)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findLowerUnbounded(BPlusTree.java:956)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:988)
... 8 more
[15:24:02,348][INFO 
][sys-#52%datafabric-dev-21.example.com%][GridDhtPartitionDemander] Started 
rebalance routine [CART_CACHE, supplier=70be2776-9e8f-4940-8a07-5e3c0ad43bdd, 
topic=0, fullPartitions=[561, 897], histPartitions=[]]
[15:24:02,439][INFO 
][sys-#48%datafabric-dev-21.example.com%][GridDhtPartitionDemander] Completed 
(final) rebalancing [grp=CART_CACHE, 
supplier=14c0d3aa-6720-4c7f-a0e5-3ae1a00948b6, topVer=AffinityTopologyVersion 
[topVer=921, minorTopVer=0], progress=5/5, time=95 ms]
[15:24:02,558][ERROR][ttl-cleanup-worker-#39%datafabric-dev-21.example.com%][GridCacheTtlManager]
  Failed to process entry expiration: class 
o.a.i.i.processors.cache.persistence.tree.CorruptedTreeException: Runtime 
failure on bounds: [lower=null, upper=PendingRow []]
class 

[jira] [Created] (IGNITE-11364) Segmenting node can cause ring topology broke

2019-02-20 Thread Ivan Daschinskiy (JIRA)
Ivan Daschinskiy created IGNITE-11364:
-

 Summary: Segmenting node can cause ring topology broke
 Key: IGNITE-11364
 URL: https://issues.apache.org/jira/browse/IGNITE-11364
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.7, 2.6, 2.5
Reporter: Ivan Daschinskiy
 Fix For: 2.8


While segmenting by partial network drop, i.e. by applying iptables rules, can 
cause ring broke.
Scenario:
On each machine there are two nodes, client and server respectivelly.

Lets draw diagram (only server nodes for brevity, they have been started before 
clients).

=> grid915 => ... => grid947 => grid945 => grid703 => ..skip 12 nodes...=> 
grid952 => grid946.
On grid945 machine we drop incoming/outgoing connections by iptables.

During ongoing drop of connection, grid945 send TcpDiscoveryStatusCheckMessage, 
but cannot send them to grid703 and others mentioned above 12 nodes, but some 
next node accepted it with collection of failedNodes (13 nodes above). This 
message was received by grid947 and it skip these 13 nodes in 
org.apache.ignite.spi.discovery.tcp.ServerImpl.RingMessageWorker#sendMessageAcrossRing.

So we see this situation in topology:

.. => grid947 => grid952
 ^ 
//
grid703=>=>grid662

These nodes are not considere by topology as failed.


 









--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11139) Remove deprecated snapshot tags from PageMetaIO.

2019-01-30 Thread Ivan Daschinskiy (JIRA)
Ivan Daschinskiy created IGNITE-11139:
-

 Summary: Remove deprecated snapshot tags from PageMetaIO.
 Key: IGNITE-11139
 URL: https://issues.apache.org/jira/browse/IGNITE-11139
 Project: Ignite
  Issue Type: Improvement
Reporter: Ivan Daschinskiy
 Fix For: 3.0


After resolving IGNITE-9672, unnecessary methods from PageMetaIO should be 
removed.
Also corresponding PageDeltaRecords should be also removed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-12897) Add .NET api to enabling SQL indexing for existing cache.

2020-04-14 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-12897:
-

 Summary: Add .NET api to enabling SQL indexing for existing cache.
 Key: IGNITE-12897
 URL: https://issues.apache.org/jira/browse/IGNITE-12897
 Project: Ignite
  Issue Type: Bug
  Components: platforms
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy


Add .NET api to enabling SQL indexing for existing cache.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12999) Fix broken ZookeeperDiscoverySpiSslTest.testIgniteSslWrongPort

2020-05-11 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-12999:
-

 Summary: Fix broken 
ZookeeperDiscoverySpiSslTest.testIgniteSslWrongPort
 Key: IGNITE-12999
 URL: https://issues.apache.org/jira/browse/IGNITE-12999
 Project: Ignite
  Issue Type: Test
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy


After merging [IGNITE-12992|https://issues.apache.org/jira/browse/IGNITE-12992] 
to master,
mentioned above test, that was initially broken, starts to fail in master. This 
is because actual zk connection string was set, but not wrong. So node joins 
and assertion fails. Fix is trivial.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12992) Fix multijvm failing tests in ZookeeperDiscoverySpiTestSuite3

2020-05-08 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-12992:
-

 Summary: Fix multijvm failing tests in 
ZookeeperDiscoverySpiTestSuite3
 Key: IGNITE-12992
 URL: https://issues.apache.org/jira/browse/IGNITE-12992
 Project: Ignite
  Issue Type: Test
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13043) Fix compilation error in Ignite C++, when boost version is greater than 1.70

2020-05-20 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-13043:
-

 Summary: Fix compilation error in Ignite C++, when boost version 
is greater than 1.70 
 Key: IGNITE-13043
 URL: https://issues.apache.org/jira/browse/IGNITE-13043
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy


Fix compilation issue when libboost greater than 1.71 in 
TeamcityBoostLogFormatter



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13042) Update SSL certificates in C++ test suites to more secure signature

2020-05-20 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-13042:
-

 Summary: Update SSL certificates in C++ test suites to more secure 
signature
 Key: IGNITE-13042
 URL: https://issues.apache.org/jira/browse/IGNITE-13042
 Project: Ignite
  Issue Type: Test
  Components: platforms
Reporter: Ivan Daschinskiy


When modern openssl is used (i.e  OpenSSL 1.1.1f, which is default for ubuntu 
20.04, for example),  provided certificates are not accepted, because use 
sha1withRsaEncription signature, that is widely considered flaw. So 
certificates needs to be renewed.

Example error:

{code}
Connecting to 127.0.0.1:0
140246535644992:error:140AB18E:SSL routines:SSL_CTX_use_certificate:ca md too 
weak:../ssl/ssl_rsa.c:310:
Failed to connect :Can not set client certificate file for secure connection: 
path 
/home/ivandasch/ignite/modules/platforms/cpp/thin-client-test/config/ssl/client_full.pem

{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13430) Create minimal documentation for ducktape tests

2020-09-10 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-13430:
-

 Summary: Create minimal documentation for ducktape tests
 Key: IGNITE-13430
 URL: https://issues.apache.org/jira/browse/IGNITE-13430
 Project: Ignite
  Issue Type: Task
  Components: documentation
Reporter: Ivan Daschinskiy
Assignee: Sergei Ryzhov


Create minimal quickstart documentation in {{README.md}}

Documentation should contain following:
# Requirements for development
# Requirements for running tests locally
# Exact algorithm how to run tests locally (full suite, particular suite, 
particular test)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13429) Implement integration tests for control.sh transactions management

2020-09-10 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-13429:
-

 Summary: Implement integration tests for control.sh transactions 
management
 Key: IGNITE-13429
 URL: https://issues.apache.org/jira/browse/IGNITE-13429
 Project: Ignite
  Issue Type: Test
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13508) Test scenario of two-phased rebalance (PDS reduce)

2020-10-02 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-13508:
-

 Summary: Test scenario of two-phased rebalance (PDS reduce)
 Key: IGNITE-13508
 URL: https://issues.apache.org/jira/browse/IGNITE-13508
 Project: Ignite
  Issue Type: Test
Reporter: Ivan Daschinskiy


Let us assume a cluster of 16 affinity nodes.

Lets divide cluster in 4 equal cells:
Each node in cell has the same node attribute {{CELL=CELL_}}

Caches, that will be started on nodes, should have affinity function with this 
backup filter:

{code:java}
public class CellularAffinityBackupFilter implements 
IgniteBiPredicate> {
private static final long serialVersionUID = 1L;

private final String attrName;

public CellularAffinityBackupFilter(String attrName) {
this.attrName = attrName;
}

@Override public boolean apply(ClusterNode candidate, List 
previouslySelected) {
for (ClusterNode node : previouslySelected)
return Objects.equals(candidate.attribute(attrName), 
node.attribute(attrName));

return true;
}
}
{code}


Steps:
*  Preparations.
1. Start all 4 cells.
2. Load data to cache with the mentioned above affinity function and  fix PDS 
size on all nodes.
3. Delete 80% of data and fix PDS size on all nodes.
*  Phase 1
1. Stop two nodes in each cell, total a half of all nodes and clean PDS.
2. Start cleaned node with preservance of consistent id and cell attributes.
3. Wait for rebalance finished.
* Phase 2
Run steps 1-3 of Phase 2 on the other half of the cluster.
* Verifications
1. Check that PDS size reduced (compare to step 3)
2. Check data consistency (idle_verify --dump)









--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13540) Exchange worker, waiting for new task from queue, considered as blocked.

2020-10-07 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-13540:
-

 Summary: Exchange worker, waiting for new task from queue, 
considered as blocked.
 Key: IGNITE-13540
 URL: https://issues.apache.org/jira/browse/IGNITE-13540
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy


Waiting for new task in ExchangeWorker#body now is not marking as blocking 
section.
So if network timeout (timeout for polling task from queue) is greater than 
system worker blocked timeout, exchange worker thread is considered as 
blocking. Sometimes this is reported in logs after few seconds when actually 
PME is finished


{noformat}
[2020-10-06 16:55:45,939][INFO 
][exchange-worker-#50][org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager1]
 Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion 
[topVer=6, minorTopVer=1], force=false, evt=DISCOVERY_CUSTOM_EVT, 
node=163fd0f0-b9a4-4317-a28f-f7dbdb776076]
[2020-10-06 16:55:48,822][ERROR][tcp-disco-msg-worker-[9e18957a 
172.18.0.5:47500]-#2-#44][org.apache.ignite.internal.util.typedef.G1] Blocked 
system-critical thread has been detected. This can lead to cluster-wide 
undefined behaviour [workerName=partition-exchanger, 
threadName=exchange-worker-#50, blockedFor=2s]
[2020-10-06 16:55:48,824][WARN ][tcp-disco-msg-worker-[9e18957a 
172.18.0.5:47500]-#2-#44][org.apache.ignite.internal.util.typedef.G1] Thread 
[name="exchange-worker-#50", id=90, state=TIMED_WAITING, blockCnt=20, 
waitCnt=48]
Lock 
[object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@14f29e0e,
 ownerName=null, ownerId=-1]

[2020-10-06 16:55:48,827][WARN ][tcp-disco-msg-worker-[9e18957a 
172.18.0.5:47500]-#2-#44][root1] Possible failure suppressed accordingly to a 
configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet 
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], 
failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class 
o.a.i.IgniteException: GridWorker [name=partition-exchanger, 
igniteInstanceName=null, finished=false, heartbeatTs=1601992545941]]]
class org.apache.ignite.IgniteException: GridWorker [name=partition-exchanger, 
igniteInstanceName=null, finished=false, heartbeatTs=1601992545941]
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$3.apply(IgnitionEx.java:1860)
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$3.apply(IgnitionEx.java:1855)
at 
org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:234)
at 
org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:299)
{noformat}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13572) Duplicates in select query during partition eviction.

2020-10-12 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-13572:
-

 Summary: Duplicates in select query during partition eviction.
 Key: IGNITE-13572
 URL: https://issues.apache.org/jira/browse/IGNITE-13572
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.8.1, 2.9
Reporter: Ivan Daschinskiy


Scenario:

# Starts 2 node with indexed atomic partitioned cache with 0 backups.
# Loads sufficient amout of data (or emulate slow removal from idx)
# Start another node.
# Perform SELECT * FROM .

Reproducer is attached



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13575) Invalid blocking section in GridNioWorker and GridNioClientWorker leads to false positive blocking thread detection

2020-10-12 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-13575:
-

 Summary: Invalid blocking section in GridNioWorker and 
GridNioClientWorker leads to false positive blocking thread detection
 Key: IGNITE-13575
 URL: https://issues.apache.org/jira/browse/IGNITE-13575
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.8.1, 2.9
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy


If {{IGNITE_SYSTEM_WORKER_BLOCKED_TIMEOUT}} less than 2000 ms, then simple 
{{epoll_wait}} for 2000 on idle cluster is considered as critical failure. 

We should surround {{selector.select}} with {{blockingSectionBegin}} and 
{{blockingSectionEnd}} instead of {{updateHeartbeat}}





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13564) Improve SYSTEM_WORKER_BLOCKED reporting.

2020-10-09 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-13564:
-

 Summary: Improve SYSTEM_WORKER_BLOCKED reporting.
 Key: IGNITE-13564
 URL: https://issues.apache.org/jira/browse/IGNITE-13564
 Project: Ignite
  Issue Type: Improvement
Affects Versions: 2.8.1, 2.9
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy
 Fix For: 2.10


Currently, reporting of system thread blocking has major drawbacks.

1. As system worker blocking is detected by another thread, due to 
implementation, failure handler receives not full information about problem. In 
{{FailureContext}} we have only two fields -- {{type}} and {{err}}.  Throwable 
{{err}} is generated in thread-detector flow, so we lost a context of main 
problem. 
2. Currently, due to implementation, we print not full stacktrace of blocking 
thread in {{org.apache.ignite.internal.worker.WorkersRegistry#onIdle}}. 
3. Current approach doesn't work when there is one thread in registry, this 
fact isn't checked and this can cause to infinite looping of single thread, 
calling {{onIdle}}

This two drawbacks can lead to completely loss of information about blocking 
system thread.

I suggests:
1. Add another parameter in {{FailureContext}}, namely {{worker}}
2. Fix threaddump printing.
3. Add assertion when there is only one system thread in registry



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13495) ZookeeperDiscoverySpiMBeanImpl#getCoordinator can return invalid node as coordinator

2020-09-29 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-13495:
-

 Summary: ZookeeperDiscoverySpiMBeanImpl#getCoordinator can return 
invalid node as coordinator
 Key: IGNITE-13495
 URL: https://issues.apache.org/jira/browse/IGNITE-13495
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.9
Reporter: Ivan Daschinskiy
 Fix For: 2.10


Due to invalid algorithm in 
{{org.apache.ignite.spi.discovery.zk.internal.ZookeeperDiscoveryImpl#getCoordinator}}
sometimes invalid coordinator could be return

Consider scenarion:
1. Start server #1
2. Start client
3. Start server #2
4. Stop server #1

After this, {{ZookeeperDiscoverySpiMBeanImpl#getCoordinator}} returns as 
coordinator a client, because it is the oldest node in topology.

We should rewrite 
{{org.apache.ignite.spi.discovery.zk.internal.ZookeeperDiscoveryImpl#getCoordinator}}
 to return *oldest server*, not any node.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13481) Decorators @version_if and @ignite_versions injects incorrect variables.

2020-09-24 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-13481:
-

 Summary: Decorators @version_if and @ignite_versions injects 
incorrect variables.
 Key: IGNITE-13481
 URL: https://issues.apache.org/jira/browse/IGNITE-13481
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy


Sometimes these decorators injects variables incorrectly, especially when 
mixed. Need
to fix corner cases and checks them in unit test.
As a side effect, introduce unit testing in ducktests module



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13491) Fix incorrect topology snapshot logger output about coordinator change.

2020-09-29 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-13491:
-

 Summary: Fix incorrect topology snapshot logger output about 
coordinator change.
 Key: IGNITE-13491
 URL: https://issues.apache.org/jira/browse/IGNITE-13491
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Daschinskiy


Currently, logic in 
{{org.apache.ignite.internal.managers.discovery.GridDiscoveryManager#topologySnapshotMessage}}
 has major drawback, in condition we don't check that failed node with order 
less than oldest server node, is actually server node. So we can see invalid 
message about coordinator change, event though previous node was a client.

Reproducer:
1. Start server #1
2. Start client
3. Start server #1
4. Stop server #1 and client

We will see in logs of server #2 something like this:

{{[2020-09-29 10:41:25,909][INFO 
][disco-event-worker-#150%tcp.TcpDiscoverySpiMBeanTest2%][GridDiscoveryManager] 
Coordinator changed [prev=TcpDiscoveryNode 
[id=371896fb-f612-4640-bfcd-cef6d281, 
consistentId=371896fb-f612-4640-bfcd-cef6d281, addrs=ArrayList [127.0.0.1], 
sockAddrs=HashSet [/127.0.0.1:0], discPort=0, order=2, intOrder=2, 
lastExchangeTime=1601365285287, loc=false, ver=2.10.0#20200929-sha1:, 
*isClient=true*], cur=TcpDiscoveryNode 
[id=9d90f4b0-1374-4147-b7a7-d821f002, consistentId=127.0.0.1:47501, 
addrs=ArrayList [127.0.0.1], sockAddrs=HashSet [/127.0.0.1:47501], 
discPort=47501, order=3, intOrder=3, lastExchangeTime=1601365285900, loc=true, 
ver=2.10.0#20200929-sha1:, isClient=false]]}}






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13078) С++: Add CMake build support

2020-05-26 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-13078:
-

 Summary: С++: Add CMake build support
 Key: IGNITE-13078
 URL: https://issues.apache.org/jira/browse/IGNITE-13078
 Project: Ignite
  Issue Type: Improvement
  Components: platforms
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy
 Fix For: 2.9


Currently, it is hard to build Ignite.C++. Different build process for windows 
and linux, lack of building support on Mac OS X (quite popular OS among 
developers), absolutely not IDE support, except windows and only Visual Studio 
is supported.

I’d suggest to migrate to CMake build system. It is very popular among open 
source projects, and in Apache Software Foundation too. Notable user: Apache 
Mesos, Apache Zookeeper (C client offers CMake as an alternative to autoconf 
and only option on windows), Apache Kafka (librdkafka - C/C++ client), Apache 
Thrift. Popular column-oriented database ClickHouse also uses CMake.

CMake is widely supported in many IDE’s on various platforms, notably Visual 
Studio, CLion, Xcode, QtCreator, KDevelop.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13459) Document new build process for Ignite C++

2020-09-18 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-13459:
-

 Summary: Document new build process for Ignite C++
 Key: IGNITE-13459
 URL: https://issues.apache.org/jira/browse/IGNITE-13459
 Project: Ignite
  Issue Type: Task
  Components: documentation
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy
 Fix For: 2.9






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13321) Control utility doesn't print results to stdout.

2020-08-03 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-13321:
-

 Summary: Control utility doesn't print results to stdout.
 Key: IGNITE-13321
 URL: https://issues.apache.org/jira/browse/IGNITE-13321
 Project: Ignite
  Issue Type: Bug
  Components: control.sh
Affects Versions: 2.10
Reporter: Ivan Daschinskiy
 Fix For: 2.10


After merging [IGNITE-13123|https://issues.apache.org/jira/browse/IGNITE-13123] 
{{control.sh}} doesn't work properly either in dev mode, or in release mode. 
Specifically,
no output printed in stdout. However, 

For example, incorrect output for {{control.sh --activate}} after commit s:
{code:sh}
Control utility [ver. 2.9.0-SNAPSHOT#20200803-sha1:DEV]
2020 Copyright(C) Apache Software Foundation
User: ivandasch
Time: 2020-08-03T17:21:06.246
Command [BASELINE] started
Arguments: --baseline

Failed to execute baseline command='collect'
Latest topology update failed.
Connection to cluster failed. Latest topology update failed.
Command [BASELINE] finished with code: 2
Control utility has completed execution at: 2020-08-03T17:21:09.613
Execution time: 3367 ms
{code}
Correct output for {{control.sh --activate}} before commit is:
{code}
Control utility [ver. 2.8.1#20200521-sha1:86422096], 
2020 Copyright(C) Apache Software Foundation, 
User: ducker, 
Time: 2020-08-03T14:23:55.793, 
Command [BASELINE] started, 
 Arguments: --host ducker04 --baseline set ducker02,ducker03,ducker04 --yes, 
 
,
 
 Cluster state: active, 
 Current topology version: 3, 
 Baseline auto adjustment disabled: softTimeout=30
   Current topology version: 3 (Coordinator: ConsistentId=ducker02, Order=1)
Baseline nodes: 
   ConsistentId=ducker02, State=ONLINE, Order=1, 
   ConsistentId=ducker03, State=ONLINE, Order=2, 
   ConsistentId=ducker04, State=ONLINE, Order=3, 
,
 
  "Number of baseline nodes: 3\n", 
  "\n", 
  "Other nodes not found.\n", 
  "Command [BASELINE] finished with code: 0\n", 
  "Control utility has completed execution at: 2020-08-03T14:23:57.351\n", 
  "Execution time: 1558 ms\n"
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13328) Control.sh bash script swallow return code of CommandHandler and always return 0

2020-08-05 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-13328:
-

 Summary: Control.sh bash script swallow return code of 
CommandHandler and always return 0
 Key: IGNITE-13328
 URL: https://issues.apache.org/jira/browse/IGNITE-13328
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.8.1, 2.8
Reporter: Ivan Daschinskiy
 Fix For: 2.9


After merging [IGNITE-12367|https://issues.apache.org/jira/browse/IGNITE-12367],
control.sh always return 0, despite the fact that CommandHandler returns 
correct code.

For example:
Ignite 2.8.1
{code}
Failed to execute baseline command='collect'
Latest topology update failed.
Connection to cluster failed. Latest topology update failed.
Command [BASELINE] finished with code: 2
Control utility has completed execution at: 2020-08-05T15:01:34.123
Execution time: 26627 ms
>>> echo $?
0
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13176) C++: Remove autotools build after merging CMake

2020-06-23 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-13176:
-

 Summary: C++: Remove autotools build after merging CMake
 Key: IGNITE-13176
 URL: https://issues.apache.org/jira/browse/IGNITE-13176
 Project: Ignite
  Issue Type: Improvement
  Components: platforms
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy
 Fix For: 2.9






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13187) Jar hell in classpath leads to failed tests in C++ and .NET suites

2020-06-25 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-13187:
-

 Summary: Jar hell in classpath leads to failed tests in C++ and 
.NET suites
 Key: IGNITE-13187
 URL: https://issues.apache.org/jira/browse/IGNITE-13187
 Project: Ignite
  Issue Type: Test
  Components: platforms
Affects Versions: 2.8.1
 Environment: Apache Ignite TC.
Reporter: Ivan Daschinskiy


On some agents tests and examples start failing with this calltrace:

{code:java}
[13:53:52]java.lang.NoSuchFieldError: logger
[13:53:52]  at 
org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:723)
[13:53:52]  at 
org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:881)
[13:53:52]  at 
org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:551)
[13:53:52]  at 
org.apache.ignite.internal.util.spring.IgniteSpringHelperImpl.applicationContext(IgniteSpringHelperImpl.java:381)
[13:53:52]  at 
org.apache.ignite.internal.util.spring.IgniteSpringHelperImpl.loadConfigurations(IgniteSpringHelperImpl.java:104)
[13:53:52]  at 
org.apache.ignite.internal.util.spring.IgniteSpringHelperImpl.loadConfigurations(IgniteSpringHelperImpl.java:98)
[13:53:52]  at 
org.apache.ignite.internal.IgnitionEx.loadConfigurations(IgnitionEx.java:709)
[13:53:52]  at 
org.apache.ignite.internal.IgnitionEx.loadConfiguration(IgnitionEx.java:767)
[13:53:52]  at 
org.apache.ignite.internal.processors.platform.PlatformIgnition.configuration(PlatformIgnition.java:152)
[13:53:52]  at 
org.apache.ignite.internal.processors.platform.PlatformIgnition.start(PlatformIgnition.java:67)
{code}


The main reason of failure is jar-hell. When .NET or C++ tests are started, if 
IGNITE_NATIVE_TEST_CLASSPATH is set to true, source directory is iterated and 
files libs, target/classes etc.are added to classpath. But neither readdir(), 
FindNextFileA() or Directory.EnumerateDirectories() do guarantee any ordering. 
But in spring-data-2.0 and spring-data-2.2 there are different version of 
spring. So jar hell occurs and tests fails. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13291) Remove unnecessary dependency to curator-client from ZookeeperDiscoverySpi

2020-07-23 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-13291:
-

 Summary: Remove unnecessary dependency to curator-client from 
ZookeeperDiscoverySpi
 Key: IGNITE-13291
 URL: https://issues.apache.org/jira/browse/IGNITE-13291
 Project: Ignite
  Issue Type: Improvement
  Components: zookeeper
Affects Versions: 2.9
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy
 Fix For: 2.10


Currently, I suppose by mistake, we use 
{{org.apache.curator.utils.PathUtils#validatePath(java.lang.String)}} from 
{{curator-client}}
in {{ZookeeperDiscoverySpi}}. Generally, this discovery implementation doesn't 
depend on curator framework at all, except some test code. We should remove 
this dependency and add this utility method to our codebase.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13292) Remove unneeded ZkPinger from ZookeeperDiscovery

2020-07-23 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-13292:
-

 Summary: Remove unneeded ZkPinger from ZookeeperDiscovery
 Key: IGNITE-13292
 URL: https://issues.apache.org/jira/browse/IGNITE-13292
 Project: Ignite
  Issue Type: Improvement
  Components: zookeeper
Affects Versions: 2.9
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy
 Fix For: 2.10


We need remove unneede {{ZkPinger}} from our codebase, introduced in 
[IGNITE-9683|https://issues.apache.org/jira/browse/IGNITE-9683]. This pinger 
was introduced to solve issues with server nodes segmentation when cluster is 
deactivated. The main reason of that is the strange all thread freeze when huge 
amount of memory is deallocated with {{Unsafe.freeMemory}}, such freeze can 
last for a minute and more. So this pinger doesn't solve problem at all and 
this is proved. The working solution to this problem is introduced in
[IGNITE-9658|https://issues.apache.org/jira/browse/IGNITE-9658]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13308) C++: Thin client transactions

2020-07-28 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-13308:
-

 Summary: C++: Thin client transactions
 Key: IGNITE-13308
 URL: https://issues.apache.org/jira/browse/IGNITE-13308
 Project: Ignite
  Issue Type: Improvement
  Components: platforms
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy
 Fix For: 2.10






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13967) Refactor and imrpove performance of python thin client marshaller

2021-01-11 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-13967:
-

 Summary: Refactor and imrpove performance of python thin client 
marshaller
 Key: IGNITE-13967
 URL: https://issues.apache.org/jira/browse/IGNITE-13967
 Project: Ignite
  Issue Type: Improvement
  Components: thin client
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy


Currently implemented serialization has questionable design and suffers from 
some problems
1. It is tightly coupled with Client object
2. It doesn't use protocol feature that total length of message is in the 
header,
thus it constantly load from Client some data instead of iteration over byte 
array.
3. It uses some tricky hacks and sometimes new connection is created when 
deserializing object.
4. It constantly allocates bytes (immutable data structure).


I suggest to rewrite serialization and deserialization:
1. Pass to corresponding methods specific SerDe context + BytesIO
2. Context can be sync and async and contains specific flags and methods for 
loading/uploading binary object schemas
3. Refactor Client in order to retrieve full packet from socket at once then 
pass full packet futher.

These steps can significantly improve performance, reduce amount of allocations 
and give
foundation for implementing asyncio version of client.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13872) Hardcoded retry timeout in ZookeeperClient

2020-12-17 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-13872:
-

 Summary: Hardcoded retry timeout in ZookeeperClient
 Key: IGNITE-13872
 URL: https://issues.apache.org/jira/browse/IGNITE-13872
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy


Currently, retry timeout is hardcoded (2000ms) in ZookeeperClient. We should 
calculate this timeout using some strategy, depending on session timeout.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13903) Python thin client tests automation.

2020-12-24 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-13903:
-

 Summary: Python thin client tests automation.
 Key: IGNITE-13903
 URL: https://issues.apache.org/jira/browse/IGNITE-13903
 Project: Ignite
  Issue Type: Improvement
  Components: python
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy


It would be nice to futher improve our development process of python-thin-client
1. Add docker-compose.yml to simplify local development
2. Add tox.ini to simplify test running automation
3. Integrate travis-ci build.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13882) Support configurable install root and work root for ducktape

2020-12-21 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-13882:
-

 Summary: Support configurable install root and work root for 
ducktape
 Key: IGNITE-13882
 URL: https://issues.apache.org/jira/browse/IGNITE-13882
 Project: Ignite
  Issue Type: Test
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13690) Failed to init coordinator caches on concurrent start of nodes with different cache configurations.

2020-11-10 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-13690:
-

 Summary: Failed to init coordinator caches on concurrent start of 
nodes with different cache configurations.
 Key: IGNITE-13690
 URL: https://issues.apache.org/jira/browse/IGNITE-13690
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.9
Reporter: Ivan Daschinskiy


Scenario:
1. Start simultaneously nodes with different cache configurations
(for simplicity, let client nodes be with configured caches, servers without).
2. When processing first exchange on coordinator, coordinator will fail with 

{code:java}
[2020-11-10 
13:23:57,232][ERROR][start-node-1][DifferentCacheConfigurationConcurrentStart0] 
Got exception while starting (will rollback startup routine).
java.lang.AssertionError: Invalid exchange futures state [cur=6, total=7]
at 
org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager$17.applyx(CacheAffinitySharedManager.java:1964)
at 
org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager$17.applyx(CacheAffinitySharedManager.java:1935)
at 
org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.lambda$forAllRegisteredCacheGroups$e0a6939d$1(CacheAffinitySharedManager.java:1265)
at 
org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11157)
at 
org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11059)
at 
org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11039)
at 
org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.forAllRegisteredCacheGroups(CacheAffinitySharedManager.java:1264)
at 
org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.initCoordinatorCaches(CacheAffinitySharedManager.java:1935)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.initCoordinatorCaches(GridDhtPartitionsExchangeFuture.java:716)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:850)
at 
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:3175)
at 
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:3021)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at java.lang.Thread.run(Thread.java:748)
{code}


The main reason is the race on creating {{LocalJoinCachesContext}}, so local 
join caches differs from registered caches from other nodes. 

Reproducer for zk and ring discoveries are attached. 
NB! Not always reproducible -- to increase probability of fail, add sleep in 
{{GridDhtPartitionsExchangeFuture#init}}

{code:java}
 public void init(boolean newCrd) throws IgniteInterruptedCheckedException {
if (newCrd)
U.sleep(500);
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13699) Support new metrics framework in ZookeeperDiscovery

2020-11-12 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-13699:
-

 Summary: Support new metrics framework in ZookeeperDiscovery
 Key: IGNITE-13699
 URL: https://issues.apache.org/jira/browse/IGNITE-13699
 Project: Ignite
  Issue Type: Improvement
  Components: zookeeper
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13911) asyncio version of python ignite thin client

2020-12-25 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-13911:
-

 Summary: asyncio version of python ignite thin client
 Key: IGNITE-13911
 URL: https://issues.apache.org/jira/browse/IGNITE-13911
 Project: Ignite
  Issue Type: Improvement
  Components: python
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy


Currently, asyncio is default event-loop and coroutine engine for python 3.6+. 
This approach can drastically improve performance of IO-bound tasks. So it is 
important to implement asyncio version of python ignite client. Old synchronous 
version should remain and share common code with asyncio version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13994) Rebalance huge cache for in-memory cluster

2021-01-14 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-13994:
-

 Summary: Rebalance huge cache for in-memory cluster
 Key: IGNITE-13994
 URL: https://issues.apache.org/jira/browse/IGNITE-13994
 Project: Ignite
  Issue Type: Test
Reporter: Ivan Daschinskiy


There are some evidence, that rebalancing huge cache without rebalance 
throttling can cause OOM on supplier. We need to cover this scenario.

Scenario:
1. Start two nodes and 1 replicated cache with data region much more than heap.
2. Stop one of the node.
3. Load data to cache almost equal to size of data region.
4. Start node.

Goal is to run experiments with parameters
1. Heap size
2. Cache size
3. Rebalance batch size.
4. Rebalance throttle



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14003) OOM on creating rebalance iterator while rebalancing cache with large values.

2021-01-16 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-14003:
-

 Summary: OOM on creating rebalance iterator while rebalancing 
cache with large values.
 Key: IGNITE-14003
 URL: https://issues.apache.org/jira/browse/IGNITE-14003
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.9.1, 2.8.1, 2.9
Reporter: Ivan Daschinskiy


Scenario
1. Start replicated cache on ignite node, memory region approx 6 Gb, heap 1Gb
2. Load significant amount of data to cache with values approx 200Kb (~20K kv 
pairs)
3. Start another node 
First node (supplier) will crash while initializing rebalance iterator with OOM
Main reason -- all values, to whon pointed from leaf of BTree, are all loaded 
to buffer in BPlusTree#ForwardCursor. For replicated cache, 512 iterators for 
each partition are created at once.

Reproducer is attached.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14046) Update ducktape version to 0.8.2 in ignite-ducktape module

2021-01-25 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-14046:
-

 Summary: Update ducktape version to 0.8.2 in ignite-ducktape module
 Key: IGNITE-14046
 URL: https://issues.apache.org/jira/browse/IGNITE-14046
 Project: Ignite
  Issue Type: Task
Reporter: Ivan Daschinskiy






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14074) Add ability to skip affinity tests for testing with older ignite versions.

2021-01-27 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-14074:
-

 Summary: Add ability to skip affinity tests for testing with older 
ignite versions.
 Key: IGNITE-14074
 URL: https://issues.apache.org/jira/browse/IGNITE-14074
 Project: Ignite
  Issue Type: Task
  Components: python, thin client
Reporter: Ivan Daschinskiy






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14154) Remove test test_unsupported_affinity_cache_operation_routed_to_random_node

2021-02-10 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-14154:
-

 Summary: Remove test 
test_unsupported_affinity_cache_operation_routed_to_random_node
 Key: IGNITE-14154
 URL: https://issues.apache.org/jira/browse/IGNITE-14154
 Project: Ignite
  Issue Type: Test
Reporter: Ivan Daschinskiy


Currently, this test simply the same as 
test_replicated_cache_operation_routed_to_random_node, but it required custom 
affinity function, that is introduced only in 2.9.1. I suggest to remove it



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14167) Concurrency issues in reconnect and too short backoff strategy for reconnect timeout.

2021-02-11 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-14167:
-

 Summary: Concurrency issues in reconnect and too short backoff 
strategy for reconnect timeout.
 Key: IGNITE-14167
 URL: https://issues.apache.org/jira/browse/IGNITE-14167
 Project: Ignite
  Issue Type: Bug
  Components: python, thin client
Reporter: Ivan Daschinskiy


Currently the code in Connection class is not properly synchronized and
socket can be set to None while reconnecting and sending requests 
simultaneously.

Also, reconnections attempt are to short (8 fibonacci sequence items) and total 
only 33
sec.

These issues lead to flaky tests for affinity suite.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14072) Remove copy-paste of response for different versions

2021-01-27 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-14072:
-

 Summary: Remove copy-paste of response for different versions
 Key: IGNITE-14072
 URL: https://issues.apache.org/jira/browse/IGNITE-14072
 Project: Ignite
  Issue Type: Task
  Components: python, thin client
Reporter: Ivan Daschinskiy


Currently there are many common code in classed Response140 SqlResponse140 
Response130 and SqlResponse130. This should be fixed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14429) Python thin client cache.get_size works not as expected and PeekModes are incorrect.

2021-03-26 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-14429:
-

 Summary: Python thin client cache.get_size works not as expected 
and PeekModes are incorrect.
 Key: IGNITE-14429
 URL: https://issues.apache.org/jira/browse/IGNITE-14429
 Project: Ignite
  Issue Type: Bug
  Components: python, thin client
Reporter: Ivan Daschinskiy


1. PeekModes is now ByteArray, so class variables should be changed.
Currently these values are incorrect, seems like masks. They should be changed 
to ordinal values in order to resemble java enum.
2. By default,  peek_modes in get_size should be None, not 0
* If pass 0, behaviour is not if we use PeekModes.ALL, but PeekModes.PRIMARY




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14418) Document asyncio version of python ignite thin client

2021-03-25 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-14418:
-

 Summary: Document asyncio version of python ignite thin client
 Key: IGNITE-14418
 URL: https://issues.apache.org/jira/browse/IGNITE-14418
 Project: Ignite
  Issue Type: New Feature
  Components: python, thin client
Reporter: Ivan Daschinskiy






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14444) Move affinity calculation and storage to client

2021-03-30 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-1:
-

 Summary: Move affinity calculation and storage to client
 Key: IGNITE-1
 URL: https://issues.apache.org/jira/browse/IGNITE-1
 Project: Ignite
  Issue Type: Improvement
  Components: python
Reporter: Ivan Daschinskiy


In current implementation, affinity storage and affinity calculation are 
located in cache.
It is not optimal:
1. affinity is not shared between Cache instance with same name
2. affinity mapping requests per cache and add additional loads.
3. if we start implementing transactions or expiry  policy, this can be an 
issue.

I propose to move affinity storage to Client and AioClient.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14472) Performance drop on primitive operations.

2021-04-02 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-14472:
-

 Summary: Performance drop on primitive operations.
 Key: IGNITE-14472
 URL: https://issues.apache.org/jira/browse/IGNITE-14472
 Project: Ignite
  Issue Type: Bug
  Components: python, thin client
Affects Versions: python-0.4.0
Reporter: Ivan Daschinskiy


Reason of performance drop: header struct of Response is not cached (now it is 
instance variable, earlier it will be class variable)

Performance drop approx 15 %.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14422) Version management for ducktape.

2021-03-26 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-14422:
-

 Summary: Version management for ducktape.
 Key: IGNITE-14422
 URL: https://issues.apache.org/jira/browse/IGNITE-14422
 Project: Ignite
  Issue Type: Improvement
Reporter: Ivan Daschinskiy


I propose following:

1. Add to `update-versions` task a sub-task, that bumps versions in 
`ignitetests.__init__.py` (i.e. `2.11-SNAPSHOT` to `2.11-dev`)
2. Change `ignitetests.versions.DEV` to `IgniteVersion(ignitetests.__version__)`

This automatically set `DEV` as latest version. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14245) Infinite loop while trying to get affinity mapping on failed node

2021-02-25 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-14245:
-

 Summary: Infinite loop while trying to get affinity mapping on 
failed node
 Key: IGNITE-14245
 URL: https://issues.apache.org/jira/browse/IGNITE-14245
 Project: Ignite
  Issue Type: Bug
  Components: python, thin client
Reporter: Ivan Daschinskiy


Currenlty, it's possible to jump in infinite loop trying to reconnect to failed 
node while requesting affinity mapping.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14240) Handle authentication error on python thin client properly

2021-02-24 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-14240:
-

 Summary: Handle authentication error on python thin client properly
 Key: IGNITE-14240
 URL: https://issues.apache.org/jira/browse/IGNITE-14240
 Project: Ignite
  Issue Type: Bug
  Components: python, thin client
Reporter: Ivan Daschinskiy






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-14186) Develop C module for python thin client to speedup hashcode calculation

2021-02-15 Thread Ivan Daschinskiy (Jira)
Ivan Daschinskiy created IGNITE-14186:
-

 Summary: Develop C module for python thin client to speedup 
hashcode calculation
 Key: IGNITE-14186
 URL: https://issues.apache.org/jira/browse/IGNITE-14186
 Project: Ignite
  Issue Type: Improvement
  Components: python, thin client
Reporter: Ivan Daschinskiy


Pure python calculation of hashcode is very slow. It leads to inadequate 
performance of simple operation.

For example, put object with 1Mb data takes 500ms. 
After rewriting hashcode in C, operation tooks only 7ms



--
This message was sent by Atlassian Jira
(v8.3.4#803005)